Week 2 Goal: Create a Python program that uses if/else statements, loops, and functions to process order and supplier data. 
Load the supply chain dataset for the first time and explore its structure.

Task 1 : Using if/else statements, write code that: 
•   Takes an order value as input 
•	Checks if the order exceeds a threshold (10000) 
•	Prints an appropriate approval message

In [1]:
order_value = int(input("Enter the order value: "))

if order_value > 10000:
    print("Order value exceeded")
else:
    print("Order value approved")

Order value approved


Task 2: Create a list of order values and use a for loop to: 
•	Check each order 
•	Apply the same logic from Task 1 to each one 
•	Print results for all orders 

In [2]:
order_list = [2000, 56000, 3400, 1000, 54990, 10000] # Sample list of order values

for order_value in order_list: # iterate through every item in the list
    if order_value > 10000:
        print(f"{order_value}: Order value exceeded")
    else:
        print(f"{order_value}: Order value approved")

2000: Order value approved
56000: Order value exceeded
3400: Order value approved
1000: Order value approved
54990: Order value exceeded
10000: Order value approved


Task 3: Create a function called check_supplier_status that: 
•	Takes supplier_name and on_time_rate as parameters 
•	Returns a status based on the on_time_rate 
•	Test it with at least 3 different suppliers 

In [3]:
def check_supplier_status(supplier_name, on_time_rate): # a check_supplier_status function was defined
        
        """
        Checks the status of the supplier based on the time rate
        Takes supplier name and on time rate as parameters
        Returns a status based on the time rate.

        """
        if on_time_rate >= 0.90:
            return f"{supplier_name}: Preferred supplier"
        elif on_time_rate >= 0.80:
            return f"{supplier_name}: Approved supplier"
        else:
            return f"{supplier_name}: Pending approval"
        
print(check_supplier_status.__doc__) # prints the documentation of the function
print(check_supplier_status("SupplierA", 0.50))
print(check_supplier_status("SupplierB", 0.97))
print(check_supplier_status("SupplierC", 0.81))
print(check_supplier_status("SupplierD", 0.45))
print(check_supplier_status("SupplierE", 0.67))
print(check_supplier_status("SupplierF", 1))


Checks the status of the supplier based on the time rate
Takes supplier name and on time rate as parameters
Returns a status based on the time rate.


SupplierA: Pending approval
SupplierB: Preferred supplier
SupplierC: Approved supplier
SupplierD: Pending approval
SupplierE: Pending approval
SupplierF: Preferred supplier


Task 4: You have been provided with a CSV file with supply chain data containing columns like: order_id, product_name, quantity, supplier_name, cost, delivery_date, status. 
Write code to: 
•	Open and read the CSV file 
•	Print the headers 
•	Print the total number of lines 
•	Print the first 5 data rows 

In [4]:
import csv

with open ("supply_chain_dataset.csv", "r") as csv_file: # Opens the file and closes it when done to avoid data corruption
    csv_reader = csv.reader(csv_file) # Creates an object that reads the line by line

    # To read the headers
    headers = next(csv_reader) # Parsing the very first row into a variable for display and keeping it
    print("Headers:", headers)

#     # To print the total number of lines
    data = list(csv_reader)  # We must read all data first to be able to access its content
    total_number = len(data) # Total number of rows
    print(f"The total number of lines in this file: {total_number}")

    # To print the first 5 data rows
    for row in data[:5]:
        print(row)

        # OR

    # Using an incremental value
    counter = 1
    print("The first five rows: ")
    for row in data[:5]:
        print(f"Row {counter}: {row}")
        counter += 1

        # OR

    # Using the while loop
    row_num = 0
    while row_num < 5:
        print(f"Row {row_num + 1}, {data[row_num]}")
        row_num += 1

Headers: ['order_id', 'order_date', 'product_category', 'quantity', 'unit_price', 'total_cost', 'supplier_name', 'supplier_location', 'lead_time_days', 'warehouse_destination', 'delivery_date', 'actual_delivery_days', 'delivery_status', 'on_time', 'product_defects', 'return_status']
The total number of lines in this file: 3000
['ORD-0001', '11/28/2022', 'Fasteners', '94', '64.49', '6062.06', 'SupplierE', 'Mexico', '32', 'Warehouse_3', '2022-12-30', '32', 'Delayed', 'Yes', '0.0', 'No Return']
['ORD-0002', '2022-02-20', 'Fasteners', '16', '76.63', '1226.08', 'Supplier E', 'Mexico', '17', 'Warehouse_2', '2022-03-09', '17', 'Delivered', 'Yes', '2.0', 'Partial Return']
['ORD-0003', '2022-11-03', 'Control Systems', '25', '2159.06', '53976.5', 'Supplier A', 'USA', '17', 'Warehouse_1', '2022-11-20', '17', 'Delivered', 'Yes', '0.0', '']
['ORD-0004', '2022-12-15', 'Valves', '33', '774.64', '25563.12', 'supplierc', 'China', '36', 'Warehouse_1', '2023-01-20', '36', 'Delivered', 'Yes', '0.0', 'No R

Task 5: Using loops and logic, answer these questions by writing code: 
•	How many total rows of data are in the file? 
•	How many unique suppliers are there? 
•	What is the highest order cost? 
•	What is the lowest order cost? 
•	How many orders have a specific status (e.g., "Delivered")? 
Print each answer clearly labeled. 

In [15]:
import csv
with open ("supply_chain_dataset.csv", "r") as file: # Open and close the file when done
    csv_reader = csv.reader(file)  # converts each rows to a dictionary
    data = list(csv_reader) # Converts it to a list of dictionaries
    total_num = len(data)
    print(f"Total number of rows in the file: {total_num}")

# To print the unique suppliers 
# We use the set data type to select distinct rows

unique_suppliers = set(row[6] for row in data) # Row is used to iterate through the dictionary "supplier_name"
total_unique_suppliers = len(unique_suppliers)
print(f"Total unique suppliers are {total_unique_suppliers}")

# To print the highest order cost
# highest_order_cost = max(float(row['total_cost']) for row in data) # Data is always in strings so it is necessary to change it to float
# String comparison uses alphabetical order so apparently 500 is higher than 10000 while number comparison uses actual value
# print(f"The highest order cost: {highest_order_cost} ")

# OR

with open ("supply_chain_dataset.csv", "r") as file:
    csv_reader = csv.reader(file)
    next(csv_reader) # No variable because I'm not keeping the headers
    data = list(csv_reader)

    highest_order_cost = max(float(row[5]) for row in data)
    print(f"The highest order cost: {highest_order_cost}")

# To print lowest order cost

with open("supply_chain_dataset.csv", "r") as file:
    csv_reader = csv.reader(file)
    next(csv_reader)
    data = list(csv_reader)

    lowest_order_cost = min(float(row[5]) for row in data) 
    print(f"The lowest order cost: {lowest_order_cost:.2f}")

# Number of orders with specific status
with open ("supply_chain_dataset.csv", "r") as file:
    csv_reader = csv.DictReader(file)
    data = list(csv_reader)

    delivered = 0
    cancelled = 0
    delayed = 0
    for row in data:
        if row['delivery_status'] == "Delivered":
            delivered += 1
        elif row['delivery_status'] == "Cancelled":
            cancelled += 1
        else:
            delayed += 1

    print(f"Number of delivered orders: {delivered} orders")
    print(f"Number of cancelled orders: {cancelled} orders")
    print(f"Number of delayed orders: {delayed} orders")

Total number of rows in the file: 3001
Total unique suppliers are 16
The highest order cost: 1720595.4
The lowest order cost: -207.32
Number of delivered orders: 2340 orders
Number of cancelled orders: 64 orders
Number of delayed orders: 596 orders


EXTRA CHALLENGE 1
Write a function that takes an order value and supplier reliability score, then returns different approval levels based on both factors. 

In [16]:
def approval_level (order_value, supplier_reliability_score):
    """
    Takes an order value and supplier reliability score, then returns different approval levels based on both factors
    """
    if order_value > 10000 or supplier_reliability_score > 70:
        return "Approved"
    elif order_value > 5000 and supplier_reliability_score > 50:
        return "Pending"
    elif order_value < 5000 or supplier_reliability_score == 0:
        return "Rejected"
    else:
        return "Cancelled"
    
print(approval_level(20000, 45))
print(approval_level(100000, 78))

Approved
Approved


Challenge 2: 
Loop through the dataset and find all orders from a specific supplier. Count how many and calculate their total value

In [None]:
import csv

with open ("supply_chain_dataset.csv", "r") as file:
    csv_reader = csv.DictReader(file)
    data = list(csv_reader)

    count = 0
    total_value = 0
    for order in data:
        # print(order['supplier_name'])
        if order['supplier_name'] in ['SupplierE', 'Supplier E', 'suppliere']:
            count =+ 1
            total_value += float(order['total_cost'])

    print(f"Number of orders from SupplierE: {count}")
    print(f"The total value of orders from SupplierE is {total_value}")


Number of orders from SupplierA: 1
The total value of orders from SupplierA is 4766913.091896022


Challenge 3: 
Loop through the dataset and identify any rows with missing data. Print which rows have problems and what's missing. 

In [5]:
import csv # imports the python's csv library
with open ("supply_chain_dataset.csv", "r") as file: # opens and closes the csv file in read mode
    csv_reader = csv.DictReader(file) # converts each row to a dictionary
    data = list(csv_reader) # converts the dictionaries to a list of dictionaries

    for key, value in enumerate(data): # gives both the row number(incremental) and the data
        if "" in value.values() or None in value.values():
            missing_column = [] # stores all the columns with missing data 
            for column_name, column_value in value.items():
                if column_value == "" or column_value == None:
                    missing_column.append(column_name)
            print(f"The row {key} has a missing value{missing_column}")

The row 2 has a missing value['return_status']
The row 9 has a missing value['product_defects']
The row 29 has a missing value['return_status']
The row 35 has a missing value['product_defects']
The row 49 has a missing value['product_defects']
The row 51 has a missing value['product_defects', 'return_status']
The row 54 has a missing value['return_status']
The row 57 has a missing value['return_status']
The row 63 has a missing value['product_defects']
The row 65 has a missing value['return_status']
The row 90 has a missing value['return_status']
The row 91 has a missing value['product_defects']
The row 95 has a missing value['return_status']
The row 108 has a missing value['product_defects']
The row 138 has a missing value['return_status']
The row 139 has a missing value['product_defects']
The row 140 has a missing value['product_defects']
The row 170 has a missing value['return_status']
The row 180 has a missing value['product_defects']
The row 200 has a missing value['return_status'