## Step 1: Hello, Data!

In this step, I load the raw sales transactions CSV file into a DataFrame and display the first three rows to understand the structure of the data.


In [1]:
# ---------- importing the pandas library ----------
import pandas as pd

# ---------- reading  CSV file ----------
file1 = r"D:\Conestoga\Machine Learning Programming\Lab-2\sales_data_with_coupons.csv"
data1 = pd.read_csv(file1)

# ---------- displaying the first 3 rows of the data ----------
print(data1.head(3))



   sno                            Region Country   Item Type Sales Channel  \
0  1.0      middle east and north africa   Libya   Cosmetics       Offline   
1  2.0                     north america  CANADA  Vegetables        Online   
2  3.0    Middle East and North Africa     LIBYA   Baby Food       Offline   

  Order Priority  Order Date     Order ID   Ship Date  Units Sold  Unit Price  \
0              M   18-Oct-14  686800706.0  31-10-2014      8446.0      437.20   
1              M  07-11-2011  185941302.0  2011-12-08      3018.0      154.06   
2              C   31-Oct-16  246222341.0  2016-12-09      1517.0      255.28   

   Unit Cost  Total Revenue  Total Cost  Total Profit coupon_code  
0     263.33     3692591.20  2224085.18    1468506.02      GF24TA  
1      90.93      464953.08   274426.74     190526.34      10AMSP  
2     159.42      387259.76   241840.14     145419.62      TEYPEU  


## Step 2: Pick the Right Container

A `dict` is simple but doesn’t group behavior with data and a `namedtuple` is cleaner than a dict,but no custom functions
A `class` is best here  because it helps organize the data and lets me add my own methods like `clean()` and `total()`.

## Step 3: Transaction Class and OO structure 


In [3]:
# ---------- creating a class to represent one transaction ----------

class Transaction:
    def __init__(self, row):
        # ---------- saving each column into the object ----------
        self.order_date = row["Order Date"]
        self.customer_id = row["Order ID"]
        self.product = row["Item Type"]
        self.price = row["Unit Price"]
        self.quantity = row["Units Sold"]
        self.coupon_code = row["coupon_code"]
        self.shipping_city = row["Country"]  # using Country as "city" substitute

# ---------- creating a list to store all transaction objects ----------
transaction_list = []

# ---------- going through each row and converting to a Transaction object ----------
for index, row in data1.iterrows():
    obj = Transaction(row)
    transaction_list.append(obj)

# ---------- printing the first transaction as a check ----------
print(vars(transaction_list[0]))


{'order_date': '18-Oct-14', 'customer_id': 686800706.0, 'product': 'Cosmetics', 'price': 437.2, 'quantity': 8446.0, 'coupon_code': 'GF24TA', 'shipping_city': 'Libya'}


### Transaction Class and OO Data Structure
In this step, I created a Python class to represent each row in the sales data.
I used real column names like "Order Date", "Item Type", and "coupon_code" from the CSV.


##  Step 4: Bulk Loader 


In [5]:
# ---------- importing List for return type hint ----------
from typing import List

# ---------- defining a function to load all transactions ----------
def load_transactions(dataframe) -> List[Transaction]:
    # ---------- creating a list to hold transaction objects ----------
    result = []

    # ---------- going through each row in the dataframe ----------
    for index, row in dataframe.iterrows():
        obj = Transaction(row)      # ---------- make Transaction object ----------
        result.append(obj)          # ---------- add to list ----------

    return result

# ---------- using the function to load data ----------
transactions = load_transactions(data1)

# ---------- print one item to check ----------
print(vars(transactions[0]))


{'order_date': '18-Oct-14', 'customer_id': 686800706.0, 'product': 'Cosmetics', 'price': 437.2, 'quantity': 8446.0, 'coupon_code': 'GF24TA', 'shipping_city': 'Libya'}


in this step i made a function called `load_transactions()` to turn each row of the data into a `Transaction` object.
This function goes through the dataframe row by row and makes a list of all  transactions.
It helps to keep the code clean and now i can just call this function anytime I want to load the data as objects.
