## Week 1: Python Foundations & Tabular Thinking

- Goal: Understand how Python represents tabular business data and how to extract insights from it.


In [74]:
# dataset
sales = [
    ("Alice", "Phone", 300),
    ("Bob", "Laptop", 700),
    ("Alice", "Tablet", 200),
    ("Mary", "Phone", 300),
    ("Bob", "Phone", 300)
]


### Basic Aggregation (Customer Revenue)

In [75]:
# total sales per customer

customer_sales = {}

for customer, product, amount in sales:
    customer = customer.lower()
    if customer in customer_sales:
        customer_sales[customer] += amount
    else:
        customer_sales[customer] = amount

In [76]:
for customer, total in customer_sales.items():
    print(f"{customer.capitalize()} total sales ${total}")

Alice total sales $500
Bob total sales $1000
Mary total sales $300


**Business Interpretation**
- Who generated the most revenue?

From the sales dataset it shows that Bob generated the highest revenue.

- Why is this structure similar to an Excel table?

It’s similar to Excel because each tuple is a row and each position in the tuple is a column.

- How could this be grouped differently?

***grouping*** means What question is the business asking?

*This could be grouped in so many ways some including:*

1. Group by Product (Which product makes the most money).
2. Group by Customer + Product (What does each customer prefer?)
3. Group by Frequency (count) e.t.c

Businesses use answers from this questions to make data driven decisions like:

Inventory decisions, Marketing focus, Product expansion,Personalization, Recommendations, Bundling and retention strategy.

Grouping = choosing a business question.

**Note**
Data does not have meaning.
Questions give data meaning.

This is why analysts are paid more than people who just “know Python”

### Product Revenue Analysis

In [77]:
# Mini exercise: total sales per product

product_sales = {}
for customer, product, amount in sales:
    product = product.lower()
    if product in product_sales:
        product_sales[product] += amount
    else:
        product_sales[product] = amount

In [78]:
# display

for product, total in product_sales.items():
    print(f"{product.capitalize()} total sales ${total}")

Phone total sales $900
Laptop total sales $700
Tablet total sales $200


**Questions for mini exercise**

What question does this answer?

- This answers the question " What product generates/generated the most revenue".

What decision could a business make from it?

- The business should prioritize inventory availability and marketing spend for phones, while investigating whether tablets need repositioning, bundling, or discontinuation.

If tablets continue underperforming for 3 months, what two actions could the business take?

- If tablets continue underperforming, the business could bundle them with higher-performing products to increase exposure, or consider discontinuation to reduce inventory and opportunity costs.

### Customer Behavior Interpretation

In [79]:
customer_preference = {}
for customer, product, amount in sales:
    customer = customer.capitalize()
    product = product.capitalize()
    if customer not in customer_preference:
        customer_preference[customer] = {
            "products": [],
            "total_spend": 0
        }
    customer_preference[customer]["products"].append(product)
    customer_preference[customer]["total_spend"] += amount
   


### Customer Behavior Classification

Assumptions\Thresholds:
- Many products: 2 or more distinct products

- High spend: >= 500

- Low spend: < 300


**NOTE**
Thresholds represent business-defined assumptions that translate qualitative concepts (such as “high value” or “low spend”) into quantitative rules. Without thresholds, code may execute correctly but produce results that are not meaningful or actionable.

***Key idea:***
Code enforces rules - **Thresholds define meaning**


In [80]:
for customer, data in customer_preference.items():
    product_count = len(set(data["products"]))
    total_spend = data["total_spend"]
    if product_count >= 2 and total_spend < 300:
        behavior = "price sensitive"
    elif total_spend >= 500:
        behavior = "high value"
    else:
        behavior = "normal"
    data["behavior_flag"] = behavior
    print(product_count, total_spend, behavior)

2 500 high value
2 1000 high value
1 300 normal


In [81]:
# Display the result

for customer, data in customer_preference.items():
    print(
        f"{customer.capitalize()} → "
        f"Products: {set(data['products'])}, "
        f"Spend: {data['total_spend']}, "
        f"Behavior: {data['behavior_flag']}"
    )


Alice → Products: {'Phone', 'Tablet'}, Spend: 500, Behavior: high value
Bob → Products: {'Phone', 'Laptop'}, Spend: 1000, Behavior: high value
Mary → Products: {'Phone'}, Spend: 300, Behavior: normal


**Changing thresholds can completely change conclusions — without changing the data.**

That’s why analysts:

Document assumptions

Revisit thresholds with stakeholders

Never pretend thresholds are “facts"