# 07 – Mini Project

Example combining NumPy and Pandas to simulate and analyze product sales.

---

Part of the [Foundations: Python, R & SQL](../README.md) repository.


## Objective

Simulate a small dataset of product sales and compute basic indicators:
- total revenue per region
- units sold per product
- apply a discount scenario and compare impact

## 1. Simulate Data

In [1]:
import pandas as pd
import numpy as np

np.random.seed(42)

products = ['USB-C Cable', 'HDMI Adapter', 'Wireless Mouse']
regions = ['North', 'South', 'East', 'West']

data = {
    'product': np.random.choice(products, size=20),
    'region': np.random.choice(regions, size=20),
    'units_sold': np.random.randint(1, 10, size=20),
    'unit_price': np.random.uniform(15, 50, size=20).round(2)
}

df = pd.DataFrame(data)
df.head()

Unnamed: 0,product,region,units_sold,unit_price
0,Wireless Mouse,West,3,36.35
1,USB-C Cable,West,7,44.16
2,Wireless Mouse,North,4,21.07
3,Wireless Mouse,North,9,28.69
4,USB-C Cable,West,3,21.38


## 2. Revenue Calculation

In [2]:
df['revenue'] = df['units_sold'] * df['unit_price']
df.head()

Unnamed: 0,product,region,units_sold,unit_price,revenue
0,Wireless Mouse,West,3,36.35,109.05
1,USB-C Cable,West,7,44.16,309.12
2,Wireless Mouse,North,4,21.07,84.28
3,Wireless Mouse,North,9,28.69,258.21
4,USB-C Cable,West,3,21.38,64.14


## 3. Aggregated Indicators

In [3]:
# Total revenue per region
df.groupby('region')['revenue'].sum().sort_values(ascending=False)

region
West     1330.40
North     954.71
East      915.59
South     377.74
Name: revenue, dtype: float64

In [4]:
# Units sold per product
df.groupby('product')['units_sold'].sum().sort_values(ascending=False)

product
Wireless Mouse    51
USB-C Cable       28
HDMI Adapter      27
Name: units_sold, dtype: int32

## 4. Promotion Scenario

Apply a 10% discount on all products and compute new revenue.

In [5]:
df['discounted_price'] = df['unit_price'] * 0.9 # df['unit_price']-0.1*df['unit_price']
df['new_revenue'] = df['discounted_price'] * df['units_sold']
impact = df['revenue'].sum() - df['new_revenue'].sum()
print(f"Revenue loss due to discount: ${impact:.2f}")

Revenue loss due to discount: $357.84


## Summary

- Simulated realistic product sales data
- Calculated key metrics with pandas
- Applied a pricing scenario and measured the impact