# Sales Data Analysis

This notebook contains tasks for analyzing sales data. Fill in the code cells with your own solutions.

## Task 1: Load Data
- Load `sales.csv` into a DataFrame.
- Display the first 5 rows.

In [1]:
import pandas as pd

df = pd.read_csv('csv/sales.csv')
df.head()

Unnamed: 0,date,region,product,quantity,price,revenue
0,2023-02-08,East,Product D,3,64.33,193.0
1,2023-02-21,East,Product B,11,26.36,290.0
2,2023-01-29,West,Product A,11,18.55,204.0
3,2023-01-15,South,Product D,18,20.06,361.0
4,2023-02-12,South,Product E,15,13.6,204.0


## Task 2: Total Revenue per Product
- Group by product and calculate total revenue.

In [2]:
df.groupby('product')['revenue'].sum()

product
Product A     9250.0
Product B     8749.0
Product C     6487.0
Product D     9128.0
Product E    10579.0
Name: revenue, dtype: float64

## Task 3: Average Daily Revenue
- Calculate the mean revenue per day.

In [3]:
df.groupby('date')['revenue'].mean()

date
2023-01-01    146.750000
2023-01-02    304.500000
2023-01-03    237.000000
2023-01-04    303.750000
2023-01-05    251.500000
2023-01-06    236.000000
2023-01-07    196.750000
2023-01-08    368.400000
2023-01-09    219.600000
2023-01-10    450.000000
2023-01-11    307.000000
2023-01-12    207.000000
2023-01-13    205.500000
2023-01-14    204.000000
2023-01-15    235.714286
2023-01-16    234.000000
2023-01-17    395.000000
2023-01-18    317.333333
2023-01-19     70.000000
2023-01-20    395.000000
2023-01-21    146.500000
2023-01-22    193.000000
2023-01-23    194.666667
2023-01-24    213.142857
2023-01-25    193.000000
2023-01-26    220.000000
2023-01-27    266.000000
2023-01-28    203.333333
2023-01-29    214.750000
2023-01-30    184.500000
2023-01-31     95.000000
2023-02-01    292.666667
2023-02-02    281.333333
2023-02-03    203.500000
2023-02-04    220.000000
2023-02-05    133.750000
2023-02-06    331.500000
2023-02-07     21.000000
2023-02-08    180.571429
2023-02-09    108.00

## Task 4: Multiple Aggregations
- Perform `sum`, `mean`, `max`, and `count` aggregations.

In [4]:
df.groupby('region')['revenue'].agg(['sum', 'mean', 'max', 'count'])

Unnamed: 0_level_0,sum,mean,max,count
region,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Central,6673.0,256.653846,495.0,26
East,10547.0,234.377778,483.0,45
North,10573.0,199.490566,483.0,53
South,6797.0,205.969697,450.0,33
West,9603.0,228.642857,494.0,42


## Task 5: Sort Products by Revenue
- Sort products in descending order of total revenue.

In [5]:
df.groupby('product')['revenue'].sum().sort_values(ascending=False)

product
Product E    10579.0
Product A     9250.0
Product D     9128.0
Product B     8749.0
Product C     6487.0
Name: revenue, dtype: float64

## (Optional) Task 6: Portfolio Function
- Create a function `sales_report(filename)` that:
  - Returns a DataFrame with revenue per product.
  - Saves the result to `sales_report.csv`.

In [7]:
from pandas import DataFrame
def sales_report(filename: str)->DataFrame:
    df = pd.read_csv(filename)
    return df.groupby('product')['revenue'].sum().reset_index()
sales_report('csv/sales.csv').to_csv('csv/sales_report.csv')