Title: Grouping & Aggregating Data using Pandas<br>
Objective: Learn how to group data and perform aggregations on these groups.

Task 1: Grouping by a Single Column<br>

Task: Group the dataset by 'region' and calculate total sales per region.<br>
Steps:<br>
10. Load the dataset.<br>
11. Use groupby('region') on the DataFrame.<br>
12. Apply .sum() to the 'sales' column.

In [4]:
import pandas as pd

data = {
    'region': ['North', 'South', 'East', 'North', 'East'],
    'sales': [250, 300, 150, 400, 500],
    'profit': [50, 60, 30, 80, 90],
    'quantity': [2, 3, 1, 5, 4]
}

df = pd.DataFrame(data)
grouped = df.groupby('region')
total_sales_per_region = grouped['sales'].sum()
print("Total Sales per Region:")
print(total_sales_per_region)


Total Sales per Region:
region
East     650
North    650
South    300
Name: sales, dtype: int64


Task 2: Grouping by Multiple Columns<br>

Task: Group the dataset by 'region' and 'category', then find the average sales.<br>
Steps:<br>
13. Group by ['region', 'category'].<br>
14. Use .mean() on the 'sales' column.<br>
15. Examine the resulting DataFrame structure.

In [5]:
import pandas as pd

data = {
    'region': ['North', 'South', 'East', 'North', 'East', 'South'],
    'category': ['A', 'A', 'B', 'B', 'A', 'B'],
    'sales': [250, 300, 150, 400, 500, 200],
    'profit': [50, 60, 30, 80, 90, 40],
    'quantity': [2, 3, 1, 5, 4, 2]
}

df = pd.DataFrame(data)
grouped_avg_sales = df.groupby(['region', 'category'])['sales'].mean()
print("Average Sales by Region and Category:")
print(grouped_avg_sales)


Average Sales by Region and Category:
region  category
East    A           500.0
        B           150.0
North   A           250.0
        B           400.0
South   A           300.0
        B           200.0
Name: sales, dtype: float64


Task 3: Aggregating Multiple Functions<br>

Task: Group data by 'category' and apply multiple aggregation functions (sum and count) on 'quantity'.<br>
Steps:<br>
16. Group by 'category'.<br>
17. Use .agg(['sum', 'count']) on 'quantity'.<br>
18. Analyze the result to understand how multiple aggregations work.

In [6]:
import pandas as pd

data = {
    'region': ['North', 'South', 'East', 'North', 'East', 'South'],
    'category': ['A', 'B', 'A', 'B', 'B', 'A'],
    'sales': [250, 300, 150, 400, 500, 200],
    'quantity': [2, 3, 1, 5, 4, 2]
}

df = pd.DataFrame(data)

grouped = df.groupby('category')
aggregated_quantity = grouped['quantity'].agg(['sum', 'count'])
print("Aggregated Quantity by Category:")
print(aggregated_quantity)


Aggregated Quantity by Category:
          sum  count
category            
A           5      3
B          12      3
