Title: Grouping & Aggregating Data using Pandas<br>
Objective: Learn how to group data and perform aggregations on these groups.

Task 1: Grouping by a Single Column<br>

Task: Group the dataset by 'region' and calculate total sales per region.<br>
Steps:<br>
10. Load the dataset.<br>
11. Use groupby('region') on the DataFrame.<br>
12. Apply .sum() to the 'sales' column.

In [None]:
import pandas as pd

# Step 10: Sample dataset including 'region'
data = {
    'region': ['North', 'South', 'East', 'North', 'East', 'South', 'West'],
    'sales': [100, 150, 200, 130, 170, 180, 160],
    'profit': [20, 30, 40, 25, 35, 45, 33],
    'quantity': [1, 2, 2, 1, 3, 2, 1]
}

df = pd.DataFrame(data)

# Display the dataset
print("Dataset:\n", df)

# Step 11: Group by 'region'
grouped = df.groupby('region')

# Step 12: Calculate total sales per region
total_sales_per_region = grouped['sales'].sum()

print("\nTotal sales per region:")
print(total_sales_per_region)


Task 2: Grouping by Multiple Columns<br>

Task: Group the dataset by 'region' and 'category', then find the average sales.<br>
Steps:<br>
13. Group by ['region', 'category'].<br>
14. Use .mean() on the 'sales' column.<br>
15. Examine the resulting DataFrame structure.

In [None]:
import pandas as pd

# Sample dataset including 'region' and 'category'
data = {
    'region': ['North', 'South', 'East', 'North', 'East', 'South', 'West', 'West'],
    'category': ['A', 'A', 'B', 'B', 'A', 'B', 'A', 'B'],
    'sales': [100, 150, 200, 130, 170, 180, 160, 140],
    'profit': [20, 30, 40, 25, 35, 45, 33, 22],
    'quantity': [1, 2, 2, 1, 3, 2, 1, 2]
}

df = pd.DataFrame(data)

# Display the dataset
print("Dataset:\n", df)

# Step 13 & 14: Group by ['region', 'category'] and calculate average sales
grouped_mean_sales = df.groupby(['region', 'category'])['sales'].mean()

# Step 15: Examine the result
print("\nAverage sales grouped by region and category:")
print(grouped_mean_sales)

# Optional: reset index to convert to DataFrame
grouped_df = grouped_mean_sales.reset_index()
print("\nGrouped result as DataFrame:")
print(grouped_df)


Task 3: Aggregating Multiple Functions<br>

Task: Group data by 'category' and apply multiple aggregation functions (sum and count) on 'quantity'.<br>
Steps:<br>
16. Group by 'category'.<br>
17. Use .agg(['sum', 'count']) on 'quantity'.<br>
18. Analyze the result to understand how multiple aggregations work.

In [None]:
import pandas as pd

# Sample dataset including 'category'
data = {
    'region': ['North', 'South', 'East', 'North', 'East', 'South', 'West', 'West'],
    'category': ['A', 'A', 'B', 'B', 'A', 'B', 'A', 'B'],
    'sales': [100, 150, 200, 130, 170, 180, 160, 140],
    'profit': [20, 30, 40, 25, 35, 45, 33, 22],
    'quantity': [1, 2, 2, 1, 3, 2, 1, 2]
}

df = pd.DataFrame(data)

# Step 16 & 17: Group by 'category' and aggregate with sum and count on 'quantity'
agg_result = df.groupby('category')['quantity'].agg(['sum', 'count'])

# Step 18: Analyze the result
print("Aggregation result (sum and count of quantity grouped by category):")
print(agg_result)
