# Topic 04 - Problem 02: Multi-Level Grouping & Aggregation

---

## 1. About the Problem

In real-world datasets, we often need to **group by more than one column** to extract deeper insights.

Multi-level grouping helps answer questions like:
- Average salary by **department AND gender**
- Total sales by **region AND product**
- Performance metrics by **year AND category**

In this problem, I will perform **grouping using multiple columns** and apply multiple aggregation functions.

---


## 2. Solution Code

In [5]:
import pandas as pd

data = {
    "department": ["IT", "IT", "HR", "HR", "Finance", "Finance", "IT"],
    "gender": ["Male", "Female", "Female", "Male", "Male", "Female", "Male"],
    "salary": [60000, 65000, 48000, 52000, 78000, 75000, 70000],
    "experience": [3, 4, 2, 5, 8, 7, 6]
}

df=pd.DataFrame(data)
grouped_data=df.groupby(['department','gender'])
grouped_data.agg({
    'salary':['mean','max','min'],
    'experience':'mean'
})


Unnamed: 0_level_0,Unnamed: 1_level_0,salary,salary,salary,experience
Unnamed: 0_level_1,Unnamed: 1_level_1,mean,max,min,mean
department,gender,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2
Finance,Female,75000.0,75000,75000,7.0
Finance,Male,78000.0,78000,78000,8.0
HR,Female,48000.0,48000,48000,2.0
HR,Male,52000.0,52000,52000,5.0
IT,Female,65000.0,65000,65000,4.0
IT,Male,65000.0,70000,60000,4.5


---

## 3. Explanation (What is happening)

- **groupby(["department", "gender"])**  
  → Groups data using **two categorical columns**

- **mean & max salary**  
  → Shows average and highest salary per group

- **mean experience**  
  → Average experience level for each department-gender combination

This produces a **hierarchical (multi-index) output**, which is very common in analytics.

---

## 4. Summary / Takeaways

By solving this problem, I learned:

1. How to group data using **multiple columns**
2. How multi-level grouping reveals deeper patterns
3. How to apply different aggregation functions simultaneously
4. Why hierarchical indexing is common in real datasets

Multi-level aggregation is heavily used in:
- Business intelligence dashboards
- Customer segmentation
- Feature creation for ML models

This skill demonstrates **intermediate-to-advanced Pandas proficiency**, which is highly valued in Data Science roles.

Next, I will work with **custom aggregations and resetting multi-index outputs**.
