# Topic 04 - Problem 07: Ranking Values Within Groups

---

## 1. About the Problem

In many datasets, I need to **rank values within each group**, not across the entire dataset.

Examples:
- Rank employees by salary **inside each department**
- Rank students by score **inside each class**
- Rank products by sales **inside each category**

In this problem, I will rank employees **based on salary within their department**.

This is extremely useful for:
- Feature engineering
- Performance comparison
- Recommendation systems

---


## 2. Solution Code

In [9]:
import pandas as pd

data = {
    "employee": ["A", "B", "C", "D", "E", "F", "G",'H'],
    "department": ["IT", "IT", "HR", "HR", "Finance", "Finance", "IT",'IT'],
    "salary": [60000, 65000, 48000, 52000, 78000, 75000, 70000,70000]
}

df = pd.DataFrame(data)

# Ranking salaries within each department
df['salary_rank']=df.groupby('department')['salary'].rank(ascending=False,method='dense')

print(df)


  employee department  salary  salary_rank
0        A         IT   60000          3.0
1        B         IT   65000          2.0
2        C         HR   48000          2.0
3        D         HR   52000          1.0
4        E    Finance   78000          1.0
5        F    Finance   75000          2.0
6        G         IT   70000          1.0
7        H         IT   70000          1.0


---

## 3. Explanation (What is happening)

- **groupby("department")["salary"]**  
  → Groups salary values by department

- **rank(ascending=False)**  
  → Higher salary gets higher rank (1 = highest)

- **method="dense"**  
  → No gaps in ranking  
    Example: 1, 2, 3 (not 1, 2, 4)

- **transform-style behavior**  
  → Ranking result aligns row-by-row with original DataFrame

---

## 4. Summary / Takeaways

By solving this problem, I learned:

1. How to rank values inside each group
2. Why group-wise ranking is better than global ranking
3. How ranking can become a powerful ML feature
4. The difference between ranking methods (`dense`, `min`, `first`)

This problem shows **analytical thinking + pandas mastery**, so it’s absolutely worth putting on GitHub.

Next, I will move toward:
- Percentile ranking
- Group-wise cumulative calculations

