# Pandas Notebook 2: GroupBy & Sorting
"Playing with LEGO blocks made of data!"

## Today's Fun Activities  
1. **GroupBy**: Like sorting LEGO by color 🟡🔴🔵  
2. **Sorting**: Ordering toys by size 📏  
3. **Real Play**: Analyzing a toy store's sales   

## 🟡🔴🔵 GroupBy Basics  
Imagine you dumped out a LEGO box:  
- **Step 1**: Gather same-color bricks together  
- **Step 2**: Count/measure each pile  
- **Step 3**: Draw conclusions ("I have more blue blocks!")  

In [1]:
import pandas as pd

toys = pd.DataFrame({
    "Color": ["Red", "Blue", "Red", "Green", "Blue"],
    "Size": [2, 3, 1, 4, 2],
    "Price": [5, 8, 3, 7, 6]
})

# Group by color (make LEGO piles)
color_groups = toys.groupby("Color")

# Count each pile
print("Number of toys per color:\n", color_groups.size())

Number of toys per color:
 Color
Blue     2
Green    1
Red      2
dtype: int64


## 📏 Sorting from Big to Small  
Just like lining up action figures by height!  

In [2]:
# Sort by size (descending)
sorted_toys = toys.sort_values("Size", ascending=False)
print("\nToys from biggest to smallest:\n", sorted_toys)


Toys from biggest to smallest:
    Color  Size  Price
3  Green     4      7
1   Blue     3      8
0    Red     2      5
4   Blue     2      6
2    Red     1      3


## Analyzing Store Sales  
"Which color sells best? What's the average price?"  

In [3]:
# Group by color + get stats
color_stats = toys.groupby("Color").agg({
    "Price": ["mean", "max"],  # Avg and highest price
    "Size": "count"            # Number sold
})
print("\nColor stats:\n", color_stats)


Color stats:
       Price      Size
       mean max count
Color                
Blue    7.0   8     2
Green   7.0   7     1
Red     4.0   5     2


## Playtime!  
1. Group `toys` by **size** and find average price per size  
2. Sort toys by **price** (cheapest first)  
3. **Challenge**: Find the most expensive toy in each color group  

*(Solutions below - peek if stuck!)*  

In [5]:
# 1
print(toys.groupby("Size")["Price"].mean())

# 2
print(toys.sort_values("Price"))

# 3
print(toys.loc[toys.groupby("Color")["Price"].idxmax()])

Size
1    3.0
2    5.5
3    8.0
4    7.0
Name: Price, dtype: float64
   Color  Size  Price
2    Red     1      3
0    Red     2      5
4   Blue     2      6
3  Green     4      7
1   Blue     3      8
   Color  Size  Price
1   Blue     3      8
3  Green     4      7
0    Red     2      5
