# Lesson 3.4: Grouping & Aggregation

## Exactly Like Laravel's groupBy()

```php
// Laravel
$orders->groupBy('region')->map(fn($group) => $group->sum('amount'));
```
```python
# Pandas
df.groupby('region')['amount'].sum()
```

In [None]:
import pandas as pd
import numpy as np

# Water filter data across regions
df = pd.DataFrame({
    'filter_id': [f'F{str(i).zfill(3)}' for i in range(1, 13)],
    'region': ['North', 'South', 'North', 'East', 'West', 'South',
               'East', 'North', 'West', 'South', 'East', 'North'],
    'tds_output': [42, 78, 55, 120, 35, 95, 110, 48, 65, 88, 130, 52],
    'flow_rate': [2.1, 1.5, 1.9, 0.8, 2.3, 1.2, 0.9, 2.0, 1.7, 1.3, 0.7, 1.8],
    'age_days': [60, 180, 90, 320, 15, 240, 300, 45, 150, 210, 340, 80]
})
df

In [None]:
# Group by region, get average TDS
print("Average TDS by region:")
df.groupby('region')['tds_output'].mean()

In [None]:
# Multiple aggregations at once with .agg()
print("Multiple stats per region:")
df.groupby('region')['tds_output'].agg(['mean', 'min', 'max', 'count'])

In [None]:
# Aggregate multiple columns
summary = df.groupby('region').agg({
    'tds_output': 'mean',
    'flow_rate': 'mean',
    'age_days': 'max',
    'filter_id': 'count'  # Count filters per region
}).rename(columns={'filter_id': 'num_filters'})
summary

In [None]:
# value_counts() - quick frequency count (like ->countBy())
print("Filters per region:")
print(df['region'].value_counts())

## Exercise

1. Find the region with the highest average TDS
2. Count how many filters have TDS > 80 per region
3. Get the min and max flow_rate per region

In [None]:
# YOUR CODE HERE