# Fun with SQL: Grouping and Aggregation 📊

Welcome to your fifth SQL adventure! Today we'll learn how to:
- Group data using GROUP BY
- Use aggregate functions like COUNT, AVG, and SUM
- Filter groups using HAVING

Let's start with our database connection:

In [None]:
import sqlite3
import pandas as pd

conn = sqlite3.connect('../data/movies.db')
pd.set_option('display.max_columns', None)

## 🎯 Challenge 1: Counting Movies by Genre

Let's find out how many movies we have in each genre using GROUP BY and COUNT:
```sql
SELECT genre, COUNT(*) as movie_count
FROM movies
GROUP BY genre;
```

💡 The GROUP BY clause groups rows that have the same values in specified columns!

In [None]:
query = """
-- Write your GROUP BY query here
"""

pd.read_sql_query(query, conn)

## 🎯 Challenge 2: Average Ratings by Genre

We can use other aggregate functions like AVG to find the average rating for each genre:
```sql
SELECT genre, 
       COUNT(*) as movie_count,
       AVG(rating) as avg_rating
FROM movies
GROUP BY genre;
```

Try finding the highest and lowest ratings for each genre using MIN and MAX!

In [None]:
query = """
-- Write your query using MIN and MAX
"""

pd.read_sql_query(query, conn)

## 🎯 Challenge 3: Filtering Groups with HAVING

Sometimes we want to filter groups based on aggregate values. That's where HAVING comes in:
```sql
SELECT genre, AVG(rating) as avg_rating
FROM movies
GROUP BY genre
HAVING avg_rating > 8.0;
```

💡 HAVING is like WHERE, but for grouped data!

In [None]:
query = """
-- Write a query to find genres with more than 3 movies
"""

pd.read_sql_query(query, conn)

## 🌟 Bonus Challenge: Complex Grouping

Can you write a query that:
1. Groups movies by release year and genre
2. Shows the count and average rating for each group
3. Only includes groups with at least 2 movies
4. Orders results by year (newest first)?

Hint: You can GROUP BY multiple columns!

In [None]:
query = """
-- Write your complex grouping query here!
"""

pd.read_sql_query(query, conn)

## 🎉 Well done!

You've learned how to:
- Group data using GROUP BY
- Use aggregate functions (COUNT, AVG, MIN, MAX)
- Filter groups using HAVING
- Combine grouping with ordering

Common Aggregate Functions:
- COUNT(): Counts rows
- SUM(): Adds up values
- AVG(): Calculates average
- MIN(): Finds minimum value
- MAX(): Finds maximum value

In the next notebook, we'll explore subqueries and more advanced SQL concepts!