# Task 05 - Part 1 of 2: SQL Aggregations & GROUP BY

**Course:** Database Applications Development  
**Lesson:** 05 - SQL Aggregations, Grouping, and Excel Export (in Part 2) 

---

## Instructions

Complete all exercises in this notebook. You will:
1. Write SQL queries using aggregate functions
2. Use GROUP BY to analyze categories
3. Export 4 query results to Excel files
4. Answer analysis questions

**Resources:**
- Lesson materials (dbApps05_AggregationsGrouping.md)
- Walkthrough notebook (dbApps05_Walkthrough.ipynb)
- SQL Reference Guide (updated with aggregations)

**Submission:**
1. Complete all TODO sections
2. Verify all cells run without errors
3. Check that Excel files were created (in Part 2)
4. Push to GitHub: `databaseApplications/dbApps05TasksPart1.ipynb`

Let's practice aggregations!

---

## Setup

In [None]:
import pandas as pd
import sqlite3

# Connect to database
conn = sqlite3.connect('nba_5seasons.db')
print("âœ… Connected to database")

---

## Part 1: Basic Aggregate Functions (10 queries)

Practice using COUNT, SUM, AVG, MIN, and MAX without GROUP BY.

**Hints:**
- Remember to use AS to name your result columns
- Aggregate functions work on ALL rows that match your WHERE clause
- Use ROUND(AVG(column), 1) to round decimals

### Query 1: Count All Teams

**Task:** How many teams are in the database?

**Hint:** Use COUNT(*) on the `teams` table.

In [None]:
# TODO: Write your query
query_1 = """

"""

result_1 = pd.read_sql(query_1, conn)
display(result_1)

### Query 2: Count Player Season Records

**Task:** How many player-season records exist for 2021-22?

**Hint:** COUNT(*) from `player_season_stats` WHERE season = '2021-22'

In [None]:
# TODO: Write your query
query_2 = """

"""

result_2 = pd.read_sql(query_2, conn)
display(result_2)

### Query 3: Total Points (All Teams)

**Task:** What were the total combined points scored by all teams in all 2021-22 games?

**Hint:** SUM(pts) from `team_game_stats` WHERE season = '2021-22'

In [None]:
# TODO: Write your query
query_3 = """

"""

result_3 = pd.read_sql(query_3, conn)
display(result_3)

### Query 4: Average Points Per Game (League-Wide)

**Task:** What was the league-wide average points per game in 2021-22?

**Hint:** AVG(pts), remember to round to 1 decimal place  (`team_game_stats` table)

In [None]:
# TODO: Write your query
query_4 = """

"""

result_4 = pd.read_sql(query_4, conn)
display(result_4)

### Query 5: Highest and Lowest Scores

**Task:** Find the highest and lowest points scored in any single game during 2021-22.

**Hint:** Use both MAX(pts) and MIN(pts) in one query  (`team_game_stats` table)

In [None]:
# TODO: Write your query
query_5 = """

"""

result_5 = pd.read_sql(query_5, conn)
display(result_5)

### Query 6: Lakers Total Points

**Task:** How many total points did the Lakers (team_id = 1610612747) score in 2021-22?

**Hint:** SUM(pts) with WHERE for team_id AND season  (`team_game_stats` table)

In [None]:
# TODO: Write your query
query_6 = """

"""

result_6 = pd.read_sql(query_6, conn)
display(result_6)

### Query 7: Warriors Average Points

**Task:** What was the Warriors' (team_id = 1610612744) average points per game in 2021-22?

**Hint:** AVG(pts), round to 1 decimal (`team_game_stats` table)

In [None]:
# TODO: Write your query
query_7 = """

"""

result_7 = pd.read_sql(query_7, conn)
display(result_7)

### Query 8: Complete Summary Statistics

**Task:** Create a summary with COUNT, SUM, AVG, MIN, and MAX for 'pts' for all games in 2021-22.

**Hint:** Use all 5 aggregate functions in one SELECT statement from the `team_game_stats` table

In [None]:
# TODO: Write your query
query_8 = """

"""

result_8 = pd.read_sql(query_8, conn)
display(result_8)

### Query 9: Count Teams by State

**Task:** How many teams are located in California?

**Hint:** COUNT(*) from `teams` WHERE state = 'California'

In [None]:
# TODO: Write your query
query_9 = """

"""

result_9 = pd.read_sql(query_9, conn)
display(result_9)

### Query 10: Oldest Team

**Task:** What is the earliest year_founded in the `teams` table?

**Hint:** MIN(year_founded)

In [None]:
# TODO: Write your query
query_10 = """

"""

result_10 = pd.read_sql(query_10, conn)
display(result_10)

---

## Part 2: GROUP BY Queries (8 queries)

Practice grouping data and aggregating by category.

**Remember:**
- Every non-aggregated column in SELECT must be in GROUP BY
- GROUP BY creates separate groups for aggregation
- Use ORDER BY to sort your results

### Query 11: Games Per Team

**Task:** How many games did each team play in 2021-22?

**Hint:** SELECT team_id, COUNT(*) ... GROUP BY team_id

In [None]:
# TODO: Write your query
query_11 = """

"""

result_11 = pd.read_sql(query_11, conn)
display(result_11.head(10))  # Show first 10 teams

### Query 12: Average Points By Team

**Task:** Calculate average points per game for each team in 2021-22. Sort by highest average first.

**Hint:** GROUP BY team_id, ORDER BY avg_points DESC

In [None]:
# TODO: Write your query
query_12 = """

"""

result_12 = pd.read_sql(query_12, conn)
display(result_12.head(10))  # Show first 10 teams

### Query 13: Team Performance with Names

**Task:** Show team name, games played, and average points for each team in 2021-22.

**Hint:** JOIN teams with team_game_stats, then GROUP BY

Recall from the walkthrough how JOIN operations work, here's an example:

SELECT <br>
    t.full_name as team,<br>
    t.city,<br>
    COUNT(tgs.game_id) as games_played<br>
FROM teams t<br>
JOIN team_game_stats tgs ON t.team_id = tgs.team_id<br>
WHERE tgs.season = '2021-22'<br>
GROUP BY t.team_id, t.full_name, t.city<br>
ORDER BY games_played DESC<br>
LIMIT 10

In [None]:
# TODO: Write your query
query_13 = """

"""

result_13 = pd.read_sql(query_13, conn)
display(result_13.head(10))

### Query 14: Total Points By Team

**Task:** Calculate total points scored by each team in 2021-22. Include team name.

**Hint:** SUM(pts), JOIN with `team_game_stats` with the `teams` table, GROUP BY team

In [None]:
# TODO: Write your query
query_14 = """

"""

result_14 = pd.read_sql(query_14, conn)
display(result_14.head(30))

### Query 15: Season High by Team

**Task:** Find each team's highest-scoring game in 2021-22. Include team name and sort by highest game.

**Hint:** MAX(pts), JOIN, GROUP BY, ORDER BY DESC

In [None]:
# TODO: Write your query
query_15 = """

"""

result_15 = pd.read_sql(query_15, conn)
display(result_15.head(10))

### Query 16: Win Count by Team

**Task:** Count how many wins each team had in 2021-22.

**Hint:** Use SUM(CASE WHEN wl = 'W' THEN 1 ELSE 0 END) to count wins

In [None]:
# TODO: Write your query
query_16 = """

"""

result_16 = pd.read_sql(query_16, conn)
display(result_16.head(10))

### Query 17: Teams by State

**Task:** Count how many teams are in each state.

**Hint:** GROUP BY state from teams table

In [None]:
# TODO: Write your query
query_17 = """

"""

result_17 = pd.read_sql(query_17, conn)
display(result_17.head(10))

### Query 18: Players Per Team

**Task:** Count how many player-season records each team has for 2021-22.

**Hint:** COUNT(*) from player_season_stats, GROUP BY team_id

In [None]:
# TODO: Write your query
query_18 = """

"""

result_18 = pd.read_sql(query_18, conn)
display(result_18.head(10))

## You've completed Part 1 - Nice Work!

## Now move onto Part 2!