# Notebook 11: Window Functions

## Learning Objectives
- Understand window functions vs aggregates
- Use ROW_NUMBER, RANK, DENSE_RANK
- Use PARTITION BY to create groups
- Use NTILE for percentiles

In [None]:
import os
import sys
from pathlib import Path

project_root = Path.cwd().parent if Path.cwd().name == "notebooks" else Path.cwd()
sys.path.insert(0, str(project_root / "src"))
import duckdb
from sql_exercises import check

os.environ["SQL_NOTEBOOK_NAME"] = "11_window_functions"
conn = duckdb.connect(
    str(project_root / "data" / "databases" / "practice.duckdb"), read_only=True
)
print("Setup complete!")

## Quick Reference
```sql
-- ROW_NUMBER: Unique sequential number
ROW_NUMBER() OVER (ORDER BY col)
ROW_NUMBER() OVER (PARTITION BY grp ORDER BY col)

-- RANK: Same rank for ties, gaps after
RANK() OVER (ORDER BY col)

-- DENSE_RANK: Same rank for ties, no gaps
DENSE_RANK() OVER (ORDER BY col)

-- NTILE: Divide into N buckets
NTILE(4) OVER (ORDER BY col)  -- Quartiles
```

---
## Exercise 1: Basic ROW_NUMBER (Easy)
**Problem:** Assign a row number to each employee ordered by salary descending.

Return columns: row_num, employee_id, first_name, salary

In [None]:
ex_01 = """

"""
conn.execute(ex_01).fetchdf()

In [None]:
check("ex_01", ex_01)

---
## Exercise 2: ROW_NUMBER with PARTITION BY (Easy)
**Problem:** Assign a row number to employees within each department by salary (highest first).

Return columns: department_id, employee_id, first_name, salary, dept_rank

In [None]:
ex_02 = """

"""
conn.execute(ex_02).fetchdf()

In [None]:
check("ex_02", ex_02)

---
## Exercise 3: RANK vs DENSE_RANK (Easy)
**Problem:** Show RANK and DENSE_RANK for employees by salary (highest salary = rank 1).

Return columns: employee_id, salary, salary_rank, salary_dense_rank

In [None]:
ex_03 = """

"""
conn.execute(ex_03).fetchdf()

In [None]:
check("ex_03", ex_03)

---
## Exercise 4: Top N Per Group (Medium)
**Problem:** Find the top 3 highest-paid employees in each department.

Return columns: department_id, employee_id, first_name, salary, dept_rank

**Hint:** Use a subquery or CTE with ROW_NUMBER, then filter

In [None]:
ex_04 = """

"""
conn.execute(ex_04).fetchdf()

In [None]:
check("ex_04", ex_04)

---
## Exercise 5: NTILE for Quartiles (Medium)
**Problem:** Divide employees into 4 salary quartiles (1=lowest, 4=highest).

Return columns: employee_id, first_name, salary, salary_quartile

In [None]:
ex_05 = """

"""
conn.execute(ex_05).fetchdf()

In [None]:
check("ex_05", ex_05)

---
## Exercise 6: Product Rankings (Medium)
**Problem:** Rank products by unit_price within each category (highest price = rank 1). Use RANK().

Return columns: category_id, product_id, product_name, unit_price, price_rank

In [None]:
ex_06 = """

"""
conn.execute(ex_06).fetchdf()

In [None]:
check("ex_06", ex_06)

---
## Exercise 7: Customer Order Ranking (Hard)
**Problem:** Rank orders for each customer by total_amount (highest = rank 1). Use RANK().

Return columns: customer_id, order_id, total_amount, order_rank

In [None]:
ex_07 = """

"""
conn.execute(ex_07).fetchdf()

In [None]:
check("ex_07", ex_07)

---
## Exercise 8: Highest Order Per Customer (Hard)
**Problem:** Find each customer's largest order.

Return columns: customer_id, order_id, total_amount

In [None]:
ex_08 = """

"""
conn.execute(ex_08).fetchdf()

In [None]:
check("ex_08", ex_08)

---
## Summary
- **ROW_NUMBER** - Unique sequence, no ties
- **RANK** - Same rank for ties, gaps after
- **DENSE_RANK** - Same rank for ties, no gaps
- **NTILE** - Divide into N buckets
- **PARTITION BY** - Apply within groups

### Next: Notebook 12 - Advanced Window Functions

In [None]:
conn.close()