# 08 – Window Functions  
Core SQL concept: perform advanced calculations across rows related to the current row.  

---

*Part of the [Foundations: Python, R & SQL](../README.md) repository.*

In [1]:
import duckdb

In [2]:
# Create a sample sales table
duckdb.sql("""
CREATE TABLE sales (
  employee TEXT,
  department TEXT,
  sale_amount INTEGER,
  sale_date DATE
);

INSERT INTO sales VALUES
('Alice', 'IT', 5000, '2023-01-10'),
('Bob', 'IT', 3000, '2023-01-15'),
('Alice', 'IT', 7000, '2023-01-20'),
('Clara', 'HR', 4000, '2023-01-12'),
('David', 'HR', 6000, '2023-01-18'),
('Eva', 'Finance', 8000, '2023-01-25');
""")

In [3]:
duckdb.sql("SELECT * FROM sales")

┌──────────┬────────────┬─────────────┬────────────┐
│ employee │ department │ sale_amount │ sale_date  │
│ varchar  │  varchar   │    int32    │    date    │
├──────────┼────────────┼─────────────┼────────────┤
│ Alice    │ IT         │        5000 │ 2023-01-10 │
│ Bob      │ IT         │        3000 │ 2023-01-15 │
│ Alice    │ IT         │        7000 │ 2023-01-20 │
│ Clara    │ HR         │        4000 │ 2023-01-12 │
│ David    │ HR         │        6000 │ 2023-01-18 │
│ Eva      │ Finance    │        8000 │ 2023-01-25 │
└──────────┴────────────┴─────────────┴────────────┘

## 1. Running Total by Employee

In [4]:
duckdb.sql("""
SELECT
  employee,
  sale_date,
  sale_amount,
  SUM(sale_amount) OVER (PARTITION BY employee ORDER BY sale_date) AS running_total
FROM sales
""")

┌──────────┬────────────┬─────────────┬───────────────┐
│ employee │ sale_date  │ sale_amount │ running_total │
│ varchar  │    date    │    int32    │    int128     │
├──────────┼────────────┼─────────────┼───────────────┤
│ David    │ 2023-01-18 │        6000 │          6000 │
│ Eva      │ 2023-01-25 │        8000 │          8000 │
│ Bob      │ 2023-01-15 │        3000 │          3000 │
│ Alice    │ 2023-01-10 │        5000 │          5000 │
│ Alice    │ 2023-01-20 │        7000 │         12000 │
│ Clara    │ 2023-01-12 │        4000 │          4000 │
└──────────┴────────────┴─────────────┴───────────────┘

## 2. Row Number by Department

In [5]:
duckdb.sql("""
SELECT
  employee,
  department,
  sale_amount,
  ROW_NUMBER() OVER (PARTITION BY department ORDER BY sale_amount DESC) AS dept_rank
FROM sales
""")

┌──────────┬────────────┬─────────────┬───────────┐
│ employee │ department │ sale_amount │ dept_rank │
│ varchar  │  varchar   │    int32    │   int64   │
├──────────┼────────────┼─────────────┼───────────┤
│ David    │ HR         │        6000 │         1 │
│ Clara    │ HR         │        4000 │         2 │
│ Eva      │ Finance    │        8000 │         1 │
│ Alice    │ IT         │        7000 │         1 │
│ Alice    │ IT         │        5000 │         2 │
│ Bob      │ IT         │        3000 │         3 │
└──────────┴────────────┴─────────────┴───────────┘

## 3. RANK vs DENSE_RANK

In [6]:
duckdb.sql("""
SELECT
  employee,
  department,
  sale_amount,
  RANK() OVER (PARTITION BY department ORDER BY sale_amount DESC) AS rank,
  DENSE_RANK() OVER (PARTITION BY department ORDER BY sale_amount DESC) AS dense_rank
FROM sales
""")

┌──────────┬────────────┬─────────────┬───────┬────────────┐
│ employee │ department │ sale_amount │ rank  │ dense_rank │
│ varchar  │  varchar   │    int32    │ int64 │   int64    │
├──────────┼────────────┼─────────────┼───────┼────────────┤
│ Eva      │ Finance    │        8000 │     1 │          1 │
│ David    │ HR         │        6000 │     1 │          1 │
│ Clara    │ HR         │        4000 │     2 │          2 │
│ Alice    │ IT         │        7000 │     1 │          1 │
│ Alice    │ IT         │        5000 │     2 │          2 │
│ Bob      │ IT         │        3000 │     3 │          3 │
└──────────┴────────────┴─────────────┴───────┴────────────┘

## Summary

- Use `OVER()` with functions like `SUM`, `AVG`, `ROW_NUMBER`, `RANK`, etc.
- `PARTITION BY` splits data into groups (like GROUP BY but doesn't collapse rows).
- `ORDER BY` inside `OVER()` allows calculations across sorted rows.
