# Module 08: Advanced SQL Queries

**Estimated Time:** 75 minutes

## Learning Objectives

By the end of this module, you will be able to:
- Use window functions (ROW_NUMBER, RANK, DENSE_RANK)
- Apply PARTITION BY for grouped calculations
- Use CASE statements for conditional logic
- Combine result sets with UNION and UNION ALL
- Create complex multi-table queries
- Calculate running totals and moving averages

In [None]:
# Setup
import sqlite3
import pandas as pd
from pathlib import Path

%load_ext sql

DB_PATH = Path.cwd().parent / "data" / "databases" / "ecommerce.db"
SALES_DB = Path.cwd().parent / "data" / "databases" / "sales.db"
conn = sqlite3.connect(DB_PATH)
%sql sqlite:///$DB_PATH

print("✓ Connected to ecommerce.db")

## 1. CASE Statements: Conditional Logic

CASE allows you to add conditional logic in SQL queries.

**Syntax:**
```sql
CASE
    WHEN condition1 THEN result1
    WHEN condition2 THEN result2
    ELSE default_result
END
```

In [None]:
# Simple CASE: Categorize products by price
%%sql
SELECT 
    product_name,
    price,
    CASE
        WHEN price < 30 THEN 'Budget'
        WHEN price BETWEEN 30 AND 100 THEN 'Mid-Range'
        ELSE 'Premium'
    END AS price_category
FROM products
ORDER BY price
LIMIT 15

In [None]:
# CASE for stock status
%%sql
SELECT 
    product_name,
    stock_quantity,
    CASE
        WHEN stock_quantity = 0 THEN 'Out of Stock'
        WHEN stock_quantity < 30 THEN 'Low Stock'
        WHEN stock_quantity < 100 THEN 'In Stock'
        ELSE 'Well Stocked'
    END AS stock_status
FROM products
ORDER BY stock_quantity

In [None]:
# Use CASE in aggregation
%%sql
SELECT 
    COUNT(*) AS total_products,
    SUM(CASE WHEN price < 50 THEN 1 ELSE 0 END) AS budget_products,
    SUM(CASE WHEN price >= 50 AND price < 100 THEN 1 ELSE 0 END) AS midrange_products,
    SUM(CASE WHEN price >= 100 THEN 1 ELSE 0 END) AS premium_products
FROM products

## 2. Window Functions

Window functions perform calculations across a set of rows related to the current row.

**Common Window Functions:**
- ROW_NUMBER(): Assigns a unique number to each row
- RANK(): Assigns a rank with gaps
- DENSE_RANK(): Assigns a rank without gaps
- NTILE(n): Divides rows into n groups

**Note:** SQLite has limited window function support in older versions.

In [None]:
# ROW_NUMBER: Assign row numbers
%%sql
SELECT 
    product_name,
    category_id,
    price,
    ROW_NUMBER() OVER (ORDER BY price DESC) AS row_num
FROM products
LIMIT 10

In [None]:
# RANK: Rank products by price (with gaps for ties)
%%sql
SELECT 
    product_name,
    price,
    RANK() OVER (ORDER BY price DESC) AS price_rank
FROM products
LIMIT 15

In [None]:
# DENSE_RANK: Rank without gaps
%%sql
SELECT 
    product_name,
    price,
    DENSE_RANK() OVER (ORDER BY price DESC) AS dense_rank
FROM products
LIMIT 15

## 3. PARTITION BY: Grouping within Window Functions

PARTITION BY divides rows into groups for window function calculations.

In [None]:
# Rank products within each category
%%sql
SELECT 
    product_name,
    category_id,
    price,
    RANK() OVER (PARTITION BY category_id ORDER BY price DESC) AS category_rank
FROM products
ORDER BY category_id, category_rank
LIMIT 20

In [None]:
# Top 3 products in each category
%%sql
WITH ranked_products AS (
    SELECT 
        product_name,
        category_id,
        price,
        ROW_NUMBER() OVER (PARTITION BY category_id ORDER BY price DESC) AS rn
    FROM products
)
SELECT product_name, category_id, price
FROM ranked_products
WHERE rn <= 3
ORDER BY category_id, price DESC

## 4. Running Totals and Cumulative Calculations

In [None]:
# Running total of order amounts by date
%%sql
SELECT 
    order_id,
    order_date,
    total_amount,
    SUM(total_amount) OVER (
        ORDER BY order_date 
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS running_total
FROM orders
ORDER BY order_date
LIMIT 20

In [None]:
# Moving average (3-order window)
%%sql
SELECT 
    order_id,
    order_date,
    total_amount,
    AVG(total_amount) OVER (
        ORDER BY order_date
        ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
    ) AS moving_avg_3
FROM orders
ORDER BY order_date
LIMIT 20

## 5. UNION and UNION ALL

- **UNION**: Combines results and removes duplicates
- **UNION ALL**: Combines results keeping duplicates (faster)

In [None]:
# UNION: Combine two result sets
%%sql
SELECT 'High Value' AS category, product_name, price
FROM products
WHERE price > 100

UNION

SELECT 'Low Stock' AS category, product_name, price
FROM products
WHERE stock_quantity < 50

ORDER BY price DESC
LIMIT 15

In [None]:
# UNION ALL: Keep duplicates (faster)
%%sql
SELECT customer_id, 'Customer' AS type FROM customers WHERE country = 'USA'
UNION ALL
SELECT customer_id, 'High Spender' AS type 
FROM orders 
WHERE total_amount > 300
LIMIT 20

## 6. Complex Multi-Table Queries

In [None]:
# Customer lifetime value with rankings
%%sql
WITH customer_totals AS (
    SELECT 
        c.customer_id,
        c.first_name || ' ' || c.last_name AS customer_name,
        c.country,
        COUNT(o.order_id) AS order_count,
        SUM(o.total_amount) AS lifetime_value
    FROM customers c
    LEFT JOIN orders o ON c.customer_id = o.customer_id
    GROUP BY c.customer_id, customer_name, c.country
)
SELECT 
    customer_name,
    country,
    order_count,
    ROUND(lifetime_value, 2) AS lifetime_value,
    RANK() OVER (ORDER BY lifetime_value DESC) AS value_rank,
    CASE
        WHEN lifetime_value > 500 THEN 'VIP'
        WHEN lifetime_value > 200 THEN 'Gold'
        WHEN lifetime_value > 100 THEN 'Silver'
        ELSE 'Bronze'
    END AS customer_tier
FROM customer_totals
WHERE lifetime_value IS NOT NULL
ORDER BY lifetime_value DESC
LIMIT 20

In [None]:
# Product performance analysis
%%sql
SELECT 
    p.product_name,
    c.category_name,
    p.price,
    p.stock_quantity,
    COUNT(oi.order_item_id) AS times_ordered,
    SUM(oi.quantity) AS total_quantity_sold,
    ROUND(SUM(oi.quantity * oi.price), 2) AS total_revenue,
    RANK() OVER (PARTITION BY c.category_name ORDER BY SUM(oi.quantity * oi.price) DESC) AS category_revenue_rank
FROM products p
JOIN categories c ON p.category_id = c.category_id
LEFT JOIN order_items oi ON p.product_id = oi.product_id
GROUP BY p.product_id, p.product_name, c.category_name, p.price, p.stock_quantity
ORDER BY total_revenue DESC
LIMIT 20

## 7. Pivot-Style Queries with CASE

In [None]:
# Pivot: Count orders by status and month
%%sql
SELECT 
    strftime('%Y-%m', order_date) AS month,
    COUNT(*) AS total_orders,
    SUM(CASE WHEN status = 'completed' THEN 1 ELSE 0 END) AS completed,
    SUM(CASE WHEN status = 'pending' THEN 1 ELSE 0 END) AS pending,
    SUM(CASE WHEN status = 'shipped' THEN 1 ELSE 0 END) AS shipped
FROM orders
GROUP BY month
ORDER BY month DESC
LIMIT 10

## 8. Exercises

### Exercise 1: Customer Segmentation
Create a query that categorizes customers based on their total spending:
- VIP: > $500
- Premium: $200-$500
- Regular: < $200

Show customer name, total spent, and segment.

In [None]:
# Your code here
%%sql

### Exercise 2: Top 5 Products Per Category
Find the top 5 best-selling products (by quantity) in each category using window functions.

In [None]:
# Your code here
%%sql

### Exercise 3: Running Total Revenue
Calculate the running total of revenue by date for all completed orders.

In [None]:
# Your code here
%%sql

### Exercise 4: Monthly Sales Summary
Create a pivot-style report showing total orders and revenue by month and status.

In [None]:
# Your code here
%%sql

## Summary

In this module, you learned:
- ✓ CASE statements for conditional logic
- ✓ Window functions (ROW_NUMBER, RANK, DENSE_RANK)
- ✓ PARTITION BY for grouped calculations
- ✓ Running totals and moving averages
- ✓ UNION and UNION ALL
- ✓ Complex multi-table analytical queries
- ✓ Pivot-style reporting with CASE

**Key Takeaways:**
- Window functions enable sophisticated analytics without subqueries
- PARTITION BY allows group-wise calculations
- CASE adds conditional logic for categorization and pivoting
- Combine techniques for powerful business intelligence queries

**Next:** Module 09 - Performance & Optimization

In [None]:
conn.close()