A comprehensive SQL curriculum with 50+ hands-on problems covering everything from basic SELECT statements to real-world data engineering scenarios. Each problem includes a clear statement, hints, a complete solution, and a detailed explanation.
Built for PostgreSQL 15. Practice locally with Docker or deploy to AWS RDS.
- Setup
- How to Use
- Problem Sections
- 01 Basics (Beginner)
- 02 Intermediate (Intermediate)
- 03 Window Functions (Intermediate / Advanced)
- 04 CTEs and Recursion (Intermediate / Advanced)
- 05 Advanced (Advanced)
- 06 Optimization (Advanced)
- 07 Real-World Scenarios (Advanced)
- Database Schema
- Infrastructure
- Reference Docs
cd setup/
docker compose up -d
# Connect to the database
psql -h localhost -p 5432 -U student -d masterclass
# Password: sqlmaster2023
# Or use pgAdmin at http://localhost:8080
# Email: admin@sqlmasterclass.com / Password: admincd infra/terraform/
terraform init
terraform apply -var="db_password=YourSecurePassword"
# Connect using the output endpoint
psql -h <rds-endpoint> -p 5432 -U masterclass -d masterclasspsql -h your-host -U your-user -d your-db -f setup/create_tables.sql
psql -h your-host -U your-user -d your-db -f setup/seed_data.sqlEach .sql file follows this format:
-- PROBLEM: Clear statement with expected output
-- HINT: A nudge in the right direction
-- SOLUTION: Complete SQL with inline comments
-- EXPLANATION: Why this works, alternatives, complexityRecommended workflow:
- Read only the PROBLEM section
- Write your own query and test it
- If stuck, read the HINT
- Compare your solution to the provided SOLUTION
- Read the EXPLANATION to learn trade-offs and alternatives
- Move to the next problem
Problems are ordered by difficulty and build on previous concepts.
| # | Problem | Topics |
|---|---|---|
| 1 | SELECT and Filtering | WHERE, AND/OR, IN, BETWEEN, LIKE, IS NULL |
| 2 | Introduction to JOINs | INNER, LEFT, RIGHT, FULL OUTER JOIN |
| 3 | Aggregations | GROUP BY, HAVING, COUNT, SUM, AVG |
| 4 | Subqueries | Scalar, IN, EXISTS, derived tables |
| 5 | Set Operations | UNION, INTERSECT, EXCEPT |
| # | Problem | Topics |
|---|---|---|
| 6 | Complex Joins | Self-joins, cross joins, anti-joins |
| 7 | Correlated Subqueries | Correlated subqueries, LATERAL |
| 8 | CASE Expressions | Conditional logic, pivot with CASE |
| 9 | Date Functions | DATE_TRUNC, EXTRACT, AGE, INTERVAL |
| 10 | String Manipulation | SPLIT_PART, REGEXP, STRING_AGG |
| # | Problem | Topics |
|---|---|---|
| 11 | ROW_NUMBER and RANK | ROW_NUMBER, RANK, DENSE_RANK |
| 12 | LEAD and LAG | Consecutive row comparison, gap analysis |
| 13 | Running Totals | Cumulative SUM, frame specifications |
| 14 | Moving Averages | AVG OVER, trailing/centered windows |
| 15 | NTILE and Percentiles | NTILE, PERCENT_RANK, PERCENTILE_CONT |
| 16 | FIRST_VALUE / LAST_VALUE | FIRST_VALUE, LAST_VALUE, NTH_VALUE |
| 17 | PARTITION BY Patterns | Multiple partitions, gaps and islands |
| # | Problem | Topics |
|---|---|---|
| 18 | Basic CTEs | WITH clause, multi-step queries |
| 19 | Multiple CTEs | CTE chaining, dashboards, funnels |
| 20 | Recursive Hierarchy | Org charts, category trees, cost rollup |
| 21 | Recursive Dates | Date generation, gap filling, calendars |
| 22 | CTE vs Subquery | MATERIALIZED, performance trade-offs |
| # | Problem | Topics |
|---|---|---|
| 23 | Pivot and Unpivot | Conditional aggregation, CROSSTAB |
| 24 | LATERAL Joins | Top-N per group, correlated FROM |
| 25 | GROUPING SETS | GROUPING SETS, CUBE, ROLLUP |
| 26 | MERGE / UPSERT | ON CONFLICT, conditional upserts |
| 27 | JSON Functions | JSONB operators, json_agg, extraction |
| 28 | Array Operations | UNNEST, array_agg, @>, && operators |
| 29 | Materialized Views | CREATE, REFRESH, CONCURRENTLY |
| # | Problem | Topics |
|---|---|---|
| 30 | EXPLAIN Plans | EXPLAIN ANALYZE, scan types, costs |
| 31 | Index Strategies | B-tree, composite, partial, covering |
| 32 | Query Rewriting | Anti-patterns, sargability, refactoring |
| 33 | Partitioning | Range, list, hash partitioning |
| 34 | Statistics and Hints | ANALYZE, pg_stats, extended statistics |
| # | Problem | Topics |
|---|---|---|
| 35 | Revenue Reporting | MoM/YoY growth, Pareto analysis |
| 36 | User Retention | Cohort retention, retention triangle |
| 37 | Funnel Analysis | Conversion funnels, drop-off rates |
| 38 | Session Analysis | Bounce rate, sessionization |
| 39 | Inventory Management | ABC analysis, reorder points, turnover |
| 40 | Customer Segmentation | RFM analysis, behavioral segments |
| 41 | Time Series Gaps | Gap detection, missing data |
| 42 | Ranking Leaderboards | Dynamic rankings, rank changes |
| 43 | A/B Test Analysis | Statistical significance, z-scores |
| 44 | Fraud Detection | Anomaly detection, velocity checks |
| 45 | Supply Chain Metrics | Lead times, supplier reliability |
| 46 | Cohort Retention | LTV estimation, cohort curves |
| 47 | Marketplace Matching | Collaborative filtering, recommendations |
| 48 | Event Sequence | Path analysis, navigation patterns |
| 49 | Data Quality Checks | Validation framework, profiling |
| 50 | Interview Questions | 10 classic SQL interview problems |
The practice database models an e-commerce platform:
departments ──< employees (self-referencing: manager_id)
categories (self-referencing: parent_id) ──< products
customers ──< orders ──< order_items >── products
customers ──< sessions ──< page_events (JSONB event_data)
ab_tests ──< ab_test_assignments >── customers
suppliers ──< shipments >── orders
products ──< inventory_logs
Tables and approximate row counts:
| Table | Rows | Description |
|---|---|---|
| departments | 8 | Company departments |
| employees | 60 | Employees with org hierarchy |
| categories | 15 | Product categories (hierarchical) |
| products | 80 | Products with pricing |
| customers | 200+ | Customer profiles |
| orders | 500 | Orders with status tracking |
| order_items | 1200 | Line items per order |
| sessions | 400 | Web/mobile sessions |
| page_events | 2000 | Clickstream data (JSONB) |
| ab_tests | 5 | A/B test definitions |
| ab_test_assignments | 300 | Test variant assignments |
| suppliers | 10 | Supplier profiles |
| inventory_logs | 400 | Stock change events |
| shipments | 350 | Order shipments |
The infra/terraform/ directory provisions:
- RDS PostgreSQL 15 (db.t3.micro) with Performance Insights
- Athena Workgroup with S3 results bucket and cost controls
- VPC, subnets, and security groups
- IAM roles for Athena access
- SQL Cheatsheet -- Quick syntax reference for PostgreSQL
- Optimization Guide -- EXPLAIN plans, indexes, and anti-patterns
MIT