Skip to content

balasivapindra/sql-masterclass

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SQL Masterclass -- 50+ Problems from Basics to Advanced

PostgreSQL SQL AWS Athena Docker Problems

A comprehensive SQL curriculum with 50+ hands-on problems covering everything from basic SELECT statements to real-world data engineering scenarios. Each problem includes a clear statement, hints, a complete solution, and a detailed explanation.

Built for PostgreSQL 15. Practice locally with Docker or deploy to AWS RDS.


Table of Contents


Setup

Option 1: Docker (Recommended)

cd setup/
docker compose up -d

# Connect to the database
psql -h localhost -p 5432 -U student -d masterclass
# Password: sqlmaster2023

# Or use pgAdmin at http://localhost:8080
# Email: admin@sqlmasterclass.com / Password: admin

Option 2: AWS RDS

cd infra/terraform/
terraform init
terraform apply -var="db_password=YourSecurePassword"

# Connect using the output endpoint
psql -h <rds-endpoint> -p 5432 -U masterclass -d masterclass

Option 3: Existing PostgreSQL

psql -h your-host -U your-user -d your-db -f setup/create_tables.sql
psql -h your-host -U your-user -d your-db -f setup/seed_data.sql

How to Use

Each .sql file follows this format:

-- PROBLEM:    Clear statement with expected output
-- HINT:       A nudge in the right direction
-- SOLUTION:   Complete SQL with inline comments
-- EXPLANATION: Why this works, alternatives, complexity

Recommended workflow:

  1. Read only the PROBLEM section
  2. Write your own query and test it
  3. If stuck, read the HINT
  4. Compare your solution to the provided SOLUTION
  5. Read the EXPLANATION to learn trade-offs and alternatives
  6. Move to the next problem

Problems are ordered by difficulty and build on previous concepts.


Problem Sections

01 Basics Beginner

# Problem Topics
1 SELECT and Filtering WHERE, AND/OR, IN, BETWEEN, LIKE, IS NULL
2 Introduction to JOINs INNER, LEFT, RIGHT, FULL OUTER JOIN
3 Aggregations GROUP BY, HAVING, COUNT, SUM, AVG
4 Subqueries Scalar, IN, EXISTS, derived tables
5 Set Operations UNION, INTERSECT, EXCEPT

02 Intermediate Intermediate

# Problem Topics
6 Complex Joins Self-joins, cross joins, anti-joins
7 Correlated Subqueries Correlated subqueries, LATERAL
8 CASE Expressions Conditional logic, pivot with CASE
9 Date Functions DATE_TRUNC, EXTRACT, AGE, INTERVAL
10 String Manipulation SPLIT_PART, REGEXP, STRING_AGG

03 Window Functions Intermediate / Advanced

# Problem Topics
11 ROW_NUMBER and RANK ROW_NUMBER, RANK, DENSE_RANK
12 LEAD and LAG Consecutive row comparison, gap analysis
13 Running Totals Cumulative SUM, frame specifications
14 Moving Averages AVG OVER, trailing/centered windows
15 NTILE and Percentiles NTILE, PERCENT_RANK, PERCENTILE_CONT
16 FIRST_VALUE / LAST_VALUE FIRST_VALUE, LAST_VALUE, NTH_VALUE
17 PARTITION BY Patterns Multiple partitions, gaps and islands

04 CTEs and Recursion Intermediate / Advanced

# Problem Topics
18 Basic CTEs WITH clause, multi-step queries
19 Multiple CTEs CTE chaining, dashboards, funnels
20 Recursive Hierarchy Org charts, category trees, cost rollup
21 Recursive Dates Date generation, gap filling, calendars
22 CTE vs Subquery MATERIALIZED, performance trade-offs

05 Advanced Advanced

# Problem Topics
23 Pivot and Unpivot Conditional aggregation, CROSSTAB
24 LATERAL Joins Top-N per group, correlated FROM
25 GROUPING SETS GROUPING SETS, CUBE, ROLLUP
26 MERGE / UPSERT ON CONFLICT, conditional upserts
27 JSON Functions JSONB operators, json_agg, extraction
28 Array Operations UNNEST, array_agg, @>, && operators
29 Materialized Views CREATE, REFRESH, CONCURRENTLY

06 Optimization Advanced

# Problem Topics
30 EXPLAIN Plans EXPLAIN ANALYZE, scan types, costs
31 Index Strategies B-tree, composite, partial, covering
32 Query Rewriting Anti-patterns, sargability, refactoring
33 Partitioning Range, list, hash partitioning
34 Statistics and Hints ANALYZE, pg_stats, extended statistics

07 Real-World Scenarios Advanced

# Problem Topics
35 Revenue Reporting MoM/YoY growth, Pareto analysis
36 User Retention Cohort retention, retention triangle
37 Funnel Analysis Conversion funnels, drop-off rates
38 Session Analysis Bounce rate, sessionization
39 Inventory Management ABC analysis, reorder points, turnover
40 Customer Segmentation RFM analysis, behavioral segments
41 Time Series Gaps Gap detection, missing data
42 Ranking Leaderboards Dynamic rankings, rank changes
43 A/B Test Analysis Statistical significance, z-scores
44 Fraud Detection Anomaly detection, velocity checks
45 Supply Chain Metrics Lead times, supplier reliability
46 Cohort Retention LTV estimation, cohort curves
47 Marketplace Matching Collaborative filtering, recommendations
48 Event Sequence Path analysis, navigation patterns
49 Data Quality Checks Validation framework, profiling
50 Interview Questions 10 classic SQL interview problems

Database Schema

The practice database models an e-commerce platform:

departments ──< employees (self-referencing: manager_id)
categories (self-referencing: parent_id) ──< products
customers ──< orders ──< order_items >── products
customers ──< sessions ──< page_events (JSONB event_data)
ab_tests ──< ab_test_assignments >── customers
suppliers ──< shipments >── orders
products ──< inventory_logs

Tables and approximate row counts:

Table Rows Description
departments 8 Company departments
employees 60 Employees with org hierarchy
categories 15 Product categories (hierarchical)
products 80 Products with pricing
customers 200+ Customer profiles
orders 500 Orders with status tracking
order_items 1200 Line items per order
sessions 400 Web/mobile sessions
page_events 2000 Clickstream data (JSONB)
ab_tests 5 A/B test definitions
ab_test_assignments 300 Test variant assignments
suppliers 10 Supplier profiles
inventory_logs 400 Stock change events
shipments 350 Order shipments

Infrastructure

The infra/terraform/ directory provisions:

  • RDS PostgreSQL 15 (db.t3.micro) with Performance Insights
  • Athena Workgroup with S3 results bucket and cost controls
  • VPC, subnets, and security groups
  • IAM roles for Athena access

Reference Docs


License

MIT

About

50+ SQL problems and solutions from basics to advanced — window functions, CTEs, optimization, and real-world scenarios

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors