A connected, production-style SQL learning repository built around one end-to-end business case:
Mingalar Mart, a fictional Myanmar retail chain with stores, delivery operations, social commerce, customer activity data, and branch inventory tracking.
This repo is designed to feel like a real Data Engineering project:
- one connected story,
- one shared schema,
- one growing set of business questions,
- and one clear path from SQL basics to pipeline thinking.
Most SQL repos teach syntax in isolation. This one teaches SQL the way data teams use it:
- understand the source system,
- inspect business grain,
- join messy real-world tables,
- build correct KPIs,
- transform raw data into marts,
- validate logic,
- and explain tradeoffs clearly.
The whole course stays inside the same Myanmar retail world, so each module builds on the last instead of starting over.
You are the Data Engineer supporting Mingalar Mart's analytics stack.
The business runs across multiple cities in Myanmar and needs help answering questions like:
- Which branches are strongest by completed sales?
- Which channels perform best:
in_store,delivery_app,facebook_live, orcall_center? - Which product categories drive revenue in Yangon vs Mandalay?
- Which customers are becoming repeat buyers?
- Which products are at stock risk?
- Which app campaigns generate interest but weak order conversion?
- read schemas and understand grain
- filter, sort, and inspect retail transactions
- work safely with keys and relationships
- join orders, customers, products, stores, and payments
- aggregate branch, category, and channel KPIs
- express business rules with conditional logic
- build staged logic with CTEs
- use windows for recency, ranking, and running metrics
- clean raw retail data for reporting
- design staging and mart layers
- write validation SQL
- reason about performance and modeling
- solve real-world retail analytics patterns
The repository includes a synthetic but realistic Myanmar retail dataset with:
| Entity | Rows |
|---|---|
| Stores | 12 |
| Suppliers | 6 |
| Products | 48 |
| Customers | 140 |
| Orders | 650 |
| Order Items | 1,594 |
| Payments | 650 |
| App Events | 1,311 |
| Delivery Events | 1,636 |
| Inventory Snapshots | 576 |
flowchart LR
A[Raw Retail Sources] --> B[Staging SQL]
B --> C[Intermediate Models]
C --> D[Retail Marts]
D --> E[Finance Reporting]
D --> F[Operations Reporting]
D --> G[Growth Analytics]
README.md -> public landing page
ROADMAP.md -> full curriculum roadmap
docs/storyline.md -> business context and metric definitions
datasets/schemas/ -> schema and seed SQL
datasets/raw/ -> source-style CSV data
modules/ -> module lessons from beginner to advanced
exercises/ -> practice prompts and answer keys
queries/ -> practice SQL, solutions, reusable patterns
projects/ -> mini-projects and capstones
| Stage | Modules | Outcome |
|---|---|---|
| 1 | 00-02 |
Understand the business, schema, keys, and raw tables |
| 2 | 03-05 |
Build joins, KPIs, and business logic correctly |
| 3 | 06-08 |
Use advanced SQL to transform and clean retail data |
| 4 | 09-12 |
Think like a Data Engineer: marts, validation, modeling, performance |
| 5 | 13-16 |
Build projects, capstones, and interview-ready portfolio work |
- Read
docs/storyline.md - Load
course_schema.sql - Load
course_seed.sql - Start with
modules/00_orientation/README.md - Practice with
queries/practice - Solve exercises before checking
exercises/solutions
WITH order_sales AS (
SELECT
oi.order_id,
SUM(oi.line_total_mmk) AS net_item_sales_mmk
FROM order_items AS oi
GROUP BY oi.order_id
)
SELECT
s.city,
o.sales_channel,
COUNT(DISTINCT o.order_id) AS completed_orders,
SUM(os.net_item_sales_mmk + o.delivery_fee_mmk) AS net_order_sales_mmk
FROM orders AS o
INNER JOIN order_sales AS os
ON o.order_id = os.order_id
INNER JOIN stores AS s
ON o.store_id = s.store_id
WHERE o.order_status = 'completed'
GROUP BY s.city, o.sales_channel
ORDER BY net_order_sales_mmk DESC;- Course roadmap:
ROADMAP.md - Storyline and KPIs:
docs/storyline.md - Setup:
docs/setup-guide.md - Best practices:
docs/sql-best-practices.md - Common mistakes:
docs/common-mistakes.md
By the end of this repo, you should be able to talk through:
- retail source-to-mart SQL design,
- branch and channel KPI logic,
- grain and duplication pitfalls,
- validation and idempotent transformation patterns,
- star schema basics,
- and practical SQL reasoning in interviews.
- rebuild one mart in dbt
- port the models to BigQuery or Snowflake
- add tests and freshness checks
- orchestrate refreshes with Airflow
See CONTRIBUTING.md. Contributions should preserve the connected Mingalar Mart storyline rather than introducing unrelated toy examples.