# Star vs Normalized (3NF) vs Snowflake — **Super Simple Walkthrough**

_Plain-English explanations with tiny pandas code examples._
_Generated on 2025-09-11._

## What you’ll learn
- Purpose of each schema style.
- How the tables look in each.
- How many joins you usually do.
- Simple pros & cons.

## Setup

In [1]:
import pandas as pd
print('pandas:', pd.__version__)

pandas: 2.3.2


## Normalized (3NF)
- Focused on **reducing redundancy** and ensuring consistency.
- Many small tables, lots of joins.


In [2]:
orders = pd.DataFrame({
 'order_id':[1001,1001,1002],
 'customer_id':[10,10,11],
 'order_date':['2024-01-01','2024-01-01','2024-01-02']
})
customers = pd.DataFrame({
 'customer_id':[10,11],
 'customer_name':['Acme','Bravo']
})
orders.merge(customers, on='customer_id')

Unnamed: 0,order_id,customer_id,order_date,customer_name
0,1001,10,2024-01-01,Acme
1,1001,10,2024-01-01,Acme
2,1002,11,2024-01-02,Bravo


## Star Schema
- **Fact table** in center, **denormalized dimensions** around.
- Easier for analytics: fewer joins.


In [3]:
fact_sales = pd.DataFrame({
 'date_key':[20240101,20240101,20240102],
 'product_key':[1,2,2],
 'customer_key':[10,10,11],
 'sales_amount':[200,120,360]
})
dim_customer = pd.DataFrame({
 'customer_key':[10,11],
 'customer_name':['Acme','Bravo']
})
fact_sales.merge(dim_customer, on='customer_key')

Unnamed: 0,date_key,product_key,customer_key,sales_amount,customer_name
0,20240101,1,10,200,Acme
1,20240101,2,10,120,Acme
2,20240102,2,11,360,Bravo


## Snowflake Schema
- Like Star, but dimensions are **normalized** into sub-dimensions.
- Saves space, adds joins.


In [4]:
dim_product = pd.DataFrame({
 'product_key':[1,2], 'category_id':[101,101]
})
dim_category = pd.DataFrame({
 'category_id':[101], 'category_name':['Widgets']
})
dim_product.merge(dim_category, on='category_id')

Unnamed: 0,product_key,category_id,category_name
0,1,101,Widgets
1,2,101,Widgets


## Recap
- **3NF**: many joins, great for transactions.
- **Star**: fewer joins, best for analytics.
- **Snowflake**: compromise, less redundancy but more joins.
