
---

## 🔷 Product Dimension Table — Deep Explanation

---

### ✅ We pull data from `product_master` to Product Dimension Table — Real Scenario

**Scenario:**
Let’s say you work for **BigMart**, a retail chain with thousands of SKUs.

The **`product_master` table** from the OLTP system (used by inventory and billing teams) has:

```text
product_id | product_name         | category  | subcategory | brand    | size | uom | standard_price | status
-----------|----------------------|-----------|-------------|----------|------|-----|----------------|-------
1001       | Coca-Cola 500ml Pet  | Beverage  | Soft Drink  | Coca-Cola| 500  | ml  | 30             | active
```

This is **volatile**, prone to real-time updates, and **not analytical** in structure.

You extract-transform-load (ETL) this into the **`dim_product` table** in the data warehouse:

| product\_key | product\_id | product\_name   | category\_code | brand\_code | uom | size | is\_active | valid\_from | valid\_to  |
| ------------ | ----------- | --------------- | -------------- | ----------- | --- | ---- | ---------- | ----------- | ---------- |
| 101          | 1001        | Coca-Cola 500ml | BEV            | COCA        | ml  | 500  | Y          | 2024-01-01  | 9999-12-31 |

Here, **category\_code** and **brand\_code** are now **foreign keys** pointing to normalized dimension tables like `dim_category` and `dim_brand`.

---

### ✅ Important Attributes in Product Dimension Table

| Attribute                | Purpose                                        |
| ------------------------ | ---------------------------------------------- |
| `product_key`            | Surrogate key (DW-specific ID, not from OLTP)  |
| `product_id`             | Natural key from source system                 |
| `product_name`           | Human-readable name                            |
| `brand_code`             | Foreign key to brand dimension                 |
| `category_code`          | Foreign key to category dimension              |
| `size`, `uom`            | Physical characteristics                       |
| `standard_price`         | Sometimes included for analytics / snapshot    |
| `is_active`              | Soft-deletion / filtering inactive products    |
| `valid_from`, `valid_to` | For slowly changing dimensions (track history) |

---

### ⚠️ Common Issue: Redundancy and Duplication

If we keep repeating all columns like `brand_name`, `category_name`, `subcategory_name` in the product table, we:

* Bloat the table
* Introduce **inconsistencies** (e.g., spelling variations, casing issues)
* Make joins **inefficient** and **reporting messy**

---

### ✅ Solution: Normalize Repeating Info into Separate Dimension Tables

**Example:**

```text
dim_brand
----------
brand_code | brand_name     | brand_origin
COCA       | Coca-Cola      | USA

dim_category
------------
category_code | category_name | subcategory_name
BEV           | Beverage      | Soft Drink
```

Then in `dim_product`, we store:

```text
brand_code = COCA
category_code = BEV
```

This reduces columns, improves clarity, and allows **independent management** of attributes like brand origin.

---

### 🌟 Star Schema vs Snowflake Schema

| Feature            | Star Schema                        | Snowflake Schema                        |
| ------------------ | ---------------------------------- | --------------------------------------- |
| Structure          | Flat denormalized dimension tables | Normalized dimensions (with sub-tables) |
| Query Performance  | Faster (fewer joins)               | Slightly slower (more joins)            |
| Storage Efficiency | More space used                    | Less space due to reduced redundancy    |
| Maintenance        | Easier                             | More complex due to multiple joins      |
| Use Case           | Dashboarding, Ad-hoc BI            | Normalized DW models for scale          |

---

#### 🧑‍💼 Real Case Scenario

* **Star Schema Example:**

  * `fact_sales` → `dim_product`, `dim_store`, `dim_date`
  * `dim_product` has all columns: brand name, category name, subcategory

* **Snowflake Schema Example:**

  * `dim_product` → has only codes
  * `dim_brand`, `dim_category` → hold detailed info

Use **Star Schema** when BI tools or analysts prefer simplicity.
Use **Snowflake Schema** when dimension tables grow large and redundancy hurts performance or manageability.

---

### 🧮 Why Include Measures in Product Dimension? (e.g., Standard Price)

**Because:**

* Some **semi-static measures** like `standard_price` change rarely and help calculate **derived measures** like:

  * Discount = Actual Price – Standard Price
* Needed in **SCD Type 2 tracking** to see historical price when the product was sold.

🔸 But avoid frequent-changing measures (like `daily_price`) in dimension tables — those belong in fact tables.

---

### 🔄 Roll-up / Drill-up vs Drill-down / Roll-down

| Operation      | Meaning     | Example in Product Hierarchy            |
| -------------- | ----------- | --------------------------------------- |
| **Drill Down** | More detail | From `category → subcategory → product` |
| **Roll Up**    | Less detail | From `product → subcategory → category` |

✅ Supported well by **hierarchical dimension tables** in Snowflake schema.

Example:

```text
"Show me sales by subcategory" ← Roll up from product level
"Show me all products under Beverages" ← Drill down from category
```

---
