# Data Warehousing - Part 12: Slowly Changing Dimensions (SCD) Type 3

## 1. What is SCD Type 3?
**SCD Type 3** is a technique used to track **limited history**. It is useful when you only need to know the *current* value and the *immediately previous* value of an attribute.

*   **Mechanism:** Instead of adding new rows (like Type 2), we add a new **column** to the existing row.
*   **Structure:** The table will have columns like `Current_Address` and `Previous_Address`.
*   **Behavior:**
    *   **Current Data:** Lives in the main column.
    *   **History:** When a change happens, the old value moves to the "Previous" column, and the new value overwrites the "Current" column.
    *   **limitation:** Only the *last* change is kept. If a customer moves 3 times, the 1st address is lost forever.

---

## 2. Python Simulation: SCD Type 3 Logic
We continue with our Employee example: `E001` lives in `Kolkata`.

### Initial State
```python
import pandas as pd

# --- 1. Current Dimension State ---
# Notice the extra column: Previous_Address
dim_data = {
    'Employee_Key': [1],
    'Employee_ID': ['E001'],
    'Current_Address': ['Kolkata'],
    'Previous_Address': [None],  # No history yet
    'Effective_Date': ['2023-01-01'] # When current became active
}

df_dim = pd.DataFrame(dim_data)

print("--- Dimension Table (Initial) ---")
display(df_dim)
```

### The Change
Employee `E001` moves to `Bengaluru`.

```python
# --- 2. Incoming Change ---
source_data = {
    'Employee_ID': ['E001'],
    'Address': ['Bengaluru']
}
df_source = pd.DataFrame(source_data)

# --- 3. Implementing SCD Type 3 Logic ---
# We simulate the update process

for index, row in df_source.iterrows():
    # Find the record
    mask = df_dim['Employee_ID'] == row['Employee_ID']
    
    if df_dim.loc[mask, 'Current_Address'].values[0] != row['Address']:
        # 1. Move Current to Previous
        df_dim.loc[mask, 'Previous_Address'] = df_dim.loc[mask, 'Current_Address']
        
        # 2. Overwrite Current with New
        df_dim.loc[mask, 'Current_Address'] = row['Address']
        
        # 3. Update Date
        df_dim.loc[mask, 'Effective_Date'] = '2023-01-10'

print("\n--- Dimension Table (After SCD 3 Update) ---")
display(df_dim)
```

### Another Change (History Loss)
Now, `E001` moves to `Indore`. Watch what happens to `Kolkata`.

```python
# New move to Indore
new_address = 'Indore'

# Move Current (Bengaluru) to Previous
# Overwrite Current with Indore
mask = df_dim['Employee_ID'] == 'E001'
df_dim.loc[mask, 'Previous_Address'] = df_dim.loc[mask, 'Current_Address'] # Previous becomes Bengaluru
df_dim.loc[mask, 'Current_Address'] = new_address

print("\n--- Dimension Table (After 2nd Move) ---")
print("Notice: 'Kolkata' is completely gone.")
display(df_dim)
```

---

## 3. Comprehensive Summary: SCD 1 vs 2 vs 3

This is a standard interview question table.

| Feature | SCD Type 1 | SCD Type 2 | SCD Type 3 |
| :--- | :--- | :--- | :--- |
| **Name** | Overwrite | Add Row | Add Column |
| **History Preserved?** | No (0%) | Yes (100%) | Limited (Current + Previous) |
| **Table Size** | Constant (Number of Entities) | Growing (Number of Changes) | Constant (Number of Entities) |
| **Columns Added** | None | Start_Date, End_Date, Active_Flag | Previous_Value_Column |
| **Complexity** | Low | Medium | Low |
| **Use Case** | Correction of errors | Tracking history for reporting | Comparing "Now vs Before" |

---

## 4. Course Wrap-Up
Congratulations! You have completed the foundational modules of Data Warehousing.

**What we covered:**
1.  **Architecture:** OLTP vs OLAP, Data Lakes, Data Warehouses, Data Marts.
2.  **Data Modeling:** Measures, Attributes, Fact Tables, Dimension Tables, Star Schemas.
3.  **Design Patterns:** Conformed Dimensions, Grain, Bus Matrix.
4.  **ETL Mechanics:** Loading Strategies (Full vs Incremental), SCD Types (1, 2, 3).

**Next Steps:**
*   Practice writing SQL queries for Star Schemas.
*   Explore specific cloud data warehouse technologies (Snowflake, BigQuery, Redshift).
*   Learn about "Data Modeling Tools" like dbt (Data Build Tool).

---