## What is a Slowly Changing Dimension (SCD)?

A **Slowly Changing Dimension** is a dimension table where attribute values change **infrequently over time**.

Unlike fact tables, dimension tables represent descriptive information such as:
- Customer profile
- Product details
- Employee attributes

Handling changes in dimension attributes is a **core data modeling decision** in data warehousing.


![ETL Concepts Overview](../images/scd.png)



## Types of Slowly Changing Dimensions

Slowly Changing Dimensions define strategies for handling attribute changes in dimension tables.
Each type addresses a specific business requirement related to historical tracking and data accuracy.



## SCD Type 0 – Fixed Dimension

### Description
Type 0 dimensions do not allow any changes after data is initially stored.
The values are considered permanent and remain unchanged throughout the lifecycle of the record.

### Use Cases
This type is suitable for attributes that are inherently static, such as:
- Date of birth
- State or country codes
- Zip codes
- Social security or national identification numbers

### Implementation
Records in Type 0 dimensions are immutable.
Any incoming changes from source systems are ignored and not applied to the table.



## SCD Type 1 – Overwrite

### Description
In Type 1 dimensions, changes are handled by overwriting existing values with new ones.
No historical information is preserved, and only the most recent value is retained.

### Use Cases
This approach is appropriate when historical data is not required, such as:
- Correcting data errors
- Updating customer addresses
- Modifying contact details

### Implementation
When an update occurs, the old value is replaced directly in the dimension table.



## SCD Type 2 – Add New Row

### Description
Type 2 dimensions preserve full history by inserting a new record whenever an attribute changes.
Each version of the record is stored separately.

### Key Characteristics
- A new surrogate key is generated for each change
- Natural keys are used to relate records across versions
- Both current and historical records coexist

### Implementation
Common techniques include:
- Status flags indicating active or inactive records
- Effective start and end date columns
This approach is widely used for attributes such as product configurations or role changes.



## SCD Type 3 – Add New Attribute

### Description
Type 3 dimensions track limited history by adding extra columns to store previous values.
Typically, only the most recent prior value is retained.

### Use Cases
This method is useful when:
- Only one level of history is required
- Changes occur periodically

### Implementation
A new column is added to store the previous value, alongside the column holding the current value.
For example, previous_address and current_address.



## SCD Type 4 – History Table

### Description
Type 4 dimensions separate current data and historical data into different tables.
One table holds the latest state, while another stores all historical versions.

### Use Cases
This approach is suitable when:
- Data changes frequently
- Detailed historical auditing is required

### Implementation
The current table is updated with new values, while older records are moved to a dedicated history table.



## How to Implement Slowly Changing Dimensions in a Data Warehouse

#### Initial Assessment
Evaluate the existing dimension tables and identify attributes that may require historical tracking.
Understand which attributes are currently static and which may evolve over time.

#### Decide on SCD Types
Select the appropriate SCD type based on business requirements.
This decision should involve collaboration between data engineers, analytics engineers, and analysts.

#### Handling Pre-existing Data
Determine how to manage historical gaps:
- Start tracking changes from the current point forward
- Attempt to reconstruct historical data where feasible, with caution

#### Implementing SCDs

#### Schema Design
- Type 1: Update records directly
- Type 2: Add columns such as Start_Date, End_Date, and Is_Current
- Type 3: Add columns to store previous values
- Type 4: Maintain separate current and history tables

#### ETL Processes
ETL logic must detect changes in source data and apply the correct SCD handling strategy.
This includes managing surrogate keys and maintaining referential integrity.



#### Testing and Validation
Validate that:
- Changes are captured accurately
- Historical records are preserved correctly
- Data integrity is maintained

#### Documentation and Training
Document:
- Data models
- Change-handling logic
- ETL workflows

Train data consumers to correctly interpret historical and current data.

#### Ongoing Maintenance
Continuously review and refine SCD implementations to align with evolving business needs.
Monitor performance and storage growth.



## Techniques for Maintaining Slowly Changing Dimensions

### ETL-Based Maintenance
ETL pipelines compare incoming data with existing records to identify changes.
This approach is commonly used in batch processing and supports audit trail creation.

### Change Data Capture (CDC)
CDC captures changes in near real time.
It is particularly important for Type 2 dimensions, where every change must be recorded without data loss.

### History Tracking with Effective Dates
Using consistent timestamps and effective date columns ensures accurate validity periods for records.
Database-generated timestamps are often preferred for consistency.

### Data Update Strategy
Regular updates, careful SCD type selection, and performance monitoring are critical to sustaining reliable historical tracking.



## Final Notes

Slowly Changing Dimensions are a foundational concept in dimensional modeling.
When implemented correctly, they enable accurate historical analysis, reliable reporting, and informed decision-making.
Poorly designed SCD strategies can lead to misleading insights and data inconsistencies.
