# 🔄 Change Data Capture (CDC) in Delta Live Tables

## 📌 Definition of CDC

**Change Data Capture (CDC)** is a method used to detect and track changes—such as inserts, updates, and deletions—in a source system and apply them to a target system. It enables efficient data synchronization and processing.

---

## 🚀 Why CDC Matters

CDC plays a critical role in modern data pipelines by:

- Keeping target systems in sync with source data.
- Transferring only changed data, reducing overhead.
- Boosting performance by eliminating full table scans and reloads.

---

## 🔍 Capturing Changes

### Types of Changes:
- **Inserts** – New records added.
- **Updates** – Modifications to existing records.
- **Deletes** – Removed records must be deleted from the target.

Each CDC event typically includes:
- The changed data.
- Metadata specifying the type of operation (insert/update/delete).
- A timestamp or version to track the order of changes.

---

## 📬 Receiving CDC Feeds

CDC feeds may arrive in the form of:
- **Data streams**
- **JSON files**
- **Events from messaging systems**

Properly handling these feeds ensures accurate and up-to-date replication in the target system.

---

## 🧠 Implementing CDC in Delta Live Tables

### 🔧 `APPLY CHANGES INTO` Command

A specialized DLT command that simplifies the application of CDC feeds.

- Defines a **target table** and **source table** (CDC feed).
- Uses **primary keys** to identify unique records.
- Applies the correct operation:
  - Inserts new records.
  - Updates existing ones.
  - Deletes when specified using `applyAsDeleteWhen`.

---

## ⏳ Importance of Sequencing

- Ensures **late-arriving records** are handled correctly.
- Maintains **data integrity** by applying changes in proper order.

---

## 🛠️ Features of `APPLY CHANGES INTO`

**Pros:**
- **Automatic ordering** of records by version or timestamp.
- Seamless **upsert operations** (insert/update).
- Optional handling of **delete operations**.
- Specify one or many fields as the  **primary key** for a table.
- Specify columns to ignore with the **EXCEPT** keyword.
- Support applying changes as SCD Type 1 (default) or SCD Type 2.

**Cons:**
- Breaks and append-only requirements for streaming table sources.
  - Can NOT perform streaming queries against the table
---

## 🗂️ Slowly Changing Dimensions (SCD)

Focus on **Type 1 SCD**, where updates overwrite existing records.

> Note: When using `APPLY CHANGES INTO`, the resulting table cannot serve as a **streaming source** in downstream layers due to the append-only limitation.

---

## ✅ Conclusion

CDC enables efficient, real-time updates in Delta Live Tables.  
By leveraging `APPLY CHANGES INTO`, users can build reliable pipelines that track and apply source system changes with minimal effort and maximum consistency.
