
## 📚 What is Delta Live Tables (DLT)?

**Delta Live Tables lets you:** \
✅ Define ETL pipelines declaratively (SQL or Python) \
✅ Automate data quality, retries, schema inference (Delta Lake + DLT Expectations API) \
✅ Track data lineage, monitoring, and versioning built-in \
✅ Support both batch and streaming data \
✅ Handles orchestration, state management, and checkpoints automatically

> _In short: it simplifies and productionizes pipelines on Databricks with less code + more guarantees_
> _(No need to write explicit checkpoints, upserts, etc. manually like Structured Streaming)_

>> Delta Live Tables workflow only avaiable in Premium Tier. Where we can configure the Pipeline mode (Triggered or Continious)





| **Aspect** | **LIVE Table (Materialized View)** | **Streaming LIVE Table** |
| --- | --- | --- |
| **Definition** | Table fully computed to optimize resources. | Table processes only new data, avoids recomputing old data. |
| **Data Processing** | Batch processing, may use incremental for optimization. | Incremental processing, real-time or near real-time. |
| **Update Mechanism** | Updates via scheduled or manual pipeline runs. | Continuous updates for new data via pipeline runs. |
| **Use Case** | Batch processing, data warehousing, historical analysis. | Real-time analytics, IoT, log processing. |
| **Performance** | Resource-intensive if fully recomputed. | More efficient, processes only new data. |
| **Stateful** | Not stateful, recomputes based on current data. | Stateful, maintains state across updates. |
| **Flow Type** | Batch flows with batch semantics. | Streaming flows with append or apply changes. |
| **Definition Method** | Implicitly defined via batch query. | Explicitly or implicitly defined in streaming pipeline. |


### Bronze layer

In [0]:
%sql
-- cloud_files method enable Auto Loader natively in SQL.

CREATE OR REFRESH STREAMING LIVE TABLE orders_raw
COMMENT "The raw books orders, ingested from orders-raw"
AS SELECT * FROM cloud_files("${datasets.path}/orders-json-raw", "json",
                             map("cloudFiles.inferColumnTypes", "true"))

In [0]:
%sql

CREATE OR REFRESH LIVE TABLE customers
COMMENT "The customers lookup table, ingested from customers-json"
AS SELECT * FROM json.`${datasets.path}/customers-json`


### Silver layer


**Manage data quality with Delta Lake and DLT Expectations API:**

> Constraint violation

| **`ON VIOLATION`** | Behavior |
| --- | --- |
| **`DROP ROW`** | Discard records that violate constraints |
| **`FAIL UPDATE`** | Violated constraint causes the pipeline to fail  |
| Omitted | Records violating constraints will be kept, and reported in metrics |

_Note: We need to use LIVE prefix for LIVE tables, STREAM for STREAMING table_

In [0]:
%sql

/*
  This Delta Live Table (DLT) script creates or refreshes a streaming table called `orders_cleaned`.
  - It continuously processes data from the `orders_raw` streaming source.
  - A data quality constraint is applied to ensure `order_id` is not null; rows violating this rule are dropped.
  - The table enriches raw order data by joining it with customer data from the `customers` table.
  - It selects and transforms key fields:
      * Extracts customer first and last names from nested JSON fields.
      * Converts the `order_timestamp` from Unix epoch seconds to a human-readable timestamp.
      * Retrieves customer country information.
  - This results in a clean, enriched dataset of book orders with valid order IDs, ready for downstream analytics.
*/

CREATE OR REFRESH STREAMING LIVE TABLE orders_cleaned (
  CONSTRAINT valid_order_number EXPECT (order_id IS NOT NULL) ON VIOLATION DROP ROW
)
COMMENT "The cleaned books orders with valid order_id"
AS
  SELECT order_id, quantity, o.customer_id, c.profile:first_name as f_name, c.profile:last_name as l_name,
         cast(from_unixtime(order_timestamp, 'yyyy-MM-dd HH:mm:ss') AS timestamp) order_timestamp, o.books,
         c.profile:address:country as country
  FROM STREAM(LIVE.orders_raw) o
  LEFT JOIN LIVE.customers c
    ON o.customer_id = c.customer_id
