# INTRODUCTION TO LAKEFLOW DECLARATIVE PIPELINES (DATABRICKS)
> Welcome to the world of simplified data engineering on Databricks! Declarative Pipelines, an integral part of Databricks Lakeflow (formerly known and powered by Delta Live Tables - DLT technology), change how we build data pipelines.

Instead of writing complex code to manage orchestration, error handling, and infrastructure, you simply declare how data should flow and be transformed. Lakeflow handles the rest.

# KEY CONCEPTS
To understand declarative pipelines, you need to know a few basic concepts:

- Pipeline: It's the complete graph of your data transformations. You configure the source, the notebooks (with the logic), and the destination.
- Table: The basic unit of a pipeline. In DLT, there are two main types of tables (datasets):
  - Live Table (Materialized Table): Similar to a "Materialized View". The transformation results are physically stored. It is updated incrementally.
  - Live View (View): Similar to a traditional "View". The transformation is calculated "on-the-fly" whenever it's queried. It doesn't store data physically but is useful for breaking down complex logic.
- Source: Where your raw data comes from. It can be cloud storage (S3, ADLS, GCS), Kafka, or other Delta tables.
- Expectations: Data quality rules that you can apply directly to your tables. DLT collects metrics on how many records fail these rules and allows you to decide what to do with them (e.g., drop, keep, or stop the pipeline).

# HOW DOES IT WORK?
The creation process is quite straightforward:

- Write the Code: You define your transformations in SQL or Python within one or more Databricks Notebooks.
- Use Declarative Syntax: Instead of complex INSERT or MERGE commands, you use simple constructs.

> In Python, you use decorators like @dlt.table and @dlt.view.

> In SQL, you use syntax like CREATE OR REFRESH LIVE TABLE.

- Configure the Pipeline: In the Databricks UI, you create a new Pipeline, point it to your notebooks, define the storage location, and configure the execution mode (continuous or scheduled).
- Run: Lakeflow analyzes your code, builds the dependency graph (lineage), and manages all the processing.

# ADVANTAGES OF USING DECLARATIVE PIPELINES

- Simplicity: You focus on the transformation logic (SQL/Python), not on managing clusters, complex schedules, or dependencies.
- Integrated Data Quality: "Expectations" allow you to clean and validate data directly in the table definition.
- Incremental Processing: Lakeflow automatically manages incremental (streaming) processing, processing only new data, which is much more efficient.
- Automatic Recovery: If an execution fails, the pipeline knows exactly where it left off (checkpoints) and resumes from that point.
- Visibility (Lineage): The DLT interface shows you a visual graph of how your data flows between tables, making debugging and understanding easier.