
<div style="text-align: center; line-height: 0; padding-top: 9px;">
  <img src="https://databricks.com/wp-content/uploads/2018/03/db-academy-rgb-1200px.png" alt="Databricks Learning">
</div>


# Pipeline Results


While DLT abstracts away many of the complexities associated with running production ETL on Databricks, many folks may wonder what's actually happening under the hood.

In this notebook, we'll avoid getting too far into the weeds, but will explore how data and metadata are persisted by DLT.

## REQUIRED - SELECT CLASSIC COMPUTE

Before executing cells in this notebook, please select your classic compute cluster in the lab. Be aware that **Serverless** is enabled by default.

Follow these steps to select the classic compute cluster:

1. Navigate to the top-right of this notebook and click the drop-down menu to select your cluster. By default, the notebook will use **Serverless**.

1. If your cluster is available, select it and continue to the next cell. If the cluster is not shown:

  - In the drop-down, select **More**.

  - In the **Attach to an existing compute resource** pop-up, select the first drop-down. You will see a unique cluster name in that drop-down. Please select that cluster.

**NOTE:** If your cluster has terminated, you might need to restart it in order to select it. To do this:

1. Right-click on **Compute** in the left navigation pane and select *Open in new tab*.

1. Find the triangle icon to the right of your compute cluster name and click it.

1. Wait a few minutes for the cluster to start.

1. Once the cluster is running, complete the steps above to select your cluster.

## A. Classroom Setup

Run the following cell to configure your working environment for this course. It will also set your default catalog to **dbacademy** and the schema to your specific schema name shown below using the `USE` statements.
<br></br>


```
USE CATALOG dbacademy;
USE SCHEMA dbacademy.<your unique schema name>;
```

**NOTE:** The `DA` object is only used in Databricks Academy courses and is not available outside of these courses. It will dynamically reference the information needed to run the course.

In [0]:
%run ./Includes/Classroom-Setup-4

[43mNote: you may need to restart the kernel using %restart_python or dbutils.library.restartPython() to use updated packages.[0m


Loading batch 1 of 31...3 seconds


True

**NOTES:** 
- If you have not completed the DLT pipeline from the previous steps (**1a, 1b, and 1c**), uncomment and run the following cell to create the pipeline using the solution SQL notebooks to complete this demonstration. Wait a few minutes for the DLT pipeline to complete execution.
- If you have not completed demo **3 - Delta Live Tables Running Modes**, your numbers might not match, but you can still continue with the demonstration.

In [0]:
DA.generate_pipeline(
    pipeline_name=DA.generate_pipeline_name(), 
    use_schema = DA.schema_name,
    notebooks_folder='2A - SQL Pipelines/(Solutions) 2A - SQL Pipelines', 
    pipeline_notebooks=[
        '1 - Orders Pipeline',
        '2 - Customers Pipeline',
        '3L - Status Pipeline Lab'
        ],
    use_configuration = {'source':f'{DA.paths.stream_source}'}
    )


DA.start_pipeline()

## B. Querying Tables in the Target Database

As long as a target database is specified during DLT Pipeline configuration, tables should be available to users throughout your Databricks environment. Let's explore them now. 

Run the cell below to see the tables registered to the database used so far. The tables were created in the **dbacademy** catalog, within your unique **schema** name.

In [0]:
%sql
SHOW TABLES;

database,tableName,isTemporary
labuser9104086_1738770250,customer_counts_state,False
labuser9104086_1738770250,customers_bronze,False
labuser9104086_1738770250,customers_bronze_clean,False
labuser9104086_1738770250,customers_silver,False
labuser9104086_1738770250,email_updates,False
labuser9104086_1738770250,orders_bronze,False
labuser9104086_1738770250,orders_by_date,False
labuser9104086_1738770250,orders_silver,False
labuser9104086_1738770250,status_bronze,False
labuser9104086_1738770250,status_silver,False


Note that the view we defined in our pipeline is absent from our tables list.

Query results from the **`orders_bronze`** table.

In [0]:
%sql
SELECT * 
FROM orders_bronze

customer_id,notifications,order_id,order_timestamp,_rescued_data,processing_time,source_file
23936,Y,75542,1641859251,,2025-02-05T18:22:16.737Z,11.json
23959,N,75543,1641861581,,2025-02-05T18:22:16.737Z,11.json
23164,N,75544,1641865949,,2025-02-05T18:22:16.737Z,11.json
23251,Y,75545,1641866794,,2025-02-05T18:22:16.737Z,11.json
23233,Y,75546,1641868404,,2025-02-05T18:22:16.737Z,11.json
22600,Y,75547,1641873198,,2025-02-05T18:22:16.737Z,11.json
23523,Y,75548,1641874102,,2025-02-05T18:22:16.737Z,11.json
23382,Y,75549,1641876494,,2025-02-05T18:22:16.737Z,11.json
24000,N,75550,1641880718,,2025-02-05T18:22:16.737Z,11.json
22765,N,75551,1641883307,,2025-02-05T18:22:16.737Z,11.json


Recall that **`orders_bronze`** was defined as a streaming table in DLT, but our results here are static.

Because DLT uses Delta Lake to store all tables, each time a query is executed, we will always return the most recent version of the table. But queries outside of DLT will return snapshot results from DLT tables, regardless of how they were defined.

## C. Examine Results of `APPLY CHANGES INTO`

Recall that the **customers_silver** table was implemented with changes from a CDC feed applied as Type 1 SCD.

Let's query this table below.

In [0]:
%sql
SELECT * 
FROM customers_silver

processing_time,address,city,customer_id,email,name,state,timestamp,zip_code
2025-02-05T18:09:05.515Z,732 Trujillo Rue,Santa Monica,23058,jmccullough@example.net,Jennifer Christensen,CA,1632356384,89020
2025-02-05T18:09:05.515Z,567 Mora Grove Apt. 795,Fort Lauderdale,23097,rebecca60@example.net,Adrienne Williams,FL,1632740764,58050
2025-02-05T18:09:05.515Z,59530 Ashley Landing,Rayne,22990,michellebeard@example.net,Beverly Poole,LA,1633029371,21249
2025-02-05T18:09:05.515Z,780 Mcbride Valleys,Richmond,23128,huberkrystal@example.net,Ashley Joseph,VA,1633107527,77133
2025-02-05T18:09:05.515Z,8013 Houston Knoll Apt. 402,Hudson Oaks,23179,tasha43@example.org,Juan Waters,TX,1633554533,65998
2025-02-05T18:09:05.515Z,617 Richard Key,Robbinsdale,22905,darlenemoon@example.org,Michael Dawson,MN,1633865490,41894
2025-02-05T18:09:05.515Z,67692 Walker Islands Suite 731,Bayonne,23218,martinezjustin@example.org,Kristy Simpson,NJ,1633935174,52423
2025-02-05T18:09:05.515Z,6567 Charles Squares,New York,22843,michelle02@example.org,William Hawkins,NY,1634500306,86286
2025-02-05T18:09:05.515Z,671 John Drive Apt. 264,Plano,23284,crawfordsusan@example.org,Mrs. Whitney Franklin MD,TX,1634538426,79508
2025-02-05T18:09:05.515Z,87058 Mcguire Mall Apt. 094,San Diego,23303,kaitlincunningham@example.com,Tammy Brock,CA,1634705045,31177



&copy; 2025 Databricks, Inc. All rights reserved.<br/>
Apache, Apache Spark, Spark and the Spark logo are trademarks of the 
<a href="https://www.apache.org/">Apache Software Foundation</a>.<br/>
<br/><a href="https://databricks.com/privacy-policy">Privacy Policy</a> | 
<a href="https://databricks.com/terms-of-use">Terms of Use</a> | 
<a href="https://help.databricks.com/">Support</a>