### Processing Incremental Updates with Structured Streaming and Delta Lake
In this lab you'll apply your knowledge of structured streaming and Auto Loader to implement a simple multi-hop architecture.

#### 1.0. Import Shared Utilities and Data Files
Run the following cell to setup necessary variables and clear out past runs of this notebook. Note that re-executing this cell will allow you to start the lab over.

In [0]:
%run ./Includes/5.1-Lab-setup


#### 2.0. Bronze Table: Ingest data
This lab uses a collection of customer-related CSV data from DBFS found in *`dbfs:/FileStore/lab_data/retail-org/customers/`*.
- Read this data using Auto Loader using its schema inference (use **`DA.paths.checkpoints`** to store the schema info in a dedicated folder for **`customers`**).
- Stream the raw data to a Delta table called **`bronze`** using the **`append`** output mode.

In [0]:
# TODO:

In [0]:
DA.block_until_stream_is_ready(query)

##### 2.1. Create a Streaming Temporary View
Create a streaming temporary view named **`bronze_temp`** into the **`bronze`** table so we can perform transformations using SQL.

In [0]:
(spark
  .readStream
  .table("bronze")
  .createOrReplaceTempView("bronze_temp"))

##### 2.2. Clean and Enhance the Data
Use the CTAS syntax to define a new streaming view called **`bronze_enhanced_temp`** that does the following:
* Skips records with a null **`postcode`** (set to zero)
* Inserts a column called **`receipt_time`** containing a current timestamp
* Inserts a column called **`source_file`** containing the input filename

In [0]:
%sql
-- TODO:

#### 3.0. Silver Table
Stream the data from **`bronze_enhanced_temp`** to a Delta table named **`silver`** using the **`append`** output mode. Use **`DA.paths.checkpoints`** and a dedicated folder for **`silver`** as the checkpoint path).

In [0]:
# TODO:

In [0]:
DA.block_until_stream_is_ready(query)

##### 3.1. Create a Streaming Temporary View
Create another streaming temporary view named **`silver_temp`** for the **`silver`** table so we can perform business-level queries using SQL.

In [0]:
(spark
  .readStream
  .table("silver")
  .createOrReplaceTempView("silver_temp"))


#### 4.0. Gold Table
Use the CTAS syntax to define a new streaming view called **`customer_count_by_state_temp`** that counts customers per state.

In [0]:
%sql
-- TODO:

Finally, stream the data from the **`customer_count_by_state_temp`** view to a Delta table called **`gold_customer_count_by_state`**. Remember to use the **`complete`** output mode because aggregations like **`count()`** and sorting cannot operate on *unbounded* datasets.  Also, use **`DA.paths.checkpoints`** and a dedicated folder for **`customer_counts`** as the checkpoint path).

In [0]:
# TODO:

In [0]:
DA.block_until_stream_is_ready(query)

#### 5.0. Query the Results
Query the **`gold_customer_count_by_state`** table (this will not be a streaming query).

In [0]:
%sql
SELECT * FROM gold_customer_count_by_state

#### 6.0. Clean Up
Run the following cell to remove the database and all data associated with this lab.

In [0]:
DA.cleanup()

By completing this lab, you should now feel comfortable:
* Using PySpark to configure Auto Loader for incremental data ingestion
* Using Spark SQL to aggregate streaming data
* Streaming data to a Delta table