### Learning Objectives
- This lab is a demonstration of setting up a simple ETL pipeline, before the use of Lakeflow Spark Declarative Pipelines.
  1. We will use the UI to Create a folder where our Pipeline code is stored
  2. We will run a Pipeline that calls a notebook called `orders_pipeline.sql`
  3. We will demonstrate the use of config settings to parameterize sql code.
  3. The pipeline will ingest from orders/oo.json to create a bronze streaming table -> silver streaming table -> gold materialzed view.


### Set Up
- Run `%run ../01_Data_Engineer_Learning_Plan/Lab-Setup/lab-setup-06`
- This should create 4 datasets
  1. `orders/00.json` -> 174 rows
  2. `status/00.json` -> 5000 rows
  3. `customers/00.json` --> 1000 rows
  4. `customers_new01.json` --> 23 rows
- In this lab, we will only focus on the `orders` dataset. The rest of the datasets will be used in the next 2 labs.


In [0]:
%run ../01_Data_Engineer_Learning_Plan/Lab-Setup/lab-setup-06

#### Steps: 
1. Select folder where you want to store your pipeline
2. Select `Create ETL Pipeline` 
  - This is a UI set up to define your souce folder where pieplien will run.
  - We can create notebooks here too to run code (Create `orders_pipeline.sql`) in this step. (Code below)
  - In the UI, click settings to change common settings
    - Under config, we will put key: source, value : `/Volumes/workspace/data_engineering_labs_00/v01` to parameterize the volume location.
3. We can click dry run when ready: This will help to chekc for errros, without creation of actual tables
4. We can run pipeline with full table refresh(can be dangerous)


### Example 
 - Sample `orders_pipeline.sql` code 
 - Do not run it here. 

In [0]:
-- 1. Create a bronze streaming table from our volume. 
CREATE OR REFRESH STREAMING TABLE workspace.data_engineering_labs_00.bronze_demo
AS
SELECT 
*, 
current_timestamp() AS processing_time,
_metadata.file_name AS source_file
FROM 
STREAM read_files(
  "${source}/orders", -- source config variable set in pipeline settings
  format => 'JSON'
);


-- 2. Create a silver streaming table from our bronze table, with a transform to convert the timestamp
CREATE OR REFRESH STREAMING TABLE workspace.data_engineering_labs_00.silver_demo
AS
SELECT 
order_id,
timestamp(order_timestamp) AS order_timestamp,
customer_id,
notifications
FROM 
STREAM bronze_demo;

-- 3. Create a materialised view from our silver table
CREATE OR REFRESH MATERIALIZED VIEW workspace.data_engineering_labs_00.gold_orders_by_date_demo
AS
SELECT 
date(order_timestamp) AS order_date,
count(*) AS total_daily_orders
FROM 
silver_demo
GROUP BY date(order_timestamp);