# Overview

This notebook guides you through the setup and execution of the SILVER transformation pipeline using Snowflake and Iceberg tables. It covers the following steps:

- Importing required Python packages and establishing a Snowflake session.
- Setting up user-specific variables, roles, databases, and schemas for the SILVER layer.
- Creating dynamic Iceberg tables for curated and enriched datasets.
- Transforming and joining raw data from the BRONZE layer to produce analytics-ready tables.
- Providing SQL code examples for data transformation and enrichment.

Follow the instructions and code cells to complete the SILVER pipeline and prepare your data for advanced analytics and reporting.

In [None]:
# Import python packages
import streamlit as st
import pandas as pd

# We can also use Snowpark for our analyses!
from snowflake.snowpark.context import get_active_session
session = get_active_session()


# Set User Number

Specify your unique user number for this lab. This ensures that all resources you create are isolated and do not conflict with those of other users.

In [None]:
usernum = str('<INSERT USER NUMBER>')

In [None]:
 -- Specify the USER Number that you are running this notebook. This is the same User number as the login user name Ex: USER_1, USER_151


SET USERNAME = 'HOL_USER_' || {{usernum}};
SELECT $USERNAME;


In [None]:
SET HOLROLE = $USERNAME || '_FULL_ROLE';
SET DB_NAME = $USERNAME || '_DB';
SET SCHEMANAME = 'SILVER';

In [None]:
USE ROLE IDENTIFIER($HOLROLE);
USE DATABASE IDENTIFIER($DB_NAME);
USE SCHEMA IDENTIFIER($SCHEMANAME);

# Creating the TRAIN_ACTIVATIONS_00 Table

This step creates a dynamic Iceberg table named `TRAIN_ACTIVATIONS_00` in the SILVER schema. The table is populated by transforming and extracting relevant fields from the raw train activations data in the BRONZE layer. The transformation includes parsing JSON, converting timestamps, and generating a unique schedule key for each record.

In [None]:


CREATE OR REPLACE DYNAMIC ICEBERG TABLE TRAIN_ACTIVATIONS_00
  TARGET_LAG = '1 minute'
  WAREHOUSE = hol_user_{{usernum}}_wh
  EXTERNAL_VOLUME= 'iceberg_hol_silver_vol'
  CATALOG = 'SNOWFLAKE'
  BASE_LOCATION = 'hol_user_{{usernum}}/train_activations_00/'
  catalog_sync = iceberg_hol_oc_int
  AS
  WITH CTE_MVT_001 as (
  select parse_json(value) as VALUE from BRONZE.MOVEMENTS_RAW_0001)
SELECT VALUE:header::STRING HEADER, 
        VALUE:body:schedule_source::STRING AS schedule_source,
        to_timestamp(VALUE:body:tp_origin_timestamp::STRING)::TIMESTAMP_NTZ(6) AS tp_origin_timestamp,
        CASE WHEN VALUE:body:schedule_type = 'O' THEN 'P'
             WHEN VALUE:body:schedule_type = 'P' THEN 'O'
             ELSE VALUE:body:schedule_type 
          END AS schedule_type,
        to_timestamp(VALUE:body:creation_timestamp::STRING)::TIMESTAMP_NTZ(6) AS creation_timestamp,
        to_timestamp(VALUE:body:origin_dep_timestamp::STRING)::TIMESTAMP_NTZ(6) AS origin_dep_timestamp,
        VALUE:body:toc_id::INT AS toc_id,
        VALUE:body:d1266_record_number::STRING AS d1266_record_number,
        VALUE:body:train_service_code::STRING AS train_service_code,
        VALUE:body:sched_origin_stanox::STRING AS sched_origin_stanox,
        VALUE:body:train_uid::STRING AS train_uid,
        VALUE:body:train_call_mode::STRING AS train_call_mode,
        to_date(VALUE:body:schedule_start_date::STRING) AS schedule_start_date,
        VALUE:body:tp_origin_stanox::STRING AS tp_origin_stanox,
        VALUE:body:schedule_wtt_id::STRING AS schedule_wtt_id,
        VALUE:body:train_call_type::STRING AS train_call_type,
        to_date(VALUE:body:schedule_end_date::STRING) AS schedule_end_date,
        VALUE:body:train_id::STRING AS train_id,
        CONCAT_WS('/',
                  VALUE:body:train_uid ,
                  VALUE:body:schedule_start_date ,
                  CASE WHEN VALUE:body:schedule_type = 'O' THEN 'P'
                    WHEN VALUE:body:schedule_type = 'P' THEN 'O'
                    ELSE VALUE:body:schedule_type 
                  END
                 ) AS SCHEDULE_KEY
    FROM CTE_MVT_001;

# Creating the TRAIN_CANCELLATIONS_00 Table

This cell creates a dynamic Iceberg table named `TRAIN_CANCELLATIONS_00` by transforming raw train cancellation events from the BRONZE layer. The transformation includes extracting fields, mapping TOC codes to operator names, and generating a schedule key for each cancellation event.

In [None]:
CREATE OR REPLACE DYNAMIC ICEBERG TABLE TRAIN_CANCELLATIONS_00
  TARGET_LAG = '1 minute'
  WAREHOUSE = hol_user_{{usernum}}_wh
  EXTERNAL_VOLUME= 'iceberg_hol_silver_vol'
  CATALOG = 'SNOWFLAKE'
  BASE_LOCATION = 'hol_user_{{usernum}}/train_cancellations_00/'
  catalog_sync = iceberg_hol_oc_int
AS
WITH CTE_MVT_002 AS (
  SELECT parse_json(value) AS MVT FROM BRONZE.MOVEMENTS_RAW_0002
)
SELECT    MVT:header::VARCHAR                                  AS MSG_HEADER, 
          MVT:body:train_file_address::VARCHAR             AS TRAIN_FILE_ADDRESS, 
          MVT:body:train_service_code::VARCHAR             AS TRAIN_SERVICE_CODE, 
          MVT:body:orig_loc_stanox::VARCHAR                AS ORIG_LOC_STANOX, 
          MVT:body:toc_id::VARCHAR                         AS TOC_ID, 
          CASE 
                    WHEN MVT:body:toc_id = '20' THEN 'TransPennine Express' 
                    WHEN MVT:body:toc_id = '23' THEN 'Arriva Trains Northern'
                    WHEN MVT:body:toc_id = '28' THEN 'East Midlands Trains' 
                    WHEN MVT:body:toc_id = '61' THEN 'London North Eastern Railway'
                    ELSE '<unknown TOC code: ' || MVT:body:toc_id || '>' 
          END                                                           AS TOC, 
          MVT:body:dep_timestamp::INT  AS DEP_TIMESTAMP, 
          MVT:body:division_code::INT                  AS DIVISION_CODE, 
          MVT:body:loc_stanox::VARCHAR                     AS LOC_STANOX, 
          MVT:body:canx_timestamp::INT AS CANX_TIMESTAMP, 
          MVT:body:canx_reason_code::VARCHAR               AS CANX_REASON_CODE, 
          MVT:body:train_id::VARCHAR                       AS TRAIN_ID, 
          to_timestamp(MVT:body:orig_loc_timestamp::VARCHAR)::timestamp_ntz(6)             AS ORIG_LOC_TIMESTAMP, 
          MVT:body:canx_type::VARCHAR                      AS CANX_TYPE,
           CONCAT_WS('/',
                  MVT:body:train_uid ,
                  MVT:body:schedule_start_date ,
                  CASE WHEN MVT:body:schedule_type = 'O' THEN 'P'
                    WHEN MVT:body:schedule_type = 'P' THEN 'O'
                    ELSE MVT:body:schedule_type 
                  END
                 ) AS SCHEDULE_KEY
FROM     CTE_MVT_002;

# Creating the TRAIN_MOVEMENTS_00 Table

This step creates a dynamic Iceberg table named `TRAIN_MOVEMENTS_00` by transforming raw train movement events from the BRONZE layer. The transformation extracts event details, computes delay indicators, and generates a unique message key for each movement event.

In [None]:
CREATE OR REPLACE DYNAMIC ICEBERG TABLE TRAIN_MOVEMENTS_00
  TARGET_LAG = '1 minute'
  WAREHOUSE = hol_user_{{usernum}}_wh
  EXTERNAL_VOLUME= 'iceberg_hol_silver_vol'
  CATALOG = 'SNOWFLAKE'
  BASE_LOCATION = 'hol_user_{{usernum}}/train_movements_00/'
  catalog_sync = iceberg_hol_oc_int
  AS
WITH CTE_MVT_0003 AS (
      SELECT parse_json(value) AS MVT FROM BRONZE.MOVEMENTS_RAW_0003
      ) 
SELECT MVT:header::STRING AS MSG_HEADER, 
       MVT:body:event_type::STRING as event_type,
       to_timestamp(MVT:body:gbtt_timestamp::STRING)::TIMESTAMP_NTZ(6)  as gbtt_timestamp,
       MVT:body:original_loc_stanox::STRING as original_loc_stanox,
       to_timestamp(MVT:body:planned_timestamp::STRING)::TIMESTAMP_NTZ(6)  as planned_timestamp,
       MVT:body:timetable_variation::INT AS TIMETABLE_VARIATION,
       to_timestamp(MVT:body:original_loc_timestamp::STRING)::TIMESTAMP_NTZ(6)  as original_loc_timestamp,
       MVT:body:current_train_id::STRING as current_train_id,
       MVT:body:delay_monitoring_point::BOOLEAN as delay_monitoring_point,
       MVT:body:next_report_run_time::INT as next_report_run_time,
       MVT:body:reporting_stanox::STRING as reporting_stanox,
       MVT:body:actual_timestamp::INT as actual_timestamp,
       MVT:body:correction_ind::BOOLEAN as correction_ind,
       MVT:body:event_source::STRING as event_source,
       MVT:body:train_file_address::STRING as train_file_address,
       CASE WHEN LEN(MVT:body:platform)> 0 THEN 'Platform' || MVT:body:platform
             ELSE '' 
          END AS PLATFORM,
       MVT:body:division_code::STRING as division_code,
       MVT:body:train_terminated::BOOLEAN as train_terminated,
       MVT:body:train_id::STRING as train_id,
       MVT:body:offroute_ind::BOOLEAN as offroute_ind,
       CASE WHEN MVT:body:variation_status = 'ON TIME' THEN 'ON TIME' 
             WHEN MVT:body:variation_status = 'LATE' THEN MVT:body:timetable_variation || ' MINS LATE' 
             WHEN MVT:body:variation_status ='EARLY' THEN MVT:body:timetable_variation || ' MINS EARLY' 
        END AS VARIATION,
       CASE WHEN MVT:body:variation_status = 'ON TIME' THEN 0
             WHEN MVT:body:variation_status = 'LATE' THEN 1
             WHEN MVT:body:variation_status='EARLY' THEN 0
        END AS LATE_IND,
       MVT:body:variation_status::STRING as variation_status,
       MVT:body:train_service_code::STRING as train_service_code,
       MVT:body:toc_id::STRING as toc_id,
        CASE WHEN MVT:body:toc_id = '20' THEN 'TransPennine Express'
               WHEN MVT:body:toc_id = '23' THEN 'Arriva Trains Northern'
               WHEN MVT:body:toc_id = '28' THEN 'East Midlands Trains'
               WHEN MVT:body:toc_id = '61' THEN 'London North Eastern Railway'
              ELSE '<unknown TOC code: ' || MVT:body:toc_id || '>'
        END AS TOC,
       MVT:body:loc_stanox::STRING as loc_stanox,
       MVT:body:auto_expected::BOOLEAN as auto_expected,
       MVT:body:direction_ind::STRING as direction_ind,
       MVT:body:route::STRING as route,
       MVT:body:planned_event_type::STRING as planned_event_type,
       MVT:body:next_report_stanox::STRING as next_report_stanox,
       MVT:body:line_ind::STRING as line_ind,
       CONCAT_WS('/',
                 MVT:body:train_id,
                 MVT:body:planned_event_type,
                 MVT:body:loc_stanox) AS MSG_KEY
  FROM CTE_MVT_0003;

# Creating the SCHEDULE_00 Table

This cell creates a dynamic Iceberg table named `SCHEDULE_00` by transforming and enriching raw schedule data from the BRONZE layer. The transformation flattens nested columns, maps codes to descriptive values, and joins with location data to provide enriched schedule information, including origin and destination details.

In [None]:
CREATE OR REPLACE DYNAMIC ICEBERG TABLE SCHEDULE_00
  TARGET_LAG = '1 minute'
  WAREHOUSE = hol_user_{{usernum}}_wh
  EXTERNAL_VOLUME= 'iceberg_hol_silver_vol'
  CATALOG = 'SNOWFLAKE'
  BASE_LOCATION = 'hol_user_{{usernum}}/schedule_00/'
  catalog_sync = iceberg_hol_oc_int
  AS
SELECT 
    CIF_train_uid,
    schedule_start_date::DATE schedule_start_date,
    CIF_stp_indicator,
     CONCAT_WS('/',
                 CIF_train_uid,
                 schedule_start_date,
                 CIF_stp_indicator) AS SCHEDULE_KEY,
    atoc_code,
     CASE
            WHEN train_status ='B' THEN 'Bus (Permanent)'
            WHEN train_status ='F' THEN 'Freight (Permanent - WTT)'
            WHEN train_status ='P' THEN 'Passenger & Parcels (Permanent - WTT)'
            WHEN train_status ='S' THEN 'Ship (Permanent)'
            WHEN train_status ='T' THEN 'Trip (Permanent)'
            WHEN train_status ='1' THEN 'STP Passenger & Parcels'
            WHEN train_status ='2' THEN 'STP Freight'
            WHEN train_status ='3' THEN 'STP Trip'
            WHEN train_status ='4' THEN 'STP Ship'
            WHEN train_status ='5' THEN 'STP Bus'
    END AS TRAIN_STATUS,
    CASE 
        WHEN schedule_segment:CIF_power_type = 'D' THEN 'Diesel'
        WHEN schedule_segment:CIF_power_type = 'DEM' THEN 'Diesel Electric Multiple Unit'
        WHEN schedule_segment:CIF_power_type = 'DMU' THEN 'Diesel Mechanical Multiple Unit'
        WHEN schedule_segment:CIF_power_type = 'E' THEN 'Electric'
        WHEN schedule_segment:CIF_power_type = 'ED' THEN 'Electro-Diesel'
        WHEN schedule_segment:CIF_power_type = 'EML' THEN 'EMU plus D, E, ED locomotive'
        WHEN schedule_segment:CIF_power_type = 'EMU' THEN 'Electric Multiple Unit'
        WHEN schedule_segment:CIF_power_type = 'HST' THEN 'High Speed Train'
      END AS POWER_TYPE,
   CASE 
        WHEN schedule_segment:CIF_train_class = 'B' OR schedule_segment:CIF_train_class = '' THEN 'First and standard' 
        WHEN schedule_segment:CIF_train_class = 'S'  THEN 'Standard only' 
      END AS SEATING_CLASSES,
   CASE 
      WHEN schedule_segment:CIF_reservations =  'A' THEN 'Reservations compulsory'
      WHEN schedule_segment:CIF_reservations =  'E' THEN 'Reservations for bicycles essential'
      WHEN schedule_segment:CIF_reservations =  'R' THEN 'Reservations recommended'
      WHEN schedule_segment:CIF_reservations =  'S' THEN 'Reservations possible from any station'
    END AS RESERVATIONS,
   CASE 
      WHEN schedule_segment:CIF_sleepers =   'B' THEN 'First and standard class'
      WHEN schedule_segment:CIF_sleepers =   'F' THEN 'First Class only'
      WHEN schedule_segment:CIF_sleepers =   'S' THEN 'Standard class only'
    END AS SLEEPING_ACCOMODATION,
    CASE 
          WHEN schedule_segment:CIF_train_category =  'OL' THEN 'Ordinary Passenger Trains: London Underground/Metro Service'
          WHEN schedule_segment:CIF_train_category =  'OU' THEN 'Ordinary Passenger Trains: Unadvertised Ordinary Passenger'
          WHEN schedule_segment:CIF_train_category =  'OO' THEN 'Ordinary Passenger Trains: Ordinary Passenger'
          WHEN schedule_segment:CIF_train_category =  'OS' THEN 'Ordinary Passenger Trains: Staff Train'
          WHEN schedule_segment:CIF_train_category =  'OW' THEN 'Ordinary Passenger Trains: Mixed'
          WHEN schedule_segment:CIF_train_category =  'XC' THEN 'Express Passenger Trains: Channel Tunnel'
          WHEN schedule_segment:CIF_train_category =  'XD' THEN 'Express Passenger Trains: Sleeper (Europe Night Services)'
          WHEN schedule_segment:CIF_train_category =  'XI' THEN 'Express Passenger Trains: International'
          WHEN schedule_segment:CIF_train_category =  'XR' THEN 'Express Passenger Trains: Motorail'
          WHEN schedule_segment:CIF_train_category =  'XU' THEN 'Express Passenger Trains: Unadvertised Express'
          WHEN schedule_segment:CIF_train_category =  'XX' THEN 'Express Passenger Trains: Express Passenger'
          WHEN schedule_segment:CIF_train_category =  'XZ' THEN 'Express Passenger Trains: Sleeper (Domestic)'
          WHEN schedule_segment:CIF_train_category =  'BR' THEN 'Buses & Ships: Bus – Replacement due to engineering work'
          WHEN schedule_segment:CIF_train_category =  'BS' THEN 'Buses & Ships: Bus – WTT Service'
          WHEN schedule_segment:CIF_train_category =  'SS' THEN 'Buses & Ships: Ship'
          WHEN schedule_segment:CIF_train_category =  'EE' THEN 'Empty Coaching Stock Trains: Empty Coaching Stock (ECS)'
          WHEN schedule_segment:CIF_train_category =  'EL' THEN 'Empty Coaching Stock Trains: ECS, London Underground/Metro Service'
          WHEN schedule_segment:CIF_train_category =  'ES' THEN 'Empty Coaching Stock Trains: ECS & Staff'
          WHEN schedule_segment:CIF_train_category =  'JJ' THEN 'Parcels and Postal Trains: Postal'
          WHEN schedule_segment:CIF_train_category =  'PM' THEN 'Parcels and Postal Trains: Post Office Controlled Parcels'
          WHEN schedule_segment:CIF_train_category =  'PP' THEN 'Parcels and Postal Trains: Parcels'
          WHEN schedule_segment:CIF_train_category =  'PV' THEN 'Parcels and Postal Trains: Empty NPCCS'
          WHEN schedule_segment:CIF_train_category =  'DD' THEN 'Departmental Trains: Departmental'
          WHEN schedule_segment:CIF_train_category =  'DH' THEN 'Departmental Trains: Civil Engineer'
          WHEN schedule_segment:CIF_train_category =  'DI' THEN 'Departmental Trains: Mechanical & Electrical Engineer'
          WHEN schedule_segment:CIF_train_category =  'DQ' THEN 'Departmental Trains: Stores'
          WHEN schedule_segment:CIF_train_category =  'DT' THEN 'Departmental Trains: Test'
          WHEN schedule_segment:CIF_train_category =  'DY' THEN 'Departmental Trains: Signal & Telecommunications Engineer'
          WHEN schedule_segment:CIF_train_category =  'ZB' THEN 'Light Locomotives: Locomotive & Brake Van'
          WHEN schedule_segment:CIF_train_category =  'ZZ' THEN 'Light Locomotives: Light Locomotive'
          WHEN schedule_segment:CIF_train_category =  'J2' THEN 'Railfreight Distribution: RfD Automotive (Components)'
          WHEN schedule_segment:CIF_train_category =  'H2' THEN 'Railfreight Distribution: RfD Automotive (Vehicles)'
          WHEN schedule_segment:CIF_train_category =  'J3' THEN 'Railfreight Distribution: RfD Edible Products (UK Contracts)'
          WHEN schedule_segment:CIF_train_category =  'J4' THEN 'Railfreight Distribution: RfD Industrial Minerals (UK Contracts)'
          WHEN schedule_segment:CIF_train_category =  'J5' THEN 'Railfreight Distribution: RfD Chemicals (UK Contracts)'
          WHEN schedule_segment:CIF_train_category =  'J6' THEN 'Railfreight Distribution: RfD Building Materials (UK Contracts)'
          WHEN schedule_segment:CIF_train_category =  'J8' THEN 'Railfreight Distribution: RfD General Merchandise (UK Contracts)'
          WHEN schedule_segment:CIF_train_category =  'H8' THEN 'Railfreight Distribution: RfD European'
          WHEN schedule_segment:CIF_train_category =  'J9' THEN 'Railfreight Distribution: RfD Freightliner (Contracts)'
          WHEN schedule_segment:CIF_train_category =  'H9' THEN 'Railfreight Distribution: RfD Freightliner (Other)'
          WHEN schedule_segment:CIF_train_category =  'A0' THEN 'Trainload Freight: Coal (Distributive)'
          WHEN schedule_segment:CIF_train_category =  'E0' THEN 'Trainload Freight: Coal (Electricity) MGR'
          WHEN schedule_segment:CIF_train_category =  'B0' THEN 'Trainload Freight: Coal (Other) and Nuclear'
          WHEN schedule_segment:CIF_train_category =  'B1' THEN 'Trainload Freight: Metals'
          WHEN schedule_segment:CIF_train_category =  'B4' THEN 'Trainload Freight: Aggregates'
          WHEN schedule_segment:CIF_train_category =  'B5' THEN 'Trainload Freight: Domestic and Industrial Waste'
          WHEN schedule_segment:CIF_train_category =  'B6' THEN 'Trainload Freight: Building Materials (TLF)'
          WHEN schedule_segment:CIF_train_category =  'B7' THEN 'Trainload Freight: Petroleum Products'
          WHEN schedule_segment:CIF_train_category =  'H0' THEN 'Railfreight Distribution (Channel Tunnel): RfD European Channel Tunnel (Mixed Business)'
          WHEN schedule_segment:CIF_train_category =  'H1' THEN 'Railfreight Distribution (Channel Tunnel): RfD European Channel Tunnel Intermodal'
          WHEN schedule_segment:CIF_train_category =  'H3' THEN 'Railfreight Distribution (Channel Tunnel): RfD European Channel Tunnel Automotive'
          WHEN schedule_segment:CIF_train_category =  'H4' THEN 'Railfreight Distribution (Channel Tunnel): RfD European Channel Tunnel Contract Services'
          WHEN schedule_segment:CIF_train_category =  'H5' THEN 'Railfreight Distribution (Channel Tunnel): RfD European Channel Tunnel Haulmark'
          WHEN schedule_segment:CIF_train_category =  'H6' THEN 'Railfreight Distribution (Channel Tunnel): RfD European Channel Tunnel Joint Venture'
        END AS TRAIN_CATEGORY,
        schedule_segment:schedule_location[0].tiploc_code      as ORIGIN_TIPLOC_CODE,
        T_SRC.DESCRIPTION AS ORIGIN_DESCRIPTION,
        T_SRC.LATITUDE::DOUBLE AS ORIGIN_LAT_LON,
        schedule_segment:schedule_location[0].public_departure as ORIGIN_PUBLIC_DEPARTURE_TIME,
        schedule_segment:schedule_location[0].platform         as ORIGIN_PLATFORM,
        schedule_segment:schedule_location[array_size(schedule_segment:schedule_location)-1].tiploc_code      as DESTINATION_TIPLOC_CODE,
        T_DST.DESCRIPTION AS DESTINATION_DESCRIPTION,
        T_DST.LONGITUDE::DOUBLE AS DESTINATION_LAT_LON,
        schedule_segment:schedule_location[array_size(schedule_segment:schedule_location)-1].public_arrival as DESTINATION_PUBLIC_ARRIVAL_TIME,
        schedule_segment:schedule_location[array_size(schedule_segment:schedule_location)-1].platform        as DESTINATION_PLATFORM,
        ARRAY_SIZE(schedule_segment:schedule_location) AS NUM_STOPS
FROM    BRONZE.SCHEDULE_RAW C
  LEFT JOIN BRONZE.LOCATIONS_RAW T_SRC 
    ON ORIGIN_TIPLOC_CODE
     = T_SRC.TIPLOC
  LEFT JOIN BRONZE.LOCATIONS_RAW T_DST
    ON DESTINATION_TIPLOC_CODE
     = T_DST.TIPLOC;


# Creating the TRAIN_ACTIVATIONS_01 Table

This step creates a dynamic Iceberg table named `TRAIN_ACTIVATIONS_01` by joining the previously created `TRAIN_ACTIVATIONS_00` table with location reference data. This enriches each activation event with descriptive information and geolocation for the origin station.

In [None]:
CREATE OR REPLACE DYNAMIC ICEBERG TABLE TRAIN_ACTIVATIONS_01
  TARGET_LAG = '1 minute'
  WAREHOUSE = hol_user_{{usernum}}_wh
  EXTERNAL_VOLUME= 'iceberg_hol_silver_vol'
  CATALOG = 'SNOWFLAKE'
  BASE_LOCATION = 'hol_user_{{usernum}}/train_activations_01/'
  catalog_sync = iceberg_hol_oc_int
  AS
SELECT TA.SCHED_ORIGIN_STANOX AS SCHED_ORIGIN_STANOX,
       TA.HEADER AS HEADER,
       TA.SCHEDULE_SOURCE AS SCHEDULE_SOURCE,
       TA.TP_ORIGIN_TIMESTAMP AS TP_ORIGIN_TIMESTAMP,
       TA.SCHEDULE_TYPE AS SCHEDULE_TYPE,
       TA.CREATION_TIMESTAMP AS CREATION_TIMESTAMP,
       TA.ORIGIN_DEP_TIMESTAMP AS ORIGIN_DEP_TIMESTAMP,
       TA.TOC_ID AS TOC_ID,
       TA.D1266_RECORD_NUMBER AS D1266_RECORD_NUMBER,
       TA.TRAIN_SERVICE_CODE AS TRAIN_SERVICE_CODE,
       TA.TRAIN_UID AS TRAIN_UID,
       TA.TRAIN_CALL_MODE AS TRAIN_CALL_MODE,
       TA.SCHEDULE_START_DATE AS SCHEDULE_START_DATE,
       TA.TP_ORIGIN_STANOX AS TP_ORIGIN_STANOX,
       TA.SCHEDULE_WTT_ID AS SCHEDULE_WTT_ID,
       TA.TRAIN_CALL_TYPE AS TRAIN_CALL_TYPE,
       TA.SCHEDULE_END_DATE AS SCHEDULE_END_DATE,
       TA.TRAIN_ID AS TRAIN_ID,
       TA.SCHEDULE_KEY AS SCHEDULE_KEY,
        L.DESCRIPTION AS SCHED_ORIGIN_DESC ,
        L.LONGITUDE::DOUBLE AS SCHED_ORIGIN_LAT,
        L.LATITUDE::DOUBLE AS SCHED_ORIGIN_LON
FROM TRAIN_ACTIVATIONS_00 TA
         LEFT JOIN BRONZE.LOCATIONS_RAW L
            ON TA.sched_origin_stanox = L.STANOX;

# Creating the CANCEL_CODE_REFERENCES_00 Table

This cell creates an Iceberg table named `CANCEL_CODE_REFERENCES_00` by copying reference data from the BRONZE layer. This table provides lookup information for cancellation codes used in the event data.

In [None]:
CREATE OR REPLACE ICEBERG TABLE CANCEL_CODE_REFERENCES_00 
  EXTERNAL_VOLUME= 'iceberg_hol_silver_vol'
  CATALOG = 'SNOWFLAKE'
  BASE_LOCATION = 'hol_user_{{usernum}}/cancel_code_references_00/'
  catalog_sync = iceberg_hol_oc_int
  AS 
  SELECT * from bronze.cancel_code_references_raw;

# Next Steps

You have now created and populated the SILVER layer tables. These are standardized datasets with light weight transformations and the datasets are ready for further transformation in the GOLD layer.

- Follow the guidelines provided under the `gold/HOL_GOLD_PIPELINE.ipynb` file to run the next pipeline