# Snowflake Notebook Data Engineering

* Author: Jeremiah Hansen
* Last Updated: 2/12/2026

Welcome to the beginning of the Quickstart! Please refer to [the official Snowflake Notebook Data Engineering Quickstart](https://www.snowflake.com/en/developers/guides/data-engineering-with-notebooks/) for all the details including set up steps.

## Step 01 Setup Snowflake

During this step we will create our demo environment.

In [None]:
-- ----------------------------------------------------------------------------
-- Create the account level objects (ACCOUNTADMIN part)
-- ----------------------------------------------------------------------------
SET MY_USER = CURRENT_USER();
USE ROLE ACCOUNTADMIN;

-- Roles
CREATE OR REPLACE ROLE DEMO_ROLE;
GRANT ROLE DEMO_ROLE TO ROLE SYSADMIN;
GRANT ROLE DEMO_ROLE TO USER IDENTIFIER($MY_USER);

GRANT CREATE INTEGRATION ON ACCOUNT TO ROLE DEMO_ROLE;
GRANT EXECUTE TASK ON ACCOUNT TO ROLE DEMO_ROLE;
GRANT EXECUTE MANAGED TASK ON ACCOUNT TO ROLE DEMO_ROLE;
GRANT MONITOR EXECUTION ON ACCOUNT TO ROLE DEMO_ROLE;
GRANT IMPORTED PRIVILEGES ON DATABASE SNOWFLAKE TO ROLE DEMO_ROLE;

-- Databases
CREATE OR REPLACE DATABASE DEMO_DB;
GRANT OWNERSHIP ON DATABASE DEMO_DB TO ROLE DEMO_ROLE;

-- Warehouses
CREATE OR REPLACE WAREHOUSE DEMO_WH WAREHOUSE_SIZE = XSMALL, AUTO_SUSPEND = 300, AUTO_RESUME= TRUE;
GRANT OWNERSHIP ON WAREHOUSE DEMO_WH TO ROLE DEMO_ROLE;


-- ----------------------------------------------------------------------------
-- Create the database level objects
-- ----------------------------------------------------------------------------
USE ROLE DEMO_ROLE;
USE WAREHOUSE DEMO_WH;
USE DATABASE DEMO_DB;

-- Schemas
CREATE OR REPLACE SCHEMA INTEGRATIONS;
CREATE OR REPLACE SCHEMA DEV_SCHEMA;
CREATE OR REPLACE SCHEMA PROD_SCHEMA;

USE SCHEMA INTEGRATIONS;

-- External Frostbyte objects
CREATE OR REPLACE STAGE FROSTBYTE_RAW_STAGE
    URL = 's3://sfquickstarts/data-engineering-with-snowpark-python/'
;


-- ----------------------------------------------------------------------------
-- Create the event table
-- ----------------------------------------------------------------------------
USE ROLE ACCOUNTADMIN;

CREATE EVENT TABLE DEMO_DB.INTEGRATIONS.DEMO_EVENTS;
GRANT SELECT ON EVENT TABLE DEMO_DB.INTEGRATIONS.DEMO_EVENTS TO ROLE DEMO_ROLE;
GRANT INSERT ON EVENT TABLE DEMO_DB.INTEGRATIONS.DEMO_EVENTS TO ROLE DEMO_ROLE;

ALTER ACCOUNT SET EVENT_TABLE = DEMO_DB.INTEGRATIONS.DEMO_EVENTS;
ALTER DATABASE DEMO_DB SET LOG_LEVEL = INFO;


-- ----------------------------------------------------------------------------
-- Set our new context
-- ----------------------------------------------------------------------------
USE ROLE DEMO_ROLE;
USE WAREHOUSE DEMO_WH;
USE SCHEMA DEMO_DB.INTEGRATIONS;

In [None]:
%%sql -r dataframe_9
USE ROLE DEMO_ROLE;
USE WAREHOUSE DEMO_WH;
USE SCHEMA DEMO_DB.INTEGRATIONS;

## Step 02 Load Weather

But what about data that needs constant updating - like the WEATHER data? We would need to build a pipeline process to constantly update that data to keep it fresh.

Perhaps a better way to get this external data would be to source it from a trusted data supplier. Let them manage the data, keeping it accurate and up to date.

Enter the Snowflake Data Cloud...

Weather Source is a leading provider of global weather and climate data and their OnPoint Product Suite provides businesses with the necessary weather and climate data to quickly generate meaningful and actionable insights for a wide range of use cases across industries. Let's connect to the "Weather Source LLC: frostbyte" feed from Weather Source in the Snowflake Data Marketplace by following these steps in Snowsight

* In the left navigation bar click on "Marketplace" and then "Snowflake Marketplace"
* Search: "Weather Source LLC: frostbyte" (and click on tile in results)
* Click the blue "Get" button
* Under "Options", adjust the Database name to read "FROSTBYTE_WEATHERSOURCE" (all capital letters)
* Grant to "DEMO_ROLE"

That's it... we don't have to do anything from here to keep this data updated. The provider will do that for us and data sharing means we are always seeing whatever they they have published.

In [None]:
%%sql -r dataframe_5
/*---
-- You can also do it via code if you know the account/share details...
SET WEATHERSOURCE_ACCT_NAME = '*** PUT ACCOUNT NAME HERE AS PART OF DEMO SETUP ***';
SET WEATHERSOURCE_SHARE_NAME = '*** PUT ACCOUNT SHARE HERE AS PART OF DEMO SETUP ***';
SET WEATHERSOURCE_SHARE = $WEATHERSOURCE_ACCT_NAME || '.' || $WEATHERSOURCE_SHARE_NAME;

CREATE OR REPLACE DATABASE FROSTBYTE_WEATHERSOURCE
  FROM SHARE IDENTIFIER($WEATHERSOURCE_SHARE);

GRANT IMPORTED PRIVILEGES ON DATABASE FROSTBYTE_WEATHERSOURCE TO ROLE HOL_ROLE;
---*/

In [None]:
%%sql -r dataframe_6
-- Let's look at the data - same 3-part naming convention as any other table
SELECT * FROM FROSTBYTE_WEATHERSOURCE.ONPOINT_ID.POSTAL_CODES LIMIT 100;

## Step 03 Load Location and Order Detail

Please follow the instructions in the [Load Location and Order Details section of the Quickstart](https://www.snowflake.com/en/developers/guides/data-engineering-with-notebooks/#load-location-and-order-detail) to open and run the `01_load_excel_files` Notebook. That Notebook will define the pipeline used to load data into the `LOCATION` and `ORDER_DETAIL` tables from the staged Excel files.

## Step 04 Load Daily City Metrics

Please follow the instructions in the [Load Daily City Metrics section of the Quickstart](https://www.snowflake.com/en/developers/guides/data-engineering-with-notebooks/#load-daily-city-metrics) to open and run the `02_load_daily_city_metrics` Notebook. That Notebook will define the pipeline used to create the `DAILY_CITY_METRICS` table.

After running the two notebooks above we can now inspect the logs that were written to [the Snowflake Event table](https://docs.snowflake.com/en/developer-guide/logging-tracing/event-table-setting-up). The query below will help get us started.

In [None]:
%%sql -r dataframe_7
USE ROLE DEMO_ROLE;
USE WAREHOUSE DEMO_WH;
USE SCHEMA DEMO_DB.INTEGRATIONS;

SELECT 
    TIMESTAMP,
    VALUE AS LOG_MESSAGE,
    RESOURCE_ATTRIBUTES:"snow.service.name"::string AS SERVICE_NAME,
    RECORD_ATTRIBUTES:"severity_text"::string AS SEVERITY
FROM DEMO_DB.INTEGRATIONS.DEMO_EVENTS
WHERE RECORD_TYPE = 'LOG'
--  AND RESOURCE_ATTRIBUTES:"snow.service.name" = 'NOTEBOOK_SERVICE'
  AND RESOURCE_ATTRIBUTES:"snow.service.name" NOT IN ('OPENFLOW', 'MYPOSTGRESCDC')
  AND TIMESTAMP > DATEADD(hour, -1, CURRENT_TIMESTAMP())
ORDER BY TIMESTAMP DESC
LIMIT 100;

## Step 05 Deploy Dev Notebook Project

During this step we will be deploying the dev versions of our two data engineering Notebooks: `01_load_excel_files` and `02_load_daily_city_metrics`. To deploy notebooks to Snowflake we will create a `NOTEBOOK PROJECT` object in our `DEV_SCHEMA`, along with the other resources in our development environment.

Notebook Project Objects (NPOs) are the key to deploying and scheduling notebooks in production with Workspace Notebooks. An NPO is a schema-level object that encapsulates your notebook files and their dependencies, making them ready for scheduled execution.

This code is contained in the `scripts/deploy_notebooks.py` script which is used below and in the deployment CI/CD pipeline later. Please open and review the contents of the script.

In [None]:
%run scripts/deploy_notebooks.py DEMO_DB DEV_SCHEMA DEMO_PIPELINES_NP ./notebooks

## Step 06 Orchestrate Pipelines

In this step we will create a DAG (or Directed Acyclic Graph) of Tasks using the new [Snowflake Python Management API](https://docs.snowflake.com/en/developer-guide/snowflake-python-api/snowflake-python-overview). The Task DAG API builds upon the Python Management API to provide advanced Task management capabilities. For more details see [Managing Snowflake tasks and task graphs with Python](https://docs.snowflake.com/en/developer-guide/snowflake-python-api/snowflake-python-managing-tasks).

This code is contained in the `scripts/deploy_task_dag.py` script which is used below and in the deployment CI/CD pipeline later. Please open and review the contents of the script.


In [None]:
%run scripts/deploy_task_dag.py DEMO_DB DEV_SCHEMA DEMO_PIPELINES_NP

In a new tab, switch to the Horizon Catalog, then open and review the new Task DAG that was created. Next manually execute the tasks by opening up the root `DEMO_DAG` task, clicking on the "Graph" tab and then clicking on the "Run Task Graph" play button above the diagram. If not already set, you may have to pick a warehouse to do this.

## Step 07 Deploy to Production

Steps
1. Make a small change to a notebook and commit it to the dev branch
1. Go into GitHub and create a PR and Merge to main branch
1. Review GitHub Actions workflow definition and run results
1. See new production versions of the Notebook Project object and Task DAG in the PROD_SCHEMA
1. Run the production version of the task DAG and see new tables created!

## Step 08 Teardown

Finally, we will tear down our demo environment.

In [None]:
%%sql -r dataframe_8
USE ROLE ACCOUNTADMIN;

DROP DATABASE DEMO_DB;
DROP WAREHOUSE DEMO_WH;
DROP ROLE DEMO_ROLE;

-- Drop the weather share
DROP DATABASE FROSTBYTE_WEATHERSOURCE;

-- Remove the "dev" branch in your repo