Example of implementing data pipelines for ingesting 3rd party data into a Snowflake database
The code corresponds to my article Creating Data Pipelines in Airflow The best way to understand the code and its purpose is to go through the article.
- Access to NBA_ELO statistics file
- Access to Snowflake (there is a free trial option available)
- Clone the repository
- Download the NBA file and save it as
nba_elo.csv
infiles\staging
subfolder - Connect to the Snowflake and run
create_database_objects.sql
- From
data_pipelines
folder executedocker compose up
- Log in to https://localhost:8080 (airflow / airflow)
- Create Snowflake connection named
sf1
- Enable DAG
- Move or copy the
nba_elo.csv
fromstaging
subfolder intofiles
folder - Observe the data pipeline running and loading the data into the database