Skip to content

aananth1/projecttwo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

projecttwo

DE Bootcamp Project Two

Exchange Rates ETL with AirByte, DBT & Snowflake

Project plan

Objective

What would you like people to do with the data you have produced? Are you supporting BI or ML use-cases?

The goal is provide exchange rate data for BI purposes.

Consumers

What users would find your data useful?

Investment bankers.

Questions

What questions are you trying to solve with your data?

  1. 7 day moving average different currencies, for example AUD and USD, GBP, INR.
  2. Daily percentage change in Exchange rates.
  3. General price trends (economic events)

Source datasets

What datasets are you sourcing from?

Exchange Rates API

Architecture

ProjectTwp

Project structure

├── AirflowDAGs
├── Data Integration
├── docker
├── ExchangeRates
│   ├── analyses
│   ├── dbt_packages
│   │   └── dbt_utils
│   │       ├── docs
│   │       │   └── decisions
│   │       ├── etc
│   │       ├── integration_tests
│   │       │   ├── ci
│   │       │   ├── data
│   │       │   │   ├── cross_db
│   │       │   │   ├── datetime
│   │       │   │   ├── etc
│   │       │   │   ├── geo
│   │       │   │   ├── materializations
│   │       │   │   ├── schema_tests
│   │       │   │   ├── sql
│   │       │   │   └── web
│   │       │   ├── macros
│   │       │   ├── models
│   │       │   │   ├── cross_db_utils
│   │       │   │   ├── datetime
│   │       │   │   ├── generic_tests
│   │       │   │   ├── geo
│   │       │   │   ├── materializations
│   │       │   │   ├── sql
│   │       │   │   └── web
│   │       │   └── tests
│   │       │       ├── generic
│   │       │       ├── jinja_helpers
│   │       │       └── sql
│   │       ├── macros
│   │       │   ├── cross_db_utils
│   │       │   │   └── deprecated
│   │       │   ├── generic_tests
│   │       │   ├── jinja_helpers
│   │       │   ├── materializations
│   │       │   ├── sql
│   │       │   └── web
│   │       └── tests
│   │           └── functional
│   │               ├── cross_db_utils
│   │               └── data_type
│   ├── logs
│   ├── macros
│   ├── models
│   │   ├── Gold
│   │   └── Silver
│   ├── seeds
│   ├── snapshots
│   └── tests
└── logs

Extract and Load

Exchange Rates API connector is available in Airbyte. The data is extracted using credentials and loaded into the Bronze layer of Snowflake. There are 3 raw tables conatining similar data. These are:

  1. Exchange_Rates
  2. Exchange_Rates_Rates
  3. Exchange_Rates_SCD

The Exchange_Rates_SCD table conatins data such as Base currency, Date, Rates, etc at Daily level.

Bronze

Transformation

Silver

The Exchange_Rates_SCD tables data is incrementally ingested into the silver at daily interval. Here the data is unpacked from the JSON format for the prices and stored in individual columns for the required currencies. This table contains data such as Base, date, AUD_Daily_Price, BTC_Daily_price, etc.

Silver

Gold

Thge Gold layer tables is a full-refresh table. The transformations done at this stage are:

  1. All other currencies were dropped except AUD.
  2. A new column was created based on the 7 day rolling average.
  3. Another new column was created to calculate daily price changes.
  4. Only Active records are used for the calcuations (where row_active_ind = 1)

Gold

DBT Docs

dbt-dag

Technical Details

Extraction Source

Exchange Rates API

Destination

Snowflake

Data integration tool

Airbyte

Transformations

Snowflake

Creation of dependencies

DBT

Scheduling of tasks

Airflow

DAG

The Airflow DAG allows you to orchestrate the ingestion via AIrflow including querying via API & loading data to Snowflake (via connection to Airbyte) and execute the ExchangeRates DBT Project on the Snowflake data to promote data to Silver & Gold.

DAG Instructions

Fill in blanks in code as indicated in the file. See example below for guidance. Add dag_exchangerates.py to your Airflow DAG folder.

123123123123

DAG Assumptions

  • Working Airflow environment with DBT setup and working as a sub-environment.
  • AirByte running locally or dockerized + locally on port 8000. If dockerized locally, set server=host.docker.internal. If hosted locally, use server=localhost. If cloud-hosted, use appropriate IP address based on configuration.

Airflow DAG Results

shot_221121_172111

Hosting

Snowflake.

Lessons Learnt

  1. Make sure your Docker Images are not cooked.

About

DE Bootcamp Project Two

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •