Skip to content

Spotify API, Airflow, Docker, AWS S3, Snowflake, dbt, localstack, Looker Studio

Notifications You must be signed in to change notification settings

salimt/Spotify-API-Pipeline

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 

Repository files navigation

Spotify ELT (extract, load, and transform) Pipeline

A data pipeline to extract Spotify data from a playlist that is created by students.

Output is a Google Data Studio report, providing insight into the track features and preferences.

Motivation

It provided a good opportunity to develop skills and experience in a range of tools. As such, project is more complex than required, utilising dbt, airflow, docker and cloud based storage, and usage of localstack for testing.

Architecture

  1. Extract data using Spotify API
  2. Simulate AWS S3 locally for testing with localstack
  3. Load into AWS S3
  4. Copy into Snowflake
  5. Transform using dbt
  6. Create Google Looker Studio Dashboard
  7. Orchestrate with Airflow in Docker

Output

  • Final output from Google Looker Studio. Link here. Note that Dashboard is reading from a static CSV output from Snowflake.

Clone using the web URL

NOTE: This was developed using Windows 10. If you're on Mac or Linux, you may need to amend certain components if issues are encountered.

git clone https://github.com/salimt/Spotify-API-Pipeline.git
cd Spotify-API-Pipeline

Releases

No releases published

Packages

No packages published

Languages