ETL Music Streaming app

In this repository, there are to example of an ETL pipeline and Database design either in PostgreSQL and Cassandra. Each folder contains the code and data necessary to create and populate the database.

Files/Folders in this repository

PostgreSQL: Folder containing SQL oriented ETL pipeline.
Cassandra: Folder containing No-SQL ETL pipeline.
AWS Redshift: Folder containing AWS Redshift-based ETL pipeline.
Spark: Folder containing Spark-based ETL pipeline.

Dataset used

The dataset used is a subset of real data from the Million Song Dataset. Each file is in JSON/CSV format and contains metadata about a song and the artist of that song. The files are partitioned by the first three letters of each song's track ID.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
AWS_Redshift		AWS_Redshift
Cassandra		Cassandra
PostgreSQL		PostgreSQL
Spark		Spark
.gitattributes		.gitattributes
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ETL Music Streaming app

Files/Folders in this repository

Dataset used

About

Releases

Packages

Languages

License

mcamarad/ETL_music_streaming_app

Folders and files

Latest commit

History

Repository files navigation

ETL Music Streaming app

Files/Folders in this repository

Dataset used

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages