Hi!

Welcome to my Data modeling and Postgresql ETL mini-project!

In this project we are creating a relational star-shape database for sparkify

What is spakrify?

"""Sparkify is a start-up that wants to analyze the data they've been collecting on songs and user activity on their new music streaming app. The analytics team is particularly interested in understanding what songs users are listening to. Currently, they don't have an easy way to query their data, which resides in a directory of JSON logs on user activity on the app, as well as a directory with JSON metadata on the songs in their app."""

So you will ask....

Ok Antonis, how do I create the database?

Well its easy!

0) Create a redshift cluster and add to dwh.cfg the appropriate fields for connecting it to our S3 bucket

1) Create the tables by running create_tables.py

2) Create the star schema database by running etl.py

Code Description:

create_tables.py drops and creates your tables. You run this file to reset your tables before each time you run your ETL scripts.

etl.py loads the json data from the S3 bucket, stages it to two tables (event and songs tables) then creates the star schema database

sql_queries.py contains all your sql queries, and is imported into the last three files above. README.md provides discussion on your project.

Check out the ETL sketch of the project

Check out the Star schema result of our loaded database (after staging as part of the DWH)

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.workspace-config.json		.workspace-config.json
ETL.png		ETL.png
README.md		README.md
Schema_sql.png		Schema_sql.png
create_tables.py		create_tables.py
dwh.cfg		dwh.cfg
etl.py		etl.py
sql_queries.py		sql_queries.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hi!

Welcome to my Data modeling and Postgresql ETL mini-project!

In this project we are creating a relational star-shape database for sparkify

What is spakrify?

So you will ask....

Ok Antonis, how do I create the database?

Well its easy!

0) Create a redshift cluster and add to dwh.cfg the appropriate fields for connecting it to our S3 bucket

1) Create the tables by running create_tables.py

2) Create the star schema database by running etl.py

Code Description:

Check out the ETL sketch of the project

Check out the Star schema result of our loaded database (after staging as part of the DWH)

About

Releases

Packages

Languages

AntonisCSt/Aws_ETL_Sparkify

Folders and files

Latest commit

History

Repository files navigation

Hi!

Welcome to my Data modeling and Postgresql ETL mini-project!

In this project we are creating a relational star-shape database for sparkify

What is spakrify?

So you will ask....

Ok Antonis, how do I create the database?

Well its easy!

0) Create a redshift cluster and add to dwh.cfg the appropriate fields for connecting it to our S3 bucket

1) Create the tables by running create_tables.py

2) Create the star schema database by running etl.py

Code Description:

Check out the ETL sketch of the project

Check out the Star schema result of our loaded database (after staging as part of the DWH)

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages