GitHub - erdemah/NanoDegree_DataEngineering_DataModelling_Postgres: Udacity Nano Degree Data Engineering Program

This project includes data modelling and an ETL (Extract, Transform, Load) job for song data and log data. The purpose of the project is to analyze data easily coming from the new music streaming app called Sparkify. Any insight from such analyses is important to businesses which can help them understand their customers better, provide better services. To make data analysis and reporting easier, the data are pipelined through a Postgres database. During this job, the Postgres database model has been created using a star schema with a total of 5 tables - 1 Fact, 4 dimensional tables.

sql_queries.py: PostgreSQL drop, create, insert tables. Data modelling occurs here.
create_tables.py: A wrapper built on top sql_queries.py to make the database console commands easier.
test.ipynb: Every time you manipulate the database, run this file to check if the code is doing what is supposed to be doing.Restart Jupyter Kernel for cleanup in every run.
etl.ipynb: A prototype of etl.py
etl.py: Data is extracted, transformed and loaded into PostgreSQL database.

Build Environment: Python 3.7
PostgreSQL 12.2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
data		data
README.md		README.md
create_tables.py		create_tables.py
etl.ipynb		etl.ipynb
etl.py		etl.py
sql_queries.py		sql_queries.py
test.ipynb		test.ipynb

erdemah/NanoDegree_DataEngineering_DataModelling_Postgres

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages