Skip to content

Projects done as part of the Udacity Data Engineering Nanodegree program.

License

Notifications You must be signed in to change notification settings

ramapinnimty/Udacity-DataEngineering-Nanodegree

Repository files navigation

Udacity Data Engineering Nanodegree

Projects done as part of the Data Engineering Nanodegree program offered by Udacity.


Developed a SQL database using PostgreSQL to model user activity data for a music streaming app.

  • Created a relational database using PostgreSQL locally.
  • Developed a Star Schema database using optimized definitions of Fact and Dimension tables and also performed Normalization on tables.
  • Built out an ETL pipeline to optimize queries in order to understand what songs users are listening to.

Tech stack: - Python, PostgreSQL, Star Schema, ETL pipelines, Normalization

Designed a NoSQL database using Apache Cassandra based on the original schema outlined in Project 1.

  • Created a NoSQL database using Apache Cassandra locally.
  • Developed denormalized tables optimized for a specific set of queries and business needs.

Tech stack: - Python, Apache Cassandra, Denormalization

Created a database warehouse utilizing Amazon Redshift.

  • Created a Redshift cluster along with the appropriate IAM role and Security group.
  • Developed an ETL Pipeline that loads data from S3 buckets into staging tables on Redshift which will be processed using Star schema.
  • Optimized queries to enable faster loads as required by the Data Analytics team.

Tech stack: - Python, AWS CLI, Amazon SDK, PostgreSQL, Amazon S3, Amazon Redshift