Skip to content

A repository containing all the data engineering and data analysis projects

Notifications You must be signed in to change notification settings

krishns18/Data_Projects

Repository files navigation

Data Projects

A repository containing data engineering and data analysis projects.

1. Data Modeling PostgreSQL

  • Created a Postgre database schema and ETL pipeline to perform analysis on song play.
  • Schema has been optimized to perform the above analysis.

2. TV, Halftime Shows and Big Game Analysis

  • Performed analysis on the Super Bowl data to extract insights such as point distribution, viewership and ads distribution etc.

3. Data Modeling Apache Cassandra

  • Created Apache Cassandra database schema, as a part of ETL pipeline

4. US Immigration Trends

  • The purpose of this project is to understand the US immigration trends.
  • Built a data pipeline using Spark.

5. Data Lakes Using Spark

  • Built an ETL pipeline that extracts their data from S3, processes them using Spark, and loads the data back into S3 as a set of dimensional tables.

6. Data Pipeline with Airflow

  • Built a data warehouse ETL pipelines using Apache Airflow.

7. Data Warehousing using RedShift

  • Built an ETL pipeline that extracts their data from S3, stages them in Redshift, and transforms data into a set of dimensional tables to find insights about what songs their users are listening to.

Updates in progress...

Releases

No releases published

Packages

No packages published