GitHub - krishns18/Data_Projects: A repository containing all the data engineering and data analysis projects

Data Projects

A repository containing data engineering and data analysis projects.

1. Data Modeling PostgreSQL

Created a Postgre database schema and ETL pipeline to perform analysis on song play.
Schema has been optimized to perform the above analysis.

2. TV, Halftime Shows and Big Game Analysis

Performed analysis on the Super Bowl data to extract insights such as point distribution, viewership and ads distribution etc.

3. Data Modeling Apache Cassandra

Created Apache Cassandra database schema, as a part of ETL pipeline

4. US Immigration Trends

The purpose of this project is to understand the US immigration trends.
Built a data pipeline using Spark.

5. Data Lakes Using Spark

Built an ETL pipeline that extracts their data from S3, processes them using Spark, and loads the data back into S3 as a set of dimensional tables.

6. Data Pipeline with Airflow

Built a data warehouse ETL pipelines using Apache Airflow.

7. Data Warehousing using RedShift

Built an ETL pipeline that extracts their data from S3, stages them in Redshift, and transforms data into a set of dimensional tables to find insights about what songs their users are listening to.

Updates in progress...

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
DataLakes_Using_Spark		DataLakes_Using_Spark
Data_Modeling_Apache_Cassandra		Data_Modeling_Apache_Cassandra
Data_Modeling_PostgreSQL		Data_Modeling_PostgreSQL
Data_Pipeline_Using_Airflow		Data_Pipeline_Using_Airflow
Kaggle_Flight_Data_Analysis		Kaggle_Flight_Data_Analysis
RedShift_Datawarehouse_Implementation		RedShift_Datawarehouse_Implementation
SuperBowl_Data_Analysis_Project		SuperBowl_Data_Analysis_Project
US_Immigration_Trends		US_Immigration_Trends
.DS_Store		.DS_Store
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data Projects

1. Data Modeling PostgreSQL

2. TV, Halftime Shows and Big Game Analysis

3. Data Modeling Apache Cassandra

4. US Immigration Trends

5. Data Lakes Using Spark

6. Data Pipeline with Airflow

7. Data Warehousing using RedShift

About

Releases

Packages

Languages

krishns18/Data_Projects

Folders and files

Latest commit

History

Repository files navigation

Data Projects

1. Data Modeling PostgreSQL

2. TV, Halftime Shows and Big Game Analysis

3. Data Modeling Apache Cassandra

4. US Immigration Trends

5. Data Lakes Using Spark

6. Data Pipeline with Airflow

7. Data Warehousing using RedShift

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages