Popular repositories Loading
-
DataPipeLine-S3-to-Redshift-Using-Airflow
DataPipeLine-S3-to-Redshift-Using-Airflow PublicImplement ETL data pipeline that reads data from S3 bucket and loads data into AWS redshift using Airflow
Python 1
-
Data_Modelling_with_PostgreSQL
Data_Modelling_with_PostgreSQL PublicA hands-on project on data modelling using PostgreSQL which includes functions like creating database, inserting records, ETL process, postgresql aggregation queries.
-
Data_Modelling_with_Cassandra
Data_Modelling_with_Cassandra PublicWith song history event files we created ETL pipeline using python and built data model using Apache Cassandra and then we wrote CQL queries to answer use case questions.
Jupyter Notebook
-
ETL-Spark-EMR-AWS-MusicData
ETL-Spark-EMR-AWS-MusicData PublicTo implement a data lake using S3 and Spark on an EMR cluster using AWS Cloud9 environment and develop an ETL pipeline for a Data Lake that extracts data from S3, processes the data using Spark, an…
Python 1
-
Data_Analysis_with_Pyspark
Data_Analysis_with_Pyspark PublicData analysis using Pyspark sql and aggregate functions.
Jupyter Notebook
-
Data-Science-and-Analytics-Projects
Data-Science-and-Analytics-Projects PublicThe Projects in this repository is done as part of my academic course work during my masters in data science and analytics from University of Leeds(2021 - 2022).
Jupyter Notebook
If the problem persists, check the GitHub status page or contact support.