GitHub - Codingaditya17/SpotifyAnalysis-using-AWS-and-Kafka

This Project was made a year ago , now the upstash is now discontinuing the Kafka for the developers!!!

Spotify Analysis using AWS and Kafka This project is an end-to-end data engineering pipeline for analyzing Spotify data. The pipeline is built using a combination of Apache Kafka for real-time data streaming and Amazon Web Services (AWS) for data processing and storage.

Project Architecture The data pipeline consists of the following steps:

Data Ingestion: A producer application running on an AWS EC2 instance sends data to a specific topic in Apache Kafka.

Data Consumption: A consumer application reads the data from the Kafka topic.

Staging Layer: The consumer application writes the raw data from Kafka to a staging layer, which is an AWS S3 bucket.

Data Processing: An AWS Glue ETL job is scheduled to process and transform the data from the staging layer. The transformed data is then stored in a data lake, which is another AWS S3 bucket.

Data Cataloging: An AWS Glue Crawler scans the data in the data lake and creates a data catalog, defining the databases and tables.

Data Querying: The data in the data lake can be queried interactively using Amazon Athena with standard SQL, utilizing the tables created by the Glue Crawler.

Monitoring: The entire pipeline is monitored using AWS CloudWatch.

Technologies Used Apache Kafka: A distributed streaming platform used for building the real-time data pipeline.

AWS (Amazon Web Services):

EC2 (Elastic Compute Cloud): Used to host the data producer application.

S3 (Simple Storage Service): Used as both the staging layer and the data lake.

Glue: An ETL service used for data transformation and a crawler for creating the data catalog.

Athena: An interactive query service for data analysis.

CloudWatch: Used for monitoring the pipeline.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
albums		albums
artist		artist
data		data
tracks		tracks
.DS_Store		.DS_Store
PROJECT SERIES.pptx		PROJECT SERIES.pptx
README.md		README.md
connect-kafka.ipynb		connect-kafka.ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

About

Uh oh!

Releases

Packages

Languages

Codingaditya17/SpotifyAnalysis-using-AWS-and-Kafka

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages