Skip to content

Codingaditya17/SpotifyAnalysis-using-AWS-and-Kafka

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This Project was made a year ago , now the upstash is now discontinuing the Kafka for the developers!!!

Spotify Analysis using AWS and Kafka This project is an end-to-end data engineering pipeline for analyzing Spotify data. The pipeline is built using a combination of Apache Kafka for real-time data streaming and Amazon Web Services (AWS) for data processing and storage.

Project Architecture The data pipeline consists of the following steps:

Data Ingestion: A producer application running on an AWS EC2 instance sends data to a specific topic in Apache Kafka.

Data Consumption: A consumer application reads the data from the Kafka topic.

Staging Layer: The consumer application writes the raw data from Kafka to a staging layer, which is an AWS S3 bucket.

Data Processing: An AWS Glue ETL job is scheduled to process and transform the data from the staging layer. The transformed data is then stored in a data lake, which is another AWS S3 bucket.

Data Cataloging: An AWS Glue Crawler scans the data in the data lake and creates a data catalog, defining the databases and tables.

Data Querying: The data in the data lake can be queried interactively using Amazon Athena with standard SQL, utilizing the tables created by the Glue Crawler.

Monitoring: The entire pipeline is monitored using AWS CloudWatch.

Technologies Used Apache Kafka: A distributed streaming platform used for building the real-time data pipeline.

AWS (Amazon Web Services):

EC2 (Elastic Compute Cloud): Used to host the data producer application.

S3 (Simple Storage Service): Used as both the staging layer and the data lake.

Glue: An ETL service used for data transformation and a crawler for creating the data catalog.

Athena: An interactive query service for data analysis.

CloudWatch: Used for monitoring the pipeline.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published