This project is the implementation of a data pipeline using Apache Airflow, Apache Kafka, and MySQL. It involves extracting data from multiple sources in various formats (csv, tsv) and consolidating it into a single file. To stream the data in real-time, I utilized Apache Kafka, and then ingested it into a Python data pipeline for transformation. Finally, the transformed data was loaded into a MySQL database, allowing for further exploration and analysis. This project showcases my ability to work with different tools and technologies, as well as my skills in data extraction, transformation, and loading.
-
Notifications
You must be signed in to change notification settings - Fork 0
BrendaChep/ETL-pipeline-for-Traffic-Toll-Data
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
The goal for this project is to extract and transform data using Apache Airflow, stream it in real time using Kafka and load it into a MySQL Database for further exploration
Topics
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published