Skip to content

The goal for this project is to extract and transform data using Apache Airflow, stream it in real time using Kafka and load it into a MySQL Database for further exploration

Notifications You must be signed in to change notification settings

BrendaChep/ETL-pipeline-for-Traffic-Toll-Data

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 

Repository files navigation

This project is the implementation of a data pipeline using Apache Airflow, Apache Kafka, and MySQL. It involves extracting data from multiple sources in various formats (csv, tsv) and consolidating it into a single file. To stream the data in real-time, I utilized Apache Kafka, and then ingested it into a Python data pipeline for transformation. Finally, the transformed data was loaded into a MySQL database, allowing for further exploration and analysis. This project showcases my ability to work with different tools and technologies, as well as my skills in data extraction, transformation, and loading.

About

The goal for this project is to extract and transform data using Apache Airflow, stream it in real time using Kafka and load it into a MySQL Database for further exploration

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages