Example end to end data engineering project.
-
Updated
Dec 8, 2022 - Python
Example end to end data engineering project.
Replicate data from MySQL, Postgres and MongoDB to ClickHouse
Ecommerce Realtime Data Pipeline (Data Modeling, Workflow Orchestration, Change Data Capture, Analytical Database and Dashboarding)
Built a real-time streaming pipeline to extract stock data, using Apache Nifi, Debezium, Kafka, and Spark Streaming. Loaded the transformed data into Glue database and created real-time dashboards using Power BI and Tableau with Athena. The pipeline is orchestrated using Airflow.
Repo for CDC with debezium blog post
Guardian for your Kafka Connect connectors. It check status of connectors and tasks and restart if they are failed
Data Pipeline for CDC data from MySQL DB to Amazon S3 through Amazon MSK using Amazon MSK Connect (Debezium).
Stream CDC into an Amazon S3 data lake in Apache Iceberg format with AWS Glue Streaming using Amazon MSK Serverless and MSK Connect (Debezium)
Data Pipeline for CDC data from MySQL DB to Amazon S3 through Amazon MSK Serverless using Amazon MSK Connect (Debezium).
Outbox pattern using Debezium and Protobuf serialization
Kafka Debezium
Пример создания CDC через Debezium
Django with Kafka, Debezium, and Faust for Email Sending using Change Data Capture
Stream data between two databases . Supports both ddl and dml statements. Built on top of kafka and debezium in python
Add a description, image, and links to the debezium topic page so that developers can more easily learn about it.
To associate your repository with the debezium topic, visit your repo's landing page and select "manage topics."