Infinitely scalable, event-driven, language-agnostic orchestration and scheduling platform to manage millions of workflows declaratively in code.
-
Updated
Jun 21, 2024 - Java
Infinitely scalable, event-driven, language-agnostic orchestration and scheduling platform to manage millions of workflows declaratively in code.
Flink CDC is a streaming data integration tool
A real-time cryptocurrency data streaming pipeline.
Toolkit for describing data transformation pipelines by compositing simple reusable components.
⚡ 数据集成 | DataLink is a lightweight data integration framework build on top of DataX, Spark and Flink
Compiler for streaming data pipelines and data microservices with configurable engines.
SeaTunnel is a distributed, high-performance data integration platform for the synchronization and transformation of massive data (offline & real-time).
Realtime metrics calculation pipeline using kafka, elasticsearch and kibana.
BitSail is a distributed high-performance data integration engine which supports batch, streaming and incremental scenarios. BitSail is widely used to synchronize hundreds of trillions of data every day.
Efficiently captures real-time Wikimedia data, like a newsroom for Wikipedia changes. Uses microservices, Kafka, and Spring Boot for reliability and scalability. Ideal for research and analysis.
Kafka Streams made easy with a YAML file
SeaTunnel is a distributed, high-performance data integration platform for the synchronization and transformation of massive data (offline & real-time).
cron replacement to schedule complex data workflows
Cloud server data pipeline built with Apache Kafka and Java
An end to end data pipeline with Kafka Spark Streaming Integration
Data pipeline using Apache Kafka, Apache Spark and HDFS
Airbyte is an open-source EL(T) platform that helps you replicate your data in your warehouses, lakes and databases.
Real Time Data Streaming Pipeline
Add a description, image, and links to the data-pipeline topic page so that developers can more easily learn about it.
To associate your repository with the data-pipeline topic, visit your repo's landing page and select "manage topics."