data-pipeline

SeaTunnel is a distributed, high-performance data integration platform for the synchronization and transformation of massive data (offline & real-time).

real-time offline high-performance apache data-integration sql-engine data-pipeline etl-framework seatunnel

Updated Jun 14, 2023
Java

GetFeedback / kahpp-oss

Star

Kafka Streams made easy with a YAML file

yaml automation kafka pipeline tool stream-processing kafka-streams data-processing data-pipeline stream-processor stream-processing-software

Updated Aug 4, 2023
Java

Ashfaqbs / Microservices-Based-Wikimedia-Data-Processing-with-Kafka

Star

Efficiently captures real-time Wikimedia data, like a newsroom for Wikipedia changes. Uses microservices, Kafka, and Spring Boot for reliability and scalability. Ideal for research and analysis.

kafka spring-boot microservice jpa java-8 data-pipeline

Updated Oct 12, 2023
Java

BitSail is a distributed high-performance data integration engine which supports batch, streaming and incremental scenarios. BitSail is widely used to synchronize hundreds of trillions of data every day.

real-time big-data high-performance data-lake data-integration flink data-synchronization data-pipeline

Updated Jan 1, 2024
Java

ghowkay / realtime-metrics-calculation

Star

Realtime metrics calculation pipeline using kafka, elasticsearch and kibana.

docker elasticsearch kibana docker-compose data-engineering data-pipeline kakfa

Updated Feb 16, 2024
Java

JinsYin / datalink

Star

⚡ 数据集成 | DataLink is a lightweight data integration framework build on top of DataX, Spark and Flink

data streaming framework big-data spark integration pipeline etl bigdata batch data-integration data-collection flink cdc data-exchange data-synchronization data-pipeline datalink flink-cdc

Updated Jun 19, 2024
Java

kwangjong / coinbase-real-time-data-pipeline

Star

A real-time cryptocurrency data streaming pipeline.

java docker kubernetes scala apache-spark grafana hdfs k8s apache-kafka apache-cassandra data-pipeline

Updated Jun 25, 2024
Java

apache / seatunnel-web

Star

SeaTunnel is a distributed, high-performance data integration platform for the synchronization and transformation of massive data (offline & real-time).

real-time offline high-performance apache data-integration sql-engine data-pipeline etl-framework seatunnel

Updated Jul 10, 2024
Java

Improve this page

Add a description, image, and links to the data-pipeline topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the data-pipeline topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data-pipeline

Here are 24 public repositories matching this topic...

rashmishrm / serverhealth

iShiBin / CS502Capstone

colechristini / dataset-lib

mbrtargeting / camus

mujahidniaz / iot_device_streaming_pipeline_cloudera-kakfa-spark-hbase

ProsperChuks / airbyte

yosra270 / store-data-pipeline

sanogotech / spring-boot-with-kafkalighttest

sushovankarmakar / kafka-spark-streaming

BrahianVT / Data-Pipeline

cjannun / kafka-based-data-pipeline

cognitree / kronos

apache / seatunnel-datasource-sdk

GetFeedback / kahpp-oss

Ashfaqbs / Microservices-Based-Wikimedia-Data-Processing-with-Kafka

bytedance / bitsail

ghowkay / realtime-metrics-calculation

JinsYin / datalink

kwangjong / coinbase-real-time-data-pipeline

apache / seatunnel-web

Improve this page

Add this topic to your repo