The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
-
Updated
Nov 8, 2024 - Python
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
The dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.
A lightweight stream processing library for Go
A compute framework for building Search, RAG, Recommendations and Analytics over complex (structured+unstructured) data, with ultra-modal vector embeddings.
Conduit streams data between data stores. Kafka Connect replacement. No JVM required.
[Advanced] - [Python] - Build processing data pipeline using 100 % open sources. The idea for this project comes from one sentence "Turn Your Laptop Into A Personal Analytics Engine"
Privacy and Security focused Segment-alternative, in Golang and React
A Data Pipeline for Algo-Trading: Download -> Clean (ETL/ELT) -> Store Data. Supports Various Data Sources. Clean Once and Forget.
Empowering Data Intelligence with Distributed SQL for Sharding, Scalability, and Security Across All Databases.
Fast & easy way to replicate databases to lakehouses
Flink CDC is a streaming data integration tool
🔥🔥🔥 Open Source Alternative to Hightouch, Census, and RudderStack - Reverse ETL & Data Activation
Declarative text based tool for data analysts and engineers to extract, load, transform and orchestrate their data pipelines.
Flexible development framework for building streaming data applications in SQL with Kafka, Flink, Postgres, GraphQL, and more.
Resilient data pipeline framework running on Apache Spark
Probabilistic Timeseries Forecasting Challenge
Processing data in a graph-like flow
Add a description, image, and links to the data-pipeline topic page so that developers can more easily learn about it.
To associate your repository with the data-pipeline topic, visit your repo's landing page and select "manage topics."