Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
-
Updated
Mar 7, 2025 - Python
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
Logstash - transport and process your logs, events, or other data
The developer first cloud governance platform
Flow-based programming for JavaScript
Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metadata. Runs and scales everywhere python does.
An end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.
This repository is a getting started guide to Singer.
Making data lake work for time series
A scalable general purpose micro-framework for defining dataflows. THIS REPOSITORY HAS BEEN MOVED TO www.github.com/dagworks-inc/hamilton
SeaTunnel is a distributed, high-performance data integration platform for the synchronization and transformation of massive data (offline & real-time).
A simplified, lightweight ETL Framework based on Apache Spark
Extract, Transform, Load: Any SQL Database in 4 lines of Code.
Knowledge Graph Toolkit
A tool for building feature stores.
A modern data marketplace that makes collaboration among diverse users (like business, analysts and engineers) easier, increasing efficiency and agility in data projects on AWS.
Bender - Serverless ETL Framework
Configurable Extract, Transform, and Load
Data Processor for AI Agents. Search your documents or the web for specific data and get it back in JSON or Markdown in a single tool call.
Add a description, image, and links to the etl-framework topic page so that developers can more easily learn about it.
To associate your repository with the etl-framework topic, visit your repo's landing page and select "manage topics."