A simplified, lightweight ETL Framework based on Apache Spark
-
Updated
Jan 24, 2024 - Scala
A simplified, lightweight ETL Framework based on Apache Spark
EtlFlow is an ecosystem of functional libraries in Scala based on ZIO for running complex Auditable workflows which can interact with Google Cloud Platform, AWS, Kubernetes, Databases, SFTP servers, On-Prem Systems and more.
Write ETL using your favorite SQL dialects
seatunnel plugin developing examples.
Apache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple categories of transformation rules.
📁 Extract, Transform, Load (ETL) 👷 refers to a process in database usage and especially in data warehousing. This repository contains a starter kit featuring ETL related work.
Yet Another SPark Framework
Broadway is a distributed actor-based processing server optimized for high-speed data/file ingestion
spark library to construct ETL pipeline with monads
Data Tweak is a simplified, lightweight ETL framework based on Apache Spark.
Repository for playing with spark
This proposes a project structure to implement multiple-layers ETL in Spark context
Add a description, image, and links to the etl-framework topic page so that developers can more easily learn about it.
To associate your repository with the etl-framework topic, visit your repo's landing page and select "manage topics."