This project is aimed at providing a framework/structure for writing ETL Pipelines driven by a central config store. The code follows a few design patterns to make the code lean and easy to write. There is an accompanying CLI that generates boilerplate code and constructs an option factory.
- Framework and barebone boilerplate to write any ETL code.
- Provides a neat way to use a central config store to compose ETL pipelines using just configuration.
- Accompanying this repository there is a
If the video does not play, click here to watch the CLI Demo.
If the video does not play, click here to watch the LSP Demo.
- Uses a Chain of Responsibility design pattern to execute each step of the ETL Pipeline.
- The idea is to create a linked list of jobs and then provide flexibility to the initiator of the linked list to execute each step and traverse through the list of jobs in an iterative fashion. This is particularly useful in paginating, and streaming a large dataset.
- Uses a Factory Pattern to use the central config store to compose ETL pipelines in different ways.
- Add Video Documentation to this repository for better presentation
- Create a Project ecosystem on Git Hub and make the CLI tool and LSP tool into individual repositories.
- Add a utility to add merge runtime CLI args with the original configs.