Skip to content
datacorner edited this page Dec 4, 2023 · 24 revisions

The pipelite Project

Empower your data workflows effortlessly with pipelite, a lightweight Python program designed for seamless data pipeline creation and execution. Using a simple JSON configuration, users can build complex pipelines without writing code. What sets pipelite apart is its total extensibilityβ€”anyone can easily create and integrate new connectors or transformations, enhancing the program's capabilities.

It's also possible to add new way to manage the flow of the transformations if needed. With a MIT license fostering collaboration, this flexible tool is perfect for users of all levels. Craft, execute, and extend your data pipelines with pipelite, your go-to solution for adaptable and scalable data processing.

Some characteristics:

  • Simple JSON configuration
  • Lightweight and code-free (MIT license for flexibility)
  • Python Code (leverage the basics libraries instead addind many heavy and complex libs)
  • Effortless pipeline creation and high integrability thanks to the json configuration
  • Streamlined execution process
  • Total extensibility (connectivity, transformation, pipeline management)
  • Boost data processing efficiency
  • Quick learning curve
  • Empower your data workflows in a simple way

So in one word ... pipelite is your extensible solution for dynamic data pipelines.

πŸš€ Currently this solution provides data access and load from these data sources :

πŸ“„ External file (csv)
πŸ“‘ External Excel Spreadsheet (xls, xlsx, xlsm, xlsb, odf, ods and odt) (read only) πŸ“ƒ External XES File (read only)
πŸ“€ ODBC Data Sources (checked with SQL Server, SQLite) by using an configurable SQL query (Read Only)
🏒 SAP Read Table via SAP RFC (Read Only)
🎒 ABBYY Timeline PI(write only in Repository)

πŸš€ And provides those transformers

πŸ”€ Pass Through (Ex. just to change the Data Sources names IN-OUT)
πŸ“Ά Dataset Profiling
πŸ”‚ Concat 2 Data sources
πŸ†– SubString
πŸ†’ Column Transformation
πŸ”ƒ Join data sources
πŸ”ƒ Lookup
πŸ”€ Rename Column Name

This is the beggining and pipelite is designed to be extensible ... So if you have in mind some new good stuff in mind you'd like to add, just join the community ;-)

🏠 Home
πŸ”‘ Main concepts
πŸ’» Installation
πŸ”¨ Configuration
πŸš€ Running

Supported Data Sources
πŸ“„ CSV File
πŸ“‘ XES File
πŸ“ƒ Excel File
πŸ“€ ODBC
🏒 SAP
🎒 ABBYY Timeline

Supported Transformations
πŸ”€ Pass Through
πŸ“Ά Dataset Profiling
πŸ”‚ Concat 2 Data sources
πŸ†– SubString
πŸ†’ Column Transformation
πŸ”ƒ Join data sources
πŸ”ƒ Lookup
πŸ”€ Rename Column Name

Extending pipelite
βœ… how to
βœ… Adding new Data sources
βœ… Adding new Transformers
βœ… Adding new Pipelines

Clone this wiki locally