data-plumber

data-plumber is a lightweight but versatile python-framework for multi-stage information processing. It allows to construct processing pipelines from both atomic building blocks and via recombination of existing pipelines. Forks enable more complex (i.e. non-linear) orders of execution. Pipelines can also be collected into arrays that can be executed at once with the same input data.

Usage example

Consider a scenario where the contents of a dictionary have to be validated and a suitable error message has to be generated. Specifically, a valid input-dictionary is expected to have a key "data" with the respective value being a list of integer numbers. A suitable pipeline might look like this

>>> from data_plumber import Stage, Pipeline, Previous
>>> pipeline = Pipeline(
...   Stage(  # validate "data" is passed into run
...     primer=lambda **kwargs: "data" in kwargs,
...     status=lambda primer, **kwargs: 0 if primer else 1,
...     message=lambda primer, **kwargs: "" if primer else "missing argument"
...   ),
...   Stage(  # validate "data" is list
...     requires={Previous: 0},
...     primer=lambda data, **kwargs: isinstance(data, list),
...     status=lambda primer, **kwargs: 0 if primer else 1,
...     message=lambda primer, **kwargs: "" if primer else "bad type"
...   ),
...   Stage(  # validate "data" contains only int
...     requires={Previous: 0},
...     primer=lambda data, **kwargs: all(isinstance(i, int) for i in data),
...     status=lambda primer, **kwargs: 0 if primer else 1,
...     message=lambda primer, **kwargs: "validation success" if primer else "bad type in data"
...   ),
...   exit_on_status=1
... )
>>> pipeline.run().last_message
'missing argument'
>>> pipeline.run(data=1).last_message
'bad type'
>>> pipeline.run(data=[1, "2", 3]).last_message
'bad type in data'
>>> pipeline.run(data=[1, 2, 3]).last_message
'validation success'

See section "Examples" in Documentation for more explanation.

Install

Install using pip with

pip install data-plumber

Consider installing in a virtual environment.

Name		Name	Last commit message	Last commit date
Latest commit History 118 Commits
.github/workflows		.github/workflows
data_plumber		data_plumber
docs		docs
test_data_plumber		test_data_plumber
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.github/workflows

.github/workflows

data_plumber

data_plumber

docs

docs

test_data_plumber

test_data_plumber

.gitignore

.gitignore

CHANGELOG.md

CHANGELOG.md

LICENSE

LICENSE

MANIFEST.in

MANIFEST.in

README.md

README.md

requirements.txt

requirements.txt

setup.py

setup.py

Repository files navigation

data-plumber

Contents

Usage example

Install

Documentation

About

Releases 10

Languages

License

RichtersFinger/data-plumber

Folders and files

Latest commit

History

Repository files navigation

data-plumber

Contents

Usage example

Install

Documentation

About

Topics

Resources

License

Stars

Watchers

Forks

Languages