Pydra: Dataflow Engine
A simple dataflow engine with scalable semantics.
Pydra is a rewrite of the Nipype engine with mapping and joining as first-class operations. It forms the core of the Nipype 2.0 ecosystem.
The goal of pydra is to provide a lightweight Python dataflow engine for DAG construction, manipulation, and distributed execution.
- Python 3.7+ using type annotation and attrs
- Composable dataflows with simple node semantics. A dataflow can be a node of another dataflow.
- splitter and combiner provides many ways of compressing complex loop semantics
- Cached execution with support for a global cache across dataflows and users
- Distributed execution, presently via ConcurrentFutures, SLURM, and Dask (this is an experimental implementation with limited testing)
Learn more about Pydra
- SciPy 2020 Proceedings
- PyCon 2020 Poster
- Explore Pydra interactively (the tutorial can be also run using Binder service)
Please note that mybinder times out after an hour.
pip install pydra
Note that installation fails with older versions of pip on Windows. Upgrade pip before installing:
pip install –upgrade pip pip install pydra
Pydra requires Python 3.7+. To install in developer mode:
git clone firstname.lastname@example.org:nipype/pydra.git cd pydra pip install -e ".[dev]"
In order to run pydra's test locally:
pytest -vs pydra
If you want to test execution with Dask:
git clone email@example.com:nipype/pydra.git cd pydra pip install -e ".[dask]"
It is also useful to install pre-commit:
pip install pre-commit pre-commit