Skip to content

prutskov/unidist

 
 

Repository files navigation

Unified Distributed Execution

PyPI version

What is unidist?

unidist is a framework that is intended to provide the unified API for distributed execution by supporting various performant execution backends. At the moment the following backends are supported under the hood:

unidist is designed to work in a task-based parallel model.

Also, the framework provides a sequential Python backend, that can be used for debugging.

Installation

From PyPI

unidist can be installed with pip on Linux, Windows and MacOS:

pip install unidist # Install unidist with dependencies for Multiprocessing and sequential Python backends

unidist can also be used with Dask, MPI or Ray execution backend. If you don't have Dask, MPI or Ray installed, you will need to install unidist with one of the targets:

pip install unidist[all] # Install unidist with dependencies for all the backends
pip install unidist[dask] # Install unidist with dependencies for Dask backend
pip install unidist[mpi] # Install unidist with dependencies for MPI backend
pip install unidist[ray] # Install unidist with dependencies for Ray backend

unidist automatically detects which execution backends are installed and uses that for scheduling computation.

From conda-forge

For installing unidist with dependencies for Dask and MPI execution backends into a conda environment the following command should be used:

conda install unidist-dask unidist-mpi -c conda-forge

All set of backends could be available in a conda environment by specifying:

conda install unidist-all -c conda-forge

or explicitly:

conda install unidist-dask unidist-mpi unidist-ray -c conda-forge

For more information refer to Installation section.

Choosing an execution backend

There are several ways to choose the execution backend for distributed computation. First, the recommended way is to use unidist CLI options:

# Running the script with unidist on Ray backend
$ unidist script.py --backend ray
# Running the script with unidist on Dask backend
$ unidist script.py --backend dask

Second, setting the environment variable:

# unidist will use Ray backend to distribute computations
export UNIDIST_BACKEND=ray
# unidist will use Dask backend to distribute computations
export UNIDIST_BACKEND=dask

Third, using config API directly in your script:

import unidist.config as cfg
cfg.Backend.put("ray") # unidist will use Ray backend to distribute computations
import unidist.config as cfg
cfg.Backend.put("dask") # unidist will use Dask backend to distribute computations

Default execution backend for unidist is Ray.

Usage

unidist provides CLI interface to run python programs.

import unidist
unidist.init() # Ray backend is used by default

@unidist.remote
def foo(x):
    return x * x

# This will run `foo` on a pool of workers in parallel;
# `refs` will contain object references to actual data
refs = [foo.remote(i) for i in range(5)]
# To get the data call `unidist.get(...)`
print(unidist.get(refs))
[0, 1, 4, 9, 16]

For more examples refer to Getting Started section in our documentation.

Full Documentation

Visit the complete documentation on readthedocs: https://unidist.readthedocs.io.

About

Unified Distributed Execution

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 99.9%
  • JavaScript 0.1%