Skip to content

Commit

Permalink
DOCS-modin-project#88: Update main/getting_started pages of rst docs (m…
Browse files Browse the repository at this point in the history
…odin-project#93)

Signed-off-by: Alexey Prutskov <alexey.prutskov@intel.com>
  • Loading branch information
prutskov committed Jan 27, 2022
1 parent f435c9a commit 3cf5720
Show file tree
Hide file tree
Showing 2 changed files with 113 additions and 39 deletions.
86 changes: 61 additions & 25 deletions docs/getting_started.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,59 +6,95 @@
Getting Started
"""""""""""""""

Using Unidist
=============
unidist provides :doc:`the high-level API</flow/unidist/api>` to make distributed applications. To tune
unidist's behavior the user has several methods described in :doc:`unidist configuration settings</flow/unidist/config>`
section.

Using unidist API
=================

The example below shows how to use unidist API to make parallel execution for
functions (tasks) and classes (actors).

.. code-block:: python
# script.py
if __name__ == "__main__":
import unidist.config as cfg
import unidist
# Initialize unidist's backend. The Ray backend is used by default.
unidist.init()
# Apply decorator to make `f` remote function.
# Apply decorator to make `square` a remote function.
@unidist.remote
def f(x):
def square(x):
return x * x
# Asynchronously execute remote function.
refs = [f.remote(i) for i in range(4)]
# Get materialized data.
print(unidist.get(refs)) # [0, 1, 4, 9]
square_refs = [square.remote(i) for i in range(4)]
# Apply decorator to make `Counter` actor class.
@unidist.remote
class Counter:
class Cube:
def __init__(self):
self.n = 0
self.volume = None
def increment(self):
self.n += 1
def compute_volume(self, square):
self.volume = square ** 1.5
def read(self):
return self.n
return self.volume
# Create instances of the actor class.
counters = [Counter.remote() for i in range(4)]
cubes = [Cube.remote() for _ in range(len(square_refs))]
# Asynchronously execute methods of the actor class.
[c.increment.remote() for c in counters]
refs = [c.read.remote() for c in counters]
[cube.compute_volume.remote(square) for cube, square in zip(cubes, square_refs)]
cube_refs = [cube.read.remote() for cube in cubes]
# Get materialized data.
print(unidist.get(refs)) # [1, 1, 1, 1]
# Get materialized results.
print(unidist.get(square_refs)) # [0, 1, 4, 9]
print(unidist.get(cube_refs)) # [0.0, 1.0, 8.0, 27.0]
Choosing unidist's backend
===========================

If you want to choose a specific unidist's backend to run on, you can set the environment variable
``UNIDIST_BACKEND``. unidist will perform execution with that backend:
There are several ways to choose an execution backend for distributed computations.
First, the recommended way is to use :doc:`unidist CLI </using_cli>` options:

.. code-block:: bash
# Running the script with unidist on Ray backend
$ unidist script.py --backend ray
# Running the script with unidist on Dask backend
$ unidist script.py --backend dask
Second, setting the environment variable:

.. code-block:: bash
# unidist will use Ray backend to distribute computations
export UNIDIST_BACKEND=ray
# unidist will use Dask backend to distribute computations
export UNIDIST_BACKEND=dask
Third, using :doc:`config API </flow/unidist/config>` directly in your script:

.. code-block:: python
import unidist.config as cfg
cfg.Backend.put("ray") # unidist will use Ray backend to distribute computations
import unidist.config as cfg
cfg.Backend.put("dask") # unidist will use Dask backend to distribute computations
Running unidist application
===========================

To run the script mentioned above the unidist CLI should be used:

.. code-block:: bash
export UNIDIST_BACKEND=Ray # unidist will use Ray
export UNIDIST_BACKEND=Dask # unidist will use Dask
export UNIDIST_BACKEND=MPI # unidist will use MPI
export UNIDIST_BACKEND=MultiProcessing # unidist will use MultiProcessing
export UNIDIST_BACKEND=Python # unidist will use Python
# Running the script in a single node with `Ray` backend on `4` workers:
$ unidist script.py -num_cpus 4
To find more options for running refer to :doc:`unidist CLI </using_cli>` documentation page.
66 changes: 52 additions & 14 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,22 +3,64 @@
SPDX-License-Identifier: Apache-2.0

Welcome to Unidist's documentation!
===================================
What is unidist?
""""""""""""""""

The framework is intended to provide the unified API for distributed execution
by supporting various distributed task schedulers. At the moment the following schedulers
are supported under the hood:
unidist (`Unified Distributed Execution`) is a framework that is intended to provide the unified API for distributed
execution by supporting various performant execution backends. At the moment the following backends are supported under the hood:

* `Ray`_
* `MPI`_
* `Dask Distributed`_
* `Python Multiprocessing`_
* `MPI`_

Also, the framework provides a sequential :doc:`Python backend <flow/unidist/core/backends/python/backend>`,
that can be used for debug purposes.
that can be used for debugging.

unidist is designed to work in a `task-based parallel model`_. The framework mimics `Ray`_ API and expands the existing frameworks
(`Ray`_ and `Dask Distributed`_) with additional features.

Installation
============

unidist can be installed from sources using ``pip``:

.. code-block:: bash
# Dependencies for all the execution backends will be installed
$ pip install git+https://github.com/modin-project/unidist#egg=unidist[all]
Usage
=====

The example below describes squaring the numbers from a list using unidist:

.. code-block:: python
# script.py
if __name__ == "__main__":
import unidist
unidist.init() # Initialize unidist's backend.
@unidist.remote # Apply a decorator to make `foo` a remote function.
def foo(x):
return x * x
# This will run `foo` on a pool of workers in parallel;
# `refs` will contain object references to actual data.
refs = [foo.remote(i) for i in range(4)]
# Get materialized data.
print(unidist.get(refs)) # [0, 1, 4, 9]
To run the `script.py` use :doc:`unidist CLI </using_cli>`:

.. code-block:: bash
# Running the script in a single node with `mpi` backend on `4` workers:
$ unidist script.py --backend mpi --num_cpus 4
The unidist is designed to work in a `task-based parallel model`_.
.. toctree::
:hidden:
Expand All @@ -28,13 +70,9 @@ The unidist is designed to work in a `task-based parallel model`_.
developer/architecture
developer/contributing

To get started with unidist refer to the getting started page.

* :doc:`Getting Started <getting_started>`

To dipe dive into unidist internals refer to the framework architecture.
To get started with unidist refer to the :doc:`getting started <getting_started>` page.

* :doc:`unidist Architecture </developer/architecture>`
To deep dive into unidist internals refer to :doc:`the framework architecture </developer/architecture>`.

.. _`Ray`: https://docs.ray.io/en/master/index.html
.. _`Dask Distributed`: https://distributed.dask.org/en/latest/
Expand Down

0 comments on commit 3cf5720

Please sign in to comment.