DOCS-modin-project#88: Update main/getting_started pages of rst docs (m…

…odin-project#93) Signed-off-by: Alexey Prutskov <alexey.prutskov@intel.com>
no-ponomarev · Jan 27, 2022 · 3cf5720 · 3cf5720
1 parent f435c9a
commit 3cf5720
Show file tree

Hide file tree

Showing 2 changed files with 113 additions and 39 deletions.
diff --git a/docs/getting_started.rst b/docs/getting_started.rst
@@ -6,59 +6,95 @@
 Getting Started
 """""""""""""""
 
-Using Unidist
-=============
+unidist provides :doc:`the high-level API</flow/unidist/api>` to make distributed applications. To tune
+unidist's behavior the user has several methods described in :doc:`unidist configuration settings</flow/unidist/config>`
+section.
+
+Using unidist API
+=================
+
+The example below shows how to use unidist API to make parallel execution for
+functions (tasks) and classes (actors).
 
 .. code-block:: python
 
+   # script.py
    if __name__ == "__main__":
+      import unidist.config as cfg
       import unidist
 
       # Initialize unidist's backend. The Ray backend is used by default.
       unidist.init()
 
-      # Apply decorator to make `f` remote function.
+      # Apply decorator to make `square` a remote function.
       @unidist.remote
-      def f(x):
+      def square(x):
          return x * x
 
       # Asynchronously execute remote function.
-      refs = [f.remote(i) for i in range(4)]
-
-      # Get materialized data.
-      print(unidist.get(refs)) # [0, 1, 4, 9]
+      square_refs = [square.remote(i) for i in range(4)]
 
       # Apply decorator to make `Counter` actor class.
       @unidist.remote
-      class Counter:
+      class Cube:
          def __init__(self):
-               self.n = 0
+               self.volume = None
 
-         def increment(self):
-               self.n += 1
+         def compute_volume(self, square):
+               self.volume = square ** 1.5
 
          def read(self):
-               return self.n
+               return self.volume
 
       # Create instances of the actor class.
-      counters = [Counter.remote() for i in range(4)]
+      cubes = [Cube.remote() for _ in range(len(square_refs))]
       # Asynchronously execute methods of the actor class.
-      [c.increment.remote() for c in counters]
-      refs = [c.read.remote() for c in counters]
+      [cube.compute_volume.remote(square) for cube, square in zip(cubes, square_refs)]
+      cube_refs = [cube.read.remote() for cube in cubes]
 
-      # Get materialized data.
-      print(unidist.get(refs)) # [1, 1, 1, 1]
+      # Get materialized results.
+      print(unidist.get(square_refs)) # [0, 1, 4, 9]
+      print(unidist.get(cube_refs)) # [0.0, 1.0, 8.0, 27.0]
 
 Choosing unidist's backend
 ===========================
 
-If you want to choose a specific unidist's backend to run on, you can set the environment variable
-``UNIDIST_BACKEND``. unidist will perform execution with that backend:
+There are several ways to choose an execution backend for distributed computations.
+First, the recommended way is to use :doc:`unidist CLI </using_cli>` options:
+
+.. code-block:: bash
+
+    # Running the script with unidist on Ray backend
+    $ unidist script.py --backend ray
+    # Running the script with unidist on Dask backend
+    $ unidist script.py --backend dask
+
+Second, setting the environment variable:
+
+.. code-block:: bash
+
+    # unidist will use Ray backend to distribute computations
+    export UNIDIST_BACKEND=ray
+    # unidist will use Dask backend to distribute computations
+    export UNIDIST_BACKEND=dask
+
+Third, using :doc:`config API </flow/unidist/config>` directly in your script:
+
+.. code-block:: python
+
+    import unidist.config as cfg
+    cfg.Backend.put("ray") # unidist will use Ray backend to distribute computations
+    import unidist.config as cfg
+    cfg.Backend.put("dask") # unidist will use Dask backend to distribute computations
+
+Running unidist application
+===========================
+
+To run the script mentioned above the unidist CLI should be used:
 
 .. code-block:: bash
 
-   export UNIDIST_BACKEND=Ray  # unidist will use Ray
-   export UNIDIST_BACKEND=Dask  # unidist will use Dask
-   export UNIDIST_BACKEND=MPI  # unidist will use MPI
-   export UNIDIST_BACKEND=MultiProcessing  # unidist will use MultiProcessing
-   export UNIDIST_BACKEND=Python  # unidist will use Python
+    # Running the script in a single node with `Ray` backend on `4` workers:
+    $ unidist script.py -num_cpus 4
+
+To find more options for running refer to :doc:`unidist CLI </using_cli>` documentation page.
diff --git a/docs/index.rst b/docs/index.rst
@@ -3,22 +3,64 @@
 
       SPDX-License-Identifier: Apache-2.0
 
-Welcome to Unidist's documentation!
-===================================
+What is unidist?
+""""""""""""""""
 
-The framework is intended to provide the unified API for distributed execution
-by supporting various distributed task schedulers. At the moment the following schedulers
-are supported under the hood:
+unidist (`Unified Distributed Execution`) is a framework that is intended to provide the unified API for distributed
+execution by supporting various performant execution backends. At the moment the following backends are supported under the hood:
 
 * `Ray`_
+* `MPI`_
 * `Dask Distributed`_
 * `Python Multiprocessing`_
-* `MPI`_
 
 Also, the framework provides a sequential :doc:`Python backend <flow/unidist/core/backends/python/backend>`,
-that can be used for debug purposes.
+that can be used for debugging.
+
+unidist is designed to work in a `task-based parallel model`_. The framework mimics `Ray`_ API and expands the existing frameworks
+(`Ray`_ and `Dask Distributed`_) with additional features.
+
+Installation
+============
+
+unidist can be installed from sources using ``pip``:
+
+.. code-block:: bash
+
+   # Dependencies for all the execution backends will be installed
+   $ pip install git+https://github.com/modin-project/unidist#egg=unidist[all]
+
+Usage
+=====
+
+The example below describes squaring the numbers from a list using unidist:
+
+.. code-block:: python
+
+   # script.py
+   if __name__ == "__main__":
+      import unidist
+
+      unidist.init() # Initialize unidist's backend.
+
+      @unidist.remote # Apply a decorator to make `foo` a remote function.
+      def foo(x):
+         return x * x
+
+      # This will run `foo` on a pool of workers in parallel;
+      # `refs` will contain object references to actual data.
+      refs = [foo.remote(i) for i in range(4)]
+
+      # Get materialized data.
+      print(unidist.get(refs)) # [0, 1, 4, 9]
+
+To run the `script.py` use :doc:`unidist CLI </using_cli>`:
+
+.. code-block:: bash
+
+    # Running the script in a single node with `mpi` backend on `4` workers:
+    $ unidist script.py --backend mpi --num_cpus 4
 
-The unidist is designed to work in a `task-based parallel model`_.
 
 .. toctree::
    :hidden:
@@ -28,13 +70,9 @@ The unidist is designed to work in a `task-based parallel model`_.
    developer/architecture
    developer/contributing
 
-To get started with unidist refer to the getting started page.
-
-* :doc:`Getting Started <getting_started>`
-
-To dipe dive into unidist internals refer to the framework architecture.
+To get started with unidist refer to the :doc:`getting started <getting_started>` page.
 
-* :doc:`unidist Architecture </developer/architecture>`
+To deep dive into unidist internals refer to :doc:`the framework architecture </developer/architecture>`.
 
 .. _`Ray`: https://docs.ray.io/en/master/index.html
 .. _`Dask Distributed`: https://distributed.dask.org/en/latest/