Skip to content

Commit

Permalink
Merge pull request #1006 from Libensemble/docs/fix_executor_docs
Browse files Browse the repository at this point in the history
Docs/fix executor docs
  • Loading branch information
shuds13 committed May 24, 2023
2 parents a770bb8 + 4403199 commit 7b2a74e
Show file tree
Hide file tree
Showing 24 changed files with 189 additions and 143 deletions.
3 changes: 3 additions & 0 deletions docs/_static/my_theme.css
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
.wy-nav-content {
max-width: 860px !important;
}
8 changes: 4 additions & 4 deletions docs/advanced_installation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ the following line::

will use the ``mpicc`` compiler wrapper on your PATH to identify the MPI library.
To specify a different compiler wrapper, add the ``MPICC`` option.
You also may wish to avoid existing binary builds e.g.::
You also may wish to avoid existing binary builds with::

MPICC=mpiicc pip install mpi4py --no-binary mpi4py

Expand Down Expand Up @@ -115,7 +115,7 @@ The above command will install the latest release of libEnsemble with
the required dependencies only. There are other optional
dependencies that can be specified through variants. The following
line installs libEnsemble version 0.7.2 with some common variants
(e.g.~ using :doc:`APOSMM<../examples/aposmm>`):
(e.g., using :doc:`APOSMM<../examples/aposmm>`):

.. code-block:: bash
Expand All @@ -127,7 +127,7 @@ The list of variants can be found by running::

On some platforms you may wish to run libEnsemble without ``mpi4py``,
using a serial PETSc build. This is often preferable if running on
the launch nodes of a three-tier system (e.g. Theta/Summit)::
the launch nodes of a three-tier system (e.g., Theta/Summit)::

spack install py-libensemble +scipy +mpmath +petsc4py ^py-petsc4py~mpi ^petsc~mpi~hdf5~hypre~superlu-dist

Expand Down Expand Up @@ -174,7 +174,7 @@ for specific systems, see the spack_libe_ repository. In particular, this
includes some example ``packages.yaml`` files (which go in ``~/.spack/``).
These files are used to specify dependencies that Spack must obtain from
the given system (rather than building from scratch). This may include
``Python`` and the packages distributed with it (e.g. ``numpy``), and will
``Python`` and the packages distributed with it (e.g., ``numpy``), and will
often include the system MPI library.

.. _GitHub: https://github.com/Libensemble/libensemble
Expand Down
2 changes: 1 addition & 1 deletion docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -200,7 +200,7 @@ def __getattr__(cls, name):
# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
# html_static_path = ["_static"]
html_static_path = ["_static"]
# html_static_path = []


Expand Down
3 changes: 0 additions & 3 deletions docs/data_structures/libE_specs.rst
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,6 @@ the ``LibeSpecs`` class. When provided as a Python class, options are validated
"disable_log_files" [bool] = ``False``:
Disable the creation of ``"ensemble.log"`` and ``"libE_stats.txt"``.


.. tab-item:: Directories

.. tab-set::
Expand Down Expand Up @@ -148,7 +147,6 @@ the ``LibeSpecs`` class. When provided as a Python class, options are validated
A dictionary of options for formatting ``"libE_stats.txt"``.
See "Formatting Options for libE_stats File" for more options.


.. tab-item:: TCP

"workers" [list]:
Expand Down Expand Up @@ -265,7 +263,6 @@ Platform Fields
:member-order:
:model-show-field-summary: False


Scheduler Options
-----------------

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ Automatic PR
Note that once libEnsemble has been released on PYPI a conda-forge bot will
usually detect the new release and automatically create a pull request with the
changes below. It may take a few hours for this to happen. If no other changes
are required (e.g. new dependencies), then you can simply wait for the tests to
are required (e.g., new dependencies), then you can simply wait for the tests to
pass and merge.

Manual PR
Expand Down
22 changes: 11 additions & 11 deletions docs/executor/executor.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,23 +8,23 @@ See this :doc:`example<overview>` for usage.

See the Executor APIs for optional arguments.

.. toctree::
:maxdepth: 1
:caption: Alternative Executors:

mpi_executor
balsam_2_executor
.. Commented out as creates duplicate menu on index page.
.. .. toctree::
.. :maxdepth: 1
.. :caption: Alternative Executors:
..
.. mpi_executor
.. balsam_2_executor
Executor Class
---------------

Only create an object of this class for running local serial-launched applications.
To run MPI applications and use detected resources, use an alternative Executor
class, as shown above.
To run MPI applications and use detected resources, use the :doc:`MPIExecutor<../executor/mpi_executor>`

.. autoclass:: libensemble.executors.executor.Executor
:members:
:exclude-members: serial_setup, sim_default_app, gen_default_app, get_app, default_app, set_resources, get_task, set_workerID, set_worker_info, new_tasks_timing
:exclude-members: serial_setup, sim_default_app, gen_default_app, get_app, default_app, set_resources, get_task, set_workerID, set_worker_info, new_tasks_timing, add_platform_info, set_gen_procs_gpus, kill, poll

.. automethod:: __init__

Expand All @@ -34,8 +34,8 @@ Task Class
----------

Tasks are created and returned through the Executor ``submit()`` function. Tasks
can be polled and killed with the respective poll and kill functions. Task
information can be queried through the task attributes below and the query
can be polled, killed, waited on with the respective poll, kill, and wait functions.
Task information can be queried through the task attributes below and the query
functions.

.. autoclass:: libensemble.executors.executor.Task
Expand Down
8 changes: 5 additions & 3 deletions docs/executor/mpi_executor.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,14 @@ MPI Executor - MPI apps
.. automodule:: mpi_executor
:no-undoc-members:

See this :doc:`example<overview>` for usage.

.. autoclass:: libensemble.executors.mpi_executor.MPIExecutor
:show-inheritance:
:inherited-members:
:exclude-members: serial_setup, sim_default_app, gen_default_app, get_app, default_app, set_resources, get_task, set_workerID, set_worker_info, new_tasks_timing
:exclude-members: serial_setup, sim_default_app, gen_default_app, get_app, default_app, set_resources, get_task, set_workerID, set_worker_info, new_tasks_timing, add_platform_info, set_gen_procs_gpus, kill, poll

.. automethod:: __init__
.. .. automethod:: __init__
.. :member-order: bysource
.. :members: __init__, register_app, submit, manager_poll
Expand All @@ -26,7 +28,7 @@ be implemented in other executors.
:fail_time: (int or float) *Only if wait_on_start is set.* Maximum run time to failure in
seconds that results in relaunch. *Default: 2*.
:retry_delay_incr: (int or float) Delay increment between launch attempts in seconds.
*Default: 5*. (E.g. First retry after 5 seconds, then 10 seconds, then 15, etc...)
*Default: 5*. (i.e., First retry after 5 seconds, then 10 seconds, then 15, etc...)

Example. To increase resilience against submission failures::

Expand Down
132 changes: 76 additions & 56 deletions docs/executor/overview.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,53 +5,74 @@ Most computationally expensive libEnsemble workflows involve launching applicati
from a :ref:`sim_f<api_sim_f>` or :ref:`gen_f<api_gen_f>` running on a worker to the
compute nodes of a supercomputer, cluster, or other compute resource.

An **Executor** interface is provided by libEnsemble to remove the burden of
system interaction from the user and improve workflow portability. Users first register
their applications to Executor instances, which then return corresponding ``Task``
objects upon submission within user functions.

**Task** attributes and retrieval functions can be queried to determine
the status of running application instances. Functions are also provided to access
and interrogate files in the task's working directory.

libEnsemble's Executors and Tasks contain many familiar features and methods to
Python's native `concurrent futures`_ interface. Executors feature the ``submit()``
function for launching apps (detailed below), but currently do not support
``map()`` or ``shutdown()``. Tasks are much like ``futures``, except they correspond
to an application instance instead of a callable. They feature the ``cancel()``,
``cancelled()``, ``running()``, ``done()``, ``result()``, and ``exception()`` functions
from the standard.

The main ``Executor`` class is an abstract class, inherited by the ``MPIExecutor``
for direct running of MPI applications, and the ``BalsamExecutor``
for submitting MPI run requests from a worker running on a compute node to the
Balsam service. This second approach is suitable for
systems that don't allow submitting MPI applications from compute nodes.

Typically, users choose and parameterize their ``Executor`` objects in their
calling scripts, where each executable generator or simulation application is
registered to it. If an alternative Executor like Balsam is used, then the applications can be
registered as in the example below. Once in the user-side worker code (sim/gen func),
the Executor can be retrieved without any need to specify the type.

Once the Executor is retrieved, tasks can be submitted by specifying the ``app_name``
from registration in the calling script alongside other optional parameters
described in the API.

**Example usage:**
The **Executor** provides a portable interface for running applications on any system.

.. dropdown:: Detailed description

An **Executor** interface is provided by libEnsemble to remove the burden
of system interaction from the user and improve workflow portability. Users
first register their applications to Executor instances, which then return
corresponding ``Task`` objects upon submission within user functions.

**Task** attributes and retrieval functions can be queried to determine
the status of running application instances. Functions are also provided
to access and interrogate files in the task's working directory.

libEnsemble's Executors and Tasks contain many familiar features and methods
to Python's native `concurrent futures`_ interface. Executors feature the
``submit()`` function for launching apps (detailed below), but currently do
not support ``map()`` or ``shutdown()``. Tasks are much like ``futures``.
They feature the ``cancel()``, ``cancelled()``, ``running()``, ``done()``,
``result()``, and ``exception()`` functions from the standard.

The main ``Executor`` class can subprocess serial applications in place,
while the ``MPIExecutor`` is used for running MPI applications, and the
``BalsamExecutor`` for submitting MPI run requests from a worker running on
a compute node to the Balsam service. This second approach is suitable for
systems that don't allow submitting MPI applications from compute nodes.

Typically, users choose and parameterize their ``Executor`` objects in their
calling scripts, where each executable generator or simulation application is
registered to it. If an alternative Executor like Balsam is used, then the
applications can be registered as in the example below. Once in the user-side
worker code (sim/gen func), the Executor can be retrieved without any need to
specify the type.

Once the Executor is retrieved, tasks can be submitted by specifying the
``app_name`` from registration in the calling script alongside other optional
parameters described in the API.

Basic usage
-----------

In calling script::

sim_app = "/path/to/my/exe"

from libensemble.executors.mpi_executor import MPIExecutor

exctr = MPIExecutor()
exctr.register_app(full_path="/path/to/my/exe", app_name="sim1")

Note that the Executor in the calling script does **not** have to be passed to
``libE()``. They can be extracted via *Executor.executor* in the sim function
(regardless of type).

In user simulation function::

exctr.register_app(full_path=sim_app, app_name="sim1")
from libensemble.executors import Executor

Note that Executor instances in the calling script are also stored as class attributes, and
do **not** have to be passed to ``libE()``. They can be extracted via *Executor.executor*
in the sim function (regardless of type).
# Will return Executor (whether MPI or inherited such as Balsam).
exctr = Executor.executor

task = exctr.submit(app_name="sim1", num_procs=8, app_args="input.txt",
stdout="out.txt", stderr="err.txt")

# Wait for task to complete
task.wait()

Advanced Features
-----------------

**Example of polling output and killing application:**

In user simulation function::

Expand Down Expand Up @@ -89,21 +110,20 @@ In user simulation function::

print(task.state) # state may be finished/failed/killed

Executor instances can also be retrieved using Python's ``with`` context switching statement,
although this is effectively syntactical sugar to above::

from libensemble.executors import Executor

with Executor.executor as exctr:
task = exctr.submit(app_name="sim1", num_procs=8, app_args="input.txt",
stdout="out.txt", stderr="err.txt")

...

Users primarily concerned with running their tasks to completion without intermediate
evaluation don't necessarily need to construct a polling loop like above, but can
instead use an ``Executor`` instance's ``polling_loop()`` method. An alternative
to the above simulation function may resemble::
.. The Executor can also be retrieved using Python's ``with`` context switching statement,
.. although this is effectively syntactical sugar to above::
..
.. from libensemble.executors import Executor
..
.. with Executor.executor as exctr:
.. task = exctr.submit(app_name="sim1", num_procs=8, app_args="input.txt",
.. stdout="out.txt", stderr="err.txt")
.. ...
Users who wish to poll only for manager kill signals and timeouts don't necessarily
need to construct a polling loop like above, but can instead use an the ``Executor``
built-in ``polling_loop()`` method. An alternative to the above simulation function
may resemble::

from libensemble.executors import Executor

Expand Down Expand Up @@ -134,7 +154,7 @@ Or put *yet another way*::

See the :doc:`executor<executor>` interface for the complete API.

For a more realistic example see
For a complete example use-case see
the :doc:`Electrostatic Forces example <../tutorials/executor_forces_tutorial>`,
which launches the ``forces.x`` application as an MPI task.

Expand Down
2 changes: 1 addition & 1 deletion docs/function_guides/allocator.rst
Original file line number Diff line number Diff line change
Expand Up @@ -110,7 +110,7 @@ allocation function and detect impending timeouts, then pack up cleanup work req
or mark points for cancellation.

The remaining values above are useful for efficient filtering of H values
(e.g. ``sim_ended_count``), saves a filtering an entire column of H.
(e.g., ``sim_ended_count``), saves a filtering an entire column of H.

.. note:: An error occurs when the ``alloc_f`` returns nothing while
all workers are idle
Expand Down
20 changes: 12 additions & 8 deletions docs/platforms/srun.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ default this is done by :ref:`reading an environment variable<resource_detection

Example SLURM submission scripts for various systems are given in the
:doc:`examples<example_scripts>`. Further examples are given in some of the specific
platform guides (e.g. :doc:`Perlmutter guide<perlmutter>`)
platform guides (e.g., :doc:`Perlmutter guide<perlmutter>`)

By default, the :doc:`MPIExecutor<../executor/mpi_executor>` uses ``mpirun``
as a priority over ``srun`` as it works better in some cases. If ``mpirun`` does
Expand All @@ -35,8 +35,8 @@ It is recommended to add these to submission scripts to prevent resource conflic
export SLURM_MEM_PER_NODE=0

Alternatively, the ``--exact`` `option to srun`_, along with other relevant options
can be given on any ``srun`` lines (including the ``MPIExecutor`` submission lines
via the ``extra_args`` option).
can be given on any ``srun`` lines, including the ``MPIExecutor`` submission lines
via the ``extra_args`` option (from version 0.10.0, these are added automatically).

Secondly, while many configurations are possible, it is recommended to **avoid** using
``#SBATCH`` commands that may limit resources to srun job steps such as::
Expand All @@ -50,21 +50,25 @@ Instead provide these to sub-tasks via the ``extra_args`` option to the
**GTL_DEBUG: [0] cudaHostRegister: no CUDA-capable device is detected**

If using the environment variable ``MPICH_GPU_SUPPORT_ENABLED``, then ``srun`` commands may
expect an option for allocating GPUs (e.g.~ ``--gpus-per-task=1`` would
expect an option for allocating GPUs (e.g., ``--gpus-per-task=1`` would
allocate one GPU to each MPI task of the MPI run). It is recommended that tasks submitted
via the :doc:`MPIExecutor<../executor/mpi_executor>` specify this in the ``extra_args``
option to the ``submit`` function (rather than using an ``#SBATCH`` command). This is needed
even when using setting ``CUDA_VISIBLE_DEVICES`` or other options.
option to the ``submit`` function (rather than using an ``#SBATCH`` command).

If running the libEnsemble user calling script with ``srun``, then it is recommended that
``MPICH_GPU_SUPPORT_ENABLED`` is set in the user ``sim_f`` or ``gen_f`` function where
GPU runs will be submitted, instead of in the batch script. E.g::
GPU runs will be submitted, instead of in the batch script. For example::

os.environ["MPICH_GPU_SUPPORT_ENABLED"] = "1"

Note on Resource Binding
------------------------

.. note::
Update: From version version 0.10.0, it is recommended that GPUs are assigned
automatically by libEnsemble. See the :doc:`forces_gpu<../tutorials/forces_gpu_tutorial>`
tutorial as an example.

Note that the use of ``CUDA_VISIBLE_DEVICES`` and other environment variables is often
a highly portable way of assigning specific GPUs to workers, and has been known to work
on some systems when other methods do not. See the libEnsemble regression test `test_persistent_sampling_CUDA_variable_resources.py`_ for an example of setting
Expand Down Expand Up @@ -98,4 +102,4 @@ Find SLURM partition configuration for a partition called "gpu"::
scontrol show partition gpu

.. _option to srun: https://docs.nersc.gov/systems/perlmutter/running-jobs/#single-gpu-tasks-in-parallel
.. _test_persistent_sampling_CUDA_variable_resources.py: https://github.com/Libensemble/libensemble/blob/develop/libensemble/tests/regression_tests/test_persistent_sampling_CUDA_variable_resources.py
.. _test_persistent_sampling_CUDA_variable_resources.py: https://github.com/Libensemble/libensemble/blob/develop/libensemble/tests/functionality_tests/test_persistent_sampling_CUDA_variable_resources.py

0 comments on commit 7b2a74e

Please sign in to comment.