Skip to content

Commit 6adbf5f

Browse files
New examples for the updated documentation (#495)
* new examples * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Build notebooks as tests * add executor bit * extend notebook environment * Update 3-hpc-allocation.ipynb * Add key features * update key arguments * Work in progress for the readme * Update readme * add new lines * Change Backend Names * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update __init__.py * update readme * Update installation * Fix init * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update local notebook * update local example notebook * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Explain jupyter kernel installation * copy existing kernel * Add HPC submission * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * execute HPC notebook once * hpc allocation * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update documentation * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * replace HPC submission notebook --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
1 parent d11a2d3 commit 6adbf5f

File tree

14 files changed

+2262
-996
lines changed

14 files changed

+2262
-996
lines changed

.ci_support/build_notebooks.sh

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
#!/bin/bash
2+
# execute notebooks
3+
i=0;
4+
for notebook in $(ls notebooks/*.ipynb); do
5+
papermill ${notebook} ${notebook%.*}-out.${notebook##*.} -k python3 || i=$((i+1));
6+
done;
7+
8+
# push error to next level
9+
if [ $i -gt 0 ]; then
10+
exit 1;
11+
fi;

.github/workflows/notebooks.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,4 +34,4 @@ jobs:
3434
timeout-minutes: 5
3535
run: >
3636
flux start
37-
papermill notebooks/examples.ipynb examples-out.ipynb -k "python3"
37+
.ci_support/build_notebooks.sh

README.md

Lines changed: 104 additions & 93 deletions
Original file line numberDiff line numberDiff line change
@@ -3,111 +3,122 @@
33
[![Coverage Status](https://coveralls.io/repos/github/pyiron/executorlib/badge.svg?branch=main)](https://coveralls.io/github/pyiron/executorlib?branch=main)
44
[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/pyiron/executorlib/HEAD?labpath=notebooks%2Fexamples.ipynb)
55

6-
## Challenges
7-
In high performance computing (HPC) the Python programming language is commonly used as high-level language to
8-
orchestrate the coupling of scientific applications. Still the efficient usage of highly parallel HPC clusters remains
9-
challenging, in primarily three aspects:
10-
11-
* **Communication**: Distributing python function calls over hundreds of compute node and gathering the results on a
12-
shared file system is technically possible, but highly inefficient. A socket-based communication approach is
13-
preferable.
14-
* **Resource Management**: Assigning Python functions to GPUs or executing Python functions on multiple CPUs using the
15-
message passing interface (MPI) requires major modifications to the python workflow.
16-
* **Integration**: Existing workflow libraries implement a secondary the job management on the Python level rather than
17-
leveraging the existing infrastructure provided by the job scheduler of the HPC.
18-
19-
### executorlib is ...
20-
In a given HPC allocation the `executorlib` library addresses these challenges by extending the Executor interface
21-
of the standard Python library to support the resource assignment in the HPC context. Computing resources can either be
22-
assigned on a per function call basis or as a block allocation on a per Executor basis. The `executorlib` library
23-
is built on top of the [flux-framework](https://flux-framework.org) to enable fine-grained resource assignment. In
24-
addition, [Simple Linux Utility for Resource Management (SLURM)](https://slurm.schedmd.com) is supported as alternative
25-
queuing system and for workstation installations `executorlib` can be installed without a job scheduler.
26-
27-
### executorlib is not ...
28-
The executorlib library is not designed to request an allocation from the job scheduler of an HPC. Instead within a given
29-
allocation from the job scheduler the `executorlib` library can be employed to distribute a series of python
30-
function calls over the available computing resources to achieve maximum computing resource utilization.
31-
32-
## Example
33-
The following examples illustrates how `executorlib` can be used to distribute a series of MPI parallel function calls
34-
within a queuing system allocation. `example.py`:
6+
Up-scale python functions for high performance computing (HPC) with executorlib.
7+
8+
## Key Features
9+
* **Up-scale your Python functions beyond a single computer.** - executorlib extends the [Executor interface](https://docs.python.org/3/library/concurrent.futures.html#executor-objects)
10+
from the Python standard library and combines it with job schedulers for high performance computing (HPC) including
11+
the [Simple Linux Utility for Resource Management (SLURM)](https://slurm.schedmd.com) and [flux](http://flux-framework.org).
12+
With this combination executorlib allows users to distribute their Python functions over multiple compute nodes.
13+
* **Parallelize your Python program one function at a time** - executorlib allows users to assign dedicated computing
14+
resources like CPU cores, threads or GPUs to one Python function call at a time. So you can accelerate your Python
15+
code function by function.
16+
* **Permanent caching of intermediate results to accelerate rapid prototyping** - To accelerate the development of
17+
machine learning pipelines and simulation workflows executorlib provides optional caching of intermediate results for
18+
iterative development in interactive environments like jupyter notebooks.
19+
20+
## Examples
21+
The Python standard library provides the [Executor interface](https://docs.python.org/3/library/concurrent.futures.html#executor-objects)
22+
with the [ProcessPoolExecutor](https://docs.python.org/3/library/concurrent.futures.html#processpoolexecutor) and the
23+
[ThreadPoolExecutor](https://docs.python.org/3/library/concurrent.futures.html#threadpoolexecutor) for parallel
24+
execution of Python functions on a single computer. executorlib extends this functionality to distribute Python
25+
functions over multiple computers within a high performance computing (HPC) cluster. This can be either achieved by
26+
submitting each function as individual job to the HPC job scheduler - [HPC Submission Mode]() - or by requesting a
27+
compute allocation of multiple nodes and then distribute the Python functions within this allocation - [HPC Allocation Mode]().
28+
Finally, to accelerate the development process executorlib also provides a - [Local Mode]() - to use the executorlib
29+
functionality on a single workstation for testing. Starting with the [Local Mode]() set by setting the backend parameter
30+
to local - `backend="local"`:
3531
```python
36-
import flux.job
3732
from executorlib import Executor
3833

34+
35+
with Executor(backend="local") as exe:
36+
future_lst = [exe.submit(sum, [i, i]) for i in range(1, 5)]
37+
print([f.result() for f in future_lst])
38+
```
39+
In the same way executorlib can also execute Python functions which use additional computing resources, like multiple
40+
CPU cores, CPU threads or GPUs. For example if the Python function internally uses the Message Passing Interface (MPI)
41+
via the [mpi4py](https://mpi4py.readthedocs.io) Python libary:
42+
```python
43+
from executorlib import Executor
44+
45+
3946
def calc(i):
4047
from mpi4py import MPI
48+
4149
size = MPI.COMM_WORLD.Get_size()
4250
rank = MPI.COMM_WORLD.Get_rank()
4351
return i, size, rank
4452

45-
with flux.job.FluxExecutor() as flux_exe:
46-
with Executor(max_cores=2, executor=flux_exe, resource_dict={"cores": 2}) as exe:
47-
fs = exe.submit(calc, 3)
48-
print(fs.result())
49-
```
50-
This example can be executed using:
51-
```
52-
python example.py
53-
```
54-
Which returns:
55-
```
56-
>>> [(0, 2, 0), (0, 2, 1)], [(1, 2, 0), (1, 2, 1)]
57-
```
58-
The important part in this example is that [mpi4py](https://mpi4py.readthedocs.io) is only used in the `calc()`
59-
function, not in the python script, consequently it is not necessary to call the script with `mpiexec` but instead
60-
a call with the regular python interpreter is sufficient. This highlights how `executorlib` allows the users to
61-
parallelize one function at a time and not having to convert their whole workflow to use [mpi4py](https://mpi4py.readthedocs.io).
62-
The same code can also be executed inside a jupyter notebook directly which enables an interactive development process.
63-
64-
The interface of the standard [concurrent.futures.Executor](https://docs.python.org/3/library/concurrent.futures.html#module-concurrent.futures)
65-
is extended by adding the option `cores_per_worker=2` to assign multiple MPI ranks to each function call. To create two
66-
workers the maximum number of cores can be increased to `max_cores=4`. In this case each worker receives two cores
67-
resulting in a total of four CPU cores being utilized.
68-
69-
After submitting the function `calc()` with the corresponding parameter to the executor `exe.submit(calc, 0)`
70-
a python [`concurrent.futures.Future`](https://docs.python.org/3/library/concurrent.futures.html#future-objects) is
71-
returned. Consequently, the `executorlib.Executor` can be used as a drop-in replacement for the
72-
[`concurrent.futures.Executor`](https://docs.python.org/3/library/concurrent.futures.html#module-concurrent.futures)
73-
which allows the user to add parallelism to their workflow one function at a time.
74-
75-
## Disclaimer
76-
While we try to develop a stable and reliable software library, the development remains a opensource project under the
77-
BSD 3-Clause License without any warranties::
53+
54+
with Executor(backend="local") as exe:
55+
fs = exe.submit(calc, 3, resource_dict={"cores": 2})
56+
print(fs.result())
7857
```
79-
BSD 3-Clause License
80-
81-
Copyright (c) 2022, Jan Janssen
82-
All rights reserved.
83-
84-
Redistribution and use in source and binary forms, with or without
85-
modification, are permitted provided that the following conditions are met:
86-
87-
* Redistributions of source code must retain the above copyright notice, this
88-
list of conditions and the following disclaimer.
89-
90-
* Redistributions in binary form must reproduce the above copyright notice,
91-
this list of conditions and the following disclaimer in the documentation
92-
and/or other materials provided with the distribution.
93-
94-
* Neither the name of the copyright holder nor the names of its
95-
contributors may be used to endorse or promote products derived from
96-
this software without specific prior written permission.
97-
98-
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
99-
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
100-
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
101-
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
102-
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
103-
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
104-
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
105-
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
106-
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
107-
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
58+
The additional `resource_dict` parameter defines the computing resources allocated to the execution of the submitted
59+
Python function. In addition to the compute cores `cores`, the resource dictionary can also define the threads per core
60+
as `threads_per_core`, the GPUs per core as `gpus_per_core`, the working directory with `cwd`, the option to use the
61+
OpenMPI oversubscribe feature with `openmpi_oversubscribe` and finally for the [Simple Linux Utility for Resource
62+
Management (SLURM)](https://slurm.schedmd.com) queuing system the option to provide additional command line arguments
63+
with the `slurm_cmd_args` parameter - [resource dictionary]().
64+
65+
This flexibility to assign computing resources on a per-function-call basis simplifies the up-scaling of Python programs.
66+
Only the part of the Python functions which benefit from parallel execution are implemented as MPI parallel Python
67+
funtions, while the rest of the program remains serial.
68+
69+
The same function can be submitted to the [SLURM](https://slurm.schedmd.com) queuing by just changing the `backend`
70+
parameter to `slurm_submission`. The rest of the example remains the same, which highlights how executorlib accelerates
71+
the rapid prototyping and up-scaling of HPC Python programs.
72+
```python
73+
from executorlib import Executor
74+
75+
76+
def calc(i):
77+
from mpi4py import MPI
78+
79+
size = MPI.COMM_WORLD.Get_size()
80+
rank = MPI.COMM_WORLD.Get_rank()
81+
return i, size, rank
82+
83+
84+
with Executor(backend="slurm_submission") as exe:
85+
fs = exe.submit(calc, 3, resource_dict={"cores": 2})
86+
print(fs.result())
10887
```
88+
In this case the [Python simple queuing system adapter (pysqa)](https://pysqa.readthedocs.io) is used to submit the
89+
`calc()` function to the [SLURM](https://slurm.schedmd.com) job scheduler and request an allocation with two CPU cores
90+
for the execution of the function - [HPC Submission Mode](). In the background the [sbatch](https://slurm.schedmd.com/sbatch.html)
91+
command is used to request the allocation to execute the Python function.
92+
93+
Within a given [SLURM](https://slurm.schedmd.com) allocation executorlib can also be used to assign a subset of the
94+
available computing resources to execute a given Python function. In terms of the [SLURM](https://slurm.schedmd.com)
95+
commands, this functionality internally uses the [srun](https://slurm.schedmd.com/srun.html) command to receive a subset
96+
of the resources of a given queuing system allocation.
97+
```python
98+
from executorlib import Executor
99+
109100

110-
# Documentation
101+
def calc(i):
102+
from mpi4py import MPI
103+
104+
size = MPI.COMM_WORLD.Get_size()
105+
rank = MPI.COMM_WORLD.Get_rank()
106+
return i, size, rank
107+
108+
109+
with Executor(backend="slurm_allocation") as exe:
110+
fs = exe.submit(calc, 3, resource_dict={"cores": 2})
111+
print(fs.result())
112+
```
113+
In addition, to support for [SLURM](https://slurm.schedmd.com) executorlib also provides support for the hierarchical
114+
[flux](http://flux-framework.org) job scheduler. The [flux](http://flux-framework.org) job scheduler is developed at
115+
[Larwence Livermore National Laboratory](https://computing.llnl.gov/projects/flux-building-framework-resource-management)
116+
to address the needs for the up-coming generation of Exascale computers. Still even on traditional HPC clusters the
117+
hierarchical approach of the [flux](http://flux-framework.org) is beneficial to distribute hundreds of tasks within a
118+
given allocation. Even when [SLURM](https://slurm.schedmd.com) is used as primary job scheduler of your HPC, it is
119+
recommended to use [SLURM with flux]() as hierarchical job scheduler within the allocations.
120+
121+
## Documentation
111122
* [Installation](https://executorlib.readthedocs.io/en/latest/installation.html)
112123
* [Compatible Job Schedulers](https://executorlib.readthedocs.io/en/latest/installation.html#compatible-job-schedulers)
113124
* [executorlib with Flux Framework](https://executorlib.readthedocs.io/en/latest/installation.html#executorlib-with-flux-framework)

binder/environment.yml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,3 +11,8 @@ dependencies:
1111
- flux-pmix =0.5.0
1212
- versioneer =0.28
1313
- h5py =3.12.1
14+
- matplotlib =3.9.2
15+
- networkx =3.4.2
16+
- pygraphviz =1.14
17+
- pysqa =0.2.2
18+
- ipython =8.29.0

docs/_toc.yml

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,9 @@ format: jb-book
22
root: README
33
chapters:
44
- file: installation.md
5-
- file: examples.ipynb
6-
- file: development.md
5+
- file: 1-local.ipynb
6+
- file: 2-hpc-submission.ipynb
7+
- file: 3-hpc-allocation.ipynb
78
- file: trouble_shooting.md
9+
- file: 4-developer.ipynb
810
- file: api.rst

0 commit comments

Comments
 (0)