diff --git a/.readthedocs.yml b/.readthedocs.yml index ea70d656..9d652641 100644 --- a/.readthedocs.yml +++ b/.readthedocs.yml @@ -16,6 +16,7 @@ build: - pip install . --no-deps --no-build-isolation - "cp README.md docs" - "cp notebooks/*.ipynb docs" + - "cp -r notebooks/images docs" - "jupyter-book config sphinx docs/" # Build documentation in the docs/ directory with Sphinx diff --git a/binder/postBuild b/binder/postBuild index 4f77c7d9..d5c12e3d 100644 --- a/binder/postBuild +++ b/binder/postBuild @@ -7,6 +7,7 @@ pip install . --no-deps --no-build-isolation # copy notebooks mv notebooks/*.ipynb . +mv notebooks/images . # clean up -rm -rf .ci_support .github binder docs notebooks executorlib executorlib.egg-info tests .coveralls.yml .gitignore .readthedocs.yml LICENSE MANIFEST.in README.md pyproject.toml setup.py build \ No newline at end of file +rm -rf .ci_support .github binder docs notebooks executorlib executorlib.egg-info tests .coveralls.yml .gitignore .readthedocs.yml LICENSE MANIFEST.in README.md pyproject.toml setup.py build diff --git a/notebooks/4-developer.ipynb b/notebooks/4-developer.ipynb index 7747a941..e1a5f10a 100644 --- a/notebooks/4-developer.ipynb +++ b/notebooks/4-developer.ipynb @@ -1,29 +1,9 @@ { - "metadata": { - "kernelspec": { - "display_name": "Python 3 (ipykernel)", - "language": "python", - "name": "python3" - }, - "language_info": { - "name": "python", - "version": "3.12.10", - "mimetype": "text/x-python", - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "pygments_lexer": "ipython3", - "nbconvert_exporter": "python", - "file_extension": ".py" - } - }, - "nbformat_minor": 5, - "nbformat": 4, "cells": [ { - "id": "511b34e0-12af-4437-8915-79f033fe7cda", "cell_type": "markdown", + "id": "511b34e0-12af-4437-8915-79f033fe7cda", + "metadata": {}, "source": [ "# Support & Contribution\n", "The executorlib open-source software package is developed by scientists for scientists. We are open for any contribution, from feedback about spelling mistakes in the documentation, to [raising issues](https://github.com/pyiron/executorlib/issues) about functionality which is insufficiently explained in the documentation or simply requesting support to suggesting new features or opening [pull requests](https://github.com/pyiron/executorlib/pulls). Our [Github repository](https://github.com/pyiron/executorlib) is the easiest way to get in contact with the developers. \n", @@ -96,19 +76,30 @@ "| `FluxClusterExecutor(plot_dependency_graph=True)` | | with `FluxPythonSpawner` | | |\n", "| `FluxJobExecutor(disable_dependencies=False)` | | with `FluxPythonSpawner` | | |\n", "| `FluxJobExecutor(disable_dependencies=True, block_allocation=False)` | | | | with `FluxPythonSpawner` |\n", - "| `FluxJobExecutor(disable_dependencies=True, block_allocation=True)` | with `FluxPythonSpawner` | | | |" - ], - "metadata": {} + "| `FluxJobExecutor(disable_dependencies=True, block_allocation=True)` | with `FluxPythonSpawner` | | | |\n", + "\n", + "In addition, the following UML diagrams give an overview of the class hierarchy of executorlib:\n", + "![uml_executor](images/uml_executor.png)\n", + "\n", + "![uml_spawner](images/uml_spawner.png)" + ] }, { - "id": "c9df5ba2-9036-422c-b9af-a5d05944aa1f", "cell_type": "markdown", - "source": "## Test Environment\nThe test environment of the executorlib library consists of three components - they are all available in the executorlib [Github repository](https://github.com/pyiron/executorlib):\n* The [Jupyter Notebooks](https://github.com/pyiron/executorlib/tree/main/notebooks) in the executorlib Github repository demonstrate the usage of executorlib. These notebooks are used as examples for new users, as documentation available on [readthedocs.org](https://executorlib.readthedocs.io) and as integration tests.\n* The [likelihood benchmark](https://github.com/pyiron/executorlib/blob/main/tests/benchmark/llh.py) to compare the performance on a single compute node to the built-in interfaces in the standard library. The benchmark can be run with the following parameters `python llh.py static`. Here `static` refers to single process execution, `process` refers to the `ProcessPoolExecutor` from the standard library, `thread` refers to the `ThreadPoolExecutor` from the standard library, `executorlib` refers to the `SingleNodeExecutor` in executorlib and `block_allocation` to the `SingleNodeExecutor` in `executorlib` with block allocation enabled. Finally, for comparison to `mpi4py` the test can be executed with `mpiexec -n 4 python -m mpi4py.futures llh.py mpi4py`.\n* The [unit tests](https://github.com/pyiron/executorlib/tree/main/tests) these can be executed with `python -m unittest discover .` in the `tests` directory. The tests are structured based on the internal structure of executorlib. Tests for the `SingleNodeExecutor` are named `test_singlenodeexecutor_*.py` and correspondingly for the other modules. ", - "metadata": {} + "id": "c9df5ba2-9036-422c-b9af-a5d05944aa1f", + "metadata": {}, + "source": [ + "## Test Environment\n", + "The test environment of the executorlib library consists of three components - they are all available in the executorlib [Github repository](https://github.com/pyiron/executorlib):\n", + "* The [Jupyter Notebooks](https://github.com/pyiron/executorlib/tree/main/notebooks) in the executorlib Github repository demonstrate the usage of executorlib. These notebooks are used as examples for new users, as documentation available on [readthedocs.org](https://executorlib.readthedocs.io) and as integration tests.\n", + "* The [likelihood benchmark](https://github.com/pyiron/executorlib/blob/main/tests/benchmark/llh.py) to compare the performance on a single compute node to the built-in interfaces in the standard library. The benchmark can be run with the following parameters `python llh.py static`. Here `static` refers to single process execution, `process` refers to the `ProcessPoolExecutor` from the standard library, `thread` refers to the `ThreadPoolExecutor` from the standard library, `executorlib` refers to the `SingleNodeExecutor` in executorlib and `block_allocation` to the `SingleNodeExecutor` in `executorlib` with block allocation enabled. Finally, for comparison to `mpi4py` the test can be executed with `mpiexec -n 4 python -m mpi4py.futures llh.py mpi4py`.\n", + "* The [unit tests](https://github.com/pyiron/executorlib/tree/main/tests) these can be executed with `python -m unittest discover .` in the `tests` directory. The tests are structured based on the internal structure of executorlib. Tests for the `SingleNodeExecutor` are named `test_singlenodeexecutor_*.py` and correspondingly for the other modules. " + ] }, { - "id": "7bc073aa-6036-48e7-9696-37af050d438a", "cell_type": "markdown", + "id": "7bc073aa-6036-48e7-9696-37af050d438a", + "metadata": {}, "source": [ "## Communication\n", "The key functionality of the executorlib package is the up-scaling of python functions with thread based parallelism, MPI based parallelism or by assigning GPUs to individual python functions. In the background this is realized using a combination of the [zero message queue](https://zeromq.org) and [cloudpickle](https://github.com/cloudpipe/cloudpickle)\n", @@ -139,142 +130,259 @@ "* `SubprocessSpawner` - Subprocess interface to start serial Python process.\n", "\n", "It is not recommended to import components from other parts of executorlib in other libraries, only the interfaces in `executorlib` and `executorlib.standalone` are designed to be stable. All other classes and functions are considered for internal use only." - ], - "metadata": {} + ] }, { - "id": "8754df33-fa95-4ca6-ae02-6669967cf4e7", "cell_type": "markdown", - "source": "## External Executables\nOn extension beyond the submission of Python functions is the communication with an external executable. This could be any kind of program written in any programming language which does not provide Python bindings so it cannot be represented in Python functions. ", - "metadata": {} + "id": "8754df33-fa95-4ca6-ae02-6669967cf4e7", + "metadata": {}, + "source": [ + "## External Executables\n", + "On extension beyond the submission of Python functions is the communication with an external executable. This could be any kind of program written in any programming language which does not provide Python bindings so it cannot be represented in Python functions. " + ] }, { - "id": "75af1f8a-7ad7-441f-80a2-5c337484097f", "cell_type": "markdown", - "source": "### Subprocess\nIf the external executable is called only once, then the call to the external executable can be represented in a Python function with the [subprocess](https://docs.python.org/3/library/subprocess.html) module of the Python standard library. In the example below the shell command `echo test` is submitted to the `execute_shell_command()` function, which itself is submitted to the `Executor` class.", - "metadata": {} + "id": "75af1f8a-7ad7-441f-80a2-5c337484097f", + "metadata": {}, + "source": [ + "### Subprocess\n", + "If the external executable is called only once, then the call to the external executable can be represented in a Python function with the [subprocess](https://docs.python.org/3/library/subprocess.html) module of the Python standard library. In the example below the shell command `echo test` is submitted to the `execute_shell_command()` function, which itself is submitted to the `Executor` class." + ] }, { - "id": "83515b16-c4d5-4b02-acd7-9e1eb57fd335", "cell_type": "code", - "source": "from executorlib import SingleNodeExecutor", - "metadata": { - "trusted": false - }, + "execution_count": 1, + "id": "83515b16-c4d5-4b02-acd7-9e1eb57fd335", + "metadata": {}, "outputs": [], - "execution_count": 1 + "source": [ + "from executorlib import SingleNodeExecutor" + ] }, { - "id": "f1ecee94-24a6-4bf9-8a3d-d50eba994367", "cell_type": "code", - "source": "def execute_shell_command(\n command: list, universal_newlines: bool = True, shell: bool = False\n):\n import subprocess\n\n return subprocess.check_output(\n command, universal_newlines=universal_newlines, shell=shell\n )", - "metadata": { - "trusted": false - }, + "execution_count": 2, + "id": "f1ecee94-24a6-4bf9-8a3d-d50eba994367", + "metadata": {}, "outputs": [], - "execution_count": 2 + "source": [ + "def execute_shell_command(\n", + " command: list, universal_newlines: bool = True, shell: bool = False\n", + "):\n", + " import subprocess\n", + "\n", + " return subprocess.check_output(\n", + " command, universal_newlines=universal_newlines, shell=shell\n", + " )" + ] }, { - "id": "32ef5b63-3245-4336-ac0e-b4a6673ee362", "cell_type": "code", - "source": "with SingleNodeExecutor() as exe:\n future = exe.submit(\n execute_shell_command,\n [\"echo\", \"test\"],\n universal_newlines=True,\n shell=False,\n )\n print(future.result())", - "metadata": { - "trusted": false - }, + "execution_count": 3, + "id": "32ef5b63-3245-4336-ac0e-b4a6673ee362", + "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", - "text": "test\n\n" + "text": [ + "test\n", + "\n" + ] } ], - "execution_count": 3 + "source": [ + "with SingleNodeExecutor() as exe:\n", + " future = exe.submit(\n", + " execute_shell_command,\n", + " [\"echo\", \"test\"],\n", + " universal_newlines=True,\n", + " shell=False,\n", + " )\n", + " print(future.result())" + ] }, { - "id": "54837938-01e0-4dd3-b989-1133d3318929", "cell_type": "markdown", - "source": "### Interactive\nThe more complex case is the interaction with an external executable during the run time of the executable. This can be implemented with executorlib using the block allocation `block_allocation=True` feature. The external executable is started as part of the initialization function `init_function` and then the indivdual functions submitted to the `Executor` class interact with the process which is connected to the external executable. \n\nStarting with the definition of the executable, in this example it is a simple script which just increases a counter. The script is written in the file `count.py` so it behaves like an external executable, which could also use any other progamming language. ", - "metadata": {} + "id": "54837938-01e0-4dd3-b989-1133d3318929", + "metadata": {}, + "source": [ + "### Interactive\n", + "The more complex case is the interaction with an external executable during the run time of the executable. This can be implemented with executorlib using the block allocation `block_allocation=True` feature. The external executable is started as part of the initialization function `init_function` and then the indivdual functions submitted to the `Executor` class interact with the process which is connected to the external executable. \n", + "\n", + "Starting with the definition of the executable, in this example it is a simple script which just increases a counter. The script is written in the file `count.py` so it behaves like an external executable, which could also use any other progamming language. " + ] }, { - "id": "dedf138f-3003-4a91-9f92-03983ac7de08", "cell_type": "code", - "source": "count_script = \"\"\"\\\ndef count(iterations):\n for i in range(int(iterations)):\n print(i)\n print(\"done\")\n\n\nif __name__ == \"__main__\":\n while True:\n user_input = input()\n if \"shutdown\" in user_input:\n break\n else:\n count(iterations=int(user_input))\n\"\"\"\n\nwith open(\"count.py\", \"w\") as f:\n f.writelines(count_script)", - "metadata": { - "trusted": false - }, + "execution_count": 4, + "id": "dedf138f-3003-4a91-9f92-03983ac7de08", + "metadata": {}, "outputs": [], - "execution_count": 4 + "source": [ + "count_script = \"\"\"\\\n", + "def count(iterations):\n", + " for i in range(int(iterations)):\n", + " print(i)\n", + " print(\"done\")\n", + "\n", + "\n", + "if __name__ == \"__main__\":\n", + " while True:\n", + " user_input = input()\n", + " if \"shutdown\" in user_input:\n", + " break\n", + " else:\n", + " count(iterations=int(user_input))\n", + "\"\"\"\n", + "\n", + "with open(\"count.py\", \"w\") as f:\n", + " f.writelines(count_script)" + ] }, { - "id": "771b5b84-48f0-4989-a2c8-c8dcb4462781", "cell_type": "markdown", - "source": "The connection to the external executable is established in the initialization function `init_function` of the `Executor` class. By using the [subprocess](https://docs.python.org/3/library/subprocess.html) module from the standard library two process pipes are created to communicate with the external executable. One process pipe is connected to the standard input `stdin` and the other is connected to the standard output `stdout`. ", - "metadata": {} + "id": "771b5b84-48f0-4989-a2c8-c8dcb4462781", + "metadata": {}, + "source": [ + "The connection to the external executable is established in the initialization function `init_function` of the `Executor` class. By using the [subprocess](https://docs.python.org/3/library/subprocess.html) module from the standard library two process pipes are created to communicate with the external executable. One process pipe is connected to the standard input `stdin` and the other is connected to the standard output `stdout`. " + ] }, { - "id": "8fe76668-0f18-40b7-9719-de47dacb0911", "cell_type": "code", - "source": "def init_process():\n import subprocess\n\n return {\n \"process\": subprocess.Popen(\n [\"python\", \"count.py\"],\n stdin=subprocess.PIPE,\n stdout=subprocess.PIPE,\n universal_newlines=True,\n shell=False,\n )\n }", - "metadata": { - "trusted": false - }, + "execution_count": 5, + "id": "8fe76668-0f18-40b7-9719-de47dacb0911", + "metadata": {}, "outputs": [], - "execution_count": 5 + "source": [ + "def init_process():\n", + " import subprocess\n", + "\n", + " return {\n", + " \"process\": subprocess.Popen(\n", + " [\"python\", \"count.py\"],\n", + " stdin=subprocess.PIPE,\n", + " stdout=subprocess.PIPE,\n", + " universal_newlines=True,\n", + " shell=False,\n", + " )\n", + " }" + ] }, { - "id": "09dde7a1-2b43-4be7-ba36-38200b9fddf0", "cell_type": "markdown", - "source": "The interaction function handles the data conversion from the Python datatypes to the strings which can be communicated to the external executable. It is important to always add a new line `\\n` to each command send via the standard input `stdin` to the external executable and afterwards flush the pipe by calling `flush()` on the standard input pipe `stdin`. ", - "metadata": {} + "id": "09dde7a1-2b43-4be7-ba36-38200b9fddf0", + "metadata": {}, + "source": [ + "The interaction function handles the data conversion from the Python datatypes to the strings which can be communicated to the external executable. It is important to always add a new line `\\n` to each command send via the standard input `stdin` to the external executable and afterwards flush the pipe by calling `flush()` on the standard input pipe `stdin`. " + ] }, { - "id": "7556f2bd-176f-4275-a87d-b5c940267888", "cell_type": "code", - "source": "def interact(shell_input, process, lines_to_read=None, stop_read_pattern=None):\n process.stdin.write(shell_input)\n process.stdin.flush()\n lines_count = 0\n output = \"\"\n while True:\n output_current = process.stdout.readline()\n output += output_current\n lines_count += 1\n if stop_read_pattern is not None and stop_read_pattern in output_current:\n break\n elif lines_to_read is not None and lines_to_read == lines_count:\n break\n return output", - "metadata": { - "trusted": false - }, + "execution_count": 6, + "id": "7556f2bd-176f-4275-a87d-b5c940267888", + "metadata": {}, "outputs": [], - "execution_count": 6 + "source": [ + "def interact(shell_input, process, lines_to_read=None, stop_read_pattern=None):\n", + " process.stdin.write(shell_input)\n", + " process.stdin.flush()\n", + " lines_count = 0\n", + " output = \"\"\n", + " while True:\n", + " output_current = process.stdout.readline()\n", + " output += output_current\n", + " lines_count += 1\n", + " if stop_read_pattern is not None and stop_read_pattern in output_current:\n", + " break\n", + " elif lines_to_read is not None and lines_to_read == lines_count:\n", + " break\n", + " return output" + ] }, { - "id": "5484b98b-546f-4f2c-8db1-919ce215e228", "cell_type": "markdown", - "source": "Finally, to close the process after the external executable is no longer required it is recommended to define a shutdown function, which communicates to the external executable that it should shutdown. In the case of the `count.py` script defined above this is achieved by sending the keyword `shutdown`. ", - "metadata": {} + "id": "5484b98b-546f-4f2c-8db1-919ce215e228", + "metadata": {}, + "source": [ + "Finally, to close the process after the external executable is no longer required it is recommended to define a shutdown function, which communicates to the external executable that it should shutdown. In the case of the `count.py` script defined above this is achieved by sending the keyword `shutdown`. " + ] }, { - "id": "d5344d2b-cb53-4d38-8cae-621e3b98bb56", "cell_type": "code", - "source": "def shutdown(process):\n process.stdin.write(\"shutdown\\n\")\n process.stdin.flush()", - "metadata": { - "trusted": false - }, + "execution_count": 7, + "id": "d5344d2b-cb53-4d38-8cae-621e3b98bb56", + "metadata": {}, "outputs": [], - "execution_count": 7 + "source": [ + "def shutdown(process):\n", + " process.stdin.write(\"shutdown\\n\")\n", + " process.stdin.flush()" + ] }, { - "id": "3899467c-dc54-41cb-b05e-b60f5cf97e46", "cell_type": "markdown", - "source": "With these utility functions is to possible to communicate with any kind of external executable. Still for the specific implementation of the external executable it might be necessary to adjust the corresponding Python functions. Therefore this functionality is currently limited to developers and not considered a general feature of executorlib. ", - "metadata": {} + "id": "3899467c-dc54-41cb-b05e-b60f5cf97e46", + "metadata": {}, + "source": [ + "With these utility functions is to possible to communicate with any kind of external executable. Still for the specific implementation of the external executable it might be necessary to adjust the corresponding Python functions. Therefore this functionality is currently limited to developers and not considered a general feature of executorlib. " + ] }, { - "id": "747c1b78-4804-467b-9ac8-8144d8031da3", "cell_type": "code", - "source": "with SingleNodeExecutor(\n max_workers=1,\n init_function=init_process,\n block_allocation=True,\n) as exe:\n future = exe.submit(\n interact, shell_input=\"4\\n\", lines_to_read=5, stop_read_pattern=None\n )\n print(future.result())\n future_shutdown = exe.submit(shutdown)\n print(future_shutdown.result())", - "metadata": { - "trusted": false - }, + "execution_count": 8, + "id": "747c1b78-4804-467b-9ac8-8144d8031da3", + "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", - "text": "0\n1\n2\n3\ndone\n\nNone\n" + "text": [ + "0\n", + "1\n", + "2\n", + "3\n", + "done\n", + "\n", + "None\n" + ] } ], - "execution_count": 8 + "source": [ + "with SingleNodeExecutor(\n", + " max_workers=1,\n", + " init_function=init_process,\n", + " block_allocation=True,\n", + ") as exe:\n", + " future = exe.submit(\n", + " interact, shell_input=\"4\\n\", lines_to_read=5, stop_read_pattern=None\n", + " )\n", + " print(future.result())\n", + " future_shutdown = exe.submit(shutdown)\n", + " print(future_shutdown.result())" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.12.10" } - ] + }, + "nbformat": 4, + "nbformat_minor": 5 } diff --git a/notebooks/images/uml_executor.png b/notebooks/images/uml_executor.png new file mode 100644 index 00000000..2ca2e18e Binary files /dev/null and b/notebooks/images/uml_executor.png differ diff --git a/notebooks/images/uml_spawner.png b/notebooks/images/uml_spawner.png new file mode 100644 index 00000000..6b140458 Binary files /dev/null and b/notebooks/images/uml_spawner.png differ