[Documentation] Add technical concepts and Mermaid diagrams to documentation#970
Conversation
- Add docs/concepts.md explaining internal architecture, class hierarchy, and execution flow. - Include Mermaid class and sequence diagrams. - Add sphinxcontrib-mermaid to documentation dependencies and configuration. - Update docs/_toc.yml to include the new concepts page. Co-authored-by: jan-janssen <3854739+jan-janssen@users.noreply.github.com>
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 53 minutes and 51 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (9)
📝 WalkthroughWalkthroughThe pull request adds Mermaid diagram support to the documentation build system across multiple configuration files, removes a notebook images copy step from the build pipeline, updates the developer notebook to use embedded Mermaid diagrams instead of image references, and adds a new "Execution Flow" section to the README. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull request overview
This PR extends the executorlib documentation with a new “Technical Concepts” page that explains the library’s internal architecture and execution flow, adding Mermaid diagrams to visualize key components and interactions.
Changes:
- Add
docs/concepts.mdwith an architecture overview plus Mermaid class/sequence diagrams. - Enable Mermaid rendering in the Jupyter Book/Sphinx build via
sphinxcontrib-mermaid. - Include the new page in the Jupyter Book table of contents.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| docs/concepts.md | New technical concepts page with Mermaid diagrams and architecture narrative. |
| docs/_toc.yml | Adds the new concepts page to the published documentation navigation. |
| docs/_config.yml | Enables the Sphinx Mermaid extension for diagram rendering. |
| .ci_support/environment-docs.yml | Adds sphinxcontrib-mermaid to the docs build environment dependencies. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| User->>Executor: submit(fn, args, resource_dict) | ||
| Executor->>TaskScheduler: submit(fn, args, resource_dict) |
There was a problem hiding this comment.
In the sequence diagram, submit(fn, args, resource_dict) reads like resource_dict is passed positionally. In executorlib the resource_dict parameter is keyword-only (e.g., submit(fn, *args, resource_dict=..., **kwargs)), so the diagram should be updated to avoid documenting an invalid calling convention.
| User->>Executor: submit(fn, args, resource_dict) | |
| Executor->>TaskScheduler: submit(fn, args, resource_dict) | |
| User->>Executor: submit(fn, *args, resource_dict=...) | |
| Executor->>TaskScheduler: submit(fn, *args, resource_dict=...) |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #970 +/- ##
=======================================
Coverage 94.15% 94.15%
=======================================
Files 39 39
Lines 2089 2089
=======================================
Hits 1967 1967
Misses 122 122 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…hnical-concepts-12379807062268802605
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
.ci_support/environment-docs.yml (1)
12-12: Pinsphinxcontrib-mermaidfor reproducible docs builds.Line 12 introduces a floating dependency; pinning it to a tested version would reduce CI/RTD drift.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.ci_support/environment-docs.yml at line 12, The docs environment file currently lists an unpinned dependency `sphinxcontrib-mermaid`; update the entry in .ci_support/environment-docs.yml to pin `sphinxcontrib-mermaid` to a specific tested version (e.g., replace `sphinxcontrib-mermaid` with `sphinxcontrib-mermaid==<tested-version>`) to ensure reproducible builds and reduce CI/RTD drift.docs/_config.yml (1)
19-22: Remove duplicatesphinx.ext.autodocentry.Line 22 duplicates Line 19; keeping one entry makes the extension list cleaner.
♻️ Proposed cleanup
- 'sphinxcontrib.mermaid' - 'sphinx.ext.autodoc' - 'sphinx.ext.napoleon' - 'sphinx.ext.viewcode' - - 'sphinx.ext.autodoc' - 'sphinx.ext.autosummary'🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/_config.yml` around lines 19 - 22, Remove the duplicated extension entry 'sphinx.ext.autodoc' from the Sphinx extensions list in docs/_config.yml: locate the two identical strings and delete one so each extension appears only once (keep a single 'sphinx.ext.autodoc' entry alongside 'sphinx.ext.napoleon' and 'sphinx.ext.viewcode').
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@notebooks/5-developer.ipynb`:
- Line 1: Notebook metadata declares Python 3.13.13 which mismatches the docs
environment pinned to Python 3.12; update the notebook kernelspec and
language_info to the documented Python 3.12 version so regenerated notebooks are
reproducible. Edit the metadata keys kernelspec.display_name, kernelspec.name
and language_info.version in this notebook (and the other inconsistent notebooks
such as the one referenced) to use the 3.12 Python identifier (e.g., "Python 3
(ipykernel)" with version "3.12" or the exact 3.12.x used in
.ci_support/environment-docs.yml) so the kernel metadata aligns with the docs
environment.
---
Nitpick comments:
In @.ci_support/environment-docs.yml:
- Line 12: The docs environment file currently lists an unpinned dependency
`sphinxcontrib-mermaid`; update the entry in .ci_support/environment-docs.yml to
pin `sphinxcontrib-mermaid` to a specific tested version (e.g., replace
`sphinxcontrib-mermaid` with `sphinxcontrib-mermaid==<tested-version>`) to
ensure reproducible builds and reduce CI/RTD drift.
In `@docs/_config.yml`:
- Around line 19-22: Remove the duplicated extension entry 'sphinx.ext.autodoc'
from the Sphinx extensions list in docs/_config.yml: locate the two identical
strings and delete one so each extension appears only once (keep a single
'sphinx.ext.autodoc' entry alongside 'sphinx.ext.napoleon' and
'sphinx.ext.viewcode').
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: b3d8f017-b8db-4f6e-8dcb-582e01168a2b
⛔ Files ignored due to path filters (2)
notebooks/images/uml_executor.pngis excluded by!**/*.pngnotebooks/images/uml_spawner.pngis excluded by!**/*.png
📒 Files selected for processing (5)
.ci_support/environment-docs.yml.readthedocs.ymlREADME.mddocs/_config.ymlnotebooks/5-developer.ipynb
💤 Files with no reviewable changes (1)
- .readthedocs.yml
| "nbformat": 4, | ||
| "nbformat_minor": 5 | ||
| } | ||
| {"metadata":{"kernelspec":{"display_name":"Python 3 (ipykernel)","language":"python","name":"python3"},"language_info":{"name":"python","version":"3.13.13","mimetype":"text/x-python","codemirror_mode":{"name":"ipython","version":3},"pygments_lexer":"ipython3","nbconvert_exporter":"python","file_extension":".py"}},"nbformat_minor":5,"nbformat":4,"cells":[{"id":"511b34e0-12af-4437-8915-79f033fe7cda","cell_type":"markdown","source":"# Support & Contribution\nThe executorlib open-source software package is developed by scientists for scientists. We are open for any contribution, from feedback about spelling mistakes in the documentation, to [raising issues](https://github.com/pyiron/executorlib/issues) about functionality which is insufficiently explained in the documentation or simply requesting support to suggesting new features or opening [pull requests](https://github.com/pyiron/executorlib/pulls). Our [Github repository](https://github.com/pyiron/executorlib) is the easiest way to get in contact with the developers. \n\n## Issues\nThe easiest way for us as developers to help in solving an issue is to provide us with sufficient information about how to reproduce the issue. The simpler the test case which causes the issue the easier it is to identify the part of the code which is causing the issue. As a general rule of thumb, everything that works with the [ProcessPoolExecutor](https://docs.python.org/3/library/concurrent.futures.html#processpoolexecutor) \nor the [ThreadPoolExecutor](https://docs.python.org/3/library/concurrent.futures.html#threadpoolexecutor) should also work with the different Executor classes provided by executorlib. If this is not the case, then it is most likely a bug and worth reporting. \n\n## Pull Requests\nReviewing a pull request is easier when the changes are clearly lined out, covered by tests and following the automated formatting using black. Still when you decide to work on a new feature it can also be helpful to open a pull request early on and mark it as draft, this gives other developers the opportunity to see what you are working on. \n\n## License\n```\nBSD 3-Clause License\n\nCopyright (c) 2022, Jan Janssen\nAll rights reserved.\n\nRedistribution and use in source and binary forms, with or without\nmodification, are permitted provided that the following conditions are met:\n\n* Redistributions of source code must retain the above copyright notice, this\n list of conditions and the following disclaimer.\n\n* Redistributions in binary form must reproduce the above copyright notice,\n this list of conditions and the following disclaimer in the documentation\n and/or other materials provided with the distribution.\n\n* Neither the name of the copyright holder nor the names of its\n contributors may be used to endorse or promote products derived from\n this software without specific prior written permission.\n\nTHIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS \"AS IS\"\nAND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE\nIMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE\nDISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE\nFOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL\nDAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR\nSERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER\nCAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,\nOR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE\nOF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n```\n\n## Modules\nWhile it is not recommended to link to specific internal components of executorlib in external Python packages but rather only the `Executor` classes should be used as central interfaces to executorlib, the internal architecture is briefly outlined below. \n* `backend` - the backend module contains the functionality for the Python processes created by executorlib to execute the submitted Python functions.\n* `executor` - the executor module defines the different `Executor` classes, namely `SingleNodeExecutor`, `SlurmClusterExecutor`, `SlurmJobExecutor`, `FluxClusterExecutor` and `FluxJobExecutor`. These are the interfaces the user interacts with.\n* `standalone` - the standalone module contains a number of utility functions which only depend on external libraries and do not have any internal dependency to other parts of `executorlib`. This includes the functionality to generate executable commands, the [h5py](https://www.h5py.org) based interface for caching, a number of input checks, routines to plot the dependencies of a number of future objects, functionality to interact with the [queues defined in the Python standard library](https://docs.python.org/3/library/queue.html), the interface for serialization based on [cloudpickle](https://github.com/cloudpipe/cloudpickle) and finally an extension to the [threading](https://docs.python.org/3/library/threading.html) of the Python standard library.\n* `task_scheduler` - the internal task scheduler module defines the task schedulers, namely `BlockAllocationTaskScheduler`, `DependencyTaskScheduler`, `FileTaskScheduler` and `OneProcessTaskScheduler`. They are divided into two sub modules:\n * `file` - the file based task scheduler module defines the file based communication for the [HPC Cluster Executor](https://executorlib.readthedocs.io/en/latest/2-hpc-cluster.html).\n * `interactive` - the interactive task scheduler module defines the [zero message queue](https://zeromq.org) based communication for the [Single Node Executor](https://executorlib.readthedocs.io/en/latest/1-single-node.html) and the [HPC Job Executor](https://executorlib.readthedocs.io/en/latest/3-hpc-job.html).\n\nGiven the level of separation the integration of submodules from the standalone module in external software packages should be the easiest way to benefit from the developments in executorlib beyond just using the `Executor` class. \n\n## Interface Class Hierarchy\nexecutorlib provides five different interfaces, namely `SingleNodeExecutor`, `SlurmClusterExecutor`, `SlurmJobExecutor`, `FluxClusterExecutor` and `FluxJobExecutor`, internally these are mapped to four types of task schedulers, namely `BlockAllocationTaskScheduler`, `DependencyTaskScheduler`, `FileTaskScheduler` and `OneProcessTaskScheduler` depending on which options are selected. Finally, the task schedulers are connected to spawners to start new processes, namely the `MpiExecSpawner`, `SrunSpawner` and `FluxPythonSpawner`. The dependence is illustrated in the following table:\n\n| | `BlockAllocationTaskScheduler` | `DependencyTaskScheduler` | `FileTaskScheduler` | `OneProcessTaskScheduler` |\n|-------------------------------------------------------------------------|--------------------------------|---------------------------|---------------------|---------------------------|\n| `SingleNodeExecutor(disable_dependencies=False)` | | with `MpiExecSpawner` | | |\n| `SingleNodeExecutor(disable_dependencies=True, block_allocation=False)` | | | | with `MpiExecSpawner` |\n| `SingleNodeExecutor(disable_dependencies=True, block_allocation=True)` | with `MpiExecSpawner` | | | |\n| `SlurmClusterExecutor(plot_dependency_graph=False)` | | | with `pysqa` | |\n| `SlurmClusterExecutor(plot_dependency_graph=True)` | | with `SrunSpawner` | | |\n| `SlurmJobExecutor(disable_dependencies=False)` | | with `SrunSpawner` | | |\n| `SlurmJobExecutor(disable_dependencies=True, block_allocation=False)` | | | | with `SrunSpawner` |\n| `SlurmJobExecutor(disable_dependencies=True, block_allocation=True)` | with `SrunSpawner` | | | |\n| `FluxClusterExecutor(plot_dependency_graph=False)` | | | with `pysqa` | |\n| `FluxClusterExecutor(plot_dependency_graph=True)` | | with `FluxPythonSpawner` | | |\n| `FluxJobExecutor(disable_dependencies=False)` | | with `FluxPythonSpawner` | | |\n| `FluxJobExecutor(disable_dependencies=True, block_allocation=False)` | | | | with `FluxPythonSpawner` |\n| `FluxJobExecutor(disable_dependencies=True, block_allocation=True)` | with `FluxPythonSpawner` | | | |\n\nThe following diagram illustrates the relationship between the main classes in `executorlib`.\n\n```{mermaid}\nclassDiagram\n class FutureExecutor {\n <<interface>>\n }\n class BaseExecutor {\n -_task_scheduler: TaskSchedulerBase\n +submit(fn, *args, **kwargs) Future\n +shutdown(wait)\n }\n class TaskSchedulerBase {\n -_future_queue: Queue\n -_process: Thread\n +submit(fn, *args, **kwargs) Future\n }\n class BaseSpawner {\n <<interface>>\n +bootup(command_lst)\n +shutdown(wait)\n }\n class SocketInterface {\n +send_dict(input_dict)\n +receive_dict() dict\n }\n\n FutureExecutor <|-- BaseExecutor\n BaseExecutor o-- TaskSchedulerBase\n TaskSchedulerBase <|-- OneProcessTaskScheduler\n TaskSchedulerBase <|-- BlockAllocationTaskScheduler\n TaskSchedulerBase <|-- DependencyTaskScheduler\n TaskSchedulerBase <|-- FileTaskScheduler\n\n OneProcessTaskScheduler o-- BaseSpawner\n BaseSpawner <|-- MpiExecSpawner\n BaseSpawner <|-- SrunSpawner\n BaseSpawner <|-- FluxPythonSpawner\n\n OneProcessTaskScheduler ..> SocketInterface : uses\n```\n\n## Execution Flow\n\nWhen a user submits a function to an executor, several steps occur in the background to ensure the task is executed with the requested resources and the result is returned.\n\n```{mermaid}\nsequenceDiagram\n participant User\n participant Executor\n participant TaskScheduler\n participant Spawner\n participant Backend\n\n User->>Executor: submit(fn, args, resource_dict)\n Executor->>TaskScheduler: submit(fn, args, resource_dict)\n TaskScheduler->>TaskScheduler: Add to _future_queue\n TaskScheduler-->>User: Return Future object\n\n Note over TaskScheduler, Spawner: Task loop in background thread\n\n TaskScheduler->>Spawner: bootup(command)\n Spawner->>Backend: Start worker process\n TaskScheduler->>Backend: Send function and arguments (ZMQ/File)\n Backend->>Backend: Execute function\n Backend->>TaskScheduler: Send result (ZMQ/File)\n TaskScheduler->>User: Update Future with result\n```","metadata":{}},{"id":"c9df5ba2-9036-422c-b9af-a5d05944aa1f","cell_type":"markdown","source":"## Test Environment\nThe test environment of the executorlib library consists of three components - they are all available in the executorlib [Github repository](https://github.com/pyiron/executorlib):\n* The [Jupyter Notebooks](https://github.com/pyiron/executorlib/tree/main/notebooks) in the executorlib Github repository demonstrate the usage of executorlib. These notebooks are used as examples for new users, as documentation available on [readthedocs.org](https://executorlib.readthedocs.io) and as integration tests.\n* The [likelihood benchmark](https://github.com/pyiron/executorlib/blob/main/tests/benchmark/llh.py) to compare the performance on a single compute node to the built-in interfaces in the standard library. The benchmark can be run with the following parameters `python llh.py static`. Here `static` refers to single process execution, `process` refers to the `ProcessPoolExecutor` from the standard library, `thread` refers to the `ThreadPoolExecutor` from the standard library, `executorlib` refers to the `SingleNodeExecutor` in executorlib and `block_allocation` to the `SingleNodeExecutor` in `executorlib` with block allocation enabled. Finally, for comparison to `mpi4py` the test can be executed with `mpiexec -n 4 python -m mpi4py.futures llh.py mpi4py`.\n* The [unit tests](https://github.com/pyiron/executorlib/tree/main/tests) these can be executed with `python -m unittest discover .` in the `tests` directory. The tests are structured based on the internal structure of executorlib. Tests for the `SingleNodeExecutor` are named `test_singlenodeexecutor_*.py` and correspondingly for the other modules. ","metadata":{}},{"id":"7bc073aa-6036-48e7-9696-37af050d438a","cell_type":"markdown","source":"## Communication\n### Interactive Communication\nThe key functionality of the executorlib package is the up-scaling of python functions with thread based parallelism, MPI based parallelism or by assigning GPUs to individual python functions. In the background this is realized using a combination of the [zero message queue](https://zeromq.org) and [cloudpickle](https://github.com/cloudpipe/cloudpickle)\nto communicate binary python objects. The `executorlib.standalone.interactive.communication.SocketInterface` is an abstraction of this \ninterface, which is used in the other classes inside `executorlib` and might also be helpful for other projects. It comes with a series of utility functions:\n\n* `executorlib.standalone.interactive.communication.interface_bootup()`: To initialize the interface\n* `executorlib.standalone.interactive.communication.interface_connect()`: To connect the interface to another instance\n* `executorlib.standalone.interactive.communication.interface_send()`: To send messages via this interface \n* `executorlib.standalone.interactive.communication.interface_receive()`: To receive messages via this interface \n* `executorlib.standalone.interactive.communication.interface_shutdown()`: To shutdown the interface\n\nWhile executorlib was initially designed for up-scaling python functions for HPC, the same functionality can be\nleveraged to up-scale any executable independent of the programming language it is developed in.\n\n### File-based Communication\nUsed by `FluxClusterExecutor`, `SlurmClusterExecutor` and `TestClusterExecutor`. It uses the filesystem to communicate between the main process and the individual HPC jobs. This mode is necessary when tasks are submitted as independent jobs to a scheduler like SLURM or Flux, where direct network communication between the login node and compute nodes might be restricted.\n\n## External Libraries\nFor external libraries executorlib provides a standardized interface for a subset of its internal functionality, which is designed to remain stable with minor version updates. Developers can import the following functionality from `executorlib.api`:\n* `cancel_items_in_queue()` - Cancel items which are still waiting in the Python standard library queue - `queue.queue`.\n* `cloudpickle_register()` - Cloudpickle can either pickle by value or pickle by reference. The functions which are communicated have to be pickled by value rather than by reference, so the module which calls the map function is pickled by value.\n* `get_command_path()` - Get path of the backend executable script `executorlib.backend`.\n* `interface_bootup()` - Start interface for ZMQ communication.\n* `interface_connect()` - Connect to an existing `SocketInterface` instance by providing the hostname and the port as strings.\n* `interface_receive()` - Receive instructions from a `SocketInterface` instance.\n* `interface_send()` - Send results to a `SocketInterface` instance.\n* `interface_shutdown()` - Close the connection to a `SocketInterface` instance.\n* `MpiExecSpawner` - Subprocess interface to start `mpi4py` parallel process.\n* `SocketInterface` - The `SocketInterface` is an abstraction layer on top of the zero message queue.\n* `SubprocessSpawner` - Subprocess interface to start serial Python process.\n\nIt is not recommended to import components from other parts of executorlib in other libraries, only the interfaces in `executorlib` and `executorlib.api` are designed to be stable. All other classes and functions are considered for internal use only.","metadata":{}},{"id":"8754df33-fa95-4ca6-ae02-6669967cf4e7","cell_type":"markdown","source":"## External Executables\nOn extension beyond the submission of Python functions is the communication with an external executable. This could be any kind of program written in any programming language which does not provide Python bindings so it cannot be represented in Python functions. ","metadata":{}},{"id":"75af1f8a-7ad7-441f-80a2-5c337484097f","cell_type":"markdown","source":"### Subprocess\nIf the external executable is called only once, then the call to the external executable can be represented in a Python function with the [subprocess](https://docs.python.org/3/library/subprocess.html) module of the Python standard library. In the example below the shell command `echo test` is submitted to the `execute_shell_command()` function, which itself is submitted to the `Executor` class.","metadata":{}},{"id":"83515b16-c4d5-4b02-acd7-9e1eb57fd335","cell_type":"code","source":"from executorlib import SingleNodeExecutor","metadata":{"trusted":true},"outputs":[],"execution_count":1},{"id":"f1ecee94-24a6-4bf9-8a3d-d50eba994367","cell_type":"code","source":"def execute_shell_command(\n command: list, universal_newlines: bool = True, shell: bool = False\n):\n import subprocess\n\n return subprocess.check_output(\n command, universal_newlines=universal_newlines, shell=shell\n )","metadata":{"trusted":true},"outputs":[],"execution_count":2},{"id":"32ef5b63-3245-4336-ac0e-b4a6673ee362","cell_type":"code","source":"with SingleNodeExecutor() as exe:\n future = exe.submit(\n execute_shell_command,\n [\"echo\", \"test\"],\n universal_newlines=True,\n shell=False,\n )\n print(future.result())","metadata":{"trusted":true},"outputs":[{"name":"stdout","output_type":"stream","text":"test\n\n"}],"execution_count":3},{"id":"54837938-01e0-4dd3-b989-1133d3318929","cell_type":"markdown","source":"### Interactive\nThe more complex case is the interaction with an external executable during the run time of the executable. This can be implemented with executorlib using the block allocation `block_allocation=True` feature. The external executable is started as part of the initialization function `init_function` and then the indivdual functions submitted to the `Executor` class interact with the process which is connected to the external executable. \n\nStarting with the definition of the executable, in this example it is a simple script which just increases a counter. The script is written in the file `count.py` so it behaves like an external executable, which could also use any other progamming language. ","metadata":{}},{"id":"dedf138f-3003-4a91-9f92-03983ac7de08","cell_type":"code","source":"count_script = \"\"\"\\\ndef count(iterations):\n for i in range(int(iterations)):\n print(i)\n print(\"done\")\n\n\nif __name__ == \"__main__\":\n while True:\n user_input = input()\n if \"shutdown\" in user_input:\n break\n else:\n count(iterations=int(user_input))\n\"\"\"\n\nwith open(\"count.py\", \"w\") as f:\n f.writelines(count_script)","metadata":{"trusted":true},"outputs":[],"execution_count":4},{"id":"771b5b84-48f0-4989-a2c8-c8dcb4462781","cell_type":"markdown","source":"The connection to the external executable is established in the initialization function `init_function` of the `Executor` class. By using the [subprocess](https://docs.python.org/3/library/subprocess.html) module from the standard library two process pipes are created to communicate with the external executable. One process pipe is connected to the standard input `stdin` and the other is connected to the standard output `stdout`. ","metadata":{}},{"id":"8fe76668-0f18-40b7-9719-de47dacb0911","cell_type":"code","source":"def init_process():\n import subprocess\n\n return {\n \"process\": subprocess.Popen(\n [\"python\", \"count.py\"],\n stdin=subprocess.PIPE,\n stdout=subprocess.PIPE,\n universal_newlines=True,\n shell=False,\n )\n }","metadata":{"trusted":true},"outputs":[],"execution_count":5},{"id":"09dde7a1-2b43-4be7-ba36-38200b9fddf0","cell_type":"markdown","source":"The interaction function handles the data conversion from the Python datatypes to the strings which can be communicated to the external executable. It is important to always add a new line `\\n` to each command send via the standard input `stdin` to the external executable and afterwards flush the pipe by calling `flush()` on the standard input pipe `stdin`. ","metadata":{}},{"id":"7556f2bd-176f-4275-a87d-b5c940267888","cell_type":"code","source":"def interact(shell_input, process, lines_to_read=None, stop_read_pattern=None):\n process.stdin.write(shell_input)\n process.stdin.flush()\n lines_count = 0\n output = \"\"\n while True:\n output_current = process.stdout.readline()\n output += output_current\n lines_count += 1\n if stop_read_pattern is not None and stop_read_pattern in output_current:\n break\n elif lines_to_read is not None and lines_to_read == lines_count:\n break\n return output","metadata":{"trusted":true},"outputs":[],"execution_count":6},{"id":"5484b98b-546f-4f2c-8db1-919ce215e228","cell_type":"markdown","source":"Finally, to close the process after the external executable is no longer required it is recommended to define a shutdown function, which communicates to the external executable that it should shutdown. In the case of the `count.py` script defined above this is achieved by sending the keyword `shutdown`. ","metadata":{}},{"id":"d5344d2b-cb53-4d38-8cae-621e3b98bb56","cell_type":"code","source":"def shutdown(process):\n process.stdin.write(\"shutdown\\n\")\n process.stdin.flush()","metadata":{"trusted":true},"outputs":[],"execution_count":7},{"id":"3899467c-dc54-41cb-b05e-b60f5cf97e46","cell_type":"markdown","source":"With these utility functions is to possible to communicate with any kind of external executable. Still for the specific implementation of the external executable it might be necessary to adjust the corresponding Python functions. Therefore this functionality is currently limited to developers and not considered a general feature of executorlib. ","metadata":{}},{"id":"747c1b78-4804-467b-9ac8-8144d8031da3","cell_type":"code","source":"with SingleNodeExecutor(\n max_workers=1,\n init_function=init_process,\n block_allocation=True,\n) as exe:\n future = exe.submit(\n interact, shell_input=\"4\\n\", lines_to_read=5, stop_read_pattern=None\n )\n print(future.result())\n future_shutdown = exe.submit(shutdown)\n print(future_shutdown.result())","metadata":{"trusted":true},"outputs":[{"name":"stdout","output_type":"stream","text":"0\n1\n2\n3\ndone\n\nNone\n"}],"execution_count":8}]} No newline at end of file |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify Python version consistency between docs env and notebook metadata.
python - <<'PY'
import json, glob, re
env_text = open(".ci_support/environment-docs.yml", encoding="utf-8").read()
m = re.search(r'^\s*-\s*python\s*=\s*([0-9]+\.[0-9]+)', env_text, re.M)
env_ver = m.group(1) if m else "NOT_FOUND"
print(f"docs env python: {env_ver}")
for nb in sorted(glob.glob("notebooks/*.ipynb")):
with open(nb, encoding="utf-8") as f:
data = json.load(f)
nb_ver = data.get("metadata", {}).get("language_info", {}).get("version", "NOT_SET")
print(f"{nb}: {nb_ver}")
PYRepository: pyiron/executorlib
Length of output: 311
Update notebook metadata Python version to match the docs environment.
Line 1 sets notebook metadata to Python 3.13.13, but .ci_support/environment-docs.yml pins python =3.12. This mismatch will cause reproducibility drift when notebooks are regenerated. The verification also revealed that other notebooks have similarly inconsistent versions (e.g., notebooks/1-single-node.ipynb also uses 3.13.13), suggesting a broader alignment issue across the notebook suite.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@notebooks/5-developer.ipynb` at line 1, Notebook metadata declares Python
3.13.13 which mismatches the docs environment pinned to Python 3.12; update the
notebook kernelspec and language_info to the documented Python 3.12 version so
regenerated notebooks are reproducible. Edit the metadata keys
kernelspec.display_name, kernelspec.name and language_info.version in this
notebook (and the other inconsistent notebooks such as the one referenced) to
use the 3.12 Python identifier (e.g., "Python 3 (ipykernel)" with version "3.12"
or the exact 3.12.x used in .ci_support/environment-docs.yml) so the kernel
metadata aligns with the docs environment.
|
Closes #816 |
Acknowledged. I will include the reference to the issue in the PR description. |
4a88beb to
836bc98
Compare
This PR extends the documentation of executorlib with a new "Technical Concepts" page. It provides a detailed explanation of the package's internal architecture, the roles of its primary modules, and its execution flow.
Key additions:
docs/concepts.mdfile containing technical explanations.docs/_config.ymland.ci_support/environment-docs.yml) to support Mermaid diagram rendering in the documentation build.docs/_toc.yml.PR created automatically by Jules for task 12379807062268802605 started by @jan-janssen
Summary by CodeRabbit
Documentation
Chores