Enable multi-threaded execution of in MonteCarloSimulation launched from Python #17363

RussTedrake · 2022-06-09T14:50:27Z

This is a follow-up to the conversation started in this Slack thread: https://drakedevelopers.slack.com/archives/C2CHRT98E/p1653658311307059

The monte-carlo simulation tools have become a useful model for multi-process executions in Drake. We currently disallow multi-threaded execution for MonteCarloSimulation at the binding layer:

drake/bindings/pydrake/systems/analysis_py.cc

Lines 280 to 282 in 4cfbb2e

    
           // Note: parallel simulation must be disabled in the binding via 
        
           // num_parallel_executions=kNoConcurrency, since parallel execution of 
        
           // Python systems in multiple threads is not supported.

According to @calderpg-tri ... running the current python tests for monte-carlo with multi-processing enabled will cause the system to hang. My current understanding of the problem is that the c++ code needs deal with the Python GIL. (see, for instance, pybind/pybind11#1087). The big question is whether we can handle this at the level of the bindings, or whether every potential c++ caller must deal with managing the lock.

Enabling this for the monte-carlo tools will pave the way for many other relevant use cases (such as trajectory optimization and DrakeGym) where users would benefit substantially from multi-process executions but also have the ability to pass Python callbacks into the C++ execution.

+@EricCousineau-TRI as pydrake owner.

EricCousineau-TRI · 2022-06-15T22:36:17Z

I liked the original intent of the thread - to explicitly fail if attempting to mix C++ multithreading with Python.

For concurrent execution in Python involving CPython API extensions, I think multiprocessing is perhaps the simplest / best way forward. Fixing pydrake bindings with GIL is feasible, but would require some design thought, esp. when interacting with other CPython API extensions such as OpenCV.

I will probably be unable to dedicate any real time to this in the near future myself. Am not entirely sure how to delegate.

jwnimmer-tri · 2022-06-27T22:50:42Z

For concurrent execution in Python involving CPython API extensions, I think multiprocessing is perhaps the simplest / best way forward.

Yes, I tend to agree with this.

Am not entirely sure how to delegate.

For my own part, I don't see any particular urgency here. I believe the resolution would be to create an illustration / example / tutorial showing how to set up concurrent simulations using multiprocessing.

RussTedrake · 2022-06-28T08:56:16Z

Two thoughts:

python multiprocessing normally gets stuck because our Diagram and Context objects aren't pickle-able. In DrakeGym I had to work around that by deferring the construction of the diagrams/contexts to the subprocess (which is not ideal)
An example showing multiprocessing would be good, but I'm pretty sure it is not the resolution I seek. I want to run concurrent c++ code (like montecarlo simulations), but in workflows that can be invoked from python. A better resolution from my perspective would be to enable the MonteCarloSimulation, but have it throw if it encountered a python callback in the parallel execution. This is where my questions started on the slack thread linked above.

jwnimmer-tri · 2022-06-28T14:05:59Z

(2) Sure. If we add a new function bool System::IsThreadSafe() const then we can have Monte Carlo fail-fast in case there is a PyLeafSystem or PyVectorSystem or similar in play when num_threads > 1.

This should catch >90% of the user errors. (I'm not confident that we can plug all of the callback holes. The Monte Carlo documentation should directly explain to Python users that they cannot have any Python code in their Diagram in any way.)

The main challenge will be with the ScalarSystemFunction output callback. It is called on multiple threads, but is a Python functor. We'll could add a lock there in the pydrake bindings so that output calculations happen one-at-a-time, blocking the worker threads in the meantime; the same lock will also need to interpose itself on the simulator factory callback -- the Python code can either be making a new simulator, or summarizing the output, never both concurrently.

Another approach would be to add a "bool use_serial_callbacks" flag to Monte Carlo itself, so that it internally governs the make vs output callbacks -- it might be able to keep the thread schedule more full if we give it direct control. It might also end up being useful for C++ code that can't easily make a thread safe output calculator.

(1) I think pickling the functor that creates the diagram, instead of the diagram itself, should be a fairly easy and plausible work-around for many cases. It also makes it possible to scale the concurrency beyond a single machine. The functor can have bound arguments (functools.partial) for any extra configuration data; it doesn't always need to be a zero-argument function. It will scale in case the user needs to add a Python System (e.g., looping in their controller). It is a minor hurdle to carve out the factory functor to pickle it, but it has the upside that you don't hit a brick wall anytime you need to go beyond pydrake's C++-bound code.

calderpg-tri · 2022-06-28T14:50:26Z

(2) For the case of ScalarSystemFunction output another approach would be to restructure the dispatch loop so that output is only called in serial on completed simulations, no locks necessary. Doesn't solve potential callbacks in the system, but does ensure that the user-provided methods are never called in parallel.

jwnimmer-tri · 2024-03-21T18:17:48Z

FYI we have this comment now:

drake/bindings/pydrake/systems/analysis_py.cc

Lines 419 to 422 in fda9e48

    
           // Note: This hard-codes `parallelism` to be off, since parallel execution 
        
           // of Python systems on multiple threads was thought to be unsupported. It's 
        
           // possible that with `py::call_guard<py::gil_scoped_release>` it would 
        
           // actually be fine, so we could revisit that decision at some point.

Adding the parallelism=... arg to the bindings is probably easy.

RussTedrake added type: feature request priority: medium component: pydrake Python API and its supporting Starlark macros labels Jun 9, 2022

RussTedrake assigned EricCousineau-TRI Jun 9, 2022

RussTedrake mentioned this issue Jun 28, 2022

Provide DrakeGym API in pydrake to make the easy simulations more familiar #15508

Closed

calderpg-tri mentioned this issue Aug 19, 2022

MonteCarlo (parallel): serialize ScalarSystemFunction calls #17753

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable multi-threaded execution of in MonteCarloSimulation launched from Python #17363

Enable multi-threaded execution of in MonteCarloSimulation launched from Python #17363

RussTedrake commented Jun 9, 2022

EricCousineau-TRI commented Jun 15, 2022

jwnimmer-tri commented Jun 27, 2022

RussTedrake commented Jun 28, 2022

jwnimmer-tri commented Jun 28, 2022 •

edited

Loading

calderpg-tri commented Jun 28, 2022

jwnimmer-tri commented Mar 21, 2024

Enable multi-threaded execution of in MonteCarloSimulation launched from Python #17363

Enable multi-threaded execution of in MonteCarloSimulation launched from Python #17363

Comments

RussTedrake commented Jun 9, 2022

EricCousineau-TRI commented Jun 15, 2022

jwnimmer-tri commented Jun 27, 2022

RussTedrake commented Jun 28, 2022

jwnimmer-tri commented Jun 28, 2022 • edited Loading

calderpg-tri commented Jun 28, 2022

jwnimmer-tri commented Mar 21, 2024

jwnimmer-tri commented Jun 28, 2022 •

edited

Loading