New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use "spawn" instead of "fork" multiprocessing #12201
Comments
I would agree that 'fork' is the fast (and sometimes unreliable) way to do it. However 'spawn' may not be a full alternative because we should know how module-level stateful objects are handled. I remember that I had an issue in a separate project where 'spawn' (or 'forkserver' as a tradeoff between reliable and fast) would not work entirely. But I may be wrong about that. |
AFAIK you would just loose it. Spawn will make a new clean Python process and re-import the modules in it, but the state of mutable globals will be "reset" as a result in the new process (I need to confirm that). If there is a registry of such state, then it should be possible to restore it assuming that state can be pickled. |
Yes that's what I remembered and I think this should be needed but I don't know about all stateful objects... so it might be impossible for us to do it :( |
For now I worked around the issue by pre-generating the data in conf.py so that the hook only needs to lookup the pre-computed information in a dict. That also had the advantage of allowing the use of ThreadPoolExecutor (otherwise I would only get the page-based parallelism of Sphinx, which was not helping as all of the computation happened for a single page). |
Describe the bug
Parallel build hangs with some libraries as multiprocessing "fork" start method is unreliable and breaks multithreading:
https://docs.pola.rs/user-guide/misc/multiprocessing/
pola-rs/polars#7535
This becomes an issue when custom sphinx hooks using such libraries are used.
Fundamentally, the "fork" method is somewhat broken as cannot generally be expected to work. I'm not sure what is the best way to handle that considered "spawn" is a lot slower (requires re-importing modules), but maybe it's ok with a long-lived pool.
How to Reproduce
I haven't managed to drill down a minimal reproducer unfortunately. The issue happens when calling
polars.LazyFrame.collect()
in our case, as this will trigger multithreaded work in an extension module (Rust code of polars, probably does not hold the GIL for all the computations but allows Python callbacks so it must to some extent, sometimes).Environment Information
Sphinx extensions
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: