Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use "spawn" instead of "fork" multiprocessing #12201

Open
douglas-raillard-arm opened this issue Mar 25, 2024 · 5 comments
Open

Use "spawn" instead of "fork" multiprocessing #12201

douglas-raillard-arm opened this issue Mar 25, 2024 · 5 comments

Comments

@douglas-raillard-arm
Copy link

Describe the bug

Parallel build hangs with some libraries as multiprocessing "fork" start method is unreliable and breaks multithreading:
https://docs.pola.rs/user-guide/misc/multiprocessing/
pola-rs/polars#7535
This becomes an issue when custom sphinx hooks using such libraries are used.

Fundamentally, the "fork" method is somewhat broken as cannot generally be expected to work. I'm not sure what is the best way to handle that considered "spawn" is a lot slower (requires re-importing modules), but maybe it's ok with a long-lived pool.

How to Reproduce

I haven't managed to drill down a minimal reproducer unfortunately. The issue happens when calling polars.LazyFrame.collect() in our case, as this will trigger multithreaded work in an extension module (Rust code of polars, probably does not hold the GIL for all the computations but allows Python callbacks so it must to some extent, sometimes).

Environment Information

Platform:              linux; (Linux-5.15.0-92-generic-x86_64-with-glibc2.31)
Python version:        3.11.7 (main, Dec  8 2023, 18:56:57) [GCC 9.4.0])
Python implementation: CPython
Sphinx version:        7.2.6
Docutils version:      0.20.1
Jinja2 version:        3.1.3
Pygments version:      2.17.2

Sphinx extensions

No response

Additional context

No response

@picnixz
Copy link
Member

picnixz commented Mar 25, 2024

I would agree that 'fork' is the fast (and sometimes unreliable) way to do it. However 'spawn' may not be a full alternative because we should know how module-level stateful objects are handled. I remember that I had an issue in a separate project where 'spawn' (or 'forkserver' as a tradeoff between reliable and fast) would not work entirely. But I may be wrong about that.

@douglas-raillard-arm
Copy link
Author

because we should know how module-level stateful objects are handled.

AFAIK you would just loose it. Spawn will make a new clean Python process and re-import the modules in it, but the state of mutable globals will be "reset" as a result in the new process (I need to confirm that). If there is a registry of such state, then it should be possible to restore it assuming that state can be pickled.

@picnixz
Copy link
Member

picnixz commented Mar 25, 2024

If there is a registry of such state, then it should be possible to restore it assuming that state can be pickled.

Yes that's what I remembered and I think this should be needed but I don't know about all stateful objects... so it might be impossible for us to do it :(

@chrisjsewell
Copy link
Member

chrisjsewell commented Mar 25, 2024

FYI @ubmarco has already given this quite a bit of thought, as part of his work on #11746

@douglas-raillard-arm
Copy link
Author

For now I worked around the issue by pre-generating the data in conf.py so that the hook only needs to lookup the pre-computed information in a dict. That also had the advantage of allowing the use of ThreadPoolExecutor (otherwise I would only get the page-based parallelism of Sphinx, which was not helping as all of the computation happened for a single page).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants