Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PARI is not thread-safe from above #28800

Open
embray opened this issue Nov 25, 2019 · 6 comments
Open

PARI is not thread-safe from above #28800

embray opened this issue Nov 25, 2019 · 6 comments

Comments

@embray
Copy link
Contributor

embray commented Nov 25, 2019

This is a follow-up to #26608. That ticket specifically discussed the issue of multi-threaded PARI causing Sage's docbuild to break. That problem is worked around in #28356, so I decided to close #26608.

However, the general problem remains, which is that PARI is not thread-safe from above, meaning that while threads created and managed by the PARI library itself work fine, threads created in a multi-system environment (like Sage) which happen to use PARI (specifically PARI built with multi-threading support) it will segfault.

This has been discussed in #26608 as well as this discussion on the OpenDreamKit project as well as related in-person conversations for which I unfortunately lack notes.

With #26608 resolved this is fortunately not an immediate problem for Sage, though it would be very easy for someone thinking they can just carelessly use threads (e.g. from the Python level) in their own code and experience similar crashes.

This is also not a problem just of PARI; getting this kind of multi-system multi-level parallelism right is hard, and should require cooperation, and specific guidelines to follow. Although I have not yet hit any other specific examples I have no doubt that the exist; for example I would not be surprised if HPC-GAP has similar problems.

Upstream: Reported upstream. Developers deny it's a bug.

CC: @antonio-rojas @jdemeyer @kiwifb @dimpase @saraedum @embray @timokau

Component: interfaces

Keywords: pari threading parallelism gap

Issue created by migration from https://trac.sagemath.org/ticket/28800

@embray

This comment has been minimized.

@dimpase
Copy link
Member

dimpase commented Nov 25, 2019

comment:3

Macaulay2 does have these problems with Pari, AFAIK. They are thinking of removing Pari from their dependencies all together due to this.

@embray
Copy link
Contributor Author

embray commented Dec 13, 2019

comment:4

As I think I mentioned in the original discussion, the issue could be mitigated in PARI somewhat, with a few strategically-placed checks to ensure that important thread-local variables have been initialized, and re-initialize them as needed (using more-or-less existing code for re-initializing PARI in its own, self-managed threads).

@embray
Copy link
Contributor Author

embray commented Dec 13, 2019

comment:5

An alternative approach (although one that would still be made easier with some internal refactoring of PARI*) would be like Python's PyGILState_Ensure(). This places some onus on users of PARI in multi-threaded code to make sure PARI's interpreter state is properly initialized before using it in a new thread, which is not an unfair thing to ask users to do.

* I should clarify what I mean by this. When multi-threading was added to PARI, a number of global variables were simply converted directly to thread-local variables (sometimes it's not clear to me if all of them need to be thread-local; I don't know). By contrast, to use CPython again as an example (since I know it well), all variables that need to be thread specific (e.g. in PARI this would include things like the main stack pointer) are collected into a single PyThreadState struct, which makes it much easier to manage. Each thread has its own threadstate stored in a thread-local variable, so just one variable instead of a whole bunch (meaning only one call to the TSS APIs to get/set it). A similar reorganization of PARI's thread-specific variables would be helpful.

@dimpase
Copy link
Member

dimpase commented Aug 23, 2020

comment:6

one way around this difficulty is to use spawn, not fork, in multiprocessing, something that is available in Python 3.7 and later.
One just calls

multiprocessing.set_start_method('spawn')

somewhere early enough.

This alone is not enough, one needs to rework various things due to spawn pickling the environment.

@embray
Copy link
Contributor Author

embray commented Aug 31, 2020

comment:7

Yeah, setting set_start_method('spawn') globally would wreak havoc, though might be useful in some careful cases. I think this specific issue would be better addressed with improvements to how PARI manages its thread-local state.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants