New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
C unpickling bypasses import thread safety #78753
Comments
Retrieving and using a module directly from sys.modules (from C in this case) leads to a race condition where the module may be importing on another thread but has not yet been initialised. For slow filesystems or large modules (e.g. numpy) this seems to lead to easily reproducible errors (the attached code fails 100% of the time on my work machine - CentOS 7). I believe they have to be in sys.modules during this phase due to the possibility of circular references. importlib handles this carefully with locking, but _pickle.c bypasses all that, leading to issues with threaded codes that use pickling, e.g. dask/distributed. |
Hi! Just wondering if there is anything I can do to move this along? |
Interesting I could not reproduce here, even by throwing Pandas into the mix and spawning 1024 workers... |
Perhaps PyImport_GetModule() should aquire-release the module's lock before the lookup? This would effectively be a call to _lock_unlock_module() in importlib._bootstrap. The alternative is to encourage using PyImport_Import() instead, like the PR has done. In the case the docs for PyImport_GetModule() should make it clear that it is guaranteed that the module is fully imported yet (and recommend using PyImport_Import() for the guarantee). Either way there should be a new issue for the more general change (and it should reference this issue). |
I agree that more generally PyImport_GetModule() should be fixed. But it should be done carefully so as to not to lose the performance benefit of doing it. I think we should open a separate issue about that. PS: one possibility is to reuse the optimization already done in PyImport_ImportModuleLevelObject(): /* Optimization: only call _bootstrap._lock_unlock_module() if
__spec__._initializing is true.
NOTE: because of this, initializing must be set *before*
stuffing the new module in sys.modules.
*/
spec = _PyObject_GetAttrId(mod, &PyId___spec__);
if (_PyModuleSpec_IsInitializing(spec)) {
PyObject *value = _PyObject_CallMethodIdObjArgs(interp->importlib,
&PyId__lock_unlock_module, abs_name,
NULL);
if (value == NULL) {
Py_DECREF(spec);
goto error;
}
Py_DECREF(value);
}
Py_XDECREF(spec); |
Opened bpo-35943 for PyImport_GetModule(). |
Thanks, Antoine. |
This is pushed to 3.7 and master now. Thank you Tim for the report and the fix! |
For this issue only 3.7 and 3.8 are listed as affected versions, but it appears to be also reproducible on the latest 3.5 and 3.6 releases. I've attached a script that can be used to reproduce the issue on those earlier releases (it consistently fails for me with values of 50 or higher as command line argument). |
3.6 and 3.5 are in security mode, so unless there's a security risk due to this bug the fix will not be backported to those older versions (https://devguide.python.org/#status-of-python-branches). |
While investigating[1], I observe that certain unpickling operations, for example, Unpickler.find_class, remain not thread-safe in Python 3.7.5 and earlier versions that I tried. I have not tried 3.8, but cannot reproduce this error on Python 2. For example, attached find_class_deadlock.py which consistently fails with a deadlock on Python 3.7.5. I opened https://bugs.python.org/issue38884, which may be causing these errors, since the failure mode is similar, and in at least some codepaths, find_class seems to be calling __import__ [2]. [1] https://issues.apache.org/jira/browse/BEAM-8651 Line 1426 in 4ffc569
|
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: