-
-
Notifications
You must be signed in to change notification settings - Fork 31.7k
Make struct module PEP-384 compatible #82257
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Make struct module PEP-384 compatible |
Executing this simple code after this commit segfaults: from multiprocessing.pool import Pool
class A(object):
def __init__(self):
self.pool = Pool()
def __del__(self):
self.pool.close()
self.pool.join()
a = A() [1] 28019 segmentation fault ./python.exe ../lel.py The reason is that there is a call to PyModule_GetState with the module being NULL:
I am marking this as a release blocked |
I can reproduce the crash on Linux. I interrupt the problem with CTRL+c (why is it blocked? I don't know). Then it does crash. First, _PyImport_Cleanup() does clear all modules including _struct. Then, _PyImport_Cleanup() calls gc.collect() which finalize_garbage() which calls A.__del__(). Problem: at this point, the _struct became unusable. -- Thread 1 "python" received signal SIGSEGV, Segmentation fault.
0x0000000000473f30 in PyModule_GetState (m=0x0) at Objects/moduleobject.c:565
565 if (!PyModule_Check(m)) {
(gdb) py-bt
Traceback (most recent call first):
<built-in method pack of module object at remote 0x7fffea99acb0>
File "/home/vstinner/python/master/Lib/multiprocessing/connection.py", line 400, in _send_bytes
header = struct.pack("!i", n)
File "/home/vstinner/python/master/Lib/multiprocessing/connection.py", line 200, in send_bytes
self._send_bytes(m[offset:offset + size])
File "/home/vstinner/python/master/Lib/multiprocessing/queues.py", line 368, in put
self._writer.send_bytes(obj)
File "/home/vstinner/python/master/Lib/multiprocessing/pool.py", line 649, in close
self._change_notifier.put(None)
File "/home/vstinner/python/master/x.py", line 7, in __del__
self.pool.close()
Garbage-collecting In debug mode, the crash occurs in s_pack() at:
-- #define _structmodulestate(o) ((_structmodulestate *)PyModule_GetState(o))
static struct PyModuleDef _structmodule;
#define _structmodulestate_global _structmodulestate(PyState_FindModule(&_structmodule))
#define PyStruct_Check(op) PyObject_TypeCheck(op, (PyTypeObject *)_structmodulestate_global->PyStructType) The problem is "_structmodulestate_global->PyStructType": PyState_FindModule(&_structmodule) returns NULL, _structmodulestate() calls PyModule_GetState(NULL) which does crash. -- The question is why the _struct module was cleared whereas there was still a reference to it. Is it part of a reference cycle? |
It seems problematic that_PyInterpreterState_ClearModules runs before all instances from a module have been cleared. If PyState_FindModule is no longer able to return module state then there's no way for a module to reliably work at shutdown other than having all instances hold onto the module (or module state) directly from all of their insatances. Certainly that would mimic more closely what happens w/ pure Python instances and modules - the type will hold onto the functions which will hold onto the module global state. |
This is a relatively simple repro of the underlying problem: import _struct
s = _struct.Struct('i')
class C:
def __del__(self):
s.pack(42, 100)
_struct.x = C() It's a little bit different in that it is actually causing the module to attempt to throw an exception instead of do a type check. |
And here's a variation which doesn't involve any instances from the module: import _struct
class C:
def __init__(self):
self.pack = _struct.pack
def __del__(self):
self.pack('I', -42)
_struct.x = C() |
#18038 is a partial fix for this. I think it fixes the crash at shutdown, although I'm still seeing a hang on master on Linux which is different then earlier versions of Python. I seem to have a really bogus stack trace when I attach to it so I'm not quite certain what's going on there. |
Hey all, I've got a fix for this bug and the CI is green: #18039 TL;DR: The module state have to be cleared at a later time. I explain in detail in the PR. Also, I didn't add a new test since there was a test that was already checking for module states in |
Is it enough to reduce the issue priority from release blocker to normal? |
The error can still happened in other modules and under similar conditions, no? |
The question was if the next 3.9 *alpha* release must be blocked by this issue. I don't think so. I reduce the priority to normal (not set). If someone disagrees, feel free to raise it again to release blocker ;-) |
The PR that I sent out already fixes the issue. @vstinner, @pablogsal, please take a look again #18039 That should close this issue, no need to work around the bug priority. |
I'm concerned by release blocker because 3.9.0a3 version is supposed to be released soon, and usually release blocker do block a release :-) |
Ah! That makes sense! In any case, feel free to ping me if you need help on my side to get this PR through (or to remove the release blocker). |
With either fix, or with both, on Linux I still see this hang at shutdown. Victor mentioned the fact that he had to hit Ctrl-C on Linux to see this, and I have to do the same thing. Then with the fixes in place the original test case still hangs on shutdown. On Python 3.7 (I don't readily have 3.8 available) at least this just runs and completes with no ctrl-C and no crashes. So while either of the fixes may be good to prevent the crashes, there's still probably some underlying issue in multiprocessing. I haven't tested on Mac OS/X. It looks like the clearing was originally introduced here: https://bugs.python.org/issue10241 Interestingly there was a similar issue w/ _tkinter, which also used PyType_FromSpec (although it sounds like it was just a ref count issue on the type). Unfortunately there's no associated test cases added to verify the behavior. Antoine and Neil are both now on the PR which removes the collection behavior so hopefully they can chime in on the safety of that fix. |
One more data point: Backporting this change to Python 3.6 (I just happened to have it applied there already, so I haven't tried it on 3.7 or 3.8) has no crash and no hangs in multiprocessing on Linux. So something definitely changed in multiproessing which is causing the hang on shutdown, and forces us into this code path where we crash as well. |
Whoa, I've never heard that before! <wink> |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: