-
-
Notifications
You must be signed in to change notification settings - Fork 30.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
segfault during shutdown attempting to log ResourceWarning #67087
Comments
Running a unittest suite for Motor, my MongoDB driver which uses PyMongo, greenlet, and Tornado. The suite commonly segfaults during interpreter shutdown. I've reproduced this crash with Python 3.3.5, 3.4.1, and 3.4.2. Python 2.6 and 2.7 do *not* crash. The Python interpreters are all built like: ./configure --prefix=/mnt/jenkins/languages/python/rX.Y.Z --enable-shared && make && make install This is Amazon Linux AMI release 2014.09. The unittest suite's final output is: ---------------------------------------------------------------------- OK Backtrace from a Python 3.4.2 coredump attached. |
The crash can ignore whether or not I specify "-Wignore" on the python command line. I was hoping to avoid the crash by short-circuiting the ResourceWarning code path, since the following line appears in the backtrace: #5 PyErr_WarnFormat (category=<optimized out>, stack_level=stack_level@entry=1, format=format@entry=0x7f5f1ca8b377 "unclosed file %R") at Python/_warnings.c:813 But "-Wignore" has no effect. |
Looks as globals in setup_context() is NULL. I suppose it is PyThreadState_Get()->interp->sysdict. interp->sysdict is cleared in PyInterpreterState_Clear(). Other code is executed after setting interp->sysdict to NULL (clearing interp->sysdict content, interp->builtins, interp->builtins_copy and interp->importlib) and this potentially can emit warnings. |
It looks like the problem is that raising the PyExc_RecursionErrorInst singleton creates a traceback object which contains frames. The singleton keeps the frames alive longer than expected. I tried to write a script to raise this singleton, but it looks like the local variables of the frames are not deleted, even if frames are deleted (by _PyExc_Fini). You may try to finish my attached runtimerror_singleton.py script. |
Oh, I also wrote a draft of patch fixing the issue, but I was unable to reproduce the issue. See attached warn.patch (not tested). |
The attached script raises the PyExc_RecursionErrorInst singleton and reproduces the issue. |
+ /* during Python finalization, warnings may be emited after interp->sysdict I would prefer to see this comment in the else block. |
Indeed. |
Why recursion limit is restored? Couldn't the test be simpler without it? %a should be used instead of '%s' (file path can contain backslashes). And it would be more robust to open file in binary mode (may be even in non-buffered). It can contain non-ascii characters. May be the test should be marked as CPython only. To check that script is executed at all we should print something from it and than test the out. Otherwise syntax error in script will make all test passed. |
For the sake of explicitness, so that the interpreter will not raise a RuntimeError during finalization when checking for the recursion limit after g.throw(MyException) has raised PyExc_RecursionErrorInst.
Thanks. The reason why the PyExc_RecursionErrorInst singleton keeps the frames alive longer than expected is that the reference count of the PyExc_RecursionErrorInst static variable never reaches zero until _PyExc_Fini(). So decrementing the reference count of this exception after the traceback has been printed in PyErr_PrintEx() does not decrement the reference count of its traceback attribute (as it is the case with the other exceptions) and the traceback is not freed. The following patch to PyErr_PrintEx() does that. With this new patch and without the changes made by warn_4.patch, the interpreter does not crash with the runtimerror_singleton_2.py reproducer and the ResourceWarning is now printed instead of being ignored as with the warn_4.patch: diff --git a/Python/pythonrun.c b/Python/pythonrun.c
--- a/Python/pythonrun.c
+++ b/Python/pythonrun.c
@@ -1876,6 +1876,8 @@
PyErr_Display(exception, v, tb);
}
Py_XDECREF(exception);
+ if (v == PyExc_RecursionErrorInst)
+ Py_CLEAR(((PyBaseExceptionObject *)v)->traceback);
Py_XDECREF(v);
Py_XDECREF(tb);
} If both patches were to be included, the test case in warn_4.patch would test the above patch and not the changes made in Python/_warnings.c. |
out can be b'Done.\r\n'. Use self.assertIn.
You can test err for warning message. The traceback should be cleared before decrementing the reference count. And only if Py_REFCNT(v) is 2. |
With warn_4.patch applied I can no longer reproduce my original segfault, looks like the fix works. |
Ok, new patch attached.
In the case where PyExc_RecursionErrorInst would not leak frames, the code path followed by the test case would not run any of the changes made in _warnings.c.
I believe that attempting to fix the frames leak by clearing the traceback implies the following changes: Note [1]: Not sure if this is worth the trouble. |
Out of curiosity I have tried to figure out how to build another test case using the model provided by runtimerror_singleton.py. This cannot be done, and for the following reasons: The infinite recursion of PyErr_NormalizeException() is supposed to occur as follows: when a RuntimeError caused by recursion is normalized, PyErr_NormalizeException() calls the RuntimeError class to instantiate the exception, the recursion limit is reached again, triggering a new RuntimeError that needs also to be normalized causing PyErr_NormalizeException() to recurse infinitely. But the low/high water mark level heuristic of the anti-recursion protection mechanism described in a comment of ceval.h prevents this. Let's assume the infinite recursion is possible:
This explains the paradox that, if you remove entirely the check against infinite recursion in PyErr_NormalizeException(), then the runtimerror_singleton_2.py reproducer does not crash and the ResourceWarning is printed even though the recursion limit has been reached. The attached patch implements this fix, includes the previous changes in _warning.c, and moves the test case to test_exceptions. History (for reference): [1] http://svn.python.org/view?view=revision&revision=58032 |
When tstate->overflowed is already set to 1 before entering PyErr_NormalizeException() to normalize an exception, the following cases may occur:
Cases 3) and 4) can be tested with runtimerror_singleton_3.py (install mymodule with setup.py for all three test cases in 4). remove_singleton.patch introduces a regression in case c), but IMHO the abort in case c) is consistent with the abort in case 3), they |
I tried the following script on Python 3.5 and Python 3.6 and I failed to reproduce the bug: import sys, traceback
class MyException(Exception):
def __init__(self, *args):
1/0
def gen():
f = open(__file__, mode='rb', buffering=0)
yield
g = gen()
next(g)
recursionlimit = sys.getrecursionlimit()
sys.setrecursionlimit(len(traceback.extract_stack())+3)
try:
g.throw(MyException)
finally:
sys.setrecursionlimit(recursionlimit)
print('Done.') Note: I had to add "+3" to the sys.setrecursionlimit() call, otherwise the limit is too low and you get a RecursionError (it's a recent bugfix, issue bpo-25274). Can somone else please confirm that the bug is fixed? |
When tested with runtimerror_singleton_3.py (see msg 231933 above), the latest Python 3.6.0a0 (default:3eec7bcc14a4, Mar 24 2016, 20:16:19) still crashes: $ python runtimerror_singleton_3.py
Importing mymodule.
Traceback (most recent call last):
File "runtimerror_singleton_3.py", line 26, in <module>
foo()
File "runtimerror_singleton_3.py", line 23, in foo
g.throw(MyException) # Entering PyErr_NormalizeException()
File "runtimerror_singleton_3.py", line 14, in gen
yield
RecursionError: maximum recursion depth exceeded
Segmentation fault (core dumped) |
warn_5.patch: The patch cannot be reviewed on Rietveld :-( You must not use the git format for diff. warn_5.patch: "if (globals == NULL) { (...) return 0; }" It looks like filename is not initialized. I suggest to use: *filename = f->f_code->co_filename; It looks like you have to add: if (PyUnicode_Check(*filename)) *filename = NULL; To mimick the code below. |
Victor, With warn_5.patch *filename is not set when globals is NULL: setup_context() returns 0, and so do_warn() returns NULL without calling warn_explicit(). This is different from your initial warn.patch where setup_context() returns 1 in that case and an attempt is made to issue the warning. |
The issue bpo-17852 is still alive and has a reference to this issue. It would be nice to rebase the latest patch on master and create a PR ;-) |
Antoine asked in PR 1981:
Just checked that crasher infinite_rec_2.py (removed by 1e534b5) does not crash with PR 1981. The other crashers listed at 1e534b5 are not valid Python 3.7 code. Does anyone know how to translate them into Python 3.7 ? With PR 1981 infinite recursion does not occur in PyErr_NormalizeException() when the tstate->overflowed flag is false upon entering this function and: Removing the PyExc_RecursionErrorInst singleton decreases the cases covered by the recursion checks because the test made upon using PyExc_RecursionErrorInst (in the 'finally' label of PyErr_NormalizeException()) has the side effect of adding another recursion check to the normal recursion machinery of _Py_CheckRecursiveCall(). Those are corner cases though, such as for example the following case that will abort instead now with PR 1981 [1]: [1] But with PR 1981, a RecursionError is raised when replacing MyException in test_recursion_normalizing_exception() at Lib/test/test_exceptions.py with: class MyException(Exception):
def __init__(self):
raise MyException |
The fact that the traceback of PyExc_RecursionErrorInst causes an issue means that PyExc_RecursionErrorInst is used. We can't just remove PyExc_RecursionErrorInst since this can cause a stack overflow or, with merged PR 2035, an infinite loop. Perhaps the solution of this issue is clearing __traceback__, __cause__ and __context__ attributes of PyExc_RecursionErrorInst as early as possible. |
The simplest solution -- make BaseException_set_tb(), BaseException_set_context() and BaseException_set_cause() no-ops for PyExc_RecursionErrorInst. |
Please give an example where a stack overflow occurs when PyExc_RecursionErrorInst has been removed. |
Fixed by bpo-30697. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: