-
-
Notifications
You must be signed in to change notification settings - Fork 31.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test_slice: gc_decref: Assertion "gc_get_refs(g) > 0" failed: refcount is too small #80142
Comments
I am seeing some failures in travis and some buildbots: 0:02:24 load avg: 3.30 [147/420/1] test_slice crashed (Exit code -6) -- running: test_multiprocessing_spawn (32 sec 523 ms), test_asyncio (45 sec 433 ms), test_multiprocessing_forkserver (47 sec 869 ms) Usually, this happens with test_slice but when the test runner runs test_slice in isolation, it succeeds. I am afraid that this will be a weird combination of tests. |
I too saw this a week back but couldn't reproduce it. There is one another test that fails like this and passes on verbose mode bpo-35809 . Travis doesn't report tests that fail in the complete run and pass during verbose run like buildbots do. |
bpo-35977 has been marked as a duplicate, copy of the first message: s390x Debian 3.x: 0:06:26 load avg: 0.92 [291/420/1] test_slice crashed (Exit code -6) object : <test.test_slice.SliceTest.test_cmp.<locals>.BadCmp object at 0x3ff93967e90> Current thread 0x000003ff95272700 (most recent call first): |
Do you have an url to an example of crash on Travis? Which buildbots are affected? I saw the crash on s390x Debian 3.x. |
I'm unable to reproduce the issue on my Fedora 29 (x86-64) laptop. I used two terminals:
|
Extract of the crashing test: def test_cmp(self):
...
class Exc(Exception):
pass
class BadCmp(object):
def __eq__(self, other):
raise Exc
s1 = slice(BadCmp())
s2 = slice(BadCmp())
self.assertEqual(s1, s1)
self.assertRaises(Exc, lambda: s1 == s2) # <==== CRASH HERE
The latest change in test_slice.py has been made in 2016: 3 years ago. Latest changes in Objects/sliceobject.c were refactoring related to header files (november 2018). The latest "significant" change was done in April 2018: "bpo-33012: Fix invalid function cast warnings with gcc 8 for METH_NOARGS". I bet more on a recent change which is not directly related to slices... like the implementation of the PEP-572? It would help to know when the crash started to occur exactly (date + time). |
Lukasz: I'm not sure what is happening here, but such random bugs are always scrary. I raise the priority to "release blocker" until we know more about the issue. |
On s390x Debian 3.x, it seems like the bug started to show up near: This build contains 2 changes, one doc change and this namedtuple change which mentions reference count: "Complete and neaten-up namedtuple's replacement of builtin function lookups with derefs (GH-11794)" Ok, but I don't see the relationship between namedtupled and this specific test!? |
Travis crash that occurred few days back : https://travis-ci.org/python/cpython/jobs/491701299 . I first noticed it on an IDLE lib PR where this failure was unrelated but I can remember occurring at least few weeks back. I will try get the older one. Sorry I forgot to file an issue during that time. |
Ah! I found a way to reproduce the crash: diff --git a/Lib/test/test_slice.py b/Lib/test/test_slice.py
index 4ae4142c60..b18f8f1c8c 100644
--- a/Lib/test/test_slice.py
+++ b/Lib/test/test_slice.py
@@ -1,4 +1,8 @@
# tests for slice objects; in particular the indices method.
+import gc, random
+n = random.randint(1, 700)
+print("set_threshold(%s)" % n)
+gc.set_threshold(n)
import itertools
import operator $ ./python -m test -F -j0 test_slice test_slice test_slice test_slice test_slice test_slice test_slice test_slice I went up to January 1st, 2018 (commit e8ed965): the test already crashed with my patch, so it's not a regression. I reset the priority. |
See also this idea of changing gc.set_threshold() when using -X dev: |
It has been crashing for almost all the last 5 PRs merged to master : https://travis-ci.org/python/cpython/jobs/492241988 - Feb 12, 21:51 It's easy to overlook this since PR passes on verbose mode . There is another test these days that fails very frequently but passes on verbose mode : bpo-35809 . It's little hard to track their first failure since Travis provides no search capability for logs. |
I removed Lukasz from the nosy list since it's not a regression. No need to spam our 3.8 release manager ;-) |
Extract of slice_richcompare(): t1 = PyTuple_New(3);
if (t1 == NULL)
return NULL;
t2 = PyTuple_New(3);
if (t2 == NULL) {
Py_DECREF(t1);
return NULL;
}
PyTuple_SET_ITEM(t1, 0, ((PySliceObject *)v)->start);
PyTuple_SET_ITEM(t1, 1, ((PySliceObject *)v)->stop);
PyTuple_SET_ITEM(t1, 2, ((PySliceObject *)v)->step);
PyTuple_SET_ITEM(t2, 0, ((PySliceObject *)w)->start);
PyTuple_SET_ITEM(t2, 1, ((PySliceObject *)w)->stop);
PyTuple_SET_ITEM(t2, 2, ((PySliceObject *)w)->step); res = PyObject_RichCompare(t1, t2, op); PyTuple_SET_ITEM(t1, 0, NULL);
PyTuple_SET_ITEM(t1, 1, NULL);
PyTuple_SET_ITEM(t1, 2, NULL);
PyTuple_SET_ITEM(t2, 0, NULL);
PyTuple_SET_ITEM(t2, 1, NULL);
PyTuple_SET_ITEM(t2, 2, NULL);
Py_DECREF(t1);
Py_DECREF(t2);
t1 and t2 tuples are tracked by the GC, but t1 and t2 items are *borrowed* references. gc.collect() calls subtract_refs() which decreases the reference counter of all objects tracked by the GC... but v, w, t1 and t2 contain the same references which causes the assertion error. The code is quite old: commit 47b9ff6
+static PyObject * I see two options:
I wrote PR 11828 which implements the first solution. The second solution has a minor impact on performance. |
It's impressive *and* scary that such 13 years old bug only show up today... I modified the GC threshold in the site module with this patch: diff --git a/Lib/site.py b/Lib/site.py
index ad1146332b..68cdf105b1 100644
--- a/Lib/site.py
+++ b/Lib/site.py
@@ -557,6 +557,10 @@ def main():
This function is called automatically when this module is imported,
unless the python interpreter was started with the -S flag.
"""
+ import gc, random
+ n = random.randint(5, 100)
+ gc.set_threshold(n)
+
global ENABLE_USER_SITE
orig_path = sys.path[:] Running the test suite with this patch doesn't catch the bug :-( $ ./python -m test -j0 -r -u all,-gui |
Victor found the same bug I found while I was composing this, posting only to incorporate proposed solution: I *think* I have a cause for this, but someone else with greater understanding of the cycle collector should check me if the suggested fix has non-trivial performance implications (I suspect the answer is no, performance is unaffected). slice_richcompare borrows its behavior from tuple by creating a temporary tuple for each slice, the delegating to the tuple comparison ( https://github.com/python/cpython/blob/master/Objects/sliceobject.c#L591 ). Problem is, it uses borrowed references when creating said tuples, not owned references. Because test_slice's BadCmp.__eq__ is implemented in Python, the comparison can be interrupted by cycle collection during the __eq__ call. When then happens, there are precisely two references to the BadCmp object:
When a cycle collection occurs during the comparison, and subtract_refs ( https://github.com/python/cpython/blob/master/Modules/gcmodule.c#L398 ) is called, the BadCmp object in question is visited via both the slice and the tuple, and since it has no non-container objects referencing it, it ends up with the initial reference count of 1 attempting to drop to -1, and the assertion is violated. While the code of gcmodule.c appears to have been refactored since 3.7 so the assert occurs in a different function, with a slightly different message, it would break the same way in both 3.7 and master, and whether or not it triggers the bug, the broken behavior of slice_richcompare hasn't changed for a *long* time. Underlying problem would seem to be slice's richcompare believing it's okay to make a tuple from borrowed references, then make a call on it that can trigger calls into Python level code (and therefore into the cycle collector); everything else is behaving correctly here. I'm guessing the only reason it's not seen in the wild is that slices based on Python defined types are almost never compared at all, let alone compared on debug builds that would be checking the assert and with an accelerated cycle collection cycle that would make a hit likely. Solution would be to stop trying to microoptimize slice_richcompare to avoid reference count manipulation and just build a proper tuple. It would even simplify the code since we could just use PyTuple_Pack, reducing custom code by replacing: t1 = PyTuple_New(3);
if (t1 == NULL)
return NULL;
t2 = PyTuple_New(3);
if (t2 == NULL) {
Py_DECREF(t1);
return NULL;
}
PyTuple_SET_ITEM(t1, 0, ((PySliceObject *)v)->start);
PyTuple_SET_ITEM(t1, 1, ((PySliceObject *)v)->stop);
PyTuple_SET_ITEM(t1, 2, ((PySliceObject *)v)->step);
PyTuple_SET_ITEM(t2, 0, ((PySliceObject *)w)->start);
PyTuple_SET_ITEM(t2, 1, ((PySliceObject *)w)->stop);
PyTuple_SET_ITEM(t2, 2, ((PySliceObject *)w)->step); with: t1 = PyTuple_Pack(3, ((PySliceObject *)v)->start, ((PySliceObject *)v)->stop, ((PySliceObject *)v)->step);
if (t1 == NULL)
return NULL;
t2 = PyTuple_Pack(3, ((PySliceObject *)w)->start, ((PySliceObject *)w)->stop, ((PySliceObject *)w)->step);
if (t2 == NULL) {
Py_DECREF(t1);
return NULL;
} and makes cleanup simpler, since you can just delete: PyTuple_SET_ITEM(t1, 0, NULL);
PyTuple_SET_ITEM(t1, 1, NULL);
PyTuple_SET_ITEM(t1, 2, NULL);
PyTuple_SET_ITEM(t2, 0, NULL);
PyTuple_SET_ITEM(t2, 1, NULL);
PyTuple_SET_ITEM(t2, 2, NULL); and let the DECREFs for t1/t2 do their work normally. If for some reason the reference count manipulation is unacceptable, this *could* switch between two behaviors depending on whether or not start/stop/step are of known types (e.g. if all are NoneType/int, this could use the borrowed refs code path safely) where a call back into Python level code is impossible; given that slices are usually made of None and/or ints, this would remove most of the cost for the common case, at the expense of more complicated code. Wouldn't help numpy types though, and I suspect the cost of pre-checking the types for all six values involved would eliminate most of the savings. Sorry for not submitting a proper PR; the work machine I use during the day is not suitable for development (doesn't even have Python installed). |
Then again, there's probably no other code in the world that compares slice objects ;-) |
Ah, I see Victor posted an alternative PR that avoids the reference counting overhead by explicitly removing the temporary tuples from GC tracking. I'm mildly worried by that approach, only because the only documented use case for PyObject_GC_UnTrack is in tp_dealloc (that said, the code explicitly allows it to be called twice due to the Py_TRASHCAN mechanism, so it's probably safe so long as the GC design never changes dramatically). If slice comparison really is performance sensitive enough to justify this, so be it, but I'd personally prefer to reduce the custom code involved in a rarely used code path (we're not even caching constant slices yet, so no comparisons are likely to occur for 99.99% of slices, right?). |
I wrote PR 11830: the other solution that I propose, use strong references rather than borrowed references. |
Josh, I'd say the speed of this code doesn't matter one whit to anything. Like you, I'd _prefer_ that the issue be fixed by building "real tuples" that own their own references, which would also (as you showed) allow for briefer, simpler, clearer code. But I'm OK with playing obscure GC tricks too. |
Oh, Tim Peters succeded somehow to remove Josh Rosenberg from the nosy list: I add him again ;-) Thanks Josh for your analysis, it seems like we agree :-) I wrote PR 11830 to use strong references, as you proposed. |
Certainly wasn't my intent! That happens too often on the tracker. Thanks for noticing! :-( |
Python 2.7 is not affected: it doesn't have the slice_richcompare() function, but the old slice_compare() implementation. Python 3.6 is affected but doesn't accept bugfixes anymore: https://devguide.python.org/#status-of-python-branches This bug is 13 years old, I don't think that it's really useful to fix it. It mostly impact test_slice and only test_slice. |
To be clear, I compiled Python 3.6 in release mode and applied test_slice patch of msg335333: no crash after 1500 runs. The bug is an assertion error, but assertions are only compiled in debug mode. |
Thanks everyone for helping on the analysis and for reviews obviously. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: