New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some repr implementations don't check for self-referential structures #69641
Comments
Implementations of repr for some of the types in the standard library doesn't check for self-referential structures. As a result, when calling repr() on such objects, Python crashes due to infinite recursion. Example:
>>> import functools
>>> x = functools.partial(min)
>>> x.__setstate__((x, (), {}, {}))
>>> repr(x)
Segmentation fault
Another example:
>>> import xml.etree.ElementTree
>>> x = xml.etree.ElementTree.Element('')
>>> x.tag = x
>>> repr(x)
Segmentation fault
One more example:
>>> import io
>>> class X(io.TextIOWrapper): __slots__ = 'name'
>>> x = X(open('/dev/null'))
>>> x.name = x
>>> repr(x)
Segmentation fault |
The general solution is to make PyObject_Repr to detect recursive calls (as reprlib.recursive_repr does). The straightforward way is to use thread local identity set. It can be implemented as a dict that maps id(obj) -> obj (creates an int object for key for every call, requires about 40-80 bytes for recurse level), or as specialized hash table (see Modules/hashtable.c) (faster, requires about 12-24 bytes for recurse level). The fastest solution would be to set special flag inside proceeded object. For now general object has no place for such flag, but we can add it to GC head. On 64-bit this would increase the size of GC head from 24 to 32 bytes, on 32-bit there is a free place in 16-bytes GC head. However the performance can be not critical here, because in any case repr() creates new object (resulting string). Using thread local hash table can be enough. In any case the patch will be enough complex to target it 3.6 only. |
Changing PyObject_Repr is too course; it affects a broad class of objects other than containers, and it risks unknown impacts to larges swaths of third-party code use this venerable API. It is also a break with the long established history of recursion detection being a responsibility of the individual types (i.e. the code in sets, lists, dicts, etc.) The three cases listed here should be fixed individually. |
Yet one example: >>> import io
>>> class BR(io.BufferedReader):
... @property
... def name(self):
... return self
...
>>> repr(BR(io.BytesIO()))
Segmentation fault The same is for other file-like objects. |
Recursive partial objects are legitimate. Here is a patch that makes partial's repr to support recursive partial objects. Also added a test for pickling. Cases for Element and file-like objects are questionable. Recursive Element.tag and TextIOWrapper.name don't make a sense, and I don't think we should special support (and encourage) these cases. To avoid stack overflow we can add a restriction for tag to be str or None, but file's name attribute can be dynamic. We can omit name from repr if it is not None, str, bytes or int. |
Added ElementTree and io modules experts to the nosy list. |
Here are patches for io classes and for ElementTree. |
There is also a crash with Python implementation of TextIOWrapper. >>> import _pyio as io
>>> t = io.TextIOWrapper(io.BytesIO())
>>> t.mode = t
>>> t
Fatal Python error: Cannot recover from stack overflow. Current thread 0xb74a9700 (most recent call first): |
Raymond, could you please make a review of the first patch? |
Also, we should ask Antoine Pitrou to look at the TextIO patch and ask Stephan Krah to look at the ElementTree patch. |
I think you may have meant Eli Bendersky -- I'm not an elementree |
As I've mentioned elsewhere, I'll have to temporarily take myself off these issues as I don't have the time to work on them (even review patches). I think Raymond may have gotten his Stefans mixed up and meant Stefan Behnel, who's also been looking at etree patches. |
Ping. |
Etree patch looks straight forward to me, feel free to apply it. |
New changeset e44bd1259bda by Serhiy Storchaka in branch '3.5': New changeset e3671a684ea0 by Serhiy Storchaka in branch 'default': |
New changeset c071da010053 by Serhiy Storchaka in branch '2.7': New changeset 17e78918f608 by Serhiy Storchaka in branch '3.5': New changeset 86959c696ab7 by Serhiy Storchaka in branch 'default': |
I'm not sure if 17e78918f608 is relevant but test_gc is failing on several buildbots:
|
The patch for io classes needed an update. |
New changeset 7859742826b2 by Serhiy Storchaka in branch '2.7': |
Indeed, tests for recursive partial objects create reference loops and don't clean them. Thank you Berker. I'll fix this. |
New changeset 0323b33894f2 by Serhiy Storchaka in branch '3.5': New changeset 688edc946ab9 by Serhiy Storchaka in branch '2.7': New changeset 818a10534e44 by Serhiy Storchaka in branch 'default': |
Antoine, are you fine with io_recursive_repr2.patch? |
New changeset ea1edf1bf362 by Benjamin Peterson in branch '2.7': |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: