Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfault on __getattr__ infinite recursion on certain attribute accesses #86075

Closed
bluetech mannequin opened this issue Oct 2, 2020 · 8 comments
Closed

Segfault on __getattr__ infinite recursion on certain attribute accesses #86075

bluetech mannequin opened this issue Oct 2, 2020 · 8 comments
Labels
3.8 (EOL) end of life 3.9 only security fixes 3.10 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@bluetech
Copy link
Mannequin

bluetech mannequin commented Oct 2, 2020

BPO 41909
Nosy @gvanrossum, @loewis, @nascheme, @pitrou, @serhiy-storchaka, @The-Compiler, @bluetech, @sparkingdark
PRs
  • bpo-41909: Enable previously disabled recursion checks. #22536
  • [3.9] bpo-41909: Enable previously disabled recursion checks. (GH-22536) #22550
  • [3.8] bpo-41909: Enable previously disabled recursion checks. (GH-22536) #22551
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2020-10-04.22:44:32.041>
    created_at = <Date 2020-10-02.10:49:51.390>
    labels = ['interpreter-core', '3.10', '3.8', '3.9', 'type-crash']
    title = 'Segfault on __getattr__ infinite recursion on certain attribute accesses'
    updated_at = <Date 2020-10-04.22:44:32.041>
    user = 'https://github.com/bluetech'

    bugs.python.org fields:

    activity = <Date 2020-10-04.22:44:32.041>
    actor = 'serhiy.storchaka'
    assignee = 'none'
    closed = True
    closed_date = <Date 2020-10-04.22:44:32.041>
    closer = 'serhiy.storchaka'
    components = ['Interpreter Core']
    creation = <Date 2020-10-02.10:49:51.390>
    creator = 'bluetech'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 41909
    keywords = ['patch']
    message_count = 8.0
    messages = ['377807', '377860', '377905', '377920', '377922', '377977', '377979', '377980']
    nosy_count = 8.0
    nosy_names = ['gvanrossum', 'loewis', 'nascheme', 'pitrou', 'serhiy.storchaka', 'The Compiler', 'bluetech', 'sparkingdark']
    pr_nums = ['22536', '22550', '22551']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'crash'
    url = 'https://bugs.python.org/issue41909'
    versions = ['Python 3.8', 'Python 3.9', 'Python 3.10']

    @bluetech
    Copy link
    Mannequin Author

    bluetech mannequin commented Oct 2, 2020

    The following program crashes with a segfault:

    class Segfault:
        def __getattr__(self, name):
            self.unknown_attribute
    
    instance = Segfault()
    issubclass(instance, int)  # int doesn't matter

    Tested with Python 3.7, 3.8, 3.9rc2, and master in debug mode (commit 497126f).

    The program is odd, but this affects pytest (see pytest-dev/pytest#2330).
    The class is buggy user code, pytest does the issubclass check.
    In pytest there are other code paths which trigger it besides issubclass, but that's the one I am able to quickly reproduce standalone.

    Here is the backtrace with python master:

    ran@ran:~/src/cpython$ gdb --args ./python segfault.py
    Reading symbols from ./python...
    (gdb) r
    Starting program: /home/ran/src/cpython/python segfault.py

    Program received signal SIGSEGV, Segmentation fault.
    0x00005555556bc55e in PyGILState_Check () at Python/pystate.c:1353
    1353 if (!PyThread_tss_is_created(&gilstate->autoTSSkey)) {

    (gdb) bt 29
    #0 0x00005555556bc55e in PyGILState_Check () at Python/pystate.c:1353
    #1 0x00005555555fc117 in _PyMem_DebugCheckGIL (func=0x55555581b330 <func.10> "_PyMem_DebugMalloc") at Objects/obmalloc.c:2329
    #2 _PyMem_DebugMalloc (ctx=0x55555593b140 <PyMem_Debug+96>, nbytes=185) at Objects/obmalloc.c:2329
    #3 0x00005555555fcee8 in PyObject_Malloc (size=<optimized out>) at Objects/obmalloc.c:685
    #4 0x000055555562d6f3 in PyUnicode_New (size=136, maxchar=<optimized out>) at Objects/unicodeobject.c:1455
    #5 0x00005555556556e7 in _PyUnicodeWriter_PrepareInternal (writer=writer@entry=0x7fffff7ff0c0, length=length@entry=1, maxchar=<optimized out>, maxchar@entry=127)
    at Objects/unicodeobject.c:14004
    #6 0x0000555555658439 in _PyUnicodeWriter_WriteASCIIString (writer=writer@entry=0x7fffff7ff0c0, ascii=ascii@entry=0x5555558196c8 "'%.50s' object has no attribute '%U'",
    len=1) at Objects/unicodeobject.c:14174
    #7 0x000055555565a18d in PyUnicode_FromFormatV (format=format@entry=0x5555558196c8 "'%.50s' object has no attribute '%U'", vargs=vargs@entry=0x7fffff7ff170)
    at Objects/unicodeobject.c:3114
    #8 0x000055555569646b in _PyErr_FormatV (tstate=0x5555559aa300, exception=<type at remote 0x555555932040>,
    format=format@entry=0x5555558196c8 "'%.50s' object has no attribute '%U'", vargs=vargs@entry=0x7fffff7ff170) at Python/errors.c:1029
    #9 0x0000555555696ff3 in PyErr_Format (exception=<optimized out>, format=format@entry=0x5555558196c8 "'%.50s' object has no attribute '%U'") at Python/errors.c:1071
    #10 0x00005555555fa4a5 in _PyObject_GenericGetAttrWithDict (obj=obj@entry=<Segfault at remote 0x7ffff772dd70>, name=name@entry='unknown_attribute', dict=<optimized out>,
    dict@entry=0x0, suppress=suppress@entry=0) at Objects/object.c:1268
    #11 0x00005555555fa96f in PyObject_GenericGetAttr (obj=obj@entry=<Segfault at remote 0x7ffff772dd70>, name=name@entry='unknown_attribute') at Objects/object.c:1281
    #12 0x000055555561c3f8 in slot_tp_getattr_hook (self=<Segfault at remote 0x7ffff772dd70>, name='unknown_attribute') at Objects/typeobject.c:6805
    #13 0x00005555555f68c2 in PyObject_GetAttr (v=v@entry=<Segfault at remote 0x7ffff772dd70>, name=<optimized out>) at Objects/object.c:891
    #14 0x000055555567c254 in _PyEval_EvalFrameDefault (tstate=0x5555559aa300,
    f=Frame 0x7ffff71cc3b0, for file /home/ran/src/cpython/segfault.py, line 3, in __getattr
    _ (self=<Segfault at remote 0x7ffff772dd70>, name='unknown_attribute'),
    throwflag=<optimized out>) at Python/ceval.c:3036
    #15 0x00005555555b9d1d in _PyEval_EvalFrame (throwflag=0,
    f=Frame 0x7ffff71cc3b0, for file /home/ran/src/cpython/segfault.py, line 3, in __getattr__ (self=<Segfault at remote 0x7ffff772dd70>, name='unknown_attribute'),
    tstate=0x5555559aa300) at ./Include/internal/pycore_ceval.h:40
    #16 function_code_fastcall (tstate=0x5555559aa300, co=<optimized out>, args=0x7fffff7ff590, nargs=2, globals=<optimized out>) at Objects/call.c:329
    #17 0x00005555555ba67b in _PyFunction_Vectorcall (func=<optimized out>, stack=<optimized out>, nargsf=<optimized out>, kwnames=<optimized out>) at Objects/call.c:366
    #18 0x000055555576c2d4 in _PyObject_VectorcallTstate (tstate=0x5555559aa300, callable=<function at remote 0x7ffff773c4b0>, args=0x7fffff7ff580, nargsf=2, kwnames=0x0)
    at ./Include/cpython/abstract.h:114
    #19 0x000055555576cb87 in method_vectorcall (method=<optimized out>, args=0x7fffff7ff588, nargsf=<optimized out>, kwnames=0x0) at Objects/classobject.c:53
    #20 0x00005555556109ac in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=9223372036854775809, args=0x7fffff7ff588, callable=<method at remote 0x7ffff71c8c50>,
    tstate=0x5555559aa300) at ./Include/cpython/abstract.h:114
    #21 PyObject_CallOneArg (arg='unknown_attribute', func=<method at remote 0x7ffff71c8c50>) at ./Include/cpython/abstract.h:184
    #22 call_attribute (self=self@entry=<Segfault at remote 0x7ffff772dd70>, attr=<method at remote 0x7ffff71c8c50>, attr@entry=<function at remote 0x7ffff773c4b0>,
    name=name@entry='unknown_attribute') at Objects/typeobject.c:6771
    #23 0x000055555561c44a in slot_tp_getattr_hook (self=<Segfault at remote 0x7ffff772dd70>, name='unknown_attribute') at Objects/typeobject.c:6813
    #24 0x00005555555f68c2 in PyObject_GetAttr (v=v@entry=<Segfault at remote 0x7ffff772dd70>, name=<optimized out>) at Objects/object.c:891
    #25 0x000055555567c254 in _PyEval_EvalFrameDefault (tstate=0x5555559aa300,
    f=Frame 0x7ffff71cc200, for file /home/ran/src/cpython/segfault.py, line 3, in __getattr__ (self=<Segfault at remote 0x7ffff772dd70>, name='unknown_attribute'),
    throwflag=<optimized out>) at Python/ceval.c:3036
    #26 0x00005555555b9d1d in _PyEval_EvalFrame (throwflag=0,
    f=Frame 0x7ffff71cc200, for file /home/ran/src/cpython/segfault.py, line 3, in __getattr__ (self=<Segfault at remote 0x7ffff772dd70>, name='unknown_attribute'),
    tstate=0x5555559aa300) at ./Include/internal/pycore_ceval.h:40
    #27 function_code_fastcall (tstate=0x5555559aa300, co=<optimized out>, args=0x7fffff7ff8b0, nargs=2, globals=<optimized out>) at Objects/call.c:329
    #28 0x00005555555ba67b in _PyFunction_Vectorcall (func=<optimized out>, stack=<optimized out>, nargsf=<optimized out>, kwnames=<optimized out>) at Objects/call.c:366
    (More stack frames follow...)

    (gdb) py-bt
    Traceback (most recent call first):
      File "/home/ran/src/cpython/segfault.py", line 3, in __getattr__
        self.unknown_attribute
      File "/home/ran/src/cpython/segfault.py", line 3, in __getattr__
        self.unknown_attribute
      File "/home/ran/src/cpython/segfault.py", line 3, in __getattr__
        self.unknown_attribute
      File "/home/ran/src/cpython/segfault.py", line 3, in __getattr__
        self.unknown_attribute
      File "/home/ran/src/cpython/segfault.py", line 3, in __getattr__
        self.unknown_attribute
      File "/home/ran/src/cpython/segfault.py", line 3, in __getattr__
        self.unknown_attribute
      File "/home/ran/src/cpython/segfault.py", line 3, in __getattr__
        self.unknown_attribute
      File "/home/ran/src/cpython/segfault.py", line 3, in __getattr__
        self.unknown_attribute
    --Type <RET> for more, q to quit, c to continue without paging--q

    @bluetech bluetech mannequin added interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump labels Oct 2, 2020
    @serhiy-storchaka
    Copy link
    Member

    Interesting, when lookup any attribute you will get a RecursionError, but this is one of only two places where the recursion check is disabled (the other one is in interning strings).

    You get a crash also when call isinstance(1, instance) or issubclass(int, instance). This is the same bug.

    The implementation of issubclass() calls PyObject_IsSubclass() which calls object_issubclass() which calls recursive_issubclass() which calls check_class() which calls abstract_get_bases() which looks up attribute "__bases__" which leads to infinite recursion in __getattr__().

    bpo-125278 0x000055555576116d in abstract_get_bases (cls=cls@entry=0x7fffeabcee10) at Objects/abstract.c:2340
    bpo-125279 0x00005555557611f1 in check_class (cls=cls@entry=0x7fffeabcee10, error=error@entry=0x55555585c828 "issubclass() arg 1 must be a class") at Objects/abstract.c:2396
    bpo-125280 0x00005555557619bf in recursive_issubclass (derived=derived@entry=0x7fffeabcee10, cls=cls@entry=0x555555946220 <PyLong_Type>) at Objects/abstract.c:2524
    bpo-125281 0x0000555555761daf in object_issubclass (tstate=<optimized out>, derived=0x7fffeabcee10, cls=0x555555946220 <PyLong_Type>) at Objects/abstract.c:2550
    bpo-125282 0x0000555555765ea1 in PyObject_IsSubclass (derived=<optimized out>, cls=<optimized out>) at Objects/abstract.c:2600
    bpo-125283 0x00005555557b78f3 in builtin_issubclass_impl (module=module@entry=0x7fffeae21d70, cls=<optimized out>, class_or_tuple=<optimized out>) at Python/bltinmodule.c:2511
    bpo-125284 0x00005555557b794d in builtin_issubclass (module=0x7fffeae21d70, args=0x7fffead0e388, nargs=<optimized out>) at Python/clinic/bltinmodule.c.h:828

    The problem is that in abstract_get_bases() the recursion check is disabled by using macros Py_ALLOW_RECURSION/Py_END_ALLOW_RECURSION (added in 5b22213). I do not know why it was necessary. Currently tests are passed if enable recursion check, and this fixes this issue.

    It is worth to mention that attribute __bases__ is looked up to support non-types in issubclass() and isinstance(). Originally it was added in bpo-464992 to support Zope extension ExtensionClass. I tested that the current code of ExtensionClass does not need it. So we could simplify the code and avoid recursion by just using tp_bases. But this needs wider discussion.

    @serhiy-storchaka serhiy-storchaka added 3.8 (EOL) end of life 3.9 only security fixes 3.10 only security fixes labels Oct 3, 2020
    @gvanrossum
    Copy link
    Member

    I don't think I remember what's so special about abstract_get_bases() to disable the recursion check. Maybe this macro is always just a micro-optimization? Then again it's hard to be sure, the behavior may depend on the platform.

    Pulling in Antoine, who showed up in the email https://mail.python.org/pipermail/python-dev/2008-August/082106.html is mentioned in a long comment in ceval.h about the recursion protection.

    Possibly the Py_ALLOW_RECURSION macro is intended to ensure that some handler code for first-order recursion errors doesn't get interrupted by second-order recursion errors? There's a limit of 50 stack frames which feels a bit dodgy.

    @sparkingdark
    Copy link
    Mannequin

    sparkingdark mannequin commented Oct 4, 2020

    Is it open Then give the process to reproduce the issue.

    @serhiy-storchaka
    Copy link
    Member

    Antoine give me a hint. The problem seems related to using PyDict_GetItem() when look up an attribute in the instance dict. It silences all exceptions (including RecursionError) and this results in silent "no such key in a dict".

    But since 3.8 it was replaced with PyDict_GetItemWithError() which preserves RecursionError (see bpo-35459). I think we can remove Py_ALLOW_RECURSION/Py_END_ALLOW_RECURSION now.

    As for other case in interning string, Py_ALLOW_RECURSION/Py_END_ALLOW_RECURSION was used to guard PyDict_GetItem(). But the current code uses PyDict_SetDefault() (added in 3.4) which does not silence errors. So Py_ALLOW_RECURSION/Py_END_ALLOW_RECURSION can be removed in that place too.

    @serhiy-storchaka
    Copy link
    Member

    New changeset 9ece9cd by Serhiy Storchaka in branch 'master':
    bpo-41909: Enable previously disabled recursion checks. (GH-22536)
    9ece9cd

    @serhiy-storchaka
    Copy link
    Member

    New changeset 7aa22ba by Serhiy Storchaka in branch '3.9':
    [3.9] bpo-41909: Enable previously disabled recursion checks. (GH-22536) (GH-22550)
    7aa22ba

    @serhiy-storchaka
    Copy link
    Member

    New changeset 09a7b3b by Serhiy Storchaka in branch '3.8':
    [3.8] bpo-41909: Enable previously disabled recursion checks. (GH-22536) (GH-22551)
    09a7b3b

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.8 (EOL) end of life 3.9 only security fixes 3.10 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants