Skip to content

[BUG] test_methods_and_attributes.py crash in cpython 3.9 debug build #2682

@albanD

Description

@albanD

Issue description

The test test_methods_and_attributes crashes with the following stack trace:

../../tests/test_methods_and_attributes.py ..............python: Objects/typeobject.c:3228: _PyType_Lookup: Assertion `!PyErr_Occurred()' failed.
--Type <RET> for more, q to quit, c to continue without paging--

Thread 1 "python" received signal SIGABRT, Aborted.
__GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50	../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007ffff7c6b859 in __GI_abort () at abort.c:79
#2  0x00007ffff7c6b729 in __assert_fail_base (fmt=0x7ffff7e01588 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", 
    assertion=0x55555582e2a3 "!PyErr_Occurred()", file=0x55555583c5c3 "Objects/typeobject.c", line=3228, function=<optimized out>)
    at assert.c:92
#3  0x00007ffff7c7cf36 in __GI___assert_fail (assertion=assertion@entry=0x55555582e2a3 "!PyErr_Occurred()", 
    file=file@entry=0x55555583c5c3 "Objects/typeobject.c", line=line@entry=3228, 
    function=function@entry=0x55555583fd58 <__PRETTY_FUNCTION__.16944> "_PyType_Lookup") at assert.c:101
#4  0x000055555561e30a in _PyType_Lookup (type=type@entry=0x5555559565c0 <PyModule_Type>, name=name@entry='should_fail')
    at Objects/typeobject.c:3222
#5  0x00005555556005ea in _PyObject_GenericGetAttrWithDict (obj=obj@entry=<module at remote 0x7ffff5967dd0>, 
    name=name@entry='should_fail', dict=dict@entry=0x0, suppress=suppress@entry=0) at Objects/object.c:1194
#6  0x0000555555600e2d in PyObject_GenericGetAttr (obj=obj@entry=<module at remote 0x7ffff5967dd0>, name=name@entry='should_fail')
    at Objects/object.c:1280
#7  0x00005555555fc595 in module_getattro (m=0x7ffff5967dd0, name='should_fail') at Objects/moduleobject.c:717
#8  0x00005555555fccc5 in PyObject_GetAttr (v=v@entry=<module at remote 0x7ffff5967dd0>, name=name@entry='should_fail')
    at Objects/object.c:890
#9  0x00005555555fed76 in PyObject_GetAttrString (v=<module at remote 0x7ffff5967dd0>, name=<optimized out>) at Objects/object.c:795
#10 0x00007ffff4e9e3ec in pybind11::getattr (obj=..., name=0x7ffff5567984 "should_fail", default_=...)
    at /home/alban/local/pytorch/3.9_debug_source/test/pybind11/include/pybind11/pytypes.h:441
#11 0x00007ffff519bd51 in pybind11::module_::def<test_submodule_methods_and_attributes(pybind11::module_&)::<lambda()>::<lambda(int, UnregisteredType)>, pybind11::arg, pybind11::arg_v>(const char *, <lambda()>::<lambda(int, UnregisteredType)> &&) (this=0x7fffffff6318, 
    name_=0x7ffff5567984 "should_fail", f=...)
    at /home/alban/local/pytorch/3.9_debug_source/test/pybind11/include/pybind11/pybind11.h:913
#12 0x00007ffff519809e in <lambda()>::operator()(void) const (__closure=0x555555de5f78)
    at /home/alban/local/pytorch/3.9_debug_source/test/pybind11/tests/test_methods_and_attributes.cpp:318
#13 0x00007ffff51a49a0 in pybind11::detail::argument_loader<>::call_impl<void, test_submodule_methods_and_attributes(pybind11::module_&)::<lambda()>&, pybind11::detail::void_type>(<lambda()> &, std::index_sequence, pybind11::detail::void_type &&) (this=0x7fffffff643e, 
    f=...) at /home/alban/local/pytorch/3.9_debug_source/test/pybind11/include/pybind11/cast.h:2010
#14 0x00007ffff51a32d4 in pybind11::detail::argument_loader<>::call<void, pybind11::detail::void_type, test_submodule_methods_and_attributes(pybind11::module_&)::<lambda()>&>(<lambda()> &) (this=0x7fffffff643e, f=...)
    at /home/alban/local/pytorch/3.9_debug_source/test/pybind11/include/pybind11/cast.h:1987
#15 0x00007ffff51a16ed in pybind11::cpp_function::<lambda(pybind11::detail::function_call&)>::operator()(pybind11::detail::function_call &) const (this=0x0, call=...) at /home/alban/local/pytorch/3.9_debug_source/test/pybind11/include/pybind11/pybind11.h:184
#16 0x00007ffff51a1755 in pybind11::cpp_function::<lambda(pybind11::detail::function_call&)>::_FUN(pybind11::detail::function_call &)
    () at /home/alban/local/pytorch/3.9_debug_source/test/pybind11/include/pybind11/pybind11.h:162
#17 0x00007ffff4eab2f1 in pybind11::cpp_function::dispatcher (self=<PyCapsule at remote 0x7ffff4b0fbd0>, args_in=(), kwargs_in=0x0)
    at /home/alban/local/pytorch/3.9_debug_source/test/pybind11/include/pybind11/pybind11.h:718
#18 0x00005555557963a0 in cfunction_call (func=<built-in method bad_arg_def_named of PyCapsule object at remote 0x7ffff4b0fbd0>, 
    args=<optimized out>, kwargs=<optimized out>) at Objects/methodobject.c:539
#19 0x00005555555c116e in _PyObject_MakeTpCall (tstate=tstate@entry=0x5555559aa450, 
    callable=callable@entry=<built-in method bad_arg_def_named of PyCapsule object at remote 0x7ffff4b0fbd0>, 
    args=args@entry=0x55555629fb18, nargs=<optimized out>, keywords=keywords@entry=0x0) at Objects/call.c:191

Reproducible example code

Run the test suite with cpython 3.9 debug build (with a temporary fix for #2681 otherwise that other issue make it fail before).

Potential root cause

I think the issue is as follows:

  • In the = step of py::arg("a") = UnregisteredType();
  • We end up here:
    PyErr_SetString(PyExc_TypeError, msg.c_str());
    where the type is not registered and so we set the python error flag
  • But the bad return state (this is a bad return state according to the functions comment) gets eaten up here:
    if (src == nullptr)
    return none().release();
    and we just set the value of the arg to none and nicely return from the = call. (But with the python error set)
  • We keep calling into python APIs in the def call that follows leading to this error

Note that the empty value field is already properly handled by the def code and ends up raising the proper error here (and erase the previous one) when you use a non-debug build:

if (!a.value) {

Proposed fix

Even though the comment for "src_and_type" explicitly states: "If the type is unknown, sets the error and returns a pair with .second = nullptr. (p.first = nullptr is not an error: it becomes None)." I am not sure how true this is. In particular, from what I can see, this result is always passed to the "type_caster_generic::cast" function which will return None when the first is nullptr without ever checking the second.

From what I have seen in the codebase, I would suggest removing the code that sets the error, just return a pair of nullptr and update the comment to mention that they are treated in a special way later.

Let me know if there are other part of the code I should look into that access these functions as well!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions