Skip to content

fix: detect virtual inheritance in add_base to prevent pointer offset crash#6017

Merged
rwgk merged 1 commit intopybind:masterfrom
rwgk:fix_virtual_inheritance_test_smart_ptr_todo
Mar 30, 2026
Merged

fix: detect virtual inheritance in add_base to prevent pointer offset crash#6017
rwgk merged 1 commit intopybind:masterfrom
rwgk:fix_virtual_inheritance_test_smart_ptr_todo

Conversation

@rwgk
Copy link
Copy Markdown
Collaborator

@rwgk rwgk commented Mar 28, 2026

Description

The bug fixed in this PR was discovered by chance while working on tests for PR #6014.

Note that this bug existed "forever".

Summary

  • Fix segfault when dispatching inherited methods through virtual bases (e.g. SftVirtDerived2::name() via SftVirtBase::name())
  • Add is_static_downcastable SFINAE trait to detect virtual inheritance at compile time
  • Automatically set multiple_inheritance = true for virtual bases, forcing load_impl to use implicit_casts instead of reinterpret_cast
  • Remove the .def("name", &SftVirtDerived2::name) workaround from test_smart_ptr.cpp

Background

When a class uses virtual inheritance (e.g. struct D : virtual Base), the base subobject is at a dynamic offset determined at runtime via the vtable. However, load_impl Case 2a uses reinterpret_cast to convert between base and derived pointers, which assumes a fixed offset. This produces a corrupted pointer, leading to segfaults in shared_ptr control block operations or method dispatch.

The existing test (test_shared_from_this_virt_shared_ptr_arg) had a TODO workaround: an explicit .def("name", ...) re-binding on SftVirtDerived2 to avoid the inherited method dispatch path. This PR fixes the root cause so the workaround is no longer needed.

Approach

static_cast<Derived*>(Base*) is ill-formed when Base is a virtual base of Derived. We use this as a SFINAE probe:

template <typename Base, typename Derived, typename = void>
struct is_static_downcastable : std::false_type {};
template <typename Base, typename Derived>
struct is_static_downcastable<Base, Derived,
    void_t<decltype(static_cast<Derived *>(std::declval<Base *>()))>>
    : std::true_type {};

In class_::add_base, when is_static_downcastable<Base, type> is false, we set rec.multiple_inheritance = true. This forces load_impl to use the implicit_casts path, which walks the registered base chain and applies static_cast in the upcast direction (derived-to-base), which is always valid even for virtual inheritance.

Suggested changelog entry:

  • Fixed segfault when dispatching inherited methods through virtual bases. load_impl Case 2a used reinterpret_cast to adjust pointers, which is invalid for virtual inheritance where the base subobject is at a dynamic offset. This bug existed since the initial commit but was never triggered until virtual inheritance tests exercised inherited method dispatch.

… crash

Virtual inheritance places the base subobject at a dynamic offset, but
load_impl Case 2a uses reinterpret_cast which assumes a fixed offset.
This caused segfaults when dispatching inherited methods through virtual
bases (e.g. SftVirtDerived2::name()).

Add an is_static_downcastable SFINAE trait that detects whether
static_cast<Derived*>(Base*) is valid. When it is not (virtual
inheritance), set multiple_inheritance = true in add_base to force the
implicit_casts path, which correctly adjusts pointers at runtime.

Remove the workaround .def("name", &SftVirtDerived2::name) from
test_smart_ptr.cpp that was papering over the issue.

Made-with: Cursor
@rwgk
Copy link
Copy Markdown
Collaborator Author

rwgk commented Mar 28, 2026

How long has the virtual inheritance pointer offset bug existed?

The bug has been latent in pybind11 since its very first commit 38bd711 (July 5, 2015).

Root cause

When pybind11 needs to convert a Python object to a C++ base type, the load
logic takes the stored void * pointer (which points to the most-derived C++
object) and reinterprets it directly as the target base type pointer.

Initial commit (38bd711, July 5, 2015):

if (PyType_IsSubtype(Py_TYPE(src), typeinfo->type)) {
    value = ((instance_type *) src)->value;  // stored pointer used as-is
    return true;
}

This was later refactored into load_impl Case 2a (e45c211, Feb 22, 2017,
"Support multiple inheritance from python"):

// Case 2a: ... we can use reinterpret_cast.
if (bases.size() == 1 && (no_cpp_mi || bases.front()->type == typeinfo->type)) {
    this_.load_value(reinterpret_cast<instance *>(src.ptr())->get_value_and_holder());
    return true;
}

For non-virtual inheritance, the base subobject is at a fixed compile-time
offset (often 0 for single inheritance), so reinterpret_cast works. For
virtual inheritance, the base subobject is at a dynamic offset stored in the
vtable. reinterpret_cast gives a corrupted pointer, leading to segfaults.

Why it was never triggered

  1. No virtual inheritance tests existed until August 2025 (da65334,
    "add virtual inheritance test"), focused on the smart_holder path.

  2. The existing virtual inheritance tests in test_multiple_inheritance.cpp
    (Propagate py::multiple_inheritance to all children #3650, ec81e8e, Jan 2022) only test direct methods on each class
    (e.g. MVC.get_c_c()). They never exercise inherited method dispatch
    through a virtual base, which is what triggers the incorrect pointer offset
    in load_impl.

  3. The SftVirtDerived2 test (added March 26, 2026, e446296) was the first
    to exercise inherited method dispatch through virtual bases. It immediately
    needed a .def("name", &SftVirtDerived2::name) workaround to avoid the
    segfault.

  4. The test_class_sh_mi_thunks.cpp tests (Left : virtual VBase,
    Cat : virtual public Animal, etc.) also use virtual inheritance, but
    MVE declares two bases to pybind11 (py::class_<MVE, MVD0, MVD1>),
    which triggers rec.bases.size() > 1, setting the MI flag and avoiding
    the Case 2a shortcut. The single-base virtual inheritance classes in that
    file (Left, Right, Cat, Tiger) define their methods directly,
    so they don't go through the problematic inherited-method-dispatch path.

Timeline

Date Commit Event
2015-07-05 38bd711 Initial commit — bug introduced (load uses raw pointer cast)
2016-09-11 8e5dceb MI support added — add_base with reinterpret_cast<type *>(src)
2017-02-22 e45c211 load_impl Case 2a optimization — same assumption formalized
2022-01-26 ec81e8e Virtual inheritance tests added (#3650) — but only direct methods
2025-08-15 da65334 First virtual inheritance test for smart_holder
2025-10-01 4dc33d6 Fix smart_holder virtual inheritance bugs (#5836)
2026-03-26 e446296 SftVirtDerived2 test added — workaround needed
2026-03-27 8885f76 Fix: is_static_downcastable SFINAE trait in add_base

The fix

Add a compile-time check in add_base: if static_cast<Derived*>(Base*) is
ill-formed (SFINAE), the base is virtual, and we set
rec.multiple_inheritance = true. This forces load_impl to skip Case 2a and
use Case 2c (try_implicit_casts), which walks the registered base chain using
static_cast in the upcast direction (derived → base), which is always valid
even for virtual inheritance.

Copy link
Copy Markdown
Contributor

@itamaro itamaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@rwgk
Copy link
Copy Markdown
Collaborator Author

rwgk commented Mar 30, 2026

Thanks @itamaro!

@rwgk rwgk merged commit 83f71d8 into pybind:master Mar 30, 2026
89 checks passed
@github-actions github-actions bot added the needs changelog Possibly needs a changelog entry label Mar 30, 2026
@rwgk rwgk deleted the fix_virtual_inheritance_test_smart_ptr_todo branch March 30, 2026 03:49
rwgk added a commit to rwgk/pybind11 that referenced this pull request Mar 30, 2026
… crash (pybind#6017)

Virtual inheritance places the base subobject at a dynamic offset, but
load_impl Case 2a uses reinterpret_cast which assumes a fixed offset.
This caused segfaults when dispatching inherited methods through virtual
bases (e.g. SftVirtDerived2::name()).

Add an is_static_downcastable SFINAE trait that detects whether
static_cast<Derived*>(Base*) is valid. When it is not (virtual
inheritance), set multiple_inheritance = true in add_base to force the
implicit_casts path, which correctly adjusts pointers at runtime.

Remove the workaround .def("name", &SftVirtDerived2::name) from
test_smart_ptr.cpp that was papering over the issue.

Made-with: Cursor
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

needs changelog Possibly needs a changelog entry

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants