Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add basic support for tag-based static polymorphism #1326

Merged
merged 7 commits into from Apr 14, 2018
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
91 changes: 91 additions & 0 deletions docs/advanced/classes.rst
Expand Up @@ -999,3 +999,94 @@ described trampoline:
requires a more explicit function binding in the form of
``.def("foo", static_cast<int (A::*)() const>(&Publicist::foo));``
where ``int (A::*)() const`` is the type of ``A::foo``.

Custom automatic downcasters
============================

As explained in :ref:`inheritance`, pybind11 comes with built-in
understanding of the dynamic type of polymorphic objects in C++; that
is, returning a Pet to Python produces a Python object that knows it's
wrapping a Dog, if Pet has virtual methods and pybind11 knows about
Dog and this Pet is in fact a Dog. Sometimes, you might want to
provide this automatic downcasting behavior when creating bindings for
a class hierarchy that does not use standard C++ polymorphism, such as
LLVM [#f4]_. As long as there's some way to determine at runtime
whether a downcast is safe, you can proceed by specializing the
``pybind11::detail::polymorphic_type_hook`` template:

.. code-block:: cpp

enum class PetKind { Cat, Dog, Zebra };
struct Pet {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be helpful here to add a comment to the effect of:

struct Pet { // Not polymorphic: has no virtual methods

It has come up before that people coming to pybind from the python side didn't realize this distinction—which is definitely understandable as it's a fairly subtle rule!

const PetKind kind;
int age = 0;
protected:
Pet(PetKind _kind) : kind(_kind) {}
};
struct Dog : Pet {
Dog() : Pet(PetKind::Dog) {}
std::string sound = "woof!";
std::string bark() const { return sound; }
};

namespace pybind11 { namespace detail {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's get this out of the detail namespace since it's explicitly provided to be user-facing. (Custom type casters are only in detail by accident, with a long-term goal of moving them out--probably to pybind11::caster, but definitely out of the detail namespace). I think defining it in just pybind11 is fine.

template<> struct polymorphic_type_hook<Pet> {
static const void *get(const Pet *src, const std::type_info*& type) {
// note that src may be nullptr
if (src && src->kind == PetKind::Dog) {
type = &typeid(Dog);
return static_cast<const Dog*>(src);
}
return src;
}
};
}} // namespace pybind11::detail

When pybind11 wants to convert a C++ pointer of type ``Base*`` to a
Python object, it calls ``polymorphic_type_hook<Base>::get()`` to
determine if a downcast is possible. The ``get()`` function should use
whatever runtime information is available to determine if its ``src``
parameter is in fact an instance of some class ``Derived`` that
inherits from ``Base``. If it finds such a ``Derived``, it sets ``type
= &typeid(Derived)`` and returns ``static_cast<const Derived*>(src)``.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The static_cast seems unnecessary: I'd think in most cases simply returning src is going to be fine. I think this text could be simplified to "and returns a pointer to the derived instance." If your downcasting apparatus requires a pointer adjustment, that gives you enough information to implement it. But in cases like the example above a simple return src; is enough.

Otherwise, it just returns ``src``, leaving ``type`` at its default
value of nullptr. It's OK to return a type that pybind11 doesn't know
about; in that case, no downcasting will occur, and the original
``src`` pointer will be used with its static type ``Base*``.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see some edge cases here that aren't really okay: for example where we have a A : B : C inheritance chain with A and B registered but C not. You'd probably want to end up with type = typeid(B) rather than type = typeid(C) to end up with the right type in Python (i.e. B rather than A).

So perhaps keep it in, but starting off with "If you set type to a type that pybind11 doesn't know about no downcasting will occur, and ..."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. FWIW, the same issue exists with ordinary polymorphic class bindings. IIRC, Boost.Python has a crazy registry-aware reimplementation of dynamic_cast so that they correctly downcast to B in that scenario, which I don't think is worth the complexity, but I agree that people already customizing the downcasting behavior might be able to recognize "I should only downcast to B here" and implement that for their specific class hierarchy.


It is critical that the return value and ``type`` argument of
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return value -> returned pointer

``get()`` agree with each other: if ``type`` is set to something
non-null, the returned pointer must point to the start of an object
whose type is ``type``. If the hierarchy being exposed uses only
single inheritance, a simple ``return src;`` will achieve this just
fine, but in the general case, you must cast ``src`` to the
appropriate derived-class pointer before allowing it to be cast to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since I suggested taking out static_cast above, you can put it back in here:

"in the general case, you must cast src to the appropriate derived-class pointer (e.g. using static_cast<Derived>(src)) before allowing it to be returned as a void *."

(NB: I also tweaked the end of that sentence re: the implicit void * cast).

``void*``.

pybind11's standard support for downcasting objects whose types
have virtual methods is implemented using ``polymorphic_type_hook`` too:

.. code-block:: cpp

template <typename itype>
struct polymorphic_type_hook<itype, enable_if_t<std::is_polymorphic<itype>::value>>
{
static const void *get(const itype *src, const std::type_info*& type) {
type = src ? &typeid(*src) : nullptr;
return dynamic_cast<const void*>(src);
}
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditch the implementation code. Just a description that the default implementation of polymorphic_type_hook uses a dynamic_cast<void *> to downcast to the most-derived type is enough: curious readers can always read the code for the actual implementation.


This uses the standard C++ ability to determine the most-derived type
of a polymorphic object using ``typeid()`` and to cast a base pointer
to that most-derived type (even if you don't know what it is) using
``dynamic_cast<void*>``.

.. [#f4] https://llvm.org/docs/HowToSetUpLLVMStyleRTTI.html

.. seealso::

The file :file:`tests/test_tagbased_polymorphic.cpp` contains a
more complete example, including a demonstration of how to provide
automatic downcasting for an entire class hierarchy without
writing one get() function for each class.
10 changes: 5 additions & 5 deletions docs/classes.rst
Expand Up @@ -228,8 +228,8 @@ just brings them on par.

.. _inheritance:

Inheritance and automatic upcasting
===================================
Inheritance and automatic downcasting
=====================================

Suppose now that the example consists of two data structures with an
inheritance relationship:
Expand Down Expand Up @@ -298,7 +298,7 @@ inheritance relationship. This is reflected in Python:

>>> p = example.pet_store()
>>> type(p) # `Dog` instance behind `Pet` pointer
Pet # no pointer upcasting for regular non-polymorphic types
Pet # no pointer downcasting for regular non-polymorphic types
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. (I though I might have been responsible for that -- for some reason I've always has trouble remembering which way is "up" and "down" in an inheritance tree. But nope, it looks like @dean0x7d wrote that, so I guess I'm not alone in confusing the directions :) ).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's definitely confusing to me too - I looked it up on Wikipedia before making this change, just to be sure :-)

>>> p.bark()
AttributeError: 'Pet' object has no attribute 'bark'

Expand Down Expand Up @@ -330,11 +330,11 @@ will automatically recognize this:

>>> p = example.pet_store2()
>>> type(p)
PolymorphicDog # automatically upcast
PolymorphicDog # automatically downcast
>>> p.bark()
u'woof!'

Given a pointer to a polymorphic base, pybind11 performs automatic upcasting
Given a pointer to a polymorphic base, pybind11 performs automatic downcasting
to the actual derived type. Note that this goes beyond the usual situation in
C++: we don't just get access to the virtual functions of the base, we get the
concrete derived type including functions and attributes that the base type may
Expand Down
63 changes: 45 additions & 18 deletions include/pybind11/cast.h
Expand Up @@ -774,9 +774,41 @@ template <typename T1, typename T2> struct is_copy_constructible<std::pair<T1, T
: all_of<is_copy_constructible<T1>, is_copy_constructible<T2>> {};
#endif

// polymorphic_type_hook<itype>::get(src, tinfo) determines whether the object pointed
// to by `src` actually is an instance of some class derived from `itype`.
// If so, it sets `tinfo` to point to the std::type_info representing that derived
// type, and returns a pointer to the start of the most-derived object of that type
// (in which `src` is a subobject; this will be the same address as `src` in most
// single inheritance cases). If not, or if `src` is nullptr, it simply returns `src`
// and leaves `tinfo` at its default value of nullptr.
//
// The default polymorphic_type_hook just returns src. A specialization for polymorphic
// types determines the runtime type of the passed object and adjusts the this-pointer
// appropriately via dynamic_cast<void*>. This is what enables a C++ Animal* to appear
// to Python as a Dog (if Dog inherits from Animal, Animal is polymorphic, Dog is
// registered with pybind11, and this Animal is in fact a Dog).
//
// You may specialize polymorphic_type_hook yourself for types that want to appear
// polymorphic to Python but do not use C++ RTTI. (This is a not uncommon pattern
// in performance-sensitive applications, used most notably in LLVM.)
template <typename itype, typename SFINAE = void>
struct polymorphic_type_hook
{
static const void *get(const itype *src, const std::type_info*&) { return src; }
};
template <typename itype>
struct polymorphic_type_hook<itype, enable_if_t<std::is_polymorphic<itype>::value>>
{
static const void *get(const itype *src, const std::type_info*& type) {
type = src ? &typeid(*src) : nullptr;
return dynamic_cast<const void*>(src);
}
};

/// Generic type caster for objects stored on the heap
template <typename type> class type_caster_base : public type_caster_generic {
using itype = intrinsic_t<type>;

public:
static constexpr auto name = _<type>();

Expand All @@ -795,30 +827,25 @@ template <typename type> class type_caster_base : public type_caster_generic {

// Returns a (pointer, type_info) pair taking care of necessary RTTI type lookup for a
// polymorphic type. If the instance isn't derived, returns the non-RTTI base version.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment here needs updating. Perhaps:

    // Returns a (pointer, type_info) pair taking care of necessary type lookup for a polymorphic
    // type (using RTTI by default, but can be overridden by specializing polymorphic_type_hook).
    // If the instance isn't derived, returns the base version.

template <typename T = itype, enable_if_t<std::is_polymorphic<T>::value, int> = 0>
static std::pair<const void *, const type_info *> src_and_type(const itype *src) {
const void *vsrc = src;
auto &cast_type = typeid(itype);
const std::type_info *instance_type = nullptr;
if (vsrc) {
instance_type = &typeid(*src);
if (!same_type(cast_type, *instance_type)) {
// This is a base pointer to a derived type; if it is a pybind11-registered type, we
// can get the correct derived pointer (which may be != base pointer) by a
// dynamic_cast to most derived type:
if (auto *tpi = get_type_info(*instance_type))
return {dynamic_cast<const void *>(src), const_cast<const type_info *>(tpi)};
}
const void *vsrc = polymorphic_type_hook<itype>::get(src, instance_type);
if (instance_type && !same_type(cast_type, *instance_type)) {
// This is a base pointer to a derived type. If the derived type is registered
// with pybind11, we want to make the full derived object available.
// In the typical case where itype is polymorphic, we get the correct
// derived pointer (which may be != base pointer) by a dynamic_cast to
// most derived type. If itype is not polymorphic, we won't get here
// except via a user-provided specialization of polymorphic_type_hook,
// and the user has promised that no this-pointer adjustment is
// required in that case, so it's OK to use static_cast.
if (const auto *tpi = get_type_info(*instance_type))
return {vsrc, tpi};
}
// Otherwise we have either a nullptr, an `itype` pointer, or an unknown derived pointer, so
// don't do a cast
return type_caster_generic::src_and_type(vsrc, cast_type, instance_type);
}

// Non-polymorphic type, so no dynamic casting; just call the generic version directly
template <typename T = itype, enable_if_t<!std::is_polymorphic<T>::value, int> = 0>
static std::pair<const void *, const type_info *> src_and_type(const itype *src) {
return type_caster_generic::src_and_type(src, typeid(itype));
return type_caster_generic::src_and_type(src, cast_type, instance_type);
}

static handle cast(const itype *src, return_value_policy policy, handle parent) {
Expand Down
1 change: 1 addition & 0 deletions tests/CMakeLists.txt
Expand Up @@ -57,6 +57,7 @@ set(PYBIND11_TEST_FILES
test_smart_ptr.cpp
test_stl.cpp
test_stl_binders.cpp
test_tagbased_polymorphic.cpp
test_virtual_functions.cpp
)

Expand Down
138 changes: 138 additions & 0 deletions tests/test_tagbased_polymorphic.cpp
@@ -0,0 +1,138 @@
/*
tests/test_tagbased_polymorphic.cpp -- test of detail::polymorphic_type_hook

Copyright (c) 2018 Hudson River Trading LLC <opensource@hudson-trading.com>

All rights reserved. Use of this source code is governed by a
BSD-style license that can be found in the LICENSE file.
*/

#include "pybind11_tests.h"
#include <pybind11/stl.h>

struct Animal
{
enum class Kind {
Unknown = 0,
Dog = 100, Labrador, Chihuahua, LastDog = 199,
Cat = 200, Panther, LastCat = 299
};
static const std::type_info* type_of_kind(Kind kind);
static std::string name_of_kind(Kind kind);

const Kind kind;
const std::string name;

protected:
Animal(const std::string& _name, Kind _kind)
: kind(_kind), name(_name)
{}
};

struct Dog : Animal
{
Dog(const std::string& _name, Kind _kind = Kind::Dog) : Animal(_name, _kind) {}
std::string bark() const { return name_of_kind(kind) + " " + name + " goes " + sound; }
std::string sound = "WOOF!";
};

struct Labrador : Dog
{
Labrador(const std::string& _name, int _excitement = 9001)
: Dog(_name, Kind::Labrador), excitement(_excitement) {}
int excitement;
};

struct Chihuahua : Dog
{
Chihuahua(const std::string& _name) : Dog(_name, Kind::Chihuahua) { sound = "iyiyiyiyiyi"; }
std::string bark() const { return Dog::bark() + " and runs in circles"; }
};

struct Cat : Animal
{
Cat(const std::string& _name, Kind _kind = Kind::Cat) : Animal(_name, _kind) {}
std::string purr() const { return "mrowr"; }
};

struct Panther : Cat
{
Panther(const std::string& _name) : Cat(_name, Kind::Panther) {}
std::string purr() const { return "mrrrRRRRRR"; }
};

std::vector<std::unique_ptr<Animal>> create_zoo()
{
std::vector<std::unique_ptr<Animal>> ret;
ret.emplace_back(new Labrador("Fido", 15000));

// simulate some new type of Dog that the Python bindings
// haven't been updated for; it should still be considered
// a Dog, not just an Animal.
ret.emplace_back(new Dog("Ginger", Dog::Kind(150)));

ret.emplace_back(new Chihuahua("Hertzl"));
ret.emplace_back(new Cat("Tiger", Cat::Kind::Cat));
ret.emplace_back(new Panther("Leo"));
return ret;
}

const std::type_info* Animal::type_of_kind(Kind kind)
{
switch (kind) {
case Kind::Unknown: break;

case Kind::Dog: break;
case Kind::Labrador: return &typeid(Labrador);
case Kind::Chihuahua: return &typeid(Chihuahua);
case Kind::LastDog: break;

case Kind::Cat: break;
case Kind::Panther: return &typeid(Panther);
case Kind::LastCat: break;
}

if (kind >= Kind::Dog && kind <= Kind::LastDog) return &typeid(Dog);
if (kind >= Kind::Cat && kind <= Kind::LastCat) return &typeid(Cat);
return nullptr;
}

std::string Animal::name_of_kind(Kind kind)
{
std::string raw_name = type_of_kind(kind)->name();
py::detail::clean_type_id(raw_name);
return raw_name;
}

namespace pybind11 {
namespace detail {
template <typename itype>
struct polymorphic_type_hook<itype, enable_if_t<std::is_base_of<Animal, itype>::value>>
{
static const void *get(const itype *src, const std::type_info*& type)
{ type = src ? Animal::type_of_kind(src->kind) : nullptr; return src; }
};
}
}

TEST_SUBMODULE(tagbased_polymorphic, m) {
py::class_<Animal>(m, "Animal")
.def_readonly("name", &Animal::name);
py::class_<Dog, Animal>(m, "Dog")
.def(py::init<std::string>())
.def_readwrite("sound", &Dog::sound)
.def("bark", &Dog::bark);
py::class_<Labrador, Dog>(m, "Labrador")
.def(py::init<std::string, int>(), "name"_a, "excitement"_a = 9001)
.def_readwrite("excitement", &Labrador::excitement);
py::class_<Chihuahua, Dog>(m, "Chihuahua")
.def(py::init<std::string>())
.def("bark", &Chihuahua::bark);
py::class_<Cat, Animal>(m, "Cat")
.def(py::init<std::string>())
.def("purr", &Cat::purr);
py::class_<Panther, Cat>(m, "Panther")
.def(py::init<std::string>())
.def("purr", &Panther::purr);
m.def("create_zoo", &create_zoo);
};
20 changes: 20 additions & 0 deletions tests/test_tagbased_polymorphic.py
@@ -0,0 +1,20 @@
from pybind11_tests import tagbased_polymorphic as m


def test_downcast():
zoo = m.create_zoo()
assert [type(animal) for animal in zoo] == [
m.Labrador, m.Dog, m.Chihuahua, m.Cat, m.Panther
]
assert [animal.name for animal in zoo] == [
"Fido", "Ginger", "Hertzl", "Tiger", "Leo"
]
zoo[1].sound = "woooooo"
assert [dog.bark() for dog in zoo[:3]] == [
"Labrador Fido goes WOOF!",
"Dog Ginger goes woooooo",
"Chihuahua Hertzl goes iyiyiyiyiyi and runs in circles"
]
assert [cat.purr() for cat in zoo[3:]] == ["mrowr", "mrrrRRRRRR"]
zoo[0].excitement -= 1000
assert zoo[0].excitement == 14000