-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add basic support for tag-based static polymorphism #1326
Conversation
Sometimes it is possible to look at a C++ object and know what its dynamic type is, even if it doesn't use C++ polymorphism, because instances of the object and its subclasses conform to some other mechanism for being self-describing; for example, perhaps there's an enumerated "tag" or "kind" member in the base class that's always set to an indication of the correct type. This might be done for performance reasons, or to permit most-derived types to be trivially copyable. One of the most widely-known examples is in LLVM: https://llvm.org/docs/HowToSetUpLLVMStyleRTTI.html This PR permits pybind11 to be informed of such conventions via a new specializable detail::polymorphic_type_hook<> template, which generalizes the previous logic for determining the runtime type of an object based on C++ RTTI. Implementors provide a way to map from a base class object to a const std::type_info* for the dynamic type; pybind11 then uses this to ensure that casting a Base* to Python creates a Python object that knows it's wrapping the appropriate sort of Derived. There are a number of restrictions with this tag-based static polymorphism support compared to pybind11's existing support for built-in C++ polymorphism: - there is no support for this-pointer adjustment, so only single inheritance is permitted - there is no way to make C++ code call new Python-provided subclasses - when binding C++ classes that redefine a method in a subclass, the .def() must be repeated in the binding for Python to know about the update But these are not much of an issue in practice in many cases, the impact on the complexity of pybind11's innards is minimal and localized, and the support for automatic downcasting improves usability a great deal.
Cool, that's quite nifty (and does not seem to introduce any runtime costs AFAIK). |
The AppVeyor error appears to be unrelated to my change; can you confirm my read of that is correct? |
This looks like a nice addition to me. Some comments (in order from most to least significant):
Furthering that line of thought, perhaps we could make the default
|
Yeah. I was kind of hoping it would go away on its own with an appveyor image update, but it seems that hasn't happened (yet?). But yes, that one is safe to ignore. |
Thanks for the review!
|
@jagerman, any updates on this? I believe I've responded to all of your concerns, but please let me know if there's more that you'd like me to take into account. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some small edits for the documentation, but the implementation looks good to me aside from the detail
namespace issue (see comment).
docs/advanced/classes.rst
Outdated
.. code-block:: cpp | ||
|
||
enum class PetKind { Cat, Dog, Zebra }; | ||
struct Pet { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be helpful here to add a comment to the effect of:
struct Pet { // Not polymorphic: has no virtual methods
It has come up before that people coming to pybind from the python side didn't realize this distinction—which is definitely understandable as it's a fairly subtle rule!
docs/advanced/classes.rst
Outdated
std::string bark() const { return sound; } | ||
}; | ||
|
||
namespace pybind11 { namespace detail { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's get this out of the detail
namespace since it's explicitly provided to be user-facing. (Custom type casters are only in detail
by accident, with a long-term goal of moving them out--probably to pybind11::caster
, but definitely out of the detail
namespace). I think defining it in just pybind11
is fine.
docs/advanced/classes.rst
Outdated
whatever runtime information is available to determine if its ``src`` | ||
parameter is in fact an instance of some class ``Derived`` that | ||
inherits from ``Base``. If it finds such a ``Derived``, it sets ``type | ||
= &typeid(Derived)`` and returns ``static_cast<const Derived*>(src)``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The static_cast
seems unnecessary: I'd think in most cases simply returning src
is going to be fine. I think this text could be simplified to "and returns a pointer to the derived instance." If your downcasting apparatus requires a pointer adjustment, that gives you enough information to implement it. But in cases like the example above a simple return src;
is enough.
docs/advanced/classes.rst
Outdated
Otherwise, it just returns ``src``, leaving ``type`` at its default | ||
value of nullptr. It's OK to return a type that pybind11 doesn't know | ||
about; in that case, no downcasting will occur, and the original | ||
``src`` pointer will be used with its static type ``Base*``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see some edge cases here that aren't really okay: for example where we have a A : B : C
inheritance chain with A
and B
registered but C
not. You'd probably want to end up with type = typeid(B)
rather than type = typeid(C)
to end up with the right type in Python (i.e. B
rather than A
).
So perhaps keep it in, but starting off with "If you set type
to a type that pybind11 doesn't know about no downcasting will occur, and ..."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. FWIW, the same issue exists with ordinary polymorphic class bindings. IIRC, Boost.Python has a crazy registry-aware reimplementation of dynamic_cast so that they correctly downcast to B in that scenario, which I don't think is worth the complexity, but I agree that people already customizing the downcasting behavior might be able to recognize "I should only downcast to B here" and implement that for their specific class hierarchy.
docs/advanced/classes.rst
Outdated
about; in that case, no downcasting will occur, and the original | ||
``src`` pointer will be used with its static type ``Base*``. | ||
|
||
It is critical that the return value and ``type`` argument of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return value -> returned pointer
docs/advanced/classes.rst
Outdated
whose type is ``type``. If the hierarchy being exposed uses only | ||
single inheritance, a simple ``return src;`` will achieve this just | ||
fine, but in the general case, you must cast ``src`` to the | ||
appropriate derived-class pointer before allowing it to be cast to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since I suggested taking out static_cast
above, you can put it back in here:
"in the general case, you must cast src
to the appropriate derived-class pointer (e.g. using static_cast<Derived>(src)
) before allowing it to be returned as a void *
."
(NB: I also tweaked the end of that sentence re: the implicit void *
cast).
docs/advanced/classes.rst
Outdated
type = src ? &typeid(*src) : nullptr; | ||
return dynamic_cast<const void*>(src); | ||
} | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditch the implementation code. Just a description that the default implementation of polymorphic_type_hook
uses a dynamic_cast<void *>
to downcast to the most-derived type is enough: curious readers can always read the code for the actual implementation.
@@ -298,7 +298,7 @@ inheritance relationship. This is reflected in Python: | |||
|
|||
>>> p = example.pet_store() | |||
>>> type(p) # `Dog` instance behind `Pet` pointer | |||
Pet # no pointer upcasting for regular non-polymorphic types | |||
Pet # no pointer downcasting for regular non-polymorphic types |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. (I though I might have been responsible for that -- for some reason I've always has trouble remembering which way is "up" and "down" in an inheritance tree. But nope, it looks like @dean0x7d wrote that, so I guess I'm not alone in confusing the directions :) ).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's definitely confusing to me too - I looked it up on Wikipedia before making this change, just to be sure :-)
include/pybind11/cast.h
Outdated
@@ -795,30 +827,25 @@ template <typename type> class type_caster_base : public type_caster_generic { | |||
|
|||
// Returns a (pointer, type_info) pair taking care of necessary RTTI type lookup for a | |||
// polymorphic type. If the instance isn't derived, returns the non-RTTI base version. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment here needs updating. Perhaps:
// Returns a (pointer, type_info) pair taking care of necessary type lookup for a polymorphic
// type (using RTTI by default, but can be overridden by specializing polymorphic_type_hook).
// If the instance isn't derived, returns the base version.
@jagerman Thanks for the feedback! I've updated per your requests; please advise if there are any other changes you'd like to see. |
This looks good to me now. Cc @wjakob for any other comments before merging. |
This looks great! -- Merging now. |
Sometimes it is possible to look at a C++ object and know what its dynamic type is,
even if it doesn't use C++ polymorphism, because instances of the object and its
subclasses conform to some other mechanism for being self-describing; for example,
perhaps there's an enumerated "tag" or "kind" member in the base class that's always
set to an indication of the correct type. This might be done for performance reasons,
or to permit most-derived types to be trivially copyable. One of the most widely-known
examples is in LLVM: https://llvm.org/docs/HowToSetUpLLVMStyleRTTI.html
This PR permits pybind11 to be informed of such conventions via a new specializable
detail::polymorphic_type_hook<> template, which generalizes the previous logic for
determining the runtime type of an object based on C++ RTTI. Implementors provide
a way to map from a base class object to a const std::type_info* for the dynamic
type; pybind11 then uses this to ensure that casting a Base* to Python creates a
Python object that knows it's wrapping the appropriate sort of Derived.
There are a number of restrictions with this tag-based static polymorphism support
compared to pybind11's existing support for built-in C++ polymorphism:
repeated in the binding for Python to know about the update
But these are not much of an issue in practice in many cases, the impact on the
complexity of pybind11's innards is minimal and localized, and the support for
automatic downcasting improves usability a great deal.