Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add basic support for tag-based static polymorphism #1326

Merged
merged 7 commits into from Apr 14, 2018

Conversation

Projects
None yet
3 participants
@oremanj
Copy link
Contributor

commented Mar 19, 2018

Sometimes it is possible to look at a C++ object and know what its dynamic type is,
even if it doesn't use C++ polymorphism, because instances of the object and its
subclasses conform to some other mechanism for being self-describing; for example,
perhaps there's an enumerated "tag" or "kind" member in the base class that's always
set to an indication of the correct type. This might be done for performance reasons,
or to permit most-derived types to be trivially copyable. One of the most widely-known
examples is in LLVM: https://llvm.org/docs/HowToSetUpLLVMStyleRTTI.html

This PR permits pybind11 to be informed of such conventions via a new specializable
detail::polymorphic_type_hook<> template, which generalizes the previous logic for
determining the runtime type of an object based on C++ RTTI. Implementors provide
a way to map from a base class object to a const std::type_info* for the dynamic
type; pybind11 then uses this to ensure that casting a Base* to Python creates a
Python object that knows it's wrapping the appropriate sort of Derived.

There are a number of restrictions with this tag-based static polymorphism support
compared to pybind11's existing support for built-in C++ polymorphism:

  • there is no support for this-pointer adjustment, so only single inheritance is permitted
  • there is no way to make C++ code call new Python-provided subclasses
  • when binding C++ classes that redefine a method in a subclass, the .def() must be
    repeated in the binding for Python to know about the update

But these are not much of an issue in practice in many cases, the impact on the
complexity of pybind11's innards is minimal and localized, and the support for
automatic downcasting improves usability a great deal.

Joshua Oreman
Add basic support for tag-based static polymorphism
Sometimes it is possible to look at a C++ object and know what its dynamic type is,
even if it doesn't use C++ polymorphism, because instances of the object and its
subclasses conform to some other mechanism for being self-describing; for example,
perhaps there's an enumerated "tag" or "kind" member in the base class that's always
set to an indication of the correct type. This might be done for performance reasons,
or to permit most-derived types to be trivially copyable. One of the most widely-known
examples is in LLVM: https://llvm.org/docs/HowToSetUpLLVMStyleRTTI.html

This PR permits pybind11 to be informed of such conventions via a new specializable
detail::polymorphic_type_hook<> template, which generalizes the previous logic for
determining the runtime type of an object based on C++ RTTI. Implementors provide
a way to map from a base class object to a const std::type_info* for the dynamic
type; pybind11 then uses this to ensure that casting a Base* to Python creates a
Python object that knows it's wrapping the appropriate sort of Derived.

There are a number of restrictions with this tag-based static polymorphism support
compared to pybind11's existing support for built-in C++ polymorphism:

- there is no support for this-pointer adjustment, so only single inheritance is permitted
- there is no way to make C++ code call new Python-provided subclasses
- when binding C++ classes that redefine a method in a subclass, the .def() must be
  repeated in the binding for Python to know about the update

But these are not much of an issue in practice in many cases, the impact on the
complexity of pybind11's innards is minimal and localized, and the support for
automatic downcasting improves usability a great deal.
@wjakob

This comment has been minimized.

Copy link
Member

commented Mar 19, 2018

Cool, that's quite nifty (and does not seem to introduce any runtime costs AFAIK).

Joshua Oreman
@oremanj

This comment has been minimized.

Copy link
Contributor Author

commented Mar 19, 2018

The AppVeyor error appears to be unrelated to my change; can you confirm my read of that is correct?

@jagerman

This comment has been minimized.

Copy link
Member

commented Mar 26, 2018

This looks like a nice addition to me. Some comments (in order from most to least significant):

  1. Regarding the limitations (only single inheritance, etc.) aren't these implementation details? In theory, it seems to me that someone could write a polymorphic_type_hook<> that did work with polymorphic types. It would simply require, I think, providing a get that takes a mutable pointer, i.e. get(const itype *&src). Simple cases that don't need to update the pointer don't need to take it by reference.

Furthering that line of thought, perhaps we could make the default polymorphic_type_hook<> implementation actually do the polymorphic conversion (by allowing it to mutate the src pointer).

  1. This is lacking an addition in the documentation. You actually wrote good documentation for it as a comment; most of that content can be moved into a documentation section as-is.

  2. I don't think the two versions of static const void *cast_to_derived(const itype *src) { ... } for polymorphic and non-polymorphic types are needed: when itype isn't polymorphic, dynamic_cast should do exactly the same thing as static_cast, so it should be perfectly fine to just use dynamic_cast whether or not polymorphic.

  3. (@wjakob) I think there is a small performance penalty here: non-polymorphic types now have to go through the more complicated polymorphic code path (while previously they simply returned a pair of values). It's not going to be a huge overhead (worst case is a couple of pointer checks and a string comparison--the latter only under non-libstdc++). That seems like a fairly small cost, but it's something we should be aware of.

@jagerman

This comment has been minimized.

Copy link
Member

commented Mar 26, 2018

The AppVeyor error appears to be unrelated to my change; can you confirm my read of that is correct?

Yeah. I was kind of hoping it would go away on its own with an appveyor image update, but it seems that hasn't happened (yet?). But yes, that one is safe to ignore.

@oremanj

This comment has been minimized.

Copy link
Contributor Author

commented Mar 26, 2018

Thanks for the review!

  1. You're absolutely right; I like the idea of doing the pointer adjustment in polymorphic_type_hook, so the dynamic_cast<void*>-when-polymorphic is a special case of the general facility rather than being bolted on afterward.
  2. I'll write something up.
  3. dynamic_cast<void*>(pointer-to-non-polymorphic) is ill-formed (I just tested it on godbolt.org)
  4. My read is that non-polymorphic types will select the default implementation of polymorphic_type_hook, which can be determined at compile-time to have a get() that always returns nullptr, which makes the if (instance_type && ...) { ... } block dead code, which makes the whole function reduce to return type_caster_generic::src_and_type(src, typeid(itype), nullptr); which is equivalent to the old non-polymorphic version.
@oremanj

This comment has been minimized.

Copy link
Contributor Author

commented Apr 9, 2018

@jagerman, any updates on this? I believe I've responded to all of your concerns, but please let me know if there's more that you'd like me to take into account.

@jagerman
Copy link
Member

left a comment

Some small edits for the documentation, but the implementation looks good to me aside from the detail namespace issue (see comment).

.. code-block:: cpp

enum class PetKind { Cat, Dog, Zebra };
struct Pet {

This comment has been minimized.

Copy link
@jagerman

jagerman Apr 9, 2018

Member

I think it would be helpful here to add a comment to the effect of:

struct Pet { // Not polymorphic: has no virtual methods

It has come up before that people coming to pybind from the python side didn't realize this distinction—which is definitely understandable as it's a fairly subtle rule!

std::string bark() const { return sound; }
};

namespace pybind11 { namespace detail {

This comment has been minimized.

Copy link
@jagerman

jagerman Apr 9, 2018

Member

Let's get this out of the detail namespace since it's explicitly provided to be user-facing. (Custom type casters are only in detail by accident, with a long-term goal of moving them out--probably to pybind11::caster, but definitely out of the detail namespace). I think defining it in just pybind11 is fine.

whatever runtime information is available to determine if its ``src``
parameter is in fact an instance of some class ``Derived`` that
inherits from ``Base``. If it finds such a ``Derived``, it sets ``type
= &typeid(Derived)`` and returns ``static_cast<const Derived*>(src)``.

This comment has been minimized.

Copy link
@jagerman

jagerman Apr 9, 2018

Member

The static_cast seems unnecessary: I'd think in most cases simply returning src is going to be fine. I think this text could be simplified to "and returns a pointer to the derived instance." If your downcasting apparatus requires a pointer adjustment, that gives you enough information to implement it. But in cases like the example above a simple return src; is enough.

Otherwise, it just returns ``src``, leaving ``type`` at its default
value of nullptr. It's OK to return a type that pybind11 doesn't know
about; in that case, no downcasting will occur, and the original
``src`` pointer will be used with its static type ``Base*``.

This comment has been minimized.

Copy link
@jagerman

jagerman Apr 9, 2018

Member

I can see some edge cases here that aren't really okay: for example where we have a A : B : C inheritance chain with A and B registered but C not. You'd probably want to end up with type = typeid(B) rather than type = typeid(C) to end up with the right type in Python (i.e. B rather than A).

So perhaps keep it in, but starting off with "If you set type to a type that pybind11 doesn't know about no downcasting will occur, and ..."

This comment has been minimized.

Copy link
@oremanj

oremanj Apr 9, 2018

Author Contributor

Makes sense. FWIW, the same issue exists with ordinary polymorphic class bindings. IIRC, Boost.Python has a crazy registry-aware reimplementation of dynamic_cast so that they correctly downcast to B in that scenario, which I don't think is worth the complexity, but I agree that people already customizing the downcasting behavior might be able to recognize "I should only downcast to B here" and implement that for their specific class hierarchy.

about; in that case, no downcasting will occur, and the original
``src`` pointer will be used with its static type ``Base*``.

It is critical that the return value and ``type`` argument of

This comment has been minimized.

Copy link
@jagerman

jagerman Apr 9, 2018

Member

return value -> returned pointer

whose type is ``type``. If the hierarchy being exposed uses only
single inheritance, a simple ``return src;`` will achieve this just
fine, but in the general case, you must cast ``src`` to the
appropriate derived-class pointer before allowing it to be cast to

This comment has been minimized.

Copy link
@jagerman

jagerman Apr 9, 2018

Member

Since I suggested taking out static_cast above, you can put it back in here:

"in the general case, you must cast src to the appropriate derived-class pointer (e.g. using static_cast<Derived>(src)) before allowing it to be returned as a void *."

(NB: I also tweaked the end of that sentence re: the implicit void * cast).

type = src ? &typeid(*src) : nullptr;
return dynamic_cast<const void*>(src);
}
};

This comment has been minimized.

Copy link
@jagerman

jagerman Apr 9, 2018

Member

Ditch the implementation code. Just a description that the default implementation of polymorphic_type_hook uses a dynamic_cast<void *> to downcast to the most-derived type is enough: curious readers can always read the code for the actual implementation.

@@ -298,7 +298,7 @@ inheritance relationship. This is reflected in Python:

>>> p = example.pet_store()
>>> type(p) # `Dog` instance behind `Pet` pointer
Pet # no pointer upcasting for regular non-polymorphic types
Pet # no pointer downcasting for regular non-polymorphic types

This comment has been minimized.

Copy link
@jagerman

jagerman Apr 9, 2018

Member

Good catch. (I though I might have been responsible for that -- for some reason I've always has trouble remembering which way is "up" and "down" in an inheritance tree. But nope, it looks like @dean0x7d wrote that, so I guess I'm not alone in confusing the directions :) ).

This comment has been minimized.

Copy link
@oremanj

oremanj Apr 9, 2018

Author Contributor

It's definitely confusing to me too - I looked it up on Wikipedia before making this change, just to be sure :-)

@@ -795,30 +827,25 @@ template <typename type> class type_caster_base : public type_caster_generic {

// Returns a (pointer, type_info) pair taking care of necessary RTTI type lookup for a
// polymorphic type. If the instance isn't derived, returns the non-RTTI base version.

This comment has been minimized.

Copy link
@jagerman

jagerman Apr 9, 2018

Member

The comment here needs updating. Perhaps:

    // Returns a (pointer, type_info) pair taking care of necessary type lookup for a polymorphic
    // type (using RTTI by default, but can be overridden by specializing polymorphic_type_hook).
    // If the instance isn't derived, returns the base version.
@oremanj

This comment has been minimized.

Copy link
Contributor Author

commented Apr 9, 2018

@jagerman Thanks for the feedback! I've updated per your requests; please advise if there are any other changes you'd like to see.

@jagerman

This comment has been minimized.

Copy link
Member

commented Apr 11, 2018

This looks good to me now. Cc @wjakob for any other comments before merging.

@wjakob

This comment has been minimized.

Copy link
Member

commented Apr 14, 2018

This looks great! -- Merging now.

@wjakob wjakob merged commit fd9bc8f into pybind:master Apr 14, 2018

2 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.