Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add basic support for tag-based static polymorphism #1326

Merged
merged 7 commits into from
Apr 14, 2018

Conversation

oremanj
Copy link
Contributor

@oremanj oremanj commented Mar 19, 2018

Sometimes it is possible to look at a C++ object and know what its dynamic type is,
even if it doesn't use C++ polymorphism, because instances of the object and its
subclasses conform to some other mechanism for being self-describing; for example,
perhaps there's an enumerated "tag" or "kind" member in the base class that's always
set to an indication of the correct type. This might be done for performance reasons,
or to permit most-derived types to be trivially copyable. One of the most widely-known
examples is in LLVM: https://llvm.org/docs/HowToSetUpLLVMStyleRTTI.html

This PR permits pybind11 to be informed of such conventions via a new specializable
detail::polymorphic_type_hook<> template, which generalizes the previous logic for
determining the runtime type of an object based on C++ RTTI. Implementors provide
a way to map from a base class object to a const std::type_info* for the dynamic
type; pybind11 then uses this to ensure that casting a Base* to Python creates a
Python object that knows it's wrapping the appropriate sort of Derived.

There are a number of restrictions with this tag-based static polymorphism support
compared to pybind11's existing support for built-in C++ polymorphism:

  • there is no support for this-pointer adjustment, so only single inheritance is permitted
  • there is no way to make C++ code call new Python-provided subclasses
  • when binding C++ classes that redefine a method in a subclass, the .def() must be
    repeated in the binding for Python to know about the update

But these are not much of an issue in practice in many cases, the impact on the
complexity of pybind11's innards is minimal and localized, and the support for
automatic downcasting improves usability a great deal.

Sometimes it is possible to look at a C++ object and know what its dynamic type is,
even if it doesn't use C++ polymorphism, because instances of the object and its
subclasses conform to some other mechanism for being self-describing; for example,
perhaps there's an enumerated "tag" or "kind" member in the base class that's always
set to an indication of the correct type. This might be done for performance reasons,
or to permit most-derived types to be trivially copyable. One of the most widely-known
examples is in LLVM: https://llvm.org/docs/HowToSetUpLLVMStyleRTTI.html

This PR permits pybind11 to be informed of such conventions via a new specializable
detail::polymorphic_type_hook<> template, which generalizes the previous logic for
determining the runtime type of an object based on C++ RTTI. Implementors provide
a way to map from a base class object to a const std::type_info* for the dynamic
type; pybind11 then uses this to ensure that casting a Base* to Python creates a
Python object that knows it's wrapping the appropriate sort of Derived.

There are a number of restrictions with this tag-based static polymorphism support
compared to pybind11's existing support for built-in C++ polymorphism:

- there is no support for this-pointer adjustment, so only single inheritance is permitted
- there is no way to make C++ code call new Python-provided subclasses
- when binding C++ classes that redefine a method in a subclass, the .def() must be
  repeated in the binding for Python to know about the update

But these are not much of an issue in practice in many cases, the impact on the
complexity of pybind11's innards is minimal and localized, and the support for
automatic downcasting improves usability a great deal.
@wjakob
Copy link
Member

wjakob commented Mar 19, 2018

Cool, that's quite nifty (and does not seem to introduce any runtime costs AFAIK).

@oremanj
Copy link
Contributor Author

oremanj commented Mar 19, 2018

The AppVeyor error appears to be unrelated to my change; can you confirm my read of that is correct?

@jagerman
Copy link
Member

This looks like a nice addition to me. Some comments (in order from most to least significant):

  1. Regarding the limitations (only single inheritance, etc.) aren't these implementation details? In theory, it seems to me that someone could write a polymorphic_type_hook<> that did work with polymorphic types. It would simply require, I think, providing a get that takes a mutable pointer, i.e. get(const itype *&src). Simple cases that don't need to update the pointer don't need to take it by reference.

Furthering that line of thought, perhaps we could make the default polymorphic_type_hook<> implementation actually do the polymorphic conversion (by allowing it to mutate the src pointer).

  1. This is lacking an addition in the documentation. You actually wrote good documentation for it as a comment; most of that content can be moved into a documentation section as-is.

  2. I don't think the two versions of static const void *cast_to_derived(const itype *src) { ... } for polymorphic and non-polymorphic types are needed: when itype isn't polymorphic, dynamic_cast should do exactly the same thing as static_cast, so it should be perfectly fine to just use dynamic_cast whether or not polymorphic.

  3. (@wjakob) I think there is a small performance penalty here: non-polymorphic types now have to go through the more complicated polymorphic code path (while previously they simply returned a pair of values). It's not going to be a huge overhead (worst case is a couple of pointer checks and a string comparison--the latter only under non-libstdc++). That seems like a fairly small cost, but it's something we should be aware of.

@jagerman
Copy link
Member

jagerman commented Mar 26, 2018

The AppVeyor error appears to be unrelated to my change; can you confirm my read of that is correct?

Yeah. I was kind of hoping it would go away on its own with an appveyor image update, but it seems that hasn't happened (yet?). But yes, that one is safe to ignore.

@oremanj
Copy link
Contributor Author

oremanj commented Mar 26, 2018

Thanks for the review!

  1. You're absolutely right; I like the idea of doing the pointer adjustment in polymorphic_type_hook, so the dynamic_cast<void*>-when-polymorphic is a special case of the general facility rather than being bolted on afterward.
  2. I'll write something up.
  3. dynamic_cast<void*>(pointer-to-non-polymorphic) is ill-formed (I just tested it on godbolt.org)
  4. My read is that non-polymorphic types will select the default implementation of polymorphic_type_hook, which can be determined at compile-time to have a get() that always returns nullptr, which makes the if (instance_type && ...) { ... } block dead code, which makes the whole function reduce to return type_caster_generic::src_and_type(src, typeid(itype), nullptr); which is equivalent to the old non-polymorphic version.

@oremanj
Copy link
Contributor Author

oremanj commented Apr 9, 2018

@jagerman, any updates on this? I believe I've responded to all of your concerns, but please let me know if there's more that you'd like me to take into account.

Copy link
Member

@jagerman jagerman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some small edits for the documentation, but the implementation looks good to me aside from the detail namespace issue (see comment).

.. code-block:: cpp

enum class PetKind { Cat, Dog, Zebra };
struct Pet {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be helpful here to add a comment to the effect of:

struct Pet { // Not polymorphic: has no virtual methods

It has come up before that people coming to pybind from the python side didn't realize this distinction—which is definitely understandable as it's a fairly subtle rule!

std::string bark() const { return sound; }
};

namespace pybind11 { namespace detail {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's get this out of the detail namespace since it's explicitly provided to be user-facing. (Custom type casters are only in detail by accident, with a long-term goal of moving them out--probably to pybind11::caster, but definitely out of the detail namespace). I think defining it in just pybind11 is fine.

whatever runtime information is available to determine if its ``src``
parameter is in fact an instance of some class ``Derived`` that
inherits from ``Base``. If it finds such a ``Derived``, it sets ``type
= &typeid(Derived)`` and returns ``static_cast<const Derived*>(src)``.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The static_cast seems unnecessary: I'd think in most cases simply returning src is going to be fine. I think this text could be simplified to "and returns a pointer to the derived instance." If your downcasting apparatus requires a pointer adjustment, that gives you enough information to implement it. But in cases like the example above a simple return src; is enough.

Otherwise, it just returns ``src``, leaving ``type`` at its default
value of nullptr. It's OK to return a type that pybind11 doesn't know
about; in that case, no downcasting will occur, and the original
``src`` pointer will be used with its static type ``Base*``.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see some edge cases here that aren't really okay: for example where we have a A : B : C inheritance chain with A and B registered but C not. You'd probably want to end up with type = typeid(B) rather than type = typeid(C) to end up with the right type in Python (i.e. B rather than A).

So perhaps keep it in, but starting off with "If you set type to a type that pybind11 doesn't know about no downcasting will occur, and ..."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. FWIW, the same issue exists with ordinary polymorphic class bindings. IIRC, Boost.Python has a crazy registry-aware reimplementation of dynamic_cast so that they correctly downcast to B in that scenario, which I don't think is worth the complexity, but I agree that people already customizing the downcasting behavior might be able to recognize "I should only downcast to B here" and implement that for their specific class hierarchy.

about; in that case, no downcasting will occur, and the original
``src`` pointer will be used with its static type ``Base*``.

It is critical that the return value and ``type`` argument of
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return value -> returned pointer

whose type is ``type``. If the hierarchy being exposed uses only
single inheritance, a simple ``return src;`` will achieve this just
fine, but in the general case, you must cast ``src`` to the
appropriate derived-class pointer before allowing it to be cast to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since I suggested taking out static_cast above, you can put it back in here:

"in the general case, you must cast src to the appropriate derived-class pointer (e.g. using static_cast<Derived>(src)) before allowing it to be returned as a void *."

(NB: I also tweaked the end of that sentence re: the implicit void * cast).

type = src ? &typeid(*src) : nullptr;
return dynamic_cast<const void*>(src);
}
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditch the implementation code. Just a description that the default implementation of polymorphic_type_hook uses a dynamic_cast<void *> to downcast to the most-derived type is enough: curious readers can always read the code for the actual implementation.

@@ -298,7 +298,7 @@ inheritance relationship. This is reflected in Python:

>>> p = example.pet_store()
>>> type(p) # `Dog` instance behind `Pet` pointer
Pet # no pointer upcasting for regular non-polymorphic types
Pet # no pointer downcasting for regular non-polymorphic types
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. (I though I might have been responsible for that -- for some reason I've always has trouble remembering which way is "up" and "down" in an inheritance tree. But nope, it looks like @dean0x7d wrote that, so I guess I'm not alone in confusing the directions :) ).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's definitely confusing to me too - I looked it up on Wikipedia before making this change, just to be sure :-)

@@ -795,30 +827,25 @@ template <typename type> class type_caster_base : public type_caster_generic {

// Returns a (pointer, type_info) pair taking care of necessary RTTI type lookup for a
// polymorphic type. If the instance isn't derived, returns the non-RTTI base version.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment here needs updating. Perhaps:

    // Returns a (pointer, type_info) pair taking care of necessary type lookup for a polymorphic
    // type (using RTTI by default, but can be overridden by specializing polymorphic_type_hook).
    // If the instance isn't derived, returns the base version.

@oremanj
Copy link
Contributor Author

oremanj commented Apr 9, 2018

@jagerman Thanks for the feedback! I've updated per your requests; please advise if there are any other changes you'd like to see.

@jagerman
Copy link
Member

This looks good to me now. Cc @wjakob for any other comments before merging.

@wjakob
Copy link
Member

wjakob commented Apr 14, 2018

This looks great! -- Merging now.

@wjakob wjakob merged commit fd9bc8f into pybind:master Apr 14, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants