Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

smart_holder trampoline shared_from_this support #3023

Merged
merged 67 commits into from
Jun 30, 2021

Conversation

rwgk
Copy link
Collaborator

@rwgk rwgk commented May 27, 2021

Before this PR the smart_holder type_caster implementation did not take std::enable_shared_from_this into account at all. As long as no trampolines are involved, that was fine, but the "inheritance slicing" issue (#1333) resurfaces

  • if shared_from_this uses its supporting weak_ptr as a backdoor to the shared_ptr inside smart_holder,
  • which does not, and cannot, keep the trampoline Python object alive,
  • because that would be a reference cycle, undetectable by gc.

Unfortunately the approach pioneered under PR #2839, and adopted in the smart_holder code, does not play nicely with shared_from_this. Here it gets complicated:

  • Before C++17, the shared_from_this behavior was not even fully defined for a situation in which multiple shared_ptr objects exist simultaneously (which is valid if custom deleters are involved, and exactly what the smart_holder code does).
  • C++17 made the definition more rigorous: shared_from_this will retrieve the first shared_ptr object created (until it expires):

How can this be made to play nicely with pybind11 trampolines?

Omitting a very long story of failed experiments in the commit history of this PR, the key trick is:

  • Make the shared_ptr inside smart_holder invisible to the shared_from_this mechanism, by casting the raw pointer to void * before passing it to the shared_ptr constructor (or shared_ptr::reset, to be precise wrt to the smart_holder implementation).

With that, the PR #2839 approach works again, although it also introduces a subtle but important requirement:

  • shared_from_this will only succeed (vs throwing a bad_weak_ptr exception) while C++ code, outside of the smart_holder, owns a shared_ptr object.

That means, a situation that boils down to the following can (if no other non-smart_holder shared_ptr exists at call time) trigger a bad_weak_ptr RuntimeError:

long foo(const Bar &obj) { return obj.shared_from_this().use_count(); }
...
m.def("foo", foo);

An obvious way to solve the problem is to redefine foo():

long foo(const std::shared_ptr<const Bar> &obj) { return obj.use_count(); }

But if that's not an option (e.g. because it is an external dependency that cannot easily be modified), there is an easy alternative trick:

m.def("foo", [](const std::shared_ptr<const Bar>& obj) { return foo(*obj); });

The lambda function keeps the shared_ptr alive for the duration of the foo() function call. Problem solved.

This PR goes one step further, although that's mostly a nicety. The smart_holder now also keeps a weak_ptr to the shared_ptr that keeps the trampoline Python object alive, and uses the weak_ptr to generate shared_ptr objects in repeated calls. Therefore the use_counts of the returned shared_ptr objects are in line with naive expectations.

It is possible that this PR does not solve all possible corner cases of shared_from_this interacting with the smart_holder implementation, but at least it goes from "clearly incorrect behavior" to "mostly correct". Systematic test coverage for all corner cases, with fixes if any, is very involved and will be done separately.

@rwgk
Copy link
Collaborator Author

rwgk commented Jun 10, 2021

Closing/Opening to work around flake (see #2995).

@rwgk rwgk closed this Jun 10, 2021
@rwgk rwgk reopened this Jun 10, 2021
@rwgk rwgk force-pushed the sh_shared_from_this branch 3 times, most recently from 37dbb55 to 36299d1 Compare June 23, 2021 00:50
rwgk added a commit to rwgk/rwgk_tbx that referenced this pull request Jun 24, 2021
@rwgk rwgk changed the title shared_from_this experiments smart_holder trampoline shared_from_this support Jun 24, 2021
@rwgk rwgk marked this pull request as ready for review June 24, 2021 19:12
@rwgk rwgk requested a review from henryiii as a code owner June 24, 2021 19:12
@rwgk rwgk force-pushed the sh_shared_from_this branch 3 times, most recently from 5918b2e to 066edb0 Compare June 29, 2021 23:02
rwgk added 18 commits June 29, 2021 16:16
…f `pointee_depends_on_holder_owner`, but currently only for `from_raw_ptr_take_ownership`.
…t to exercise the path through `cast`. The path through init.h is missing a test that would fail if the flag is incorrect. The plan is to systematically cover all situations (there are many that are not exercised for shared_from_this).
…HA download issue affecting all MSVC builds."

This reverts commit f4e1112.

It didn't help.
https://github.com/deepmind/open_spiel/tree/bbfb0259b5d26c1e3f05000f1f4a05b7ad60393e/open_spiel/python/mfg/algorithms
    best_response_value_test
    distribution_test
    fictitious_play_test
    greedy_policy_test
    nash_conv_test
    policy_value_test

CAVEAT: The fix is NOT covered by pybind11 unit tests.
Completes unit test coverage for the changed code in smart_holder_type_casters.h.
…le changes in smart_holder_type_caster_load, trying to work on weak_ptr for shared_ptr_trampoline_self_life_support, but that didn't work out. Manually fully leak-checked again.
@rwgk
Copy link
Collaborator Author

rwgk commented Jun 30, 2021

Merging comment:

  • This PR was run through the Google internal global testing.
  • Ignoring the valgrind errors observed in the last iteration here, assuming they are false positives. Google-internally this PR was tested with ASAN.
  • This PR was reviewed by two Googlers and will be reviewed again when it is merged internally.

@rwgk rwgk merged commit 93169cc into pybind:smart_holder Jun 30, 2021
@rwgk rwgk deleted the sh_shared_from_this branch June 30, 2021 14:04
@github-actions github-actions bot added the needs changelog Possibly needs a changelog entry label Jun 30, 2021
@rwgk rwgk removed the needs changelog Possibly needs a changelog entry label Jun 30, 2021
@rwgk rwgk mentioned this pull request Jul 1, 2021
@virtuald
Copy link
Contributor

virtuald commented Jan 4, 2022

It would be good if the description of the various subtle pieces involved with using enable_shared_from_this were put into the README.

OpenSpiel pushed a commit to google-deepmind/open_spiel that referenced this pull request Aug 21, 2022
…ython games.

The pybind smart_holder logic will create a shared_ptr for Python-created objects only when required to do so. This means that if a Python-implemented game is passed from Python to C++ as Game& and then a C++ function calls shared_from_this() on it, this will fail unless there's already a C++ shared_ptr for some other reason.

The fix is either:
a - Amend the C++ interface to take shared_ptr instead of refs
b - Introduce a lambda function in the pybind interface, taking a shared_ptr and dereferencing it to call the ref-based C++ implementation

Either option will result in pybind creating a shared_ptr for us before calling our C++ code.

To minimize disruption to existing code, and forestall future failures, I've applied change (b) everywhere I could see, even though not every case was failing (because not every case called shared_from_this in the C++ implementation).

For further details of the relevant pybind internals, see pybind/pybind11#3023

fixes: #905
PiperOrigin-RevId: 469016236
Change-Id: I9467eeb992f3463a432cc7060c46404d2bbd4638
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants