Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rclcpp::Context::shutdown docs imply on_shutdown callbacks are called in the order added, but implementation iterates an unordered_set #2096

Closed
mmattb opened this issue Feb 1, 2023 · 9 comments

Comments

@mmattb
Copy link

mmattb commented Feb 1, 2023

Bug report

Required Info:

  • Operating System:
    • All
  • Installation type:
    • All
  • Version or commit hash:
  • DDS implementation:
    • All
  • Client library (if applicable):
    • rclcpp

Steps to reproduce issue

This is a rough sketch of what we are trying in nav2. I doubt anything this complicated is needed. Probably just making two arbitrary callbacks and registering them in different orders will repro the issue. You just need to beat the hash algorithm basically.

class MyChild : public rclcpp::LifecycleNode {
   MyChild() {
       register_my_rcl_preshutdown_callback();
   }

  ...
};

class MyDescendant : public MyChild
{
  MyDescendant() : MyChild(blah) {

    // Super constructor already ran by this point, so the MyDescendant
    // instance should have its preshutdown callback called before my_member's
    // preshutdown callback, but we are sometimes seeing them in reverse order!
    auto my_member = std::make_shared<MyChild>(blah);

  }

  ...
};

Actual behavior

In nav2 we are working on a fix which uses the preshutdown callback mechanism. A pair of callbacks are sometimes firing in the reverse order from which they were added.

It looks like both preshutdown callbacks and on shutdown callbacks use an unordered_set:

std::unordered_set<std::shared_ptr<OnShutdownCallback>> on_shutdown_callbacks_;

And they are executed via an iteration of those sets, which I believe means they will execute in some arbitrary hash order?

for (const auto & callback : pre_shutdown_callbacks_) {

Expected behavior

But the docs state pretty clearly that callbacks will be executed in the order they are added:

* These callbacks will be called in the order they are added as the second

* - each on_shutdown callback is called, in the order that they were added

Seems like the docs are mismatched to the implementation. Unfortunately it may be annoying to reproduce since this involves a hash function... Might need to make a custom hash function to reproduce it.

Additional information

Can we please also add another line to the Context::shutdown docs regarding the preshutdown callbacks? Thanks to the people who added that callback; super handy!

@SteveMacenski
Copy link
Collaborator

Maybe a dumb question, but why not just use something ordered std::set or std::vector?

@clalancette
Copy link
Contributor

Maybe a dumb question, but why not just use something ordered std::set or std::vector?

It used to be a std::vector, but was changed to a std::unordered_set in #1639 (based on a comment of mine). The basic reason is that when removing callbacks, we have to iterate through the entire list looking for the specific one to remove, which could be a performance problem.

That said, you could argue that removal isn't a common operation, so we should prioritize being consistent here. I could buy that argument. @Barry-Xu-2018 what do you think?

@SteveMacenski
Copy link
Collaborator

SteveMacenski commented Feb 1, 2023

Unfortunately we have to remove it due to some oddities with the shutdown procedures: ros-navigation/navigation2#3377 (comment)

This is to prevent a use after free in the case that the callback is called after the destructor fires. At that point the instance has evaporated, so there is no 'this' to bind the call to -> seg fault. I found this out the hard way :)

Basically, the preshutdown callbacks don't consider if the object registered even still exists (which for components, they can be unloaded or removed at any time) so the system seg faults. The preshutdown callback needs to be removed when we destroy our nodes to deal with that issue. I suppose the bigger issue is that we seg fault in the first place (ticket to follow) but either way, the shutdown callbacks should be executed deterministically when I add potentially many of them in a single node at different levels of object composition.

Though, this is only really a problem we functionally run into because other libraries are having issue on clean shutdown. The docs say that it should be ordered, however. So if that's not the case and not intended to be fixed, the docs should be updated to reflect it. I think though that calling any registered callbacks in the ROS 2 system should always be ordered, as a matter of philosophy of expected behavior.

I believe an std::set is what you'd want then. https://en.cppreference.com/w/cpp/container/set/find Find is log and contents are ordered.

@alsora
Copy link
Collaborator

alsora commented Feb 1, 2023

IMO, we shouldn't be worried about the performance of adding/removing callbacks there, as it's seems something quite far from the fast-path that we want as optimized as possible.

@fujitatomoya
Copy link
Collaborator

I would also add test case for this.

@Barry-Xu-2018
Copy link
Collaborator

Barry-Xu-2018 commented Feb 2, 2023

If we only use std::set, I think it cannot make sure the order same as registered order.

Reference to ParameterEventCallback, it uses std::unordered_map and std::list to implement. std::list is used to make sure read order is the same as register order.

// Map container for registered parameters
std::unordered_map<
std::pair<std::string, std::string>,
CallbacksContainerType,
StringPairHash
> parameter_callbacks_;
std::list<ParameterEventCallbackHandle::WeakPtr> event_callbacks_;

Considering the number of shutdown callback isn't large, std::vector is enough. The performance overhand for remove may be ignored (While deleting, scan container to find deleted item.)

@clalancette agree with your thought.

If no more comments, I will use vector to fix this bug.

@fujitatomoya, I will add corresponding test case.

@mmattb mmattb changed the title rclcpp::Context::shutdown docs imply on_shutdown callbacks are called in the order added, but implementation iterates and unordered_set rclcpp::Context::shutdown docs imply on_shutdown callbacks are called in the order added, but implementation iterates an unordered_set Feb 2, 2023
@fujitatomoya
Copy link
Collaborator

@mmattb this has been addressed by #2097, can we close this issue?

@mmattb
Copy link
Author

mmattb commented Feb 7, 2023

@fujitatomoya I haven't tested it, but it looks directionally correct. And it comes complete with unit tests! I'm fine with closing this.

@SteveMacenski
Copy link
Collaborator

Thanks for the quick turn around!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants