Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thread-safe access to SubscriptionData #288

Merged
merged 6 commits into from
Oct 7, 2024
Merged

Thread-safe access to SubscriptionData #288

merged 6 commits into from
Oct 7, 2024

Conversation

Yadunund
Copy link
Member

@Yadunund Yadunund commented Oct 2, 2024

This PR introduces SubscriptionData class that manages the lifetime of Zenoh artifacts. It makes access to class members thread-safe where possible. Some z_closure callbacks still work with type erased raw ptrs but this will be addressed when we migrate to zenoh-cpp.

It seems like it is not feasible to store rmw_node_t * as type erased void * in rmw_subscription_t->data like we did with rmw_publisher_t->data .

The problem is that because rmw_subscriptions_t in rmw_wait contains an array of the type erased subscriptions handles (ie, rmw_subscription_t->data ), there is no way to get the SubscriptionData::SharedPtr here

auto sub_data =
static_cast<rmw_zenoh_cpp::rmw_subscription_data_t *>(subscriptions->subscribers[i]);
. ie we cannot traverse the heirarchy from rmw_subscription->data->rmw_node_t->context->impl->get_node_data()->get_sub_data().

The only option we have is storing a rawptr to SubscriptionData in rmw_subscription_t->data but leads to the risk of it becoming a dangling pointer when the SubscriptionData::SharedPtr is deleted.

Copy link
Collaborator

@clalancette clalancette left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This generally looks good. I've left some things to improved, and it needs to be rebased. Once those are done, I'll run some more extensive tests on it.

rmw_zenoh_cpp/src/detail/liveliness_utils.hpp Outdated Show resolved Hide resolved
rmw_zenoh_cpp/src/detail/liveliness_utils.hpp Outdated Show resolved Hide resolved
@@ -69,12 +69,14 @@ rmw_publisher_event_init(
rmw_event->event_type = event_type;

// Register the event with graph cache.
std::weak_ptr<rmw_zenoh_cpp::PublisherData> data_wp = pub_data;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a separate bugfix?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The approach with capturing the weak_ptr ensures that if the event callback executes for some reason after the last reference to PublisherData::SharedPtr goes out of scope, we would safely exit the callback. This is what I want to do with some of the Zenoh callbacks as well once we switch to zenoh-cpp.
I would have liked to also do it within rmw_subscription_event_init but given that we're storing the raw_ptr to SubscriptionData in rmw_subscription_t->data and hence not able to retrieve the shared_ptr to the same, we are stuck with potential to dereference a dangling pointer if the scenario above. But this scenario should never occur given the implementation of rcl*

rmw_zenoh_cpp/src/rmw_event.cpp Outdated Show resolved Hide resolved
rmw_zenoh_cpp/src/rmw_zenoh.cpp Outdated Show resolved Hide resolved
rmw_zenoh_cpp/src/rmw_zenoh.cpp Show resolved Hide resolved
Signed-off-by: Yadunund <yadunund@intrinsic.ai>
…imitation

Signed-off-by: Yadunund <yadunund@intrinsic.ai>
Signed-off-by: Yadunund <yadunund@intrinsic.ai>
Signed-off-by: Yadunund <yadunund@intrinsic.ai>
@ahcorde
Copy link
Contributor

ahcorde commented Oct 7, 2024

ros2.repos

rmw_zenoh/yadu/raii-sub

ROS Package Failing test name Failing test output
rcl None None
rcl_action test_graph__rmw_zenoh_cpp segfault
rcl_lifecycle None None
rcl_yaml_param None None
rclcpp test_intra_process_manager Failed (Check logs)
rclcpp test_node_interfaces__node_graph Failed (Check logs)
rclcpp test_publisher Failed (Check logs)
rclcpp test_events_executor Timeout (Check logs)
rclcpp test_wait_set Failed (Check logs)
rclcpp_action test_server Failed (Check logs)
rclcpp_components None None
rclcpp_lifecycle None -
rmw_zenoh_cpp None None
test_cli None None
test_cli_remap test_cli_remapping TBD

There are some new segfaults

  • rcl_action -> test_graph__rmw_zenoh_cpp
  • rclcpp_Action -> test_server

@clalancette
Copy link
Collaborator

This generally looks good to me, and passes my tests locally. We should fix the failing rcl_action and rclcpp_action tests before merging, though.

Signed-off-by: Yadunund <yadunund@intrinsic.ai>
@Yadunund
Copy link
Member Author

Yadunund commented Oct 7, 2024

@ahcorde thanks a lot for running those tests ahead of time and reporting the issues. I've fixed them in 2bd7e13.

rmw_zenoh_cpp/src/detail/rmw_subscription_data.cpp Outdated Show resolved Hide resolved
rmw_zenoh_cpp/src/detail/rmw_subscription_data.cpp Outdated Show resolved Hide resolved
Co-authored-by: Alejandro Hernández Cordero <ahcorde@gmail.com>
Signed-off-by: yadunund <yadunund@gmail.com>
@Yadunund Yadunund merged commit 439d6dc into rolling Oct 7, 2024
8 checks passed
@Yadunund Yadunund deleted the yadu/raii-sub branch October 7, 2024 19:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants