Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not clear entities callbacks on destruction #2002

Merged

Conversation

mauropasse
Copy link
Collaborator

Removing these not-necessary clearings of listener callbacks, since the objects are being destroyed anyway.

This also fixes a bug (use-after-free) happening during services destruction:

ServiceBase::~ServiceBase()
{
clear_on_new_request_callback();
}

On detail:

ServiceBase::~ServiceBase()
{
  // Here, implicitly, the deleter of service_handle_ was being called (*),
  // freeing the service listener (**):
  // ~service_handle_
  
  // Then, the following was called, which tries to access the freed mem (thus seg-fault) (***)
  clear_on_new_request_callback();
}

(*)

service_handle_ = std::shared_ptr<rcl_service_t>(
new rcl_service_t, [handle = node_handle_, service_name](rcl_service_t * service)
{
if (rcl_service_fini(service, handle.get()) != RCL_RET_OK) {
RCLCPP_ERROR(
rclcpp::get_node_logger(handle.get()).get_child("rclcpp"),
"Error in destruction of rcl service handle: %s",
rcl_get_error_string().str);
rcl_reset_error();
}
delete service;
});

(**) https://github.com/ros2/rmw_fastrtps/blob/dbee45ec67104db51b96da9693466f7a14e2f000/rmw_fastrtps_shared_cpp/src/rmw_service.cpp#L132

(***) https://github.com/ros2/rmw_fastrtps/blob/dbee45ec67104db51b96da9693466f7a14e2f000/rmw_fastrtps_shared_cpp/src/rmw_service.cpp#L170

FYI @alsora

Signed-off-by: Mauro Passerino mpasserino@irobot.com

Mauro Passerino added 2 commits August 24, 2022 15:08
Removing these clearings since they were not necessary,
since the objects are being destroyed anyway.

Signed-off-by: Mauro Passerino <mpasserino@irobot.com>
Signed-off-by: Mauro Passerino <mpasserino@irobot.com>
@@ -32,10 +32,6 @@ ServiceBase::ServiceBase(std::shared_ptr<rcl_node_t> node_handle)
node_logger_(rclcpp::get_node_logger(node_handle_.get()))
{}

ServiceBase::~ServiceBase()
{
clear_on_new_request_callback();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::shared_ptr<rcl_service_t> service_handle_ will be deleted after clear_on_new_request_callback(), right? All member variables are alive in the body of the destructor. So this is something like optimization?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, usually that's the order. But I found a strange case with lifecycle nodes destruction, where it happens in the opposite order!
Anyway, even if the order were correct, IMO there is no real point in clearing those callbacks.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think that this should not be called here, because it might generate the exception in dtor.

see,

https://github.com/ros2/rmw/blob/rolling/rmw/include/rmw/rmw.h#L3118-L3131

if the callback is NULL, as specification, it is supposed to return RMW_RET_INVALID_ARGUMENT.
and eventually rclcpp will generate rclcpp::exceptions::throw_from_rcl_error.

test is pass cz rmw_fastrtps always return RCL_RET_OK. (this is not correct behavior based on the doc.)

if we use RTI Coneext DDS, the test fails as following, because it returns RMW_RET_UNSUPPORTED.

[ RUN      ] TestService.on_new_request_callback
RTI Connext DDS Non-commercial license is for academic, research, evaluation and personal use only. USE FOR COMMERCIAL PURPOSES IS PROHIBITED. See RTI_LICENSE.TXT for terms. Download free tools at rti.com/ncl. License issued to Non-Commercial User license@rti.com For non-production use only.
Expires on 00-jan-00 See www.rti.com for more information.
[ERROR] [1661535719.598830456] [rmw_connextdds]: rmw_service_set_on_new_request_callback not implemented
unknown file: Failure
C++ exception with description "failed to set the on new request callback for service: rmw_service_set_on_new_request_callback not implemented, at /root/ros2_ws/colcon_ws/src/ros2/rmw_connextdds/rmw_connextdds_common/src/common/rmw_listener.cpp:45" thrown in the test body.
[  FAILED  ] TestService.on_new_request_callback (315 ms)

after all, i think after this PR is merged, we need to re-check the listener rmw implementation and extend test for all rmw tier 1 implementation. what do you think?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good, you found a better reason for removing these calls! And yes I agree, tests should be extended for these calls to return the other possible values too. When I found some time I'll add them. You have any other ideas for testing?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can manage the following with different PR,

  • test all tier 1 rmw implementations with listener APIs for rclcpp (rclpy does not support?)
  • bugfix rmw implementations if necessary, i think rmw_connextdds is the correct implementation.
  • test would be required to adjust accordingly.

@@ -32,10 +32,6 @@ ServiceBase::ServiceBase(std::shared_ptr<rcl_node_t> node_handle)
node_logger_(rclcpp::get_node_logger(node_handle_.get()))
{}

ServiceBase::~ServiceBase()
{
clear_on_new_request_callback();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think that this should not be called here, because it might generate the exception in dtor.

see,

https://github.com/ros2/rmw/blob/rolling/rmw/include/rmw/rmw.h#L3118-L3131

if the callback is NULL, as specification, it is supposed to return RMW_RET_INVALID_ARGUMENT.
and eventually rclcpp will generate rclcpp::exceptions::throw_from_rcl_error.

test is pass cz rmw_fastrtps always return RCL_RET_OK. (this is not correct behavior based on the doc.)

if we use RTI Coneext DDS, the test fails as following, because it returns RMW_RET_UNSUPPORTED.

[ RUN      ] TestService.on_new_request_callback
RTI Connext DDS Non-commercial license is for academic, research, evaluation and personal use only. USE FOR COMMERCIAL PURPOSES IS PROHIBITED. See RTI_LICENSE.TXT for terms. Download free tools at rti.com/ncl. License issued to Non-Commercial User license@rti.com For non-production use only.
Expires on 00-jan-00 See www.rti.com for more information.
[ERROR] [1661535719.598830456] [rmw_connextdds]: rmw_service_set_on_new_request_callback not implemented
unknown file: Failure
C++ exception with description "failed to set the on new request callback for service: rmw_service_set_on_new_request_callback not implemented, at /root/ros2_ws/colcon_ws/src/ros2/rmw_connextdds/rmw_connextdds_common/src/common/rmw_listener.cpp:45" thrown in the test body.
[  FAILED  ] TestService.on_new_request_callback (315 ms)

after all, i think after this PR is merged, we need to re-check the listener rmw implementation and extend test for all rmw tier 1 implementation. what do you think?

@alsora
Copy link
Collaborator

alsora commented Aug 30, 2022

CI

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Windows Build Status

@fujitatomoya
Copy link
Collaborator

@mauropasse can you check the CI failure? i think those two are unrelated, but need to double-check.

Needed since QOSEventHandlerBase does not own
the pub/sub listeners. So the QOSEventHandler
can be destroyed while the corresponding listeners
are still alive, so we need to clear these callbacks.

Signed-off-by: Mauro Passerino <mpasserino@irobot.com>
@mauropasse
Copy link
Collaborator Author

@fujitatomoya I reverted the changes for clearing callbacks on ~QOSEventHandlerBase.
This object is created when creating a Publisher/Subscriber, so it's destructor will be called before the rmw publisher/subscribers/etc are destroyed.
So the events callbacks need to be cleared on QOSEventHandlerBase destructor. No risk of rmw entities destroyed before, since they are created prior of this object.

@alsora
Copy link
Collaborator

alsora commented Sep 20, 2022

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Windows Build Status

@mauropasse
Copy link
Collaborator Author

I'll add a comment about why this clearing is needed here, and not in the other entities, but I'll wait for all green CI.

Signed-off-by: Mauro Passerino <mpasserino@irobot.com>
@fujitatomoya
Copy link
Collaborator

@alsora can you review this? i am good to go with it.

@alsora
Copy link
Collaborator

alsora commented Sep 22, 2022

Looks good to me.
We have been testing this PR for the last couple of days and it seems to fix the problem.
I'm going to go ahead and merge it.

@alsora alsora merged commit 145933b into ros2:rolling Sep 22, 2022
@fujitatomoya
Copy link
Collaborator

@alsora thanks!

@mauropasse mauropasse deleted the mauro/not-clear-callbacks-on-destruction branch September 23, 2022 09:35
alsora pushed a commit to irobot-ros/rclcpp that referenced this pull request Apr 29, 2023
* Do not clear entities callbacks on destruction

Removing these clearings since they were not necessary,
since the objects are being destroyed anyway.

Signed-off-by: Mauro Passerino <mpasserino@irobot.com>

* Fix CI

Signed-off-by: Mauro Passerino <mpasserino@irobot.com>

* Restore clear_on_ready_callback on ~QOSEventHandlerBase

Needed since QOSEventHandlerBase does not own
the pub/sub listeners. So the QOSEventHandler
can be destroyed while the corresponding listeners
are still alive, so we need to clear these callbacks.

Signed-off-by: Mauro Passerino <mpasserino@irobot.com>

* Add coment on clearing callback for QoS event

Signed-off-by: Mauro Passerino <mpasserino@irobot.com>

Signed-off-by: Mauro Passerino <mpasserino@irobot.com>
Co-authored-by: Mauro Passerino <mpasserino@irobot.com>
alsora pushed a commit to irobot-ros/rclcpp that referenced this pull request Jan 30, 2024
* Do not clear entities callbacks on destruction

Removing these clearings since they were not necessary,
since the objects are being destroyed anyway.

Signed-off-by: Mauro Passerino <mpasserino@irobot.com>

* Fix CI

Signed-off-by: Mauro Passerino <mpasserino@irobot.com>

* Restore clear_on_ready_callback on ~QOSEventHandlerBase

Needed since QOSEventHandlerBase does not own
the pub/sub listeners. So the QOSEventHandler
can be destroyed while the corresponding listeners
are still alive, so we need to clear these callbacks.

Signed-off-by: Mauro Passerino <mpasserino@irobot.com>

* Add coment on clearing callback for QoS event

Signed-off-by: Mauro Passerino <mpasserino@irobot.com>

Signed-off-by: Mauro Passerino <mpasserino@irobot.com>
Co-authored-by: Mauro Passerino <mpasserino@irobot.com>
apojomovsky pushed a commit to apojomovsky/rclcpp that referenced this pull request Jun 6, 2024
* Do not clear entities callbacks on destruction

Removing these clearings since they were not necessary,
since the objects are being destroyed anyway.

Signed-off-by: Mauro Passerino <mpasserino@irobot.com>

* Fix CI

Signed-off-by: Mauro Passerino <mpasserino@irobot.com>

* Restore clear_on_ready_callback on ~QOSEventHandlerBase

Needed since QOSEventHandlerBase does not own
the pub/sub listeners. So the QOSEventHandler
can be destroyed while the corresponding listeners
are still alive, so we need to clear these callbacks.

Signed-off-by: Mauro Passerino <mpasserino@irobot.com>

* Add coment on clearing callback for QoS event

Signed-off-by: Mauro Passerino <mpasserino@irobot.com>

Signed-off-by: Mauro Passerino <mpasserino@irobot.com>
Co-authored-by: Mauro Passerino <mpasserino@irobot.com>
apojomovsky pushed a commit to apojomovsky/rclcpp that referenced this pull request Jun 6, 2024
* Do not clear entities callbacks on destruction

Removing these clearings since they were not necessary,
since the objects are being destroyed anyway.

Signed-off-by: Mauro Passerino <mpasserino@irobot.com>

* Fix CI

Signed-off-by: Mauro Passerino <mpasserino@irobot.com>

* Restore clear_on_ready_callback on ~QOSEventHandlerBase

Needed since QOSEventHandlerBase does not own
the pub/sub listeners. So the QOSEventHandler
can be destroyed while the corresponding listeners
are still alive, so we need to clear these callbacks.

Signed-off-by: Mauro Passerino <mpasserino@irobot.com>

* Add coment on clearing callback for QoS event

Signed-off-by: Mauro Passerino <mpasserino@irobot.com>

Signed-off-by: Mauro Passerino <mpasserino@irobot.com>
Co-authored-by: Mauro Passerino <mpasserino@irobot.com>
apojomovsky pushed a commit to apojomovsky/rclcpp that referenced this pull request Jun 21, 2024
* Do not clear entities callbacks on destruction

Removing these clearings since they were not necessary,
since the objects are being destroyed anyway.

Signed-off-by: Mauro Passerino <mpasserino@irobot.com>

* Fix CI

Signed-off-by: Mauro Passerino <mpasserino@irobot.com>

* Restore clear_on_ready_callback on ~QOSEventHandlerBase

Needed since QOSEventHandlerBase does not own
the pub/sub listeners. So the QOSEventHandler
can be destroyed while the corresponding listeners
are still alive, so we need to clear these callbacks.

Signed-off-by: Mauro Passerino <mpasserino@irobot.com>

* Add coment on clearing callback for QoS event

Signed-off-by: Mauro Passerino <mpasserino@irobot.com>

Signed-off-by: Mauro Passerino <mpasserino@irobot.com>
Co-authored-by: Mauro Passerino <mpasserino@irobot.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants