Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add resources_lock_ lock_guards to avoid race condition when loading robot_description through topic #1451

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

saikishor
Copy link
Member

As reported in #1442, loading the robot_description through the topic will cause a segmentation fault or some undefined behaviors as the read and write methods real-time methods are continuously executed, and when the robot description is received and the resource_manager is to be initialized, there is no lock_guard of recursive mutex resources_lock_, which should avoid the RM to execute the components when they are changing state or being loaded and initialized

Fixes #1442

@saikishor saikishor added backport-humble This label should be used by maintaines only! Label triggers PR backport to ROS2 humble. backport-iron This label should be used by maintaines only! Label triggers PR backport to ROS2 Iron. labels Mar 19, 2024
Copy link
Contributor

@christophfroehlich christophfroehlich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@firesurfer
Copy link
Contributor

@saikishor I just tested this PR and it seems like the treading/lock issue is gone.
I couldn't test is further as I am running iron on my system and something about the default controllers seems to have changed between iron/rolling.

It can not load the JointStateBroadCaster what(): Failed to load library /opt/ros/iron/lib/libjoint_state_broadcaster.so which is installed from the package repository.
But I guess that is to be expected.

@saikishor
Copy link
Member Author

It can not load the JointStateBroadCaster what(): Failed to load library /opt/ros/iron/lib/libjoint_state_broadcaster.so which is installed from the package repository.
But I guess that is to be expected.

I believe this is related to the ABI/API breaking changes. If you are using this branch directly in rolling, then you would also need to use the master branch of ros2_controllers, control_msgs, and control_toolbox packages

@bmagyar
Copy link
Member

bmagyar commented Jun 25, 2024

There's a conflict to resolve sir!

Copy link

codecov bot commented Jun 25, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 87.79%. Comparing base (86dd7d2) to head (9360c10).

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #1451   +/-   ##
=======================================
  Coverage   87.79%   87.79%           
=======================================
  Files         102      102           
  Lines        8764     8766    +2     
  Branches      787      787           
=======================================
+ Hits         7694     7696    +2     
  Misses        792      792           
  Partials      278      278           
Flag Coverage Δ
unittests 87.79% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
hardware_interface/src/resource_manager.cpp 73.63% <100.00%> (+0.07%) ⬆️

@saikishor
Copy link
Member Author

There's a conflict to resolve sir!

@bmagyar Fixed!

@destogl
Copy link
Member

destogl commented Jun 25, 2024

You are here protecting loading and set_state; from your explanation, you also want to protect read/write access. So I am not sure how this fix is related, but I am probably missing some detail.

Besides that, something for the future. We should probably have a similar approach with two lists for the controller and also for hardware to enable more dynamics everywhere.

@saikishor
Copy link
Member Author

You are here protecting loading and set_state; from your explanation, you also want to protect read/write access. So I am not sure how this fix is related, but I am probably missing some detail.

Besides that, something for the future. We should probably have a similar approach with two lists for the controller and also for hardware to enable more dynamics everywhere.

@destogl Yes, the thing is the resource manager was trying to execute the components while changing their state, so the set_states need to be protected. This wouldn't happen with the parameter, because they are initialized at the construction time and then we run read and write cycles, but with the topic, the components are loaded in a nonRT thread, while the RT thread is still doing the job. So, this is causing the segfault. @firesurfer confirmed that the fix worked for him. So, I think it makes sense to have them.

I agree with you regarding the 2 lists. We can try to implement it in the near future.

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-humble This label should be used by maintaines only! Label triggers PR backport to ROS2 humble. backport-iron This label should be used by maintaines only! Label triggers PR backport to ROS2 Iron.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Segfault in hardware interface when reading robot_description from topic
5 participants