Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix multi-threaded race condition in client.call_async #871

Merged
merged 5 commits into from
Jan 14, 2022
Merged

Fix multi-threaded race condition in client.call_async #871

merged 5 commits into from
Jan 14, 2022

Conversation

augustelalande
Copy link
Contributor

@augustelalande augustelalande commented Jan 4, 2022

Proposal to fix multi-threaded race condition in client.call_async as described in #870

@augustelalande augustelalande changed the title Fix multi-threaded race condition in client.call Fix multi-threaded race condition in client.call_async Jan 4, 2022
Copy link
Collaborator

@fujitatomoya fujitatomoya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this makes sense to avoid the possible racy condition in theory.

rclpy/rclpy/client.py Outdated Show resolved Hide resolved
@fujitatomoya
Copy link
Collaborator

@ivanpauno @jacobperron could you take a look at this when you have time?

@fujitatomoya
Copy link
Collaborator

CI:

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Windows Build Status

@jacobperron
Copy link
Member

Tricky..

I suppose there are similar threading issues with the action client, wherever we are making service requests, for instance

self._pending_goal_requests[sequence_number] = future

has a race with

for future in waitable._futures:

It's possible the executor tries to process a request before the future is available in the list.

It's too bad we have to introduce a lock at the scope of sending the request, but I can't think of another way. Hopefully it won't have a significant impact to performance.

@sloretz Thoughts?

@augustelalande
Copy link
Contributor Author

Tricky..

I suppose there are similar threading issues with the action client, wherever we are making service requests, for instance

self._pending_goal_requests[sequence_number] = future

has a race with

for future in waitable._futures:

It's possible the executor tries to process a request before the future is available in the list.

It's too bad we have to introduce a lock at the scope of sending the request, but I can't think of another way. Hopefully it won't have a significant impact to performance.

@sloretz Thoughts?

Yes this race condition definitely exists in the ActionClient. Actually, even #564 still exists there. The question is why is the Client functionality being reimplemented in ActionClient? Why not reuse the existing Client functionality by creating sub-clients to handle goal requests, cancel requests etc.?

I guess this should be a new issue.

@ivanpauno
Copy link
Member

Yes this race condition definitely exists in the ActionClient. Actually, even #564 still exists there. The question is why is the Client functionality being reimplemented in ActionClient? Why not reuse the existing Client functionality by creating sub-clients to handle goal requests, cancel requests etc.?

👍, I'm not sure how easy is to do that but I think it would be good to reuse the existing client code.

I guess this should be a new issue.

Yes, I don't think we need to address that here.

@fujitatomoya
Copy link
Collaborator

CI:

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Windows Build Status

@jacobperron
Copy link
Member

jacobperron commented Jan 12, 2022

I think the Action Client doesn't share the same logic as the Client because it is using the rcl_action C-API under the hood. Perhaps it's possible to refactor the code in rclpy such that they share the same high-level logic; I don't recall if this was considered when actions were implemented.

rclpy/rclpy/client.py Outdated Show resolved Hide resolved
@jacobperron
Copy link
Member

Testing rclpy and packages above:

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Windows Build Status

Signed-off-by: augustelalande <auguste.lalande@gmail.com>
Signed-off-by: augustelalande <auguste.lalande@gmail.com>
Signed-off-by: augustelalande <auguste.lalande@gmail.com>
Signed-off-by: augustelalande <auguste.lalande@gmail.com>
Signed-off-by: augustelalande <auguste.lalande@gmail.com>
@jacobperron
Copy link
Member

jacobperron commented Jan 13, 2022

I've rebased on master to get recent changes related to pybind11, hopefully this resolves CI failures:

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Windows Build Status

@jacobperron jacobperron merged commit 301953d into ros2:master Jan 14, 2022
jmarsik added a commit to km-robotics/rclpy that referenced this pull request May 12, 2022
* fix multi-thread race condition in client.call

(cherry picked from commit 301953d)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants