Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bug that a callback not reached #1640

Conversation

iuhilnehc-ynos
Copy link
Collaborator

a subscriber on a new executor with a callback group triggered to receive a message

related to #1611

Signed-off-by: Chen Lihui lihui.chen@sony.com

@iuhilnehc-ynos iuhilnehc-ynos force-pushed the topic-subscriber-triggered-to-receive-message branch from d783577 to 52eba9a Compare April 25, 2021 08:04
@iuhilnehc-ynos
Copy link
Collaborator Author

this test case expects test result as

The following tests FAILED:
	 26 - test_add_callback_groups_to_executor__rmw_fastrtps_cpp (Failed)
	 27 - test_add_callback_groups_to_executor__rmw_fastrtps_dynamic_cpp (Failed)

after using ros2/rmw_fastrtps#527, there is no failure.

@asorbini
Copy link
Contributor

Before merging this test, can we discuss whether attaching a node to two different executors should be allowed by the API? This isn't currently a requirement AFAIK, so making this change is not trivial at all.

Copy link
Member

@wjwwood wjwwood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reexamined the test to check what @asorbini was asking about, and I think I found some flaws, but the general idea (i.e. using more than one executor per node via add_callback_group() and having a callback group wake an executor when something is added to it in order to consider new items) is a supported use case based on the API.

It might be that this is a bug in rclcpp and not the middlewares however. I need to look more deeply at the root cause of the issue. But I don't have time at the moment.

@iuhilnehc-ynos
Copy link
Collaborator Author

This kind of test source code seems not good (adding it into rclcpp might make other users wrongly use it in the future), close it.

@fujitatomoya
Copy link
Collaborator

i do not know why we need to close this. after rclcpp is fixed, we can add this kind of test?

@wjwwood
Copy link
Member

wjwwood commented Apr 30, 2021

i do not know why we need to close this. after rclcpp is fixed, we can add this kind of test?

I don't think we needed to close it, but I do think the things I pointed out needed to be addressed.

I do not think, however, that we should have tests that copy user submitted code by including anti-patterns or otherwise not-recommended practices. Instead, the test should (ideally) test one potential issue at a time. It can be distilled from a user reported issue, but again it should be cleaned up and written as properly as possible.

@fujitatomoya
Copy link
Collaborator

yeah okay that makes sense to me.

@iuhilnehc-ynos
Copy link
Collaborator Author

reopen it after not using some anti-pattern source code.

@iuhilnehc-ynos iuhilnehc-ynos reopened this May 8, 2021
@iuhilnehc-ynos iuhilnehc-ynos changed the title Add a test case Fix bug that a callback not reached May 19, 2021
@iuhilnehc-ynos
Copy link
Collaborator Author

@wjwwood

I have updated rclcpp source code by following #1611 (comment) if I understood correctly.
Could you help to review it again?

Copy link
Collaborator

@fujitatomoya fujitatomoya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall, i think it looks good.

rclcpp/src/rclcpp/executor.cpp Outdated Show resolved Hide resolved
@fujitatomoya
Copy link
Collaborator

@wjwwood friendly ping.

@SteveMacenski
Copy link
Collaborator

SteveMacenski commented May 20, 2022

Noting that this is causing a bug in the Humble migration and this fixes it for an issue in Nav2 linked above @clalancette

Obviously too late in the game for this to get into Humble out of the box, but hope that it will get in for the LTS and backported. The issues we ran into were very subtle.

@clalancette
Copy link
Contributor

It looks like this needs a rebase, and another review for the latest changes. Once that is done, we can hopefully merge it in for Rolling.

Backporting it to Humble is going to be tricky, as it changes ABI. We'll have to consider another way to do that.

@iuhilnehc-ynos iuhilnehc-ynos force-pushed the topic-subscriber-triggered-to-receive-message branch from d7b1a74 to 6f8bd5e Compare June 1, 2022 02:58
@iuhilnehc-ynos
Copy link
Collaborator Author

@wjwwood friendly ping.

@audrow audrow changed the base branch from master to rolling June 28, 2022 14:21
Chen Lihui and others added 6 commits October 18, 2022 17:08
Signed-off-by: Chen Lihui <lihui.chen@sony.com>
Signed-off-by: Chen Lihui <lihui.chen@sony.com>
Signed-off-by: Chen Lihui <lihui.chen@sony.com>
Signed-off-by: Chen Lihui <lihui.chen@sony.com>
… after shutdown

Signed-off-by: Chen Lihui <lihui.chen@sony.com>
Signed-off-by: Chris Lalancette <clalancette@openrobotics.org>
@clalancette clalancette force-pushed the topic-subscriber-triggered-to-receive-message branch from c50390c to 8dd0ec8 Compare October 18, 2022 18:07
Copy link
Contributor

@clalancette clalancette left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I rebased this PR and made a few minor changes (just to the include files).

Overall, I think the idea behind this PR is good, but I will point out that it is not thread-safe. In particular, the lock around get_notify_guard_condition is useless; it is only held while manipulating the shared_ptr, but is then dropped when we return the object. Anything we do with that object afterwards is not thread-safe.

What I think we should do instead is to make sure we do any operations on the callback-group guard condition with the lock held. That would require us to add some additional APIs (like a new trigger method), modify some of the existing ones (change destroy_notify_guard_condition to only destroy the guard condition if it was created and not error out), and otherwise clean some things up. With those things done, we should be able to make this thread-safe and move forward with this.

@iuhilnehc-ynos
Copy link
Collaborator Author

One callback group can only have one notify_guard_condition_, the notify_guard_condition_mutex_ is jsut for these three operations, and the callback group can be used in only one executor.

so it seems unnecessary to add extra mutex to make the callback-group guard condition thread-safe.

Chen Lihui added 2 commits October 19, 2022 16:20
make a callback group can only create one guard condition

Signed-off-by: Chen Lihui <lihui.chen@sony.com>
…ack group

Signed-off-by: Chen Lihui <lihui.chen@sony.com>
@iuhilnehc-ynos
Copy link
Collaborator Author

iuhilnehc-ynos commented Oct 25, 2022

CI:

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Windows Build Status

@iuhilnehc-ynos
Copy link
Collaborator Author

retry CI:

  • Windows Build Status

@fujitatomoya
Copy link
Collaborator

@iuhilnehc-ynos thanks for the effort on this, i will take a look at this today. I think this is very helpful for user application.

Copy link
Collaborator

@fujitatomoya fujitatomoya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@fujitatomoya
Copy link
Collaborator

dependent and above CI all green, i will go ahead to merge this.

@fujitatomoya fujitatomoya merged commit d119157 into ros2:rolling Oct 26, 2022
@fujitatomoya
Copy link
Collaborator

@iuhilnehc-ynos i think we can backport this to humble, right? it would be really nice to backport since this affects user application behavior.

@fujitatomoya
Copy link
Collaborator

@mergify backport this to humble

Just create backport PR so that we will not forget, let's give it a couple of weeks to make sure there is no degression, then we can merge the backport PR to humble.

@mergify
Copy link

mergify bot commented Oct 27, 2022

backport this to humble

❌ No backport have been created

mergify bot pushed a commit that referenced this pull request Oct 27, 2022
* Add a test case

a subscriber on a new executor with a callback group triggered to receive a message

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

* fix flaky and not to use spin_some

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

* update comment

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

* update for not using anti-pattern source code

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

* add a notify guard condition for callback group

Co-authored-by: William Woodall <william@osrfoundation.org>
Signed-off-by: Chen Lihui <lihui.chen@sony.com>

* Notify guard condition of Node not to be used in Executor

it is only for the waitset of GraphListener

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

* put code in the try catch

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

* defer to create guard condition

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

* use context directly for the create function

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

* cpplint

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

* fix that some case might call add_node after shutdown

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

* nitpick and fix some potential bug

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

* add sanity check as some case might not create notify guard condition after shutdown

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

* Cleanup includes.

Signed-off-by: Chris Lalancette <clalancette@openrobotics.org>

* remove destroy method

make a callback group can only create one guard condition

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

* remove limitation that guard condition can not be re-created in callback group

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

Signed-off-by: Chen Lihui <lihui.chen@sony.com>
Signed-off-by: Chris Lalancette <clalancette@openrobotics.org>
Co-authored-by: William Woodall <william@osrfoundation.org>
Co-authored-by: Chris Lalancette <clalancette@openrobotics.org>
(cherry picked from commit d119157)
fujitatomoya pushed a commit that referenced this pull request Nov 10, 2022
* Add a test case

a subscriber on a new executor with a callback group triggered to receive a message

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

* fix flaky and not to use spin_some

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

* update comment

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

* update for not using anti-pattern source code

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

* add a notify guard condition for callback group

Co-authored-by: William Woodall <william@osrfoundation.org>
Signed-off-by: Chen Lihui <lihui.chen@sony.com>

* Notify guard condition of Node not to be used in Executor

it is only for the waitset of GraphListener

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

* put code in the try catch

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

* defer to create guard condition

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

* use context directly for the create function

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

* cpplint

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

* fix that some case might call add_node after shutdown

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

* nitpick and fix some potential bug

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

* add sanity check as some case might not create notify guard condition after shutdown

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

* Cleanup includes.

Signed-off-by: Chris Lalancette <clalancette@openrobotics.org>

* remove destroy method

make a callback group can only create one guard condition

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

* remove limitation that guard condition can not be re-created in callback group

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

Signed-off-by: Chen Lihui <lihui.chen@sony.com>
Signed-off-by: Chris Lalancette <clalancette@openrobotics.org>
Co-authored-by: William Woodall <william@osrfoundation.org>
Co-authored-by: Chris Lalancette <clalancette@openrobotics.org>
(cherry picked from commit d119157)

Co-authored-by: Chen Lihui <lihui.chen@sony.com>
alsora added a commit to irobot-ros/rclcpp that referenced this pull request Jan 23, 2023
Signed-off-by: Alberto Soragna <alberto.soragna@gmail.com>
alsora added a commit to irobot-ros/events-executor that referenced this pull request Jan 24, 2023
Signed-off-by: Alberto Soragna <alberto.soragna@gmail.com>
alsora added a commit to irobot-ros/events-executor that referenced this pull request Jan 24, 2023
Signed-off-by: Alberto Soragna <alberto.soragna@gmail.com>
alsora pushed a commit to irobot-ros/rclcpp that referenced this pull request Feb 10, 2023
* Add a test case

a subscriber on a new executor with a callback group triggered to receive a message

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

* fix flaky and not to use spin_some

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

* update comment

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

* update for not using anti-pattern source code

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

* add a notify guard condition for callback group

Co-authored-by: William Woodall <william@osrfoundation.org>
Signed-off-by: Chen Lihui <lihui.chen@sony.com>

* Notify guard condition of Node not to be used in Executor

it is only for the waitset of GraphListener

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

* put code in the try catch

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

* defer to create guard condition

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

* use context directly for the create function

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

* cpplint

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

* fix that some case might call add_node after shutdown

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

* nitpick and fix some potential bug

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

* add sanity check as some case might not create notify guard condition after shutdown

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

* Cleanup includes.

Signed-off-by: Chris Lalancette <clalancette@openrobotics.org>

* remove destroy method

make a callback group can only create one guard condition

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

* remove limitation that guard condition can not be re-created in callback group

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

Signed-off-by: Chen Lihui <lihui.chen@sony.com>
Signed-off-by: Chris Lalancette <clalancette@openrobotics.org>
Co-authored-by: William Woodall <william@osrfoundation.org>
Co-authored-by: Chris Lalancette <clalancette@openrobotics.org>
alsora pushed a commit to irobot-ros/rclcpp that referenced this pull request Apr 29, 2023
* Add a test case

a subscriber on a new executor with a callback group triggered to receive a message

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

* fix flaky and not to use spin_some

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

* update comment

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

* update for not using anti-pattern source code

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

* add a notify guard condition for callback group

Co-authored-by: William Woodall <william@osrfoundation.org>
Signed-off-by: Chen Lihui <lihui.chen@sony.com>

* Notify guard condition of Node not to be used in Executor

it is only for the waitset of GraphListener

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

* put code in the try catch

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

* defer to create guard condition

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

* use context directly for the create function

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

* cpplint

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

* fix that some case might call add_node after shutdown

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

* nitpick and fix some potential bug

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

* add sanity check as some case might not create notify guard condition after shutdown

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

* Cleanup includes.

Signed-off-by: Chris Lalancette <clalancette@openrobotics.org>

* remove destroy method

make a callback group can only create one guard condition

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

* remove limitation that guard condition can not be re-created in callback group

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

Signed-off-by: Chen Lihui <lihui.chen@sony.com>
Signed-off-by: Chris Lalancette <clalancette@openrobotics.org>
Co-authored-by: William Woodall <william@osrfoundation.org>
Co-authored-by: Chris Lalancette <clalancette@openrobotics.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants