Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce result_timeout to 10 seconds. #1012

Merged
merged 1 commit into from Nov 30, 2022

Conversation

clalancette
Copy link
Contributor

Signed-off-by: Chris Lalancette clalancette@openrobotics.org

I spent a bit of time investigating ros2/ros2#1074 . One problem I found is that if an action client calls an action server repeatedly, the time it takes to complete that action gets slower and slower, eventually ending up taking a good portion of a second.

After doing some debugging, I found out that at least part of the problem is that the number of goal handles (as stored by the rcl_action_server_t structure) continually goes up. Since rclcpp_action goes and copies all of the goal handles during publish_status, this causes the amount of time necessary to call publish_status to go up and up.

But looking at it further, it seems like the main problem is that goal_handles are never being cleared out when they are done being used. Instead, it looks like they are meant to "expire", but this only happens by default every 15 minutes. Quite a lot of goal handles can be accumulated in that time.

This PR is a not-serious attempt to remedy this by reducing the default expiration time to 10 seconds. With this in place, things still end up going slower for 10 seconds when first starting up, but then the expiration logic kicks in and keeps the number of goal handles to a reasonable number. I don't really intend this to go in as-is, but I would like to have a discussion about whether we should remove goals automatically once goal_handle->succeed is called, whether publish_status should really publish status on all goals, and whether we should reduce the default time expiration period for action goal handles.

Signed-off-by: Chris Lalancette <clalancette@openrobotics.org>
@clalancette
Copy link
Contributor Author

After some discussion, this may be a reasonable thing to do. So I'm going to go ahead and open this as a PR for real feedback.

@clalancette clalancette marked this pull request as ready for review November 3, 2022 17:23
@clalancette
Copy link
Contributor Author

@ros-pull-request-builder retest this please

Copy link
Collaborator

@fujitatomoya fujitatomoya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good to go.

@clalancette
Copy link
Contributor Author

@ros-pull-request-builder retest this please

@clalancette
Copy link
Contributor Author

clalancette commented Nov 17, 2022

CI:

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Windows Build Status

@clalancette clalancette self-assigned this Nov 17, 2022
@fujitatomoya fujitatomoya merged commit 3eb0f75 into rolling Nov 30, 2022
@delete-merged-branch delete-merged-branch bot deleted the clalancette/reduce-goal-timeout branch November 30, 2022 01:47
@fujitatomoya
Copy link
Collaborator

@clalancette do we want to backport this to humble?

@clalancette
Copy link
Contributor Author

@clalancette do we want to backport this to humble?

I'm worried that this is a big semantic change, and it is really hard to debug. I'd rather let it soak in Rolling for a while before we even consider that.

@SteveMacenski
Copy link

By the way, this caused alot of our tests to become flaky that we're not just realizing: ros-planning/navigation2#3765

I believe this should be set to max and/or disabled by default since its very common for actions to be very long running. I think the method for expiring of goals so they don't pile up needs to be done another way -- its throwing out currently running goals now and returning statuses of UNKNOWN back to the application when we have things running > 10 s.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants