Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix JTC from immediately returning success #565

Merged
merged 2 commits into from
May 2, 2023

Conversation

MarqRazz
Copy link
Contributor

@MarqRazz MarqRazz commented Apr 7, 2023

When using the JTC action server we have found a race condition where it can return SUCCESS as soon as the trajectory goal is accepted. I have confirmed that after the first goal is given to the controller the update() method maintains access to the last traj_point_active_ptr_ causing it continuously command the hardware to the end of the last goal's trajectory while also setting:

auto res = std::make_shared<FollowJTrajAction::Result>();
res->set__error_code(FollowJTrajAction::Result::SUCCESSFUL);
active_goal->setSucceeded(res);

If a new goal is received while processing update() there is a chance for it to immediately return SUCCESS but continue to execute the new goal.

This PR removes the ability for the update() method to command the arm when it does not have an active goal by setting traj_point_active_ptr_ to a nullptr after it has completed/aborted. When a new goal is received it is copied over to the traj_point_active_ptr_ allowing it to execute again.

Background on our problem:

This causes problems with one of our pipeline where additional motion planning is staged up after each trajectory is completed. When the goal returns and the arm is still in motion the next plan grabs the start_state from the robot while it's moving and then generates and sends the new plan for execution. The controller happly accepts the new goal which will cause the arm to quickly jump back to where it grabbed the start_state and then execute the new trajectory. This jump in the joint commands frequently causes our hardware to throw a fault. Below is a timeline of the happening on our hardware along with a graph of the joint_commands.

image
joint_command_jump2

@mechwiz
Copy link
Contributor

mechwiz commented Apr 10, 2023

I think I noticed a similar if not the same issue a while back and made a PR that handles the issue here - #410. However your implementation may be better. Will review

@MarqRazz
Copy link
Contributor Author

I think I noticed a similar if not the same issue a while back and made a PR that handles the issue here - #410. However your implementation may be better. Will review

I'm not sure if your PR will fix the whole issue. I think it might keep the second trajectory from being executed but will still immediately return SUCCESS for the action result if this section of code is being executed when a new goal is received.

@mechwiz
Copy link
Contributor

mechwiz commented Apr 10, 2023

I would agree with you if the active-goal is queried within that section of code. However, my PR specifically removes that query (auto active_goal = *rt_active_goal_.readFromRT();) from occurring in that section of code for the reason you are describing and instead does it earlier when checking when a new external message has been received which, in my opinion, is where that query should exist.

@MarqRazz
Copy link
Contributor Author

I'm open to either fix. If you have changes you would like made here or we can continue to work on your PR just let me know.

I do think that this section of code should not be executed unless we have an active goal though which means it should be checked in your if statement...

if (active_goal && traj_point_active_ptr_ && (*traj_point_active_ptr_)->has_trajectory_msg() && !mismatch )

@mechwiz
Copy link
Contributor

mechwiz commented Apr 14, 2023

I worked through the logic between the 2 PRs and I think I prefer your implementation since yours will definitely ensure that that section of code won't execute after the previous goal finished until a new goal has successfully been obtained externally. (I think mine solves the issue in a less elegant way). I would +1 but I don't have any approval rights

DatSpace added a commit to vtikha/ros2_controllers that referenced this pull request Apr 26, 2023
@MarqRazz
Copy link
Contributor Author

MarqRazz commented May 1, 2023

Any feedback from the ros2_control maintainer team?

Copy link
Member

@bmagyar bmagyar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the insightful fix! The double accounting we do with the pointer Vs the buffer is the root of the issue which we may be able to refactor so we can get rid of this... Or maybe not! For now I'm happy with the change

@moriarty
Copy link
Contributor

moriarty commented May 2, 2023

I believe this PR is failing on Rolling because some of the github ci jobs do pull in current dependencies to the upstream_workspace and a change to control_msgs has renamed some msg fields ros-controls/control_msgs#86

@bmagyar
Copy link
Member

bmagyar commented May 2, 2023

Aye I'm aware of those, that's why we have all these variants for the CI ;) eventually pending PRs on the releases will be merged and synced

@bmagyar bmagyar merged commit 634e6fe into ros-controls:master May 2, 2023
9 of 12 checks passed
@MarqRazz
Copy link
Contributor Author

MarqRazz commented May 2, 2023

Do we need to do anything special to get this back ported to humble?

@bmagyar
Copy link
Member

bmagyar commented May 2, 2023

Nothing hopefully, but to remind me which you've already done!

@bmagyar
Copy link
Member

bmagyar commented May 2, 2023

@Mergifyio backport humble

@mergify
Copy link
Contributor

mergify bot commented May 2, 2023

backport humble

✅ Backports have been created

mergify bot pushed a commit that referenced this pull request May 2, 2023
Co-authored-by: Bence Magyar <bence.magyar.robotics@gmail.com>
(cherry picked from commit 634e6fe)
@MarqRazz MarqRazz deleted the pr-fix_immediate_jtc_return branch May 2, 2023 21:17
@egordon
Copy link
Contributor

egordon commented Jun 14, 2023

Unfortunately, this has introduced a bug on our end (realized with 2.20.0-1jammy.20230522.072811, but just noticed this week) with the use of a velocity hardware interface.

If we are within the goal tolerance (but not yet commanding identically 0 velocity because we aren't exactly at the goal), the controller stops commanding the hardware, and this means that the hardware interface is left with a non-0 velocity at the end of the trajectory, causing the robot to drift from the goal position over time.

I've confirmed that the bug is fixed by just commenting out L336 in the PR (L339 in the current humble build). I'll make a bug report.

mechwiz pushed a commit to mechwiz/ros2_controllers that referenced this pull request Jul 3, 2023
mechwiz pushed a commit to mechwiz/ros2_controllers that referenced this pull request Aug 3, 2023
mechwiz pushed a commit to mechwiz/ros2_controllers that referenced this pull request Aug 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants