Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Various Servo fixes, optional latency compensation parameter #886

Closed
wants to merge 11 commits into from

Conversation

AndyZe
Copy link
Member

@AndyZe AndyZe commented Dec 3, 2021

These are all relatively minor bug fixes or documentation improvement except "Add optional latency compensation parameter".

The latency comp parameter was very useful in improving a UR5e's path tracking accuracy. @MarqRazz can attest to that.

If the latency comp parameter is not specified, it doesn't change the behavior at all.

Closes #696

Should be merged with #886 so that the changes in motion speed make sense with the tutorial.

I lumped the latency parameter into one number because there are quite a few sources of latency that can be hard to quantify:

 - joint_states publication rate
 - ROS message transmission. I believe there are 2 message hops
 - network latency
 - a little bit of processing time
 - whatever happens in the ros_control interface
 - low-level controller execution

@MarqRazz here is a diagram to help understand (one case) where this helps. If there is a net latency of 30 ms (from when the Servo message is published to when the low-level controller executes), it can reduce the path-tracking error quite a bit. Thus the robot's low-level PID output is small --> the robot does not move very fast. In the perfect world where the latency is 0, the path-tracking accuracy will be much larger and the robot would move with the expected speed.

The first green circle is when Servo publishes. The dashed green line represents 30ms of latency. Red is when the low-level controller starts executing. The final purple dot is the end of the control period.

You can see that when latency is present, the difference between (purple-red) is much less.

IMG_20211203_161358960

*/
Eigen::VectorXd scaleCartesianCommand(const geometry_msgs::msg::TwistStamped& command);
Eigen::VectorXd scaleCartesianTwistToCartesianPositionDelta(const geometry_msgs::msg::TwistStamped& command);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed this function to be a bit more specific about what it does

# This is a fudge factor to account for any latency in the system, e.g. network latency or poor low-level
# controller performance. It essentially increases the timestep when calculating the target pose, to move the target
# pose farther away. [seconds]
system_latency_compensation: 0.05
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry, but I really dont like "fudge factors". Is there any way the software could compensate for this on its own? Cant we measure/know the publish rate of the arms joint_states?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's a little more complicated than the publication rate of the arm's joint states, Marq.

Yes, I can probably make this adaptive.

Copy link
Member Author

@AndyZe AndyZe Dec 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sources of latency I can think of are:

  • joint_states publication rate
  • ROS message transmission. I believe there are 2 message hops
  • network latency
  • a little bit of processing time
  • whatever happens in the ros_control interface

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed this comment to be more professional and the parameter was renamed. Now it says:

# This is an optional factor to account for any latency or low-level controller ramp up/ramp down in the system.
# It increases the timestep when calculating the target pose, to move the target pose farther away. [seconds]
target_pose_lookahead_time: 0.05

Re. naming, I looked around to see if there is any name for this technique but I did not find an exact match. A Smith Predictor is similar but not exactly the same. That is strange there's not a better name for it already; I know the UR robots do something like this internally.

I have a branch where I tried to calculate the system latency automatically but TBH it did not work very well yet, so I won't put it in this PR.

@@ -703,9 +693,6 @@ bool ServoCalcs::applyJointUpdate(const Eigen::ArrayXd& delta_theta, sensor_msgs
// Calculate joint velocity
joint_state.velocity[i] =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please add some debug prints that I can use to prove the velocity calculation we have here is correct (i.e. print this calculation, the actual joint velocity, and the difference between the two)? This still looks wrong because the arm is not moving this entire distance commanded in the amount of time we are giving it. If it did then Servo would cause the arm to jerk from a start/stop state every update.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please see the diagram in the PR description.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll work on actually tracking the Cartesian position and using that feedback to suggest a latency parameter.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started a branch called "latency monitor" here. It's not ready to merge yet, though.

https://github.com/AndyZe/moveit2/tree/andyz/latency_monitor

@tylerjw
Copy link
Member

tylerjw commented Dec 6, 2021

If you want to publish numbers simply for debugging would it make sense it add https://github.com/pal-robotics/pal_statistics to servo. It has a way of sampling and publishing internal state in a way that should not affect the performance of servo and wouldn't require a separate node. @facontidavide showed me it when I asked about how he would go about plotting something like this.

@MarqRazz
Copy link
Contributor

MarqRazz commented Dec 7, 2021

@MarqRazz here is a diagram to help understand why this works. If there is a net latency of 30 ms (from when the Servo message is published to when the low-level controller executes), it can reduce the path-tracking error quite a bit. Thus the robot's low-level PID output is small --> the robot does not move very fast. In the perfect world where the latency is 0, the path-tracking accuracy will be much larger and the robot would move with the expected speed.

The first green circle is when Servo publishes. The dashed green line represents 30ms of latency. Red is when the low-level controller starts executing. The final purple dot is the end of the control period.

You can see that when latency is present, the difference between (purple-red) is much less.

@AndyZe if the robot were to perform like you are showing with latency = 0 then the robot would start and stop at each point which would equate to very jerky motion. The goal of Servo is to constantly keep the command in front of the robot (i.e. it should have following error) so that motion is smooth and moves in the direction commanded.

I assume the latency in this system is constant and is included in the update rate. That is if we are running this loop at 500 Hz then it will send 500 commands per second to the robot, not 500 + some offset (i.e. timestep = parameters_->publish_period + parameters_->system_latency_compensation). That is if we keep the fudge factor then moving the carrot further ahead does not equate to the robot tracking the command better it just moves the goal further out causing the robot to move faster.

@MarqRazz
Copy link
Contributor

MarqRazz commented Dec 7, 2021

If you want to publish numbers simply for debugging would it make sense it add https://github.com/pal-robotics/pal_statistics to servo. It has a way of sampling and publishing internal state in a way that should not affect the performance of servo and wouldn't require a separate node. @facontidavide showed me it when I asked about how he would go about plotting something like this.

@tylerjw I like this suggestion but was thinking it would only be a debugging print statement, meaning you only take the performance hit when you enable it without adding dependencies.

@tylerjw
Copy link
Member

tylerjw commented Dec 7, 2021

meaning you only take the performance hit when you enable it without adding dependencies.

I think you'd only take a performance hit from pal_statistics if you subscribe to its topic. The real downside is adding yet another dependency.

@AndyZe
Copy link
Member Author

AndyZe commented Dec 7, 2021

We don't need to publish any extra data, though. Everything we need is already published.

@AndyZe AndyZe force-pushed the andyz/various_servo_fixes branch 3 times, most recently from 455e979 to 9378a93 Compare December 7, 2021 23:50
@AndyZe
Copy link
Member Author

AndyZe commented Dec 15, 2021

@MarqRazz & @tylerjw I do want to get this merged soon. There are some good fixes in here re how the velocity limits are applied. The new "target pose lookahead time" parameter is completely optional.

@tylerjw
Copy link
Member

tylerjw commented Dec 15, 2021

I'm mostly not convinced that adding some fixed value from the parameters is a good long-term solution. I would like to find some way to measure and quantify what this delay is so instead of having a number you can tweak it to try to get the result closer to the velocity you expect you would actually set it from a measured value and achieve the velocities you are requesting.

@AndyZe
Copy link
Member Author

AndyZe commented Dec 15, 2021

I would like to find some way to measure and quantify what this delay is so instead of having a number you can tweak it to try to get the result closer to the velocity you expect you would actually set it from a measured value and achieve the velocities you are requesting.

I attempted that here. It does not work very well yet and I don't have time/hardware to debug further at the moment. Let me know if you see any issues.

https://github.com/AndyZe/moveit2/tree/andyz/latency_monitor

At least it is only one parameter to adjust, right?

@gavanderhoorn
Copy link
Contributor

I'm mostly not convinced that adding some fixed value from the parameters is a good long-term solution.

Just an observation: many "tracking" systems provided by the big OEMs actually include exactly the kind of twiddle factor @AndyZe proposes here. Line-tracking systems or general IO interfacing with external "triggering devices" would be examples of this.

Sometimes even multiple (ie: one for "io delay", another for "processing delay" and yet another for "misc delay").

There can be many possible contributing factors to these kinds of delays, and their combination doesn't always make it possible to express them in a single unit, hence the "twiddle factor" naming.

Whether you'd want to implement this by scaling control outputs or in some other way is another discussion of course.

@tylerjw
Copy link
Member

tylerjw commented Dec 15, 2021

At least it is only one parameter to adjust, right?

It is, but it seems like the sort of thing that a user will have to tune and is only valid for their setup (assuming this value can change between computers/setups/network environment). It might even change while executing.

@tylerjw
Copy link
Member

tylerjw commented Dec 15, 2021

Have you considered creating feedback from observing changes in joint states from previous commands adjusting the delay to achieve velocities closer to what the user requested? Ideally, this doesn't change that quickly and you could observe the behavior of the robot to determine how much time offset to add in order to achieve the requested velocity (that is assuming it is actually modeled well enough by a single number).

@AndyZe
Copy link
Member Author

AndyZe commented Dec 16, 2021

Have you considered ...

Tyler, I already posted a link a couple message up where I tried something very similar to that on hardware. I only had about an hour to test on hardware but it was not working well at that time so I won't add it to this MR.

I think you have lost focus on the bigger picture here.

@AndyZe AndyZe requested a review from MarqRazz December 16, 2021 18:50
@tylerjw
Copy link
Member

tylerjw commented Dec 16, 2021

I think you have lost focus on the bigger picture here.

The bigger picture is that I'd like to move moveit in the direction of fewer nobs and dials, not more. It is already hard enough to use. User interfaces should express the output intent, not expose nobs like this that have to be tuned. When you expose a nob that has to be tuned your users end up depending on the specific behavior in your library and that makes it much harder to change the behavior for the better without breaking users who have optimized their setups around your nob. It also makes it much more likely someone will misunderstand how to use your nob and create poorer results than if it didn't exist.

@tylerjw
Copy link
Member

tylerjw commented Dec 16, 2021

I generally disagree with exposing a nob to tune in the parameters if at all possible. At the same time, this PR is crashing and generating a stack trace in a test on rolling.

@codecov
Copy link

codecov bot commented Dec 18, 2021

Codecov Report

Merging #886 (f285c02) into main (91fbe5a) will increase coverage by 0.03%.
The diff coverage is 91.49%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #886      +/-   ##
==========================================
+ Coverage   57.90%   57.93%   +0.03%     
==========================================
  Files         310      310              
  Lines       26067    26072       +5     
==========================================
+ Hits        15092    15102      +10     
+ Misses      10975    10970       -5     
Impacted Files Coverage Δ
...veit_servo/include/moveit_servo/servo_parameters.h 100.00% <ø> (ø)
moveit_ros/moveit_servo/src/servo_parameters.cpp 79.41% <72.73%> (-0.44%) ⬇️
moveit_ros/moveit_servo/src/servo_calcs.cpp 68.31% <95.84%> (+0.67%) ⬆️
moveit_ros/moveit_servo/src/enforce_limits.cpp 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 91fbe5a...f285c02. Read the comment docs.

@AndyZe
Copy link
Member Author

AndyZe commented Dec 20, 2021

I generally disagree with exposing a nob to tune in the parameters if at all possible. At the same time, this PR is crashing and generating a stack trace in a test on rolling.

Sorry about the crash. It was a silly mistake that's fixed now.

You want an auto-tuner that runs in the background and just works perfectly. It sounds nice but here are some reasons that may be more difficult than you realized:

  • 6 or 7-dof data has to be squashed down to a scalar.
  • The robot needs to move over a trajectory that's free of singularities, joint limits, and collisions. Good luck coming up with a trajectory that generalizes to all robot types.
  • There would be a delay at the beginning of the auto-tuning process.

I did set the parameter to zero by default and added validation, making it a bit harder for the user to screw up.

@mergify
Copy link

mergify bot commented Dec 21, 2021

This pull request is in conflict. Could you fix it @AndyZe?

@AndyZe
Copy link
Member Author

AndyZe commented Jan 4, 2022

Closing in favor of #956

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Consider a "feedback delay" parameter for Servo
4 participants