-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bond support + fixing various action server issues #1894
Conversation
@naiveHobo please review. |
@naiveHobo review now would be good, added complete coverage of the feature set |
The I think it would make sense to make bond optional in |
Ah, it is optional, I'll update the test to set that up |
Oh whoops closed it |
Should be all good now. Funny enough, that actually covered one test I didnt cover (the case of no bond connection) so that's convenient :-) |
Great, actions are failing tests again. |
Actually, I think I uncovered another failure in the action server wrapper - one sec. there was a completely unsafe autostart feature that wasn't totally implemented I removed and I think the tests assumed that it was activated unfairly without telling it to do so (which no other server does other than the tests... not great) Edit: tested locally, that was it - fixed now |
Codecov Report
@@ Coverage Diff @@
## master #1894 +/- ##
==========================================
+ Coverage 71.12% 71.38% +0.26%
==========================================
Files 222 222
Lines 10676 10760 +84
==========================================
+ Hits 7593 7681 +88
+ Misses 3083 3079 -4
Continue to review full report at Codecov.
|
rosdep installing your bond_core branch failed for me:
(no other mention of bond or your lifecycle branch) Installing your branch manually works fine, after cleaning the workspace. |
See the repos file you should be building with master for dependencies. |
yes, I cloned the master branch of nav2, which is default. |
I don't know what you're asking. See our build instructions for master branch https://navigation.ros.org/build_instructions/index.html#build-ros-2-master it requires you work with the repos file. If you choose not to, then don't run master, run a specific distribution released branch |
Master assumes you're building the dependencies we set out in the repos file, because often we need to make updates to fix issues or add features so we need to be able to use non-released versions in master development. |
aaaah... I was assuming But I was just following the wrong docs (and skipping |
* prototype of lifecycle bond system * adding more structure to get around weak ptr issue * working prototype for manager * adding some ns -> s conversions * changing to service node * adding bond connections to all servers * update logs * fixing review comments * fix types * remove extraneous functions * make linters happy * simplifications * adding spinner to get working but now unstable * moving bond connections to activate state * adding defaults * working complete prototype for review * update dependencies * adding connection logging * remove accidental file * fix server side timeout for heartbeats * adding complete unit coverage of bond support * fixing lifecycle test * trying to activate since autostart was removed
@SteveMacenski, I have a question about this. As commented here, if one of them is down then bring them all down. How am I supposed to bring them up again? |
Using the lifecycle manager |
* prototype of lifecycle bond system * adding more structure to get around weak ptr issue * working prototype for manager * adding some ns -> s conversions * changing to service node * adding bond connections to all servers * update logs * fixing review comments * fix types * remove extraneous functions * make linters happy * simplifications * adding spinner to get working but now unstable * moving bond connections to activate state * adding defaults * working complete prototype for review * update dependencies * adding connection logging * remove accidental file * fix server side timeout for heartbeats * adding complete unit coverage of bond support * fixing lifecycle test * trying to activate since autostart was removed
Hi! From Eloquent to Foxy, I deduce this feature should be available in Foxy but I don't see the changes of this PR in the foxy-devel branch (e.g. no bond in bt_navigator). Does that mean the feature is not in Foxy? Am I missing something? Thanks in advance! |
Correct, it is in galactic and newer |
this->get_name(), | ||
shared_from_this()); | ||
|
||
bond_->setHeartbeatPeriod(0.10); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@SteveMacenski Can I ask where this value for the period came from? The message is small but on our system running many parts of nav2, this equates to almost 100Hz on the bond topic for 5 nodes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be 10hz, not 100hz https://github.com/ros/bond_core/blob/ros2/bondcpp/src/bond.cpp#L291-L299. It was selected to offer a reasonable trade off between responsiveness to a server going down and how long before it goes down would it continue to be unsafe for a robot to operate. Since this needs to match from the lifecycle node -> lifecycle manager and vise versa, this couldn't be easily parameterized so that they would be guaranteed to match, so this is one of the very few "magic numbers" in the stack.
I'd be more than happy to discuss a proposal though to change to another value if you had a suggestion and rationale to share!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The stack up of several bond publishers on the same bond topic is giving us the 100 Hz as measured from ros2 topic hz /bond
.
Could this value not be kept as a default in a common header file with an associated parameter name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are not currently any common headers between all of these systems, that I am aware of at least.
You should see 10 * N messages on the topic per second, that is correct, where N is the number of lifecycle bond monitored servers. This should be very attainable for DDS given these messages are very small and the default bringup right now includes composition so its all intraprocess.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there some precedent for the rate selected from other usage of the bond topic? It seems like there could be a distance metric derived from max speed to come up with a time interval and a timeout value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would probably break some encapsulation in the servers, since many don't know or care about the robot's speed. What are you trying to solve? 100hz is nothing, that shouldn't be causing any issues even on small platforms!
This pull request is in conflict. Could you fix it @SteveMacenski? |
@SteveMacenski I have a question regarding relaunching a custom node when it crashes.
What else do I need to configure? Is there any example or documentation? Thanks. |
You have the set respawn as true in launch so that the servers respawn, this is a relatively new feature. I know its in Rolling, but I'm not sure if its been (or will be) backported to Galactic or Foxy. Respawn is a launch feature |
Adding heartbeat support for servers so if they crash or fail, the lifecycle manager will kill the system.
#1869 replacement
Tested:
Completed:
Relies on ros/bond_core#67 being merged but updated the ros2_depedencies.repos file to use my fork + branch that this PR represents