Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tcp_no_delay to joint_states subscriber #80

Merged
merged 2 commits into from Feb 16, 2018

Conversation

v-lopez
Copy link

@v-lopez v-lopez commented Nov 27, 2017

Setting no delay to true prevents that some joint_states messages are bundled and sent in pairs, reducing the period to which we receive new data to half and causing some tf joints to go at 25Hz instead of 50Hz.

This can be observed by running rostopic hz /joint_states:

average rate: 49.988
min: 0.000s max: 0.043s std dev: 0.01929s window: 3552
The std dev is an indicator of how the messages arrive bundled.

@sloretz
Copy link
Contributor

sloretz commented Dec 18, 2017

Thanks for the pull request @v-lopez! Would you mind providing a short example I can run that shows the improvement?

I am a bit surprised setting TCP_NODELAY on the subscriber has an impact. I thought this setting only affects data being sent, and the subscriber should only be receiving data. I would have expected the standard deviation to change only when the setting is set on the publisher side.

@v-lopez
Copy link
Author

v-lopez commented Dec 19, 2017

Hi @sloretz I have pushed a branch that contains some logging to display the delays between each callback call.

To test, checkout that branch: https://github.com/pal-robotics/robot_state_publisher/tree/tcp-no-delay-debugging

Compile it and compile the tests.

Launch test robot_description and rosbag play of /joint_states at 100Hz:
roslaunch robot_state_publisher test_joint_states_bag.launch
rosrun robot_state_publisher robot_state_publisher

You will see that a WARN log is printed each callback, something along these lines:

[ WARN][1513676837.026900344, 292.413758850]: Elapsed time is 0.030253020
[ WARN][1513676837.036831916, 292.423859968]: Elapsed time is 0.010101118
[ WARN][1513676837.043819062, 292.433916643]: Elapsed time is 0.010056675
[ WARN][1513676837.074908838, 292.464133541]: Elapsed time is 0.030216898
[ WARN][1513676837.087898581, 292.474217422]: Elapsed time is 0.010083881
[ WARN][1513676837.094862135, 292.484314403]: Elapsed time is 0.010096981
[ WARN][1513676837.123897863, 292.514582436]: Elapsed time is 0.030268033

If you set the tcpNoDelay to false on https://github.com/pal-robotics/robot_state_publisher/blob/tcp-no-delay-debugging/src/joint_state_listener.cpp#L73
And run again:

[ WARN][1513676963.674487524, 304.483075370]: Elapsed time is 0.040345668 [/home/user/tiago_ws/src/robot_state_publisher/src/joint_state_listener.cpp:99]
[ WARN][1513676963.674767024, 304.483075370]: Elapsed time is 0.000000000
[ WARN][1513676963.714505245, 304.523400080]: Elapsed time is 0.040324710
[ WARN][1513676963.714607611, 304.523400080]: Elapsed time is 0.000000000
[ WARN][1513676963.715027617, 304.523400080]: Elapsed time is 0.000000000
[ WARN][1513676963.754641514, 304.563687284]: Elapsed time is 0.040287204
[ WARN][1513676963.794658579, 304.603997555]: Elapsed time is 0.040310271
[ WARN][1513676963.834603753, 304.644354054]: Elapsed time is 0.040356499
[ WARN][1513676963.835021608, 304.644354054]: Elapsed time is 0.000000000
[ WARN][1513676963.853764710, 304.664548838]: Elapsed time is 0.020194784
[ WARN][1513676963.863713069, 304.674653080]: Elapsed time is 0.010104242

I don't know the actual rate of the rosbag, but you can see from the logs that with tcpNoDelay set to false, the messages typically arrive with 0.0000 or 0.04 second delay, when it's set to true, the rate is not so extreme.

We are still working on indigo, so I can't provide you with real robot data from kinetic.

@sloretz
Copy link
Contributor

sloretz commented Feb 12, 2018

Hi @v-lopez, Thanks for the instructions. I'm having trouble reproducing bundled joint state messages. Would you mind checking for mistakes in my setup?

I'm testing using this launch file

<?xml version="1.0"?>
<launch>
  <param name="robot_description" textfile="$(find robot_state_publisher)/test/two_links_moving_joint.urdf" />

  <node pkg="robot_state_publisher" name="robot_state_publisher" type="robot_state_publisher" output="screen"/>
  <node pkg="joint_state_publisher" name="joint_state_publisher" type="joint_state_publisher">
    <param name="rate" value="100"/>
  </node>
</launch>

I expanded the debugging output to print a mean and standard deviation once every 10 seconds. Here's the diff with use-tcp-no-delay

$ git diff
diff --git a/src/joint_state_listener.cpp b/src/joint_state_listener.cpp
index 3576fc3..4032478 100644
--- a/src/joint_state_listener.cpp
+++ b/src/joint_state_listener.cpp
@@ -92,6 +92,45 @@ void JointStateListener::callbackFixedJoint(const ros::TimerEvent& e)
 
 void JointStateListener::callbackJointState(const JointStateConstPtr& state)
 {
+  static ros::Time last_callback = ros::Time();
+  static std::vector<double> samples;
+  if (last_callback != ros::Time())
+  {
+    ros::Duration elapsed = ros::Time::now() - last_callback;
+    double sample = elapsed.toSec();
+    samples.push_back(sample);
+  }
+  if (samples.size() == 1000)
+  {
+    double size = samples.size();
+    // Mean
+    double sum_elapsed = 0.0;
+    for (std::size_t i = 0; i < samples.size(); ++i)
+    {
+      sum_elapsed += samples[i];
+    }
+    double mean = sum_elapsed / size;
+
+    // std deviation
+    double sum_squared_errors = 0;
+    for (double sample : samples)
+    {
+      sum_squared_errors += (sample - mean) * (sample - mean);
+    }
+    double std_dev = sqrt(sum_squared_errors / (size - 1));
+
+    ROS_WARN_STREAM("Mean: " << mean << " std_dev: " << std_dev);
+
+    // Reset sample collection
+    last_callback = ros::Time();
+    samples.clear();
+    samples.reserve(1000);
+  }
+  else
+  {
+    last_callback = ros::Time::now();
+  }
+
   if (state->name.size() != state->position.size()){
     if (state->position.empty()){
       const int throttleSeconds = 300;

Then I ran both with and without TCP_NODELAY on the subscriber.

With tcpNoDelay(false)

[ WARN] [1518478227.984331117]: Mean: 0.00999911 std_dev: 7.16055e-05
[ WARN] [1518478237.994308600]: Mean: 0.00999914 std_dev: 6.82069e-05
[ WARN] [1518478248.004293398]: Mean: 0.00999921 std_dev: 5.29343e-05
[ WARN] [1518478258.014390825]: Mean: 0.00999916 std_dev: 7.38244e-05
[ WARN] [1518478268.024530604]: Mean: 0.00999942 std_dev: 8.27715e-05
[ WARN] [1518478278.034287945]: Mean: 0.00999895 std_dev: 9.5796e-05
[ WARN] [1518478288.044508078]: Mean: 0.00999928 std_dev: 9.49781e-05
[ WARN] [1518478298.054278808]: Mean: 0.00999902 std_dev: 7.89577e-05
[ WARN] [1518478308.064342439]: Mean: 0.00999917 std_dev: 8.50564e-05
[ WARN] [1518478318.074636489]: Mean: 0.00999945 std_dev: 9.12649e-05

With tcpNoDelay(true)

[ WARN] [1518478398.973933502]: Mean: 0.00999915 std_dev: 7.30417e-05
[ WARN] [1518478408.983748573]: Mean: 0.00999902 std_dev: 7.71818e-05
[ WARN] [1518478418.993866114]: Mean: 0.00999918 std_dev: 7.37418e-05
[ WARN] [1518478429.003923888]: Mean: 0.00999913 std_dev: 0.000102603
[ WARN] [1518478439.013906729]: Mean: 0.00999893 std_dev: 8.00194e-05
[ WARN] [1518478449.023939980]: Mean: 0.00999878 std_dev: 8.85488e-05
[ WARN] [1518478459.034390458]: Mean: 0.00999949 std_dev: 8.74943e-05
[ WARN] [1518478469.043908741]: Mean: 0.00999876 std_dev: 7.92344e-05
[ WARN] [1518478479.053686142]: Mean: 0.00999876 std_dev: 7.29482e-05
[ WARN] [1518478489.063752852]: Mean: 0.00999887 std_dev: 6.59634e-05

Neither run looks like joint_state messages are being bundled, so I can't say for sure if TCP_NODELAY on the subscriber has an effect.

@sloretz
Copy link
Contributor

sloretz commented Feb 15, 2018

Sorry it took me a while to understand how this could work. When a subscriber and publisher connect the subscriber header has a field tcp_nodelay. The publisher then sets TCP_NODELAY on their side based on that setting. http://wiki.ros.org/ROS/TCPROS .

@clalancette If this looks OK to you, I would like to merge it and cherry-pick onto melodic-devel.

@sloretz sloretz merged commit ef0bde4 into ros:kinetic-devel Feb 16, 2018
@sloretz
Copy link
Contributor

sloretz commented Feb 16, 2018

Thanks for the pull request @v-lopez !

sloretz pushed a commit that referenced this pull request Feb 16, 2018
* Add tcp_no_delay to joint_states subscriber

* Describe how subscriber NODELAY gets to publishers
@v-lopez
Copy link
Author

v-lopez commented Feb 16, 2018

Hi @sloretz sorry I could not answer your comment until today.

With the tests you mentioned, did you manage to see the std_dev variations in the end?

sloretz added a commit that referenced this pull request Feb 16, 2018
* Add tcp_no_delay to joint_states subscriber

* Describe how subscriber NODELAY gets to publishers
@sloretz
Copy link
Contributor

sloretz commented Feb 16, 2018

Hi @v-lopez, I wasn't able to reproduce joint_state messages being combined in the same packet. However I did check the code path that TransportHints on the subscriber takes and saw that setting it would help if the publisher was combining messages.

@v-lopez
Copy link
Author

v-lopez commented Feb 19, 2018

Awesome, would it be possible to have a release on kinetic including these changes?

@sloretz
Copy link
Contributor

sloretz commented May 14, 2018

@v-lopez FYI 1.13.6 includes this PR and was released to kinetic

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants