Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consecutive odom estimates with exact same timestamp published? #336

Closed
skohlbr opened this issue Jan 17, 2017 · 12 comments
Closed

Consecutive odom estimates with exact same timestamp published? #336

skohlbr opened this issue Jan 17, 2017 · 12 comments
Assignees

Comments

@skohlbr
Copy link

skohlbr commented Jan 17, 2017

I noticed an issue in a controller where we use the difference of odom timestamps to compute some variables. It appears that very rarely, robot_localization publishes odometry estimates that have the exact same timestamp as the preceding odometry message published. If users are not careful, this can lead to some interesting results (division by zero etc.).

Is this behavior intentional (or at least "tolerated") or a bug? Given that there are two estimates for the exact same time I'd certainly consider this a bug.

/edit: This is on ROS Indigo/64bit/.debs 2.3.1-0trusty-20161027-103827-0700

Here's an instance of the issue:

---
header: 
  seq: 17833
  stamp: 
    secs: 1484649904
    nsecs: 356093645
  frame_id: odom
child_frame_id: base_link
pose: 
  pose: 
    position: 
      x: 0.0
      y: 0.0
      z: 0.0
    orientation: 
      x: -0.000558444323988
      y: -0.00263958323178
      z: -0.659855487819
      w: 0.751387686841
  covariance: [17.788812315170336, -0.0021885944538905023, 1.150434721799784e-05, 0.0, 0.0, 0.0, -0.0021885944538904893, 17.79475984478557, -2.879316282316279e-05, 0.0, 0.0, 0.0, 1.150434721799786e-05, -2.8793162823163247e-05, 21.346028161957094, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.013309594332698326, 3.898956301909769e-08, 1.6479665855370166e-05, 0.0, 0.0, 0.0, 3.898956301909763e-08, 0.01330948183976714, -1.5314744923536395e-06, 0.0, 0.0, 0.0, 1.6479665855370186e-05, -1.5314744923536395e-06, 0.018818686915421896]
twist: 
  twist: 
    linear: 
      x: 0.0
      y: 0.0
      z: 0.0
    angular: 
      x: 0.00372952535455
      y: 0.00331567765125
      z: -0.000923871614868
  covariance: [0.0013216861377139634, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0009551949241778792, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0014724931725999622, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 8.156328152666064e-05, 8.308409942495703e-19, 1.569253771565988e-16, 0.0, 0.0, 0.0, 8.308409941728289e-19, 8.156328152666075e-05, 1.5007201403014085e-16, 0.0, 0.0, 0.0, 1.569253771566009e-16, 1.5007201403014341e-16, 8.906543220831504e-05]
---
header: 
  seq: 17834
  stamp: 
    secs: 1484649904
    nsecs: 356093645
  frame_id: odom
child_frame_id: base_link
pose: 
  pose: 
    position: 
      x: 0.0
      y: 0.0
      z: 0.0
    orientation: 
      x: -0.000558444050624
      y: -0.00263958279673
      z: -0.659855421042
      w: 0.751387745484
  covariance: [17.7888121515518, -0.0021885126985585015, 1.1504075430374946e-05, 0.0, 0.0, 0.0, -0.0021885126985584885, 17.794759066310935, -2.879037276871341e-05, 0.0, 0.0, 0.0, 1.1504075430374966e-05, -2.8790372768713867e-05, 21.34602797521369, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.013309594332697206, 3.898956452210344e-08, 1.647966619807549e-05, 0.0, 0.0, 0.0, 3.898956452210338e-08, 0.013309481839765124, -1.5314749518720583e-06, 0.0, 0.0, 0.0, 1.647966619807551e-05, -1.5314749518720583e-06, 0.018818686810645576]
twist: 
  twist: 
    linear: 
      x: 0.0
      y: 0.0
      z: 0.0
    angular: 
      x: 0.00372952535455
      y: 0.00331567765125
      z: -0.000894802025721
  covariance: [0.0004243384796694964, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 9.052307799168843e-05, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 9.364067191244717e-05, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 8.156328152666064e-05, 8.308409942412505e-19, 1.5198772546641445e-16, 0.0, 0.0, 0.0, 8.308409941645092e-19, 8.156328152666075e-05, 1.4535000317917548e-16, 0.0, 0.0, 0.0, 1.5198772546641647e-16, 1.4535000317917797e-16, 8.626299139314002e-05]
---

@ayrton04
Copy link
Collaborator

If I had to guess, I'd say it's these three lines that are causing trouble in this case:

https://github.com/cra-ros-pkg/robot_localization/blob/kinetic-devel/src/filter_base.cpp#L213
https://github.com/cra-ros-pkg/robot_localization/blob/kinetic-devel/src/filter_base.cpp#L257
https://github.com/cra-ros-pkg/robot_localization/blob/kinetic-devel/src/ros_filter.cpp#L308

Again, I'm just guessing at this point, but is it possible that you occasionally get a measurement whose timestamp is out-of-sequence? If so, the filter will either (a) skip prediction and simply correct based on that value, or (b) if you have smooth_lagged_data set to true, it will walk back through the state and measurement history and "insert" that measurement. In either case, though, the filter's output stamp will be that of the most recent measurement, and you can get two messages with identical timestamps.

I could add a parameter that causes the filter to always do a prediction from the last measurement time to ros::Time::now(), if you think that would help.

@skohlbr
Copy link
Author

skohlbr commented Jan 18, 2017

Ah ok, I see. I can see how this could happen given the nature of ROS communication with no hard guarantees of the order of arrival of messages from different sources. The example is fusing data from IMU and base odometry, so it certainly can't be ruled out that timestamps arrive out of order.

I guess there is no trivial "right way" of doing things, but adding that option might be good (if it doesn't clutter things too much).

@ayrton04
Copy link
Collaborator

ayrton04 commented Feb 1, 2017

I don't think it'll clutter anything. I'll add it. 👍

@ayrton04 ayrton04 self-assigned this Feb 21, 2017
@ayrton04 ayrton04 added this to the 2.3.2 milestone Feb 21, 2017
@ayrton04
Copy link
Collaborator

Addressed in #381. I realize this is quite old, but if you still have need of it and have a chance to try it out, can you let me know how it goes? Thanks!

@ayrton04
Copy link
Collaborator

ayrton04 commented Dec 7, 2017

Closing. Please reopen if you find this isn't fixed.

@ayrton04 ayrton04 closed this as completed Dec 7, 2017
@jim-rothrock
Copy link

This problem still exists. I have an ekf_localization_node that has wheel odometry coming in at 15 Hz and IMU data coming in at 100 Hz. The node's "frequency" parameter is set to 100. Almost immediately, I get messages published to odometry/filtered that have duplicate timestamps. Decreasing "frequency" to 50 increases the amount of time before a duplicate timestamp is detected.

cartographer_ros notices the duplicate timestamps, and exits with the following error message:

[FATAL] [1589675649.666559249, 22585.620000000]: F0516 17:34:09.000000 18205 map_by_time.h:43] Check failed: data.time > std::prev(trajectory.end())->first (621356193855500000 vs. 621356193855500000)

@ayrton04
Copy link
Collaborator

And I assume you have predict_to_current_time set to true?

@jim-rothrock
Copy link

I didn't. After setting predict_to_current_time to "true", duplicate timestamps occur far less frequently, but one showed up after driving the simulated robot for about a minute.

@ayrton04
Copy link
Collaborator

Hrm, that makes me suspect something is up with the simulated time server, then. With predict_to_current_time set to true, every time we call integrateMeasurements, we capture the current time:

const double currentTimeSec = currentTime.toSec();

We then take the difference between that time and the last measurement time, and use that to project the state forward:

double lastUpdateDelta = currentTimeSec - filter_.getLastMeasurementTime();

That stamp then gets put into the message header:

message.header.stamp = ros::Time(filter_.getLastMeasurementTime());

So the only way I can see this happening (at least right now) is if integrateMeasurements gets called twice, and both times, this call to ros::Time::now() returns the exact same value. That's why I am wondering if this is a sim issue.

You said your node runs at 100 Hz, which is quite fast. What's the update rate on your sim set to? Gazebo publishes to /clock at the same rate that the sim updates, I believe:

http://gazebosim.org/tutorials/?tut=ros_comm

/use_sim_time is true if gazebo_ros is publishing to the ROS /clock topic in order to provide a ROS system with simulation-synchronized time. For more info on simulation time, see ROS C++ Time.

So if you ran a node at 1000 Hz that just checked ros::Time::now(), but Gazebo was only running at 100 Hz, you'd get 10 messages with the same time value, I believe. I'm wondering if both the sim and the EKF are running at 100 Hz, and small timing errors in the sim and EKF are causing the EKF to get the same timestamp once in a while.

@jim-rothrock
Copy link

I'm wondering if both the sim and the EKF are running at 100 Hz, and small timing errors in the
sim and EKF are causing the EKF to get the same timestamp once in a while.

That sounds plausible. The Gazebo simulation is running at 100 Hz, and ekf_localization_node's frequency is also set to 100 Hz. I commented out both of ekf_localization_node's input sources (wheel odometry and the IMU), and it still produces duplicate timestamps. When predict_to_current_time is false, the duplicate timestamps happen very frequently. When predict_to_current_time is true, they happen rarely, but even one of them will cause cartographer_node to exit.

My current thinking is that I should set predict_to_current_time to true, and there is no bug in ekf_localization_node; it just sometimes gets identical times from Gazebo. I've written a node that gets messages from odometry/filtered, modifies the timestamps (if necessary) to prevent duplications, and passes the messages on to cartographer_node.

@ayrton04
Copy link
Collaborator

I commented out both of ekf_localization_node's input sources (wheel odometry and the IMU), and it still produces duplicate timestamps.

The node won't (or shouldn't, anyway) run with zero input sources. We set initialized to true inside the core filter when we process our first measurement:

initialized_ = true;

processMeasurement, in turn, is only called if we have values inside our measurement queue:

filter_.processMeasurement(*(measurement.get()));

Finally, when it comes time to publish, we try to build the output message:

if (getFilteredOdometryMessage(filteredPosition))

...but that method returns false if the core filter is not yet initialized:

return filter_.getInitializedStatus();

So I'm not seeing how that was possible (though that doesn't mean there isn't a bug!). After commenting out the input sources, did you restart your roscore? The parameter server will have retained those input settings. Commenting them out won't delete them from the parameter server.

@jim-rothrock
Copy link

After commenting out the input sources, did you restart your roscore?

No, I didn't. I did not realize that the parameters are not deleted from the server. I ran roscore again, with the commented out input sources, and messages are no longer written to odometry/filtered.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants