Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wall time should use CLOCK_MONOTONIC instead of CLOCK_REALTIME which is vulnerable to leapseconds #33

Closed
xlz opened this issue Apr 10, 2015 · 6 comments
Labels

Comments

@xlz
Copy link

xlz commented Apr 10, 2015

https://github.com/ros/roscpp_core/blob/indigo-devel/rostime/src/time.cpp#L104

clock_gettime(CLOCK_REALTIME, &start);

CLOCK_REALTIME is subject to discontinuity administratively or from NTP leap second. Last time a leap second elapsed, millions of servers exploded http://lwn.net/Articles/509207/ . Considering the real-time sensitivity of robot control and the schedule of the next leap second, CLOCK_REALTIME should be changed to CLOCK_MONOTONIC. Other usage of CLOCK_REALTIME in ROS code base should also be eliminated.

@dirk-thomas
Copy link
Member

The function ros_walltime which uses clock_gettime is widely exposed through Time::now as well as WallTime::now. A change to CLOCK_MONOTONIC would change the behavior of these functions significantly. Due to the side effects on existing code it is not very likely that it can be changed.

It would be easier to introduce a new API based on the monotonic clock. But that would require a significant effort to ensure that critical code paths use this new clock function.

Since this is not a feature the maintainers will likely work on in the near future I will mark the ticket with the milestone "untargeted".

@dirk-thomas dirk-thomas added this to the Untargeted milestone Apr 10, 2015
@xlz
Copy link
Author

xlz commented Apr 10, 2015

The only behavioral difference between CLOCK_REALTIME and CLOCK_MONOTONIC is that CLOCK_REALTIME is subject to discontinuous jumps in the underlying system while CLOCK_MONOTONIC is not.

Leaving the core timing function vulnerable to disruption caused by discontinuity probably should not be considered a worthy feature only for supporting administrative time adjustment. But the decision is yours.

@xlz xlz changed the title Wall time should use CLOCK_MONOTONIC instead of CLOCK_REALTIME Wall time should use CLOCK_MONOTONIC instead of CLOCK_REALTIME which is vulnerable to leapseconds Apr 10, 2015
@tfoote
Copy link
Member

tfoote commented Apr 10, 2015

From the documentation, CLOCK_MONOTONIC "represents monotonic time since some unspecified starting point" which is not the same as time since the epoc. This is not useful for inter-computer syncronization.

A discussion of the differences can be found on stack overflow

From a quick test from that thread confirms CLOCK_MONOTONIC and CLOCK_REALTIME are very different.

$ perl -w -MTime::HiRes=clock_gettime,CLOCK_MONOTONIC -E 'say clock_gettime(CLOCK_MONOTONIC)'
130701.623424962
$ perl -w -MTime::HiRes=clock_gettime -E 'say clock_gettime()'
1428686871.54929

The ability to correct for drift is actually important for ROS systems. We rely on accurate timestamps between disparate systems, the internal clocks inside computers have significant enough drift that there are significant performance issues without time syncronization.

If you want to make sure that your system maintains syncronization without jumping we recommend using chrony instead of ntpd. notes here It will slew the clock rate instead of jumping the clock to correct for offsets as well as estimating the drift rate. It also works much better over intermittent connections.

We also use seconds since the epoc which is not vulnerable to leap second adjustments. Even if there's a leap second that only effects a date representation of the time and not.

Also all our core code is written to be robust to time jumps, all timing functions check for jumps both forward and backward. This is because we have support for simulated time and people regularly use log files playing back data in a loop which regularly causes large time jumps backwards as part of standard practice.

And as noted on the Clock wiki page wiki page using ros::Time is not an API for realtime control systems. If you are doing a realtime control task you should internally use a different API such as CLOCK_MONOTONIC if you need guaranteed timing.

With the above reasons I would suggest that this be closed as won't fix instead of set as untargted, as CLOCK_MONOTONIC is not appropriate for our core design.

@wjwwood
Copy link
Member

wjwwood commented Apr 10, 2015

We also use seconds since the epoc which is not vulnerable to leap second adjustments. Even if there's a leap second that only effects a date representation of the time and not.

That's actually system dependent, this webpage has a good summary of the issues:

http://www.madore.org/~david/computers/unix-leap-seconds.html#the-spec

According to the spec, when a leap second occurs the seconds since epoch can have one repeated second. However, on most machines to avoid this confusion the seconds since epoch do not repeat on leap seconds and diverge from the date formatted number seconds. So the leap seconds logic is baked into the date formatting code instead.

Regardless, CLOCK_REALTIME is still subject to time travel due to administrative changing of the time or something like NTP syncing. However, I think that is appropriate to use it in ros::WallTime and using CLOCK_MONOTONIC is not appropriate, since it cannot be compared across machines or even processes on the same machine, depending on the implementation. CLOCK_MONOTONIC should be used for measuring intervals which are smaller than months and do not need to survive over system reboots.

CLOCK_REALTIME should be changed to CLOCK_MONOTONIC. Other usage of CLOCK_REALTIME in ROS code base should also be eliminated.

This is my opinion on the issue: All code (ros and user) which uses ros::Time and/or ros::WallTime should be aware that the time reported can time travel backwards, either from simulated time looping, ntp updates, or an administrator changing the time. At best we can improve the documentation to make this clear. Switching to CLOCK_MONOTONIC is not feasible since it is not relative to a common starting point.

@xlz Thank you for raising the issue, it is definitely worth us looking at it, but I do not agree that any change needs to be made.

@xlz
Copy link
Author

xlz commented Apr 10, 2015

OK I agree.

@flixr
Copy link
Contributor

flixr commented Apr 15, 2017

See new SteadyTime in #57

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants