Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[rostime] add SteadyTime #57

Merged
merged 3 commits into from May 15, 2017
Merged

Conversation

flixr
Copy link
Contributor

@flixr flixr commented Mar 3, 2017

Add a new SteadyTime wich uses a steady/monotonic clock to prevent issues with time jumps.

Replacement for #55 with MonotonicTime renamed to SteadyTime and added Windows implementation.

@flixr
Copy link
Contributor Author

flixr commented Mar 3, 2017

I didn't want to use the C++11 features here as I'm not sure if we can/should use them for roscpp yet.
And I'm hoping this will also applied to indigo (we are currently stuck on Ubuntu 14.04 on the Nvidia Jetson, as there are no cuda drivers for newer Ubuntu versions), where a C++11 compatible compiler is not yet required.

Copy link
Member

@dirk-thomas dirk-thomas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add unit tests covering the new functionality.

{
public:
SteadyTime()
: TimeBase<SteadyTime, WallDuration>()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use two space indentation to be consistent with the existing code.

@@ -93,8 +93,8 @@ namespace ros
* These have only internal linkage to this translation unit.
* (i.e. not exposed to users of the time classes)
*/
void ros_walltime(uint32_t& sec, uint32_t& nsec)
#ifndef WIN32
void ros_walltime(uint32_t& sec, uint32_t& nsec)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please avoid any unrelated whitespace changes in your PR.

@flixr
Copy link
Contributor Author

flixr commented Mar 8, 2017

Fixed the whitespace stuff... will look into creating unit tests, although I have no idea on how to check if time is really monotonic, since I don't think we can reset the system time in the tests...

Add a new SteadyTime wich uses a steady/monotonic clock to prevent issues with time jumps.
@flixr
Copy link
Contributor Author

flixr commented Mar 20, 2017

Anything else to do here?

@flixr
Copy link
Contributor Author

flixr commented Mar 20, 2017

Is it OK to use WallDuration with SteadyTime or should we add a SteadyDuration as well? There would be no difference between WallDuration and SteadyDuration though...

@dirk-thomas
Copy link
Member

I think most importantly this needs to be tested in context of the full stack up to bond_core to make sure it satisfies the desired use case. @mikaelarguedas FYI

@mikaelarguedas
Copy link
Member

I tested the changes to use SteadyTimer all the way up to bond_core. It works as expected (bonds don't break on forward time jumps, but if you jump back in time the bond will stay alive until the deadline is reached). I did notice a few unexpected nodelet crashes when using this so I'll do some more testing on Monday before +1 ing this.

@mikaelarguedas
Copy link
Member

Update:
Looks like there are still situations where time jumps cause this to fail:

[DEBUG] [1490639010.653929481]: SteadyTimer deregistering callbacks.
[DEBUG] [1490643600.000426555]: Time jumped forward by [25.143854] for timer of period [1.000000], resetting timer (current=1524518.175054, next_expected=1524493.031200)
[DEBUG] [1490643600.000492923]: Time jumped forward by [25.042610] for timer of period [1.000000], resetting timer (current=1524518.175106, next_expected=1524493.132497)
[ INFO] [1490643600.087787003]: Bond broken, exiting

It shows up only with nodelets but not with simple processes tied together with a bond. Needs more investigation.

Can be reproduced by running:

roslaunch nodelet_tutorial_math plus.launch

and jumping system time forward:

sudo date --set="2017-03-28 14:00:00.000"

@flixr
Copy link
Contributor Author

flixr commented Mar 29, 2017

@mikaelarguedas thanks a lot for testing!

The SteadyTime class of this PR should be ok, but I added a fix for the SteadyTimer in ros/ros_comm#1014

@flixr
Copy link
Contributor Author

flixr commented Apr 4, 2017

@mikaelarguedas any updates here?
We have these 3 patchsets running here on all our devices and haven't encountered any problems so far.

@mikaelarguedas
Copy link
Member

@flixr Sorry didn't have time to do another round of testing on this. Did you perform tests with nodelets as well or only on simple bond connection ? (IIRC I faced problem only with nodelets and multiple forward time jumps)

I'll try to test this today or tomorrow and will report here

@flixr
Copy link
Contributor Author

flixr commented Apr 4, 2017

I tested the SteadyTimer callbacks directly and nodelets...

@mikaelarguedas
Copy link
Member

I tested the same here (cb, bonds only and nodelets) and couldn't make if fail. 👏 🎉 thanks for pounding on this that's a great improvement! 🚀
@dirk-thomas What is the porting strategy you consider here ? Do you consider back-porting this to all active distros ?

@@ -167,6 +167,40 @@ namespace ros
nsec = nsec_sum;
#endif
}

void ros_steadytime(uint32_t& sec, uint32_t& nsec)
#ifndef WIN32
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This condition seems to be inverted. The exception is thrown in the Windows block.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just copied that whole part from the implemtation of ros_walltime...

sec = start.tv_sec;
nsec = start.tv_nsec;
#else
static LARGE_INTEGER cpu_frequency, performance_count;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should these variables be static? A local variable would avoid problems of the static variable being used by multiple concurrent callers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above: basically just copied that from ros_walltime...

// https://msdn.microsoft.com/en-us/library/windows/desktop/ms644904(v=vs.85).aspx:
// "On systems that run Windows XP or later, the function will always succeed and will
// thus never return zero."
QueryPerformanceFrequency(&cpu_frequency);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return value of the function should be checked.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above: basically just copied that from ros_walltime...

if (cpu_frequency.QuadPart == 0) {
throw NoHighPerformanceTimersException();
}
QueryPerformanceCounter(&performance_count);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return value of the function should be checked.

@dirk-thomas
Copy link
Member

The change in this repo is easier since it only adds API. The PRs in other repos building on top of this are more challenging since they change behavior. Therefore I plan to roll them out distro by distro. So first Lunar and Kinetic, waiting for PRs on top to also be released, wait until all changes are in the public repo and let it soak there for some time. Then consider doing the backports for Indigo and Jade.

@flixr
Copy link
Contributor Author

flixr commented Apr 10, 2017

@dirk-thomas I would propose to merge this as is...
If you want further changes in the windows implementation, it should IMHO be done in a separate PR as the same should be applied to the already existing ros_walltime function and tested by someone on Windows...

@flixr
Copy link
Contributor Author

flixr commented Apr 22, 2017

@dirk-thomas any updates here?
I can't really test the windows code... but we have these patches running on our systems for a while now without any problems, so is there any reason to not merge this yet?

@flixr
Copy link
Contributor Author

flixr commented May 2, 2017

@dirk-thomas @mikaelarguedas ping...

@dirk-thomas
Copy link
Member

Sorry for the delay - very busy times 😟

I will merge this and release a new version into Lunar in order to test the ros_comm PR.

@dirk-thomas dirk-thomas merged commit 089d553 into ros:kinetic-devel May 15, 2017
@dirk-thomas
Copy link
Member

@flixr Thank you for working on this feature!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants