-
Notifications
You must be signed in to change notification settings - Fork 240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rewrite TimeControllerClockTest.unpaused_sleep_returns_true to be correct #1384
Conversation
…rect Signed-off-by: Emerson Knapp <emerson.b.knapp@gmail.com>
Pulls: #1384 |
I ran above CI on this single test with a restest-until-fail times 20 for all three platforms. In the past we were seeing it fail within 3 tries pretty reliably so that should be a solid stress test. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@emersonknapp Good catch about the useless check for the rclcpp::ok()
and thanks for trying to improve this test. While the core solution looks good to me I have a couple concerns.
- The original purpose of this test was to verify that
sleep_until(timestamp)
works properly and returns true in the expected timeframe +- some tolerance after the explicit call for clock pause() and then resume().
Although in your current implementation, there is no explicit call forclock->pause()
and followingresume()
.
Please consider returning the implementation with an explicit call for pause-resume in the test.
rosbag2_cpp::TimeControllerClock clock(ros_start_time);
clock.resume();
EXPECT_TRUE(clock.sleep_until(clock.now() + 100));
clock.pause();
clock.resume();
auto ros_time_now = clock.now();
auto sleep_until_timestamp = ros_time_now + RCUTILS_S_TO_NS(1);
auto start = clock.ros_to_steady(ros_time_now);
- The tolerance for checking that
clock->sleep_until(timestamp)
was sleeping for the correct amount of time should be in some reasonable range about 10-20% i.e. 100-200 msec. In the current implementation, it is 100% for the upper bound i.e. timestamp * 2 = 2000 msec. With tolerance in 100% for upper bound we can't judge about validity of theclock->sleep_until(timestamp)
call.
OK - yeah I see that's how I originally wrote it. Added back a second loop
Sure, fair point |
Signed-off-by: Emerson Knapp <emerson.b.knapp@gmail.com>
4e4fcdf
to
15b7421
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. With green CI.
i.e. need to address new warning on Windows build.
Signed-off-by: Emerson Knapp <emerson.b.knapp@gmail.com>
@Mergifyio backport iron humble |
✅ Backports have been created
|
This pull request has been mentioned on ROS Discourse. There might be relevant details there: https://discourse.ros.org/t/ros-2-tsc-meeting-minutes-2023-06-15/32038/1 |
Closes #1282
Related to #1012, #1290, #1367
The original test was written such that it requested a 100ns sleep. With such a short time it was either impossible or extremely unlikely to experience a spurious wakeup, so the unlooped condition never failed. When #1012 attempted an improvement, it added a 1s sleep (10 million times longer), making the chance of spurious wakeup much more likely. However, its loop condition checked for
rclcpp::ok()
which is always false in this test, so the loop was never entered. The outcome was that test failure was made far more likely.This rewrites the test to have a correct spurious wakeup check loop with reasonable timeout. It should remove the flakiness from the CI builds. If so, it is purely a test change and can be easily backported to live distros.