Rewrite TimeControllerClockTest.unpaused_sleep_returns_true to be correct #1384

emersonknapp · 2023-06-10T01:56:51Z

Closes #1282
Related to #1012, #1290, #1367

The original test was written such that it requested a 100ns sleep. With such a short time it was either impossible or extremely unlikely to experience a spurious wakeup, so the unlooped condition never failed. When #1012 attempted an improvement, it added a 1s sleep (10 million times longer), making the chance of spurious wakeup much more likely. However, its loop condition checked for rclcpp::ok() which is always false in this test, so the loop was never entered. The outcome was that test failure was made far more likely.

This rewrites the test to have a correct spurious wakeup check loop with reasonable timeout. It should remove the flakiness from the CI builds. If so, it is purely a test change and can be easily backported to live distros.

…rect Signed-off-by: Emerson Knapp <emerson.b.knapp@gmail.com>

emersonknapp · 2023-06-10T02:03:18Z

Pulls: #1384
Gist: https://gist.githubusercontent.com/emersonknapp/3f3be53971c13a762207805eda247757/raw/7cb44417585d1b87f69ab233ef35d7c4fc87f778/ros2.repos
BUILD args: --packages-up-to rosbag2_cpp
TEST args: --packages-select rosbag2_cpp --retest-until-fail 20 --ctest-args -R test_time_controller_clock
ROS Distro: rolling
Job: ci_launcher
ci_launcher ran: https://ci.ros2.org/job/ci_launcher/12194

Linux
Linux-aarch64
Windows

emersonknapp · 2023-06-10T02:12:20Z

I ran above CI on this single test with a restest-until-fail times 20 for all three platforms. In the past we were seeing it fail within 3 tries pretty reliably so that should be a solid stress test.

MichaelOrlov

@emersonknapp Good catch about the useless check for the rclcpp::ok() and thanks for trying to improve this test. While the core solution looks good to me I have a couple concerns.

The original purpose of this test was to verify that sleep_until(timestamp) works properly and returns true in the expected timeframe +- some tolerance after the explicit call for clock pause() and then resume().
Although in your current implementation, there is no explicit call for clock->pause() and following resume().
Please consider returning the implementation with an explicit call for pause-resume in the test.

  rosbag2_cpp::TimeControllerClock clock(ros_start_time);
  clock.resume();
  EXPECT_TRUE(clock.sleep_until(clock.now() + 100));

  clock.pause();
  clock.resume();

  auto ros_time_now = clock.now();
  auto sleep_until_timestamp = ros_time_now + RCUTILS_S_TO_NS(1);
  auto start = clock.ros_to_steady(ros_time_now);

The tolerance for checking that clock->sleep_until(timestamp) was sleeping for the correct amount of time should be in some reasonable range about 10-20% i.e. 100-200 msec. In the current implementation, it is 100% for the upper bound i.e. timestamp * 2 = 2000 msec. With tolerance in 100% for upper bound we can't judge about validity of the clock->sleep_until(timestamp) call.

emersonknapp · 2023-06-10T08:24:46Z

after the explicit call for clock pause() and then resume()

OK - yeah I see that's how I originally wrote it. Added back a second loop

range about 10-20% i.e. 100-200 msec

Sure, fair point

Signed-off-by: Emerson Knapp <emerson.b.knapp@gmail.com>

emersonknapp · 2023-06-10T08:27:40Z

https://ci.ros2.org/job/ci_launcher/12195/

Linux
Linux-aarch64
Windows

MichaelOrlov

LGTM. With green CI.
i.e. need to address new warning on Windows build.

rosbag2_cpp/test/rosbag2_cpp/test_time_controller_clock.cpp

Signed-off-by: Emerson Knapp <emerson.b.knapp@gmail.com>

emersonknapp · 2023-06-12T00:19:09Z

https://ci.ros2.org/job/ci_launcher/12200

Linux
Linux-aarch64
Windows

emersonknapp · 2023-06-12T17:54:28Z

@Mergifyio backport iron humble

mergify · 2023-06-12T17:54:47Z

backport iron humble

✅ Backports have been created

#1389 Rewrite TimeControllerClockTest.unpaused_sleep_returns_true to be correct (backport #1384) has been created for branch iron
#1390 Rewrite TimeControllerClockTest.unpaused_sleep_returns_true to be correct (backport #1384) has been created for branch humble but encountered conflicts

…rect (#1384) Previous fix attempt's loop over spurious wakeups was never entered. Signed-off-by: Emerson Knapp <emerson.b.knapp@gmail.com> (cherry picked from commit 38e9c96)

…rect (#1384) Previous fix attempt's loop over spurious wakeups was never entered. Signed-off-by: Emerson Knapp <emerson.b.knapp@gmail.com> (cherry picked from commit 38e9c96) # Conflicts: # rosbag2_cpp/test/rosbag2_cpp/test_time_controller_clock.cpp

…rect (backport #1384) (#1390) * Rewrite TimeControllerClockTest.unpaused_sleep_returns_true to be correct (#1384) Previous fix attempt's loop over spurious wakeups was never entered. Signed-off-by: Emerson Knapp <emerson.b.knapp@gmail.com> (cherry picked from commit 38e9c96)

…rect (#1384) (#1389) Previous fix attempt's loop over spurious wakeups was never entered. Signed-off-by: Emerson Knapp <emerson.b.knapp@gmail.com> (cherry picked from commit 38e9c96)

ros-discourse · 2023-06-20T23:22:43Z

This pull request has been mentioned on ROS Discourse. There might be relevant details there:

https://discourse.ros.org/t/ros-2-tsc-meeting-minutes-2023-06-15/32038/1

Rewrite TimeControllerClockTest.unpaused_sleep_returns_true to be cor…

83cb4d2

…rect Signed-off-by: Emerson Knapp <emerson.b.knapp@gmail.com>

emersonknapp requested a review from a team as a code owner June 10, 2023 01:56

emersonknapp requested review from gbiggs, hidmic and MichaelOrlov and removed request for a team June 10, 2023 01:56

emersonknapp mentioned this pull request Jun 10, 2023

[iron] Add ROS_DISTRO metadata record to mcap file when opening for writing #1371

Merged

MichaelOrlov reviewed Jun 10, 2023

View reviewed changes

Pause and resume clock between two sleeping tests

15b7421

Signed-off-by: Emerson Knapp <emerson.b.knapp@gmail.com>

emersonknapp force-pushed the emersonknapp/fix-timecontroller-test branch from 4e4fcdf to 15b7421 Compare June 10, 2023 08:26

MichaelOrlov approved these changes Jun 10, 2023

View reviewed changes

rosbag2_cpp/test/rosbag2_cpp/test_time_controller_clock.cpp Outdated Show resolved Hide resolved

MichaelOrlov mentioned this pull request Jun 10, 2023

Refine root cause of the unpaused_sleep_returns_true test failure #1367

Closed

Fix windows warning

7560581

Signed-off-by: Emerson Knapp <emerson.b.knapp@gmail.com>

emersonknapp merged commit 38e9c96 into rolling Jun 12, 2023
14 checks passed

delete-merged-branch bot deleted the emersonknapp/fix-timecontroller-test branch June 12, 2023 17:54

mergify bot mentioned this pull request Jun 12, 2023

[iron] Rewrite TimeControllerClockTest.unpaused_sleep_returns_true to be correct (backport #1384) #1389

Merged

mergify bot mentioned this pull request Jun 12, 2023

[humble] Rewrite TimeControllerClockTest.unpaused_sleep_returns_true to be correct (backport #1384) #1390

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rewrite TimeControllerClockTest.unpaused_sleep_returns_true to be correct #1384

Rewrite TimeControllerClockTest.unpaused_sleep_returns_true to be correct #1384

emersonknapp commented Jun 10, 2023

emersonknapp commented Jun 10, 2023 •

edited

Loading

emersonknapp commented Jun 10, 2023

MichaelOrlov left a comment •

edited

Loading

emersonknapp commented Jun 10, 2023

emersonknapp commented Jun 10, 2023

MichaelOrlov left a comment

emersonknapp commented Jun 12, 2023

emersonknapp commented Jun 12, 2023

mergify bot commented Jun 12, 2023 •

edited

Loading

ros-discourse commented Jun 20, 2023

Rewrite TimeControllerClockTest.unpaused_sleep_returns_true to be correct #1384

Rewrite TimeControllerClockTest.unpaused_sleep_returns_true to be correct #1384

Conversation

emersonknapp commented Jun 10, 2023

emersonknapp commented Jun 10, 2023 • edited Loading

emersonknapp commented Jun 10, 2023

MichaelOrlov left a comment • edited Loading

Choose a reason for hiding this comment

emersonknapp commented Jun 10, 2023

emersonknapp commented Jun 10, 2023

MichaelOrlov left a comment

Choose a reason for hiding this comment

emersonknapp commented Jun 12, 2023

emersonknapp commented Jun 12, 2023

mergify bot commented Jun 12, 2023 • edited Loading

✅ Backports have been created

ros-discourse commented Jun 20, 2023

emersonknapp commented Jun 10, 2023 •

edited

Loading

MichaelOrlov left a comment •

edited

Loading

mergify bot commented Jun 12, 2023 •

edited

Loading