Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🧑‍🌾 Flaky test: test_time_source.parameter_activation #2088

Closed
Crola1702 opened this issue Jan 24, 2023 · 1 comment · Fixed by #2092
Closed

🧑‍🌾 Flaky test: test_time_source.parameter_activation #2088

Crola1702 opened this issue Jan 24, 2023 · 1 comment · Fixed by #2092

Comments

@Crola1702
Copy link
Contributor

Bug report

Required Info:

  • Operating System:
    • Windows, Ubuntu 22.04
  • Installation type:
    • Source install
  • Version or commit hash:
    • Rolling
  • Client library (if applicable):
    • rclcpp

Steps to reproduce issue

This test regression happens on all supported distros in nightly CI:

Can also be reproduced locally building up to rclcpp and testing test_time_source:

As this is a flaky test, sometimes it passes and sometimes it fails. Using --retest-until-fail increases the probability of having this problem

colcon build --packages-up-to rclcpp
colcon test --event-handlers console_direct+ --retest-until-fail 5 --ctest-args -R test_time_source --packages-select rclcpp

Expected behavior

Test should pass

Actual behavior

Log output:

-- run_test.py: extra environment variables to append:
 - LD_LIBRARY_PATH+=/home/jenkins-agent/workspace/nightly_linux_debug/ws/build/rclcpp/test/rclcpp
-- run_test.py: invoking following command in '/home/jenkins-agent/workspace/nightly_linux_debug/ws/build/rclcpp/test/rclcpp':
 - /home/jenkins-agent/workspace/nightly_linux_debug/ws/build/rclcpp/test/rclcpp/test_time_source --gtest_output=xml:/home/jenkins-agent/workspace/nightly_linux_debug/ws/build/rclcpp/test_results/rclcpp/test_time_source.gtest.xml
Running main() from /home/jenkins-agent/workspace/nightly_linux_debug/ws/install/gtest_vendor/src/gtest_vendor/src/gtest_main.cc
[==========] Running 17 tests from 2 test suites.
[----------] Global test environment set-up.
[----------] 14 tests from TestTimeSource
[ RUN      ] TestTimeSource.detachUnattached
[       OK ] TestTimeSource.detachUnattached (47 ms)
[ RUN      ] TestTimeSource.reattach
[       OK ] TestTimeSource.reattach (25 ms)
[ RUN      ] TestTimeSource.ROS_time_valid_attach_detach
[ERROR] [1673851682.820279192] [rclcpp]: failed to remove clock
[       OK ] TestTimeSource.ROS_time_valid_attach_detach (25 ms)
[ RUN      ] TestTimeSource.ROS_time_valid_wall_time
[       OK ] TestTimeSource.ROS_time_valid_wall_time (23 ms)
[ RUN      ] TestTimeSource.ROS_time_valid_sim_time
[       OK ] TestTimeSource.ROS_time_valid_sim_time (2055 ms)
[ RUN      ] TestTimeSource.ROS_invalid_sim_time
[       OK ] TestTimeSource.ROS_invalid_sim_time (23 ms)
[ RUN      ] TestTimeSource.clock
[       OK ] TestTimeSource.clock (51 ms)
[ RUN      ] TestTimeSource.callbacks
[       OK ] TestTimeSource.callbacks (58 ms)
[ RUN      ] TestTimeSource.callback_handler_erasure
[       OK ] TestTimeSource.callback_handler_erasure (49 ms)
[ RUN      ] TestTimeSource.parameter_activation
unknown file: Failure
C++ exception with description "cannot store a negative time point in rclcpp::Time" thrown in the test body.
[  FAILED  ] TestTimeSource.parameter_activation (90 ms)
[ RUN      ] TestTimeSource.no_pre_jump_callback
[       OK ] TestTimeSource.no_pre_jump_callback (42 ms)
[ RUN      ] TestTimeSource.check_use_clock_thread_value
[       OK ] TestTimeSource.check_use_clock_thread_value (89 ms)
[ RUN      ] TestTimeSource.check_clock_thread_status
[       OK ] TestTimeSource.check_clock_thread_status (72 ms)
[ RUN      ] TestTimeSource.clock_sleep_until_with_ros_time_basic
[       OK ] TestTimeSource.clock_sleep_until_with_ros_time_basic (48 ms)
[----------] 14 tests from TestTimeSource (2698 ms total)

[----------] 3 tests from TimeSource
[ RUN      ] TimeSource.invalid_sim_time_parameter_override
[ERROR] [1673851685.442106687] [my_node]: Invalid type 'string' for parameter 'use_sim_time', should be 'bool'
[       OK ] TimeSource.invalid_sim_time_parameter_override (21 ms)
[ RUN      ] TimeSource.valid_clock_type_for_sim_time
[       OK ] TimeSource.valid_clock_type_for_sim_time (22 ms)
[ RUN      ] TimeSource.invalid_clock_type_for_sim_time
[ERROR] [1673851685.486597161] [my_node]: use_sim_time parameter can't be true while clocks are not all of RCL_ROS_TIME type
[       OK ] TimeSource.invalid_clock_type_for_sim_time (23 ms)
[----------] 3 tests from TimeSource (66 ms total)

[----------] Global test environment tear-down
[==========] 17 tests from 2 test suites ran. (2777 ms total)
[  PASSED  ] 16 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] TestTimeSource.parameter_activation

 1 FAILED TEST
-- run_test.py: return code 1
-- run_test.py: inject classname prefix into gtest result file '/home/jenkins-agent/workspace/nightly_linux_debug/ws/build/rclcpp/test_results/rclcpp/test_time_source.gtest.xml'
-- run_test.py: verify result file '/home/jenkins-agent/workspace/nightly_linux_debug/ws/build/rclcpp/test_results/rclcpp/test_time_source.gtest.xml'

Additional information

Test flakiness:

Job: nightly_linux_repeated

  • Last 30 builds: 10 builds failed, 33.33% with error
  • Last 7 builds: 5 builds failed, 71.42% with error

Job: nightly_win_deb

  • Last 30 builds: 6 builds failed, 20% with error
  • Last 7 builds: 3 builds failed, 42.85% with error

Job: nightly_win_rep

  • Last 30 builds: 1 builds failed, 3.33% with error
  • Last 7 builds: 1 builds failed, 14.28% with error

Job: Rci__nightly-debug_ubuntu_jammy_amd64

  • Last 30 builds: 2 builds failed, 6.66% with error
  • Last 7 builds: 2 builds failed, 28.57% with error

Job: nightly_linux_release

  • Last 30 builds: 5 builds failed, 16.66% with error
  • Last 7 builds: 4 builds failed, 57.14% with error

Job: nightly_linux_debug

  • Last 30 builds: 6 builds failed, 20% with error
  • Last 7 builds: 4 builds failed, 57.14% with error

Job: nightly_linux_coverage

  • Last 30 builds: 3 builds failed, 10% with error
  • Last 7 builds: 0 builds failed, 0% with error

Job: nightly_linux-aarch64_repeated

  • Last 30 builds: 1 builds failed, 3.33% with error
  • Last 7 builds: 0 builds failed, 0% with error
@fujitatomoya
Copy link
Collaborator

@Crola1702 @clalancette i did not have much time to figure out the root cause what makes this flaky, but i think

time_msg = std::make_shared<builtin_interfaces::msg::Time>(last_msg_set_->clock);

goes negative in second field, and eventually

set_clock(time_msg, true, *it);

Time constructor throws that exception.

I think clocks_state_.enable_ros_time() is called from on_parameter_event that leads to this exception. but i do not really understand why last_msg_set_ goes negative, could be racy condition with clock thread?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants