-
Notifications
You must be signed in to change notification settings - Fork 24
[BUG] Flaky CI: test_resource_change_notifier timeout + test_hybrid_suppression SIGSEGV on Humble #344
Description
Bug report
Two flaky tests on main branch CI.
Test 1: test_resource_change_notifier - 60s timeout hang
Steps to reproduce:
- Run
test_resource_change_notifierunder CPU pressure (CI runners) - Test
MultipleSubscribersAllCalledhangs until CTest 60s timeout - No result XML generated ("missing_result")
Expected behavior: All 16 tests pass in under 5 seconds.
Actual behavior: Binary hangs on MultipleSubscribersAllCalled, killed by CTest timeout.
Root cause: ResourceChangeNotifier notifier is declared first in every test, so it is destroyed last - after the std::promise/std::atomic variables that the worker thread's callbacks reference. When future.wait_for(2s) times out under CI load, the test returns and destroys the promise while the worker thread is still calling set_value() on it - causing undefined behavior (hang in corrupted promise internals). The notifier destructor then calls join() which blocks forever.
Observed on: Rolling (run 23909922091, 2026-04-02)
Test 2: test_hybrid_suppression - SIGSEGV on Humble
Steps to reproduce:
- Run
test_hybrid_suppressionintegration test on Humble - Demo nodes crash with SIGSEGV (exit code -11) during SIGINT shutdown
test_exit_codesfails because -11 is not in ALLOWED_EXIT_CODES
Expected behavior: All demo nodes exit cleanly with 0, SIGINT, or SIGTERM.
Actual behavior: Random demo nodes (brake_actuator, brake_pressure_sensor) crash with SIGSEGV.
Root cause: Two combined issues: (A) BrakeActuator and LightController have subscriptions with this-capturing callbacks but destructors don't reset subscriptions before member destruction. (B) Humble-specific rclcpp::spin() teardown race - DDS callbacks fire on partially-destroyed nodes during SIGINT shutdown.
Observed on: Humble only (runs 23909922091, 23895290725, 23708608625)
Environment
- ros2_medkit version: main (latest)
- ROS 2 distro: Rolling (test 1), Humble (test 2)
- OS: Ubuntu Noble / Jammy (GitHub Actions)
Fix plan
- Test 1: Reorder declarations in all tests so
notifieris declared after shared state (destroyed first) - Test 2: Fix demo node destructors (reset subscriptions/timers) + restructure
main()to destroy node beforerclcpp::shutdown()