-
Notifications
You must be signed in to change notification settings - Fork 251
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Address flakiness in rosbag2_play_end_to_end tests #1297
Address flakiness in rosbag2_play_end_to_end tests #1297
Conversation
Gist: https://gist.githubusercontent.com/MichaelOrlov/001cdcfa299a0cd40badfb30fae6aa8c/raw/c737f0f4f0068f00fccb96bf909caf35cb4fb722/ros2.repos |
- Change QoS depth in test databases to correspond number of messages - Change QoS durability to transient local in test DB and mcap file - Explicitly specify QoS depth = 10 for subscribers - Explicitly specify QoS reliability to reliable for subscribers - Explicitly specify QoS durability to transient local for subscribers - Update metadata in test DB and mcap files to the latest version(7) - Remove xfail test_rosbag2_play_end_to_end Signed-off-by: Michael Orlov <michael.orlov@apex.ai>
- Remove xfail for test_rosbag2_info_end_to_end Signed-off-by: Michael Orlov <michael.orlov@apex.ai>
3cebc12
to
a41482e
Compare
Re-run CI with |
@ros-pull-request-builder retest this please |
Re-run CI with --retest-until-fail 5 after fixing and enabling |
93b546b
to
31c07f0
Compare
- Uncomment play_filters_by_topic test - Use proper qos settings for subscribers in `play_filters_by_topic` and fix expectations about number of published messages. - Log warning if SubscriptionManager::continue_spinning(..) finished by timeout. - Enable `play_end_to_end_test` for windows. Signed-off-by: Michael Orlov <michael.orlov@apex.ai>
- Start player in pause mode and wait on subscribers for matched publishers from player then send resume service call to unpause. - Add spin_and_wait_for_matched(topic_names) for SubscriptionManager Signed-off-by: Michael Orlov <michael.orlov@apex.ai>
31c07f0
to
bc48555
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've left a few pieces of minor feedback.
rosbag2_test_common/include/rosbag2_test_common/subscription_manager.hpp
Show resolved
Hide resolved
rosbag2_test_common/include/rosbag2_test_common/subscription_manager.hpp
Outdated
Show resolved
Hide resolved
rosbag2_tests/test/rosbag2_tests/test_rosbag2_play_end_to_end.cpp
Outdated
Show resolved
Hide resolved
rosbag2_tests/test/rosbag2_tests/test_rosbag2_play_end_to_end.cpp
Outdated
Show resolved
Hide resolved
Signed-off-by: Michael Orlov <michael.orlov@apex.ai>
Signed-off-by: Michael Orlov <michael.orlov@apex.ai>
…anager.hpp Co-authored-by: Chris Lalancette <clalancette@gmail.com> Signed-off-by: Michael Orlov <michael.orlov@apex.ai> Co-authored-by: Chris Lalancette <clalancette@gmail.com>
Signed-off-by: Michael Orlov <michael.orlov@apex.ai>
Re-run CI with --retest-until-fail 5 after review. BUILD args: --packages-above-and-dependencies rosbag2_test_common rosbag2_tests |
@clalancette Fair warning about or CI. |
- Added wait_until_completion(process_id, timeout) helper function Signed-off-by: Michael Orlov <michael.orlov@apex.ai>
Re-run CI with BUILD args: --packages-above-and-dependencies rosbag2_test_common rosbag2_tests |
@clalancette FYI I have fixed race conditions in process termination routines. Which was causing intermittent test failures due to sending SIGINT signal when the process was already in the destruction phase. |
It looks like this probably needs some PUBLIC annotations for Windows CI to pass. |
…rror Signed-off-by: Michael Orlov <michael.orlov@apex.ai>
@clalancette Nop. It's a bit more straightforward. The Windows build was failing because I start using wait_until_completion inside stop_execution and the former one was declared after the latter. I made a fix in the new commit. Re-running CI for Windows. |
@clalancette This PR has ABI and API changes in the test framework, although I want to backport it to the Iron branch if you wouldn't mind. It will help us to have a more stable CI for Iron. |
Yeah, that should be fine here. |
Signed-off-by: Michael Orlov <michael.orlov@apex.ai>
c7beaef
to
3966c81
Compare
CI failed on Windows in the
Line
Which means timeout for getting a response from a service call. The timeout is defined there as 4 seconds. It might be not enough for Windows. |
Signed-off-by: Michael Orlov <michael.orlov@apex.ai>
7a697a0
to
0e5ab52
Compare
https://github.com/Mergifyio backport iron |
✅ Backports have been created
|
* Make test_rosbag2_play_end_to_end more deterministic - Change QoS depth in test databases to correspond number of messages - Change QoS durability to transient local in test DB and mcap file - Explicitly specify QoS depth = 10 for subscribers - Explicitly specify QoS reliability to reliable for subscribers - Explicitly specify QoS durability to transient local for subscribers - Update metadata in test DB and mcap files to the latest version(7) - Remove xfail test_rosbag2_play_end_to_end Signed-off-by: Michael Orlov <michael.orlov@apex.ai> * Add wait_for_matched for record_end_to_end_exits_gracefully_on_sigterm - Remove xfail for test_rosbag2_info_end_to_end Signed-off-by: Michael Orlov <michael.orlov@apex.ai> * Fix for play_filters_by_topic test - Uncomment play_filters_by_topic test - Use proper qos settings for subscribers in `play_filters_by_topic` and fix expectations about number of published messages. - Log warning if SubscriptionManager::continue_spinning(..) finished by timeout. - Enable `play_end_to_end_test` for windows. Signed-off-by: Michael Orlov <michael.orlov@apex.ai> * Make test_rosbag2_play_end_to_end deterministic - Start player in pause mode and wait on subscribers for matched publishers from player then send resume service call to unpause. - Add spin_and_wait_for_matched(topic_names) for SubscriptionManager Signed-off-by: Michael Orlov <michael.orlov@apex.ai> * Remove redundant includes from test_rosbag2_play_end_to_end.cpp Signed-off-by: Michael Orlov <michael.orlov@apex.ai> * Sleep for a few milliseconds in SubscriptionManager to avoid busy loop Signed-off-by: Michael Orlov <michael.orlov@apex.ai> * Update rosbag2_test_common/include/rosbag2_test_common/subscription_manager.hpp Co-authored-by: Chris Lalancette <clalancette@gmail.com> Signed-off-by: Michael Orlov <michael.orlov@apex.ai> Co-authored-by: Chris Lalancette <clalancette@gmail.com> * Add missing include<thread> in process_execution_helpers_unix.hpp Signed-off-by: Michael Orlov <michael.orlov@apex.ai> * Address race condition in process termination routines - Added wait_until_completion(process_id, timeout) helper function Signed-off-by: Michael Orlov <michael.orlov@apex.ai> * Move wait_until_completion before stop_execution to fix compilation error Signed-off-by: Michael Orlov <michael.orlov@apex.ai> * Fix for Windows build error. Rename process_id to handle. Signed-off-by: Michael Orlov <michael.orlov@apex.ai> * Increase timeout for service call and wait_until_completion up to 10 sec Signed-off-by: Michael Orlov <michael.orlov@apex.ai> --------- Signed-off-by: Michael Orlov <michael.orlov@apex.ai> Co-authored-by: Chris Lalancette <clalancette@gmail.com> (cherry picked from commit af4ca0c)
* Make test_rosbag2_play_end_to_end more deterministic - Change QoS depth in test databases to correspond number of messages - Change QoS durability to transient local in test DB and mcap file - Explicitly specify QoS depth = 10 for subscribers - Explicitly specify QoS reliability to reliable for subscribers - Explicitly specify QoS durability to transient local for subscribers - Update metadata in test DB and mcap files to the latest version(7) - Remove xfail test_rosbag2_play_end_to_end Signed-off-by: Michael Orlov <michael.orlov@apex.ai> * Add wait_for_matched for record_end_to_end_exits_gracefully_on_sigterm - Remove xfail for test_rosbag2_info_end_to_end Signed-off-by: Michael Orlov <michael.orlov@apex.ai> * Fix for play_filters_by_topic test - Uncomment play_filters_by_topic test - Use proper qos settings for subscribers in `play_filters_by_topic` and fix expectations about number of published messages. - Log warning if SubscriptionManager::continue_spinning(..) finished by timeout. - Enable `play_end_to_end_test` for windows. Signed-off-by: Michael Orlov <michael.orlov@apex.ai> * Make test_rosbag2_play_end_to_end deterministic - Start player in pause mode and wait on subscribers for matched publishers from player then send resume service call to unpause. - Add spin_and_wait_for_matched(topic_names) for SubscriptionManager Signed-off-by: Michael Orlov <michael.orlov@apex.ai> * Remove redundant includes from test_rosbag2_play_end_to_end.cpp Signed-off-by: Michael Orlov <michael.orlov@apex.ai> * Sleep for a few milliseconds in SubscriptionManager to avoid busy loop Signed-off-by: Michael Orlov <michael.orlov@apex.ai> * Update rosbag2_test_common/include/rosbag2_test_common/subscription_manager.hpp Co-authored-by: Chris Lalancette <clalancette@gmail.com> Signed-off-by: Michael Orlov <michael.orlov@apex.ai> Co-authored-by: Chris Lalancette <clalancette@gmail.com> * Add missing include<thread> in process_execution_helpers_unix.hpp Signed-off-by: Michael Orlov <michael.orlov@apex.ai> * Address race condition in process termination routines - Added wait_until_completion(process_id, timeout) helper function Signed-off-by: Michael Orlov <michael.orlov@apex.ai> * Move wait_until_completion before stop_execution to fix compilation error Signed-off-by: Michael Orlov <michael.orlov@apex.ai> * Fix for Windows build error. Rename process_id to handle. Signed-off-by: Michael Orlov <michael.orlov@apex.ai> * Increase timeout for service call and wait_until_completion up to 10 sec Signed-off-by: Michael Orlov <michael.orlov@apex.ai> --------- Signed-off-by: Michael Orlov <michael.orlov@apex.ai> Co-authored-by: Chris Lalancette <clalancette@gmail.com> (cherry picked from commit af4ca0c) Co-authored-by: Michael Orlov <michael.orlov@apex.ai>
This pull request has been mentioned on ROS Discourse. There might be relevant details there: https://discourse.ros.org/t/ros-2-tsc-meeting-minutes-2023-05-18/31587/1 |
Make test_rosbag2_play_end_to_end more deterministic
publishers from player then send resume service call to unpause.
Add wait_for_matched for record_end_to_end_exits_gracefully_on_sigterm
Fix for play_filters_by_topic test
play_filters_by_topic
and fix expectations about number of published messages.
timeout.
play_end_to_end_test
for windows.Note: Didn't remove xfail for test_rosbag2_record_end_to_end because tests fails on Windows build due to absence of the console control handling for
CTRL_C_EVENT
andCTRL_SHUTDOWN_EVENT
events.Will address
test_rosbag2_record_end_to_end
tests in follow up PRs.