Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pause/resume options to the bag recorder #905

Merged
merged 10 commits into from
Nov 24, 2021

Conversation

ivanpauno
Copy link
Member

Implements half of #558: ability to pause/resume recording.

Signed-off-by: Ivan Santiago Paunovic <ivanpauno@ekumenlabs.com>
@ivanpauno ivanpauno added the enhancement New feature or request label Nov 5, 2021
@ivanpauno ivanpauno self-assigned this Nov 5, 2021
@ivanpauno ivanpauno requested a review from a team as a code owner November 5, 2021 16:53
@ivanpauno ivanpauno requested review from emersonknapp and MichaelOrlov and removed request for a team November 5, 2021 16:53
Signed-off-by: Ivan Santiago Paunovic <ivanpauno@ekumenlabs.com>
Copy link
Contributor

@MichaelOrlov MichaelOrlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ivanpauno Thanks for your contribution.
I think that paused_ defined as atomic<int> is overkill and I don't see a real use case when it could be justified to use. I would propose to change it to the std::atomic<bool> or std::atomic_bool.

I would hide toggle_paused() API as private method unless you need it outside. e.g. in Rqtbag. It's sort of ambiguous and difficult to determine an outcome from it and would be preferable to use pause resume explicitly outside of the Rosbag2.

I am also worry about recording termination by CRL+C keyboard binding. As far as I understand, before start using KeyboardHandler in recording SIGINT = CTRL+C was handling somewhere inside Node class and then re-translating to the SIGTERM and we were catching and processing SIGTERM in _transport.cpp.
However KeyboardHandler override SIGINT handler on construction and it looks like now rclcpp::shutdown(); will not be called and we will abnormally exit from process without properly finishing recording.

rosbag2_transport/include/rosbag2_transport/recorder.hpp Outdated Show resolved Hide resolved
rosbag2_transport/src/rosbag2_transport/recorder.cpp Outdated Show resolved Hide resolved
rosbag2_transport/src/rosbag2_transport/recorder.cpp Outdated Show resolved Hide resolved
rosbag2_transport/include/rosbag2_transport/recorder.hpp Outdated Show resolved Hide resolved
@ivanpauno
Copy link
Member Author

I am also worry about recording termination by CRL+C keyboard binding. As far as I understand, before start using KeyboardHandler in recording SIGINT = CTRL+C was handling somewhere inside Node class and then re-translating to the SIGTERM and we were catching and processing SIGTERM in _transport.cpp.
However KeyboardHandler override SIGINT handler on construction and it looks like now rclcpp::shutdown(); will not be called and we will abnormally exit from process without properly finishing recording.

Interesting, I didn't thought about signal handling.
I will take another look to that.

Signed-off-by: Ivan Santiago Paunovic <ivanpauno@ekumenlabs.com>
Signed-off-by: Ivan Santiago Paunovic <ivanpauno@ekumenlabs.com>
Signed-off-by: Ivan Santiago Paunovic <ivanpauno@ekumenlabs.com>
@ivanpauno
Copy link
Member Author

I would hide toggle_paused() API as private method unless you need it outside. e.g. in Rqtbag. It's sort of ambiguous and difficult to determine an outcome from it and would be preferable to use pause resume explicitly outside of the Rosbag2.

I think it makes sense to offer the same API that the Player class uses.

@ivanpauno
Copy link
Member Author

I am also worry about recording termination by CRL+C keyboard binding. As far as I understand, before start using KeyboardHandler in recording SIGINT = CTRL+C was handling somewhere inside Node class and then re-translating to the SIGTERM and we were catching and processing SIGTERM in _transport.cpp.

SIGINT isn't being retranslated to sigterm, the default rclcpp sigint signal handler is calling shutdown() asynchronously.

However KeyboardHandler override SIGINT handler on construction and it looks like now rclcpp::shutdown(); will not be called and we will abnormally exit from process without properly finishing recording.

Yes, this is an issue.
IMO KeyboardHandler should have an option to disable installing the signal handler, or it should be using signal chaining by default.
I don't like signal chaining though.

I will open a PR in the keyboard_handler repo and link it here.

@MichaelOrlov
Copy link
Contributor

Yes, this is an issue. IMO KeyboardHandler should have an option to disable installing the signal handler, or it should be using signal chaining by default. I don't like signal chaining though.

I will open a PR in the keyboard_handler repo and link it here.

I think chaining signal handler would be a valid workaround. I can help with PR for KeyboardHandler

@MichaelOrlov
Copy link
Contributor

@ivanpauno Wouldn't you mind to add at least one unit test to the https://github.com/ros2/rosbag2/blob/master/rosbag2_tests/test/rosbag2_tests/test_rosbag2_record_end_to_end.cpp
to cover new --start-paused command line option?
Ideally it would be nice to have one more test for pause/resume via simulating keyboard press. Similar to how it was done for the Player class. I hope it would be possible to create some mock writer to check in test if writer_->write(..) has been called.

@MichaelOrlov
Copy link
Contributor

@ivanpauno If you wouldn't mind I would suggest that I will take over issue relating to the signal handler in KeyboardHandler ros-tooling/keyboard_handler#9 and you will focus on unit tests for the pause/resume options in bag recorder.
What do you think?

@ivanpauno
Copy link
Member Author

@ivanpauno If you wouldn't mind I would suggest that I will take over issue relating to the signal handler in KeyboardHandler ros-tooling/keyboard_handler#9 and you will focus on unit tests for the pause/resume options in bag recorder.
What do you think?

That sounds fine to me!
I've left a comment in the issue.

…resume toggle key

Signed-off-by: Ivan Santiago Paunovic <ivanpauno@ekumenlabs.com>
Signed-off-by: Ivan Santiago Paunovic <ivanpauno@ekumenlabs.com>
Signed-off-by: Ivan Santiago Paunovic <ivanpauno@ekumenlabs.com>
@ivanpauno
Copy link
Member Author

Ideally it would be nice to have one more test for pause/resume via simulating keyboard press. Similar to how it was done for the Player class. I hope it would be possible to create some mock writer to check in test if writer_->write(..) has been called.

Test added in b6c35a6.

Copy link
Collaborator

@emersonknapp emersonknapp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

API looks good to me - I like a match with the Player. Sounds like you have the shutdown issue under control

@MichaelOrlov
Copy link
Contributor

@emersonknapp @ivanpauno I think we shouldn't merge this PR untill ros-tooling/keyboard_handler#10 will be merged, otherwise recording will be broken.

Signed-off-by: Ivan Santiago Paunovic <ivanpauno@ekumenlabs.com>
@ivanpauno
Copy link
Member Author

@emersonknapp @ivanpauno I think we shouldn't merge this PR untill ros-tooling/keyboard_handler#10 will be merged, otherwise recording will be broken.

Agreed, I will also have to update this PR to use the new signal handling options there.

@MichaelOrlov
Copy link
Contributor

Agreed, I will also have to update this PR to use the new signal handling options there.

You can, but you don't have to.
The correct CTRL+C handling will be correctly processed by default via calling original signal handler from rclcpp context after calling KeyboardHandler signal handler.

@ivanpauno
Copy link
Member Author

The correct CTRL+C handling will be correctly processed by default via calling original signal handler from rclcpp context after calling KeyboardHandler signal handler.

That's right, I forgot about that.
It seems that we only need to wait until that PR gets merged and then rerun CI here.

@MichaelOrlov
Copy link
Contributor

@ivanpauno I see you have added unit test to cover keyboard press handling for pause/resume during recording.
However I don't see coverage for the new --start-paused command line option.
Could you please add one more test to the
https://github.com/ros2/rosbag2/blob/master/rosbag2_tests/test/rosbag2_tests/test_rosbag2_record_end_to_end.cpp

Signed-off-by: Ivan Santiago Paunovic <ivanpauno@ekumenlabs.com>
@ivanpauno
Copy link
Member Author

Could you please add one more test to the
https://github.com/ros2/rosbag2/blob/master/rosbag2_tests/test/rosbag2_tests/test_rosbag2_record_end_to_end.cpp

I'm having issues when running those tests locally, they always timeout even when using the master branch.
Do you have any idea what could be happening?

Example output

[==========] Running 16 tests from 1 test suite.
[----------] Global test environment set-up.
[----------] 16 tests from RecordFixture
[ RUN ] RecordFixture.record_end_to_end_test_with_zstd_file_compression
/home/ivanpauno/ros2_ws/src/ros2/rosbag2/rosbag2_tests/test/rosbag2_tests/test_rosbag2_record_end_to_end.cpp:86: Failure
Value of: pub_manager.wait_for_matched(topic_name)
Actual: false
Expected: true
Expected find rosbag subscription

I have added a ros2 bag record --start-paused test in b11614f.
It's at least building locally, but IDK if it's passing.

@ivanpauno
Copy link
Member Author

Rpr checker failure seems unrelated.

CI:

  • Linux Build Status
  • Linux-aarch64 Build Status
  • macOS Build Status
  • Windows Build Status

@emersonknapp
Copy link
Collaborator

emersonknapp commented Nov 23, 2021

The end_to_end tests are marked xfail and therefore nothing is enforcing that they keep passing.

However - they are passing locally for me when built against master branch - I'm not sure what the issue is from the information you've provided.

Are you building all of ros2 up to rosbag2, or are you building against something like Rolling?

@ivanpauno
Copy link
Member Author

Are you building all of ros2 up to rosbag2

Yes, I'm doing that.
I have pulled and rebuilt everything. I might have a random issue in my workspace, I will try a clean rebuild.

The end_to_end tests are marked xfail and therefore nothing is enforcing that they keep passing.

I have checked the output of the Test rosbag2 / build_and_test (pull_request) check in the Run xfail tests section, and the test I added is passing.

output
1: [ RUN      ] RecordFixture.record_end_to_end_test_start_paused
1: stdin is not a terminal device. Keyboard handling disabled.[INFO] [1637682700.124453840] [rosbag2_recorder]: Press SPACE for pausing/resuming
1: [INFO] [1637682700.127076721] [rosbag2_storage]: Opened database '/tmp/tmp_test_dir_ZUrO9M/record_end_to_end_test_start_paused/record_end_to_end_test_start_paused_0.db3' for READ_WRITE.
1: [INFO] [1637682700.127164920] [rosbag2_recorder]: Listening for topics...
1: [INFO] [1637682700.131385888] [rosbag2_recorder]: Subscribed to topic '/test_topic'
1: [INFO] [1637682700.133184375] [rosbag2_recorder]: All requested topics are subscribed. Stopping discovery...
1: [INFO] [1637682700.687889545] [rclcpp]: signal_handler(signum=2)
1: [INFO] [1637682700.913097269] [rosbag2_storage]: Opened database '/tmp/tmp_test_dir_ZUrO9M/record_end_to_end_test_start_paused/record_end_to_end_test_start_paused_0.db3' for READ_ONLY.
1: [       OK ] RecordFixture.record_end_to_end_test_start_paused (1262 ms)

@emersonknapp
Copy link
Collaborator

I'm inclined to move forward, given that everything is green.

@ivanpauno
Copy link
Member Author

I'm inclined to move forward, given that everything is green.

Agreed, I think it's ready to go

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants