Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use rw_lock to protect mcap metadata lists. #1561

Merged

Conversation

fujitatomoya
Copy link
Contributor

address #1542

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>
@fujitatomoya fujitatomoya requested a review from a team as a code owner February 1, 2024 17:50
@fujitatomoya fujitatomoya requested review from gbiggs and jhdcs and removed request for a team February 1, 2024 17:50
@MichaelOrlov MichaelOrlov requested review from MichaelOrlov and removed request for gbiggs February 2, 2024 03:29
Copy link
Contributor

@MichaelOrlov MichaelOrlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fujitatomoya Thanks for trying to fix this issue with race condition.
At first glance, the fix is trivial. However, we can do way better.

  • We don't need to use shared_mutex because there are no situations in rosbag2 when write() method could be called from multiple threads and unlikely would be because
    1. it is protected with mutex on upper rosbag2_cpp::writer::write(..) method for the cases when we are writing without cache.
    2. We are using SingleThreadedExecutor which is processing all subscription callbacks one by one sequentially in the one thread.
    3. When double-buffer cache is enabled we are using another method
      void MCAPStorage::write(
      const std::vector<std::shared_ptr<const rosbag2_storage::SerializedBagMessage>> & msgs)
      {
      for (const auto & msg : msgs) {
      write(msg);
      }
      }
  • And here we can do optimization. It will be more efficient to lock mutex once before we iterate through the vector of messages and call the write method for each message. Other API calls as create_topic(..) or remove_topic(..) have lower priority and could wait until we will finish dumping the vector of messages from our double-buffer cache to the storage. It will be more elegant to create a new private method for writing one message without lock. We can name it write_lock_free(msg) or write_locked(msg) and lock mutex in public write(..) method right before calling newly created write_lock_free(msg) and in the
    void MCAPStorage::write(
    const std::vector<std::shared_ptr<const rosbag2_storage::SerializedBagMessage>> & msgs)
    {
    for (const auto & msg : msgs) {
    write(msg);
    }
    }

    right before for loop.

@fujitatomoya
Copy link
Contributor Author

fujitatomoya commented Feb 2, 2024

@MichaelOrlov thanks!

because there are no situations in rosbag2 when write() method could be called from multiple threads and unlikely would be because

I see, rosbag2_cpp::writer::write protects the methods with mutex lock.

whole point is rosbag2_cpp::cache::CacheConsumer::exec_consuming(), this one is staying outside of the lock mechanism.

We are using SingleThreadedExecutor which is processing all subscription callbacks one by one sequentially in the one thread.

currently true. minor concern was if we support MultiThreadedExecutor in the future, we will need to come back to this discussion. probably that is not concern, i guess rosbag2 takes the different path to take advantage of cache with dedicated single thread for the performance?

It will be more elegant to create a new private method for writing one message without lock. We can name it write_lock_free(msg) or write_locked(msg) and lock mutex in public write(..) method right before calling newly created write_lock_free(msg)

okay, can do that.

  • will use std::mutex, instead of shared_mutex.
  • conceal lock free implementation in private scope, then control lock in public scope.

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>
@fujitatomoya
Copy link
Contributor Author

@MichaelOrlov requesting another look! thanks!

@MichaelOrlov
Copy link
Contributor

@fujitatomoya Thanks for the followup. As reagrds.

currently true. minor concern was if we support MultiThreadedExecutor in the future, we will need to come back to this discussion. probably that is not concern, i guess rosbag2 takes the different path to take advantage of cache with dedicated single thread for the performance?

Yes. Currently using double-buffer cache gives the same performance gain as a multithreaded executor but in two threads and decouples writing to the storage and transport layer. Which is very important due to the possible delays with filesystem or DB operations.
Even if we would start using MutithreadedExecutor for whatever reasons in the future we still bounded to the locking on one std::mutex on the rosbag2_cpp writer(message) layer here

void Writer::write(std::shared_ptr<const rosbag2_storage::SerializedBagMessage> message)
{
std::lock_guard<std::mutex> writer_lock(writer_mutex_);
writer_impl_->write(message);
}

And would need to reconsider and make changes on that level as well.
Switching to shared_mutex on the rosbag2_storage_mcap layer if would needed in the future will not cause difficulties even with backporting since the whole changes in the .cpp file only and will be API/ABI compatible.

Copy link
Contributor

@MichaelOrlov MichaelOrlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fujitatomoya Thanks. LGTM with green CI.

@MichaelOrlov MichaelOrlov changed the title use rw_lock to protect mcap metadata lists. Use rw_lock to protect mcap metadata lists. Feb 2, 2024
@MichaelOrlov
Copy link
Contributor

Gist: https://gist.githubusercontent.com/MichaelOrlov/a06b14d20610f8f8cc8007db7fc61efe/raw/06daf70cbe74c9b1eb6715c512d655cd101f648f/ros2.repos
BUILD args: --packages-above-and-dependencies rosbag2_storage_mcap rosbag2_tests
TEST args: --packages-above rosbag2_storage_mcap rosbag2_tests
ROS Distro: rolling
Job: ci_launcher
ci_launcher ran: https://ci.ros2.org/job/ci_launcher/13233

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Windows Build Status

@MichaelOrlov MichaelOrlov merged commit 90d1da8 into ros2:rolling Feb 3, 2024
14 checks passed
@MichaelOrlov
Copy link
Contributor

https://github.com/Mergifyio backport iron humble

Copy link

mergify bot commented Feb 3, 2024

backport iron humble

✅ Backports have been created

mergify bot pushed a commit that referenced this pull request Feb 3, 2024
* use rw_lock to protect mcap metadata lists.

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>

* introduce MCAPStorage::write_lock_free private method.

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>

---------

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>
(cherry picked from commit 90d1da8)

# Conflicts:
#	rosbag2_storage_mcap/src/mcap_storage.cpp
mergify bot pushed a commit that referenced this pull request Feb 3, 2024
* use rw_lock to protect mcap metadata lists.

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>

* introduce MCAPStorage::write_lock_free private method.

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>

---------

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>
(cherry picked from commit 90d1da8)

# Conflicts:
#	rosbag2_storage_mcap/src/mcap_storage.cpp
MichaelOrlov added a commit that referenced this pull request Feb 4, 2024
…#1567)

* Use rw_lock to protect mcap metadata lists. (#1561)

* use rw_lock to protect mcap metadata lists.

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>

* introduce MCAPStorage::write_lock_free private method.

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>

---------

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>
(cherry picked from commit 90d1da8)

# Conflicts:
#	rosbag2_storage_mcap/src/mcap_storage.cpp

* Resolve merge conflicts

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>

---------

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>
Co-authored-by: Tomoya Fujita <Tomoya.Fujita@sony.com>
Co-authored-by: Michael Orlov <michael.orlov@apex.ai>
MichaelOrlov added a commit that referenced this pull request Feb 5, 2024
…1566)

* Use rw_lock to protect mcap metadata lists. (#1561)

* use rw_lock to protect mcap metadata lists.

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>

* introduce MCAPStorage::write_lock_free private method.

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>

---------

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>
(cherry picked from commit 90d1da8)

# Conflicts:
#	rosbag2_storage_mcap/src/mcap_storage.cpp

* Resolve merge conflicts

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>

* Suppress warning STL4015: The std::iterator class template is deprecated in C++17

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>

---------

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>
Co-authored-by: Tomoya Fujita <Tomoya.Fujita@sony.com>
Co-authored-by: Michael Orlov <michael.orlov@apex.ai>
anrp-tri pushed a commit to anrp-tri/rosbag2 that referenced this pull request Feb 6, 2024
) (ros2#1567)

* Use rw_lock to protect mcap metadata lists. (ros2#1561)

* use rw_lock to protect mcap metadata lists.

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>

* introduce MCAPStorage::write_lock_free private method.

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>

---------

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>
(cherry picked from commit 90d1da8)

# Conflicts:
#	rosbag2_storage_mcap/src/mcap_storage.cpp

* Resolve merge conflicts

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>

---------

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>
Co-authored-by: Tomoya Fujita <Tomoya.Fujita@sony.com>
Co-authored-by: Michael Orlov <michael.orlov@apex.ai>
@ros-discourse
Copy link

This pull request has been mentioned on ROS Discourse. There might be relevant details there:

https://discourse.ros.org/t/ros-2-meeting-minutes-2024-02-15/36221/1

emersonknapp added a commit that referenced this pull request Jun 10, 2024
* Link and compile against rosbag2_storage_mcap: Fixed issue 1492 (#1496) (#1498)

Signed-off-by: Alejandro Hernández Cordero <ahcorde@gmail.com>
(cherry picked from commit 7fcb703)

Co-authored-by: Alejandro Hernández Cordero <ahcorde@gmail.com>

* [humble] Bugfix for incorrect playback rate changes when pressing buttons (backport #1513) (#1515)

* Bugfix for incorrect playback rate changes when pressing buttons (#1513)

- Playback rate expected to be changed by 10% with each
increase/decrease step.
- Use +0.1 and -0.1 in decrease/increase rate formula instead of
multiply by factor of the 1.1 and 0.9 respectively.

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>
(cherry picked from commit 95f78b6)

# Conflicts:
#	rosbag2_transport/src/rosbag2_transport/player.cpp

* Address merge conflicts after auto-backporting

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>

---------

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>
Co-authored-by: Michael Orlov <michael.orlov@apex.ai>

* call cv.wait_until only if necessary. (#1521) (#1523)

* call cv.wait_until only if necessary.

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>

* add comment to avoid extra delay for performance.

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>

---------

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>
(cherry picked from commit a16704b)

Co-authored-by: Tomoya Fujita <Tomoya.Fujita@sony.com>

* [humble] Install signal handlers in recorder only inside record method (backport #1464) (#1526)

* Install signal handlers in recorder only inside record method (#1464)

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>
(cherry picked from commit 195e406)

# Conflicts:
#	rosbag2_py/src/rosbag2_py/_transport.cpp

* Address merge conflicts

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>

---------

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>
Co-authored-by: Michael Orlov <michael.orlov@apex.ai>

* [humble] `Recording stopped` prints only once. (backport #1530) (#1535)

* `Recording stopped` prints only once. (#1530)

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>
(cherry picked from commit 73b0772)

# Conflicts:
#	rosbag2_transport/src/rosbag2_transport/recorder.cpp

* Address merge conflicts

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>

---------

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>
Co-authored-by: Tomoya Fujita <Tomoya.Fujita@sony.com>
Co-authored-by: Michael Orlov <michael.orlov@apex.ai>

* [humble] Give proper log message for `--start-paused` (backport #1537) (#1541)

* Add proper message for --start-paused (#1537)

Signed-off-by: Christoph Froehlich <christoph.froehlich@ait.ac.at>
(cherry picked from commit 317286c)

# Conflicts:
#	rosbag2_transport/src/rosbag2_transport/recorder.cpp

* Address merge conflicts

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>

---------

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>
Co-authored-by: Christoph Fröhlich <christophfroehlich@users.noreply.github.com>
Co-authored-by: Michael Orlov <michael.orlov@apex.ai>

* 0.15.9 (#1551)

* 0.15.9

Signed-off-by: Audrow Nash <audrow@intrinsic.ai>

* Update rosbag2_transport/CHANGELOG.rst

Co-authored-by: Michael Orlov <michael.orlov@apex.ai>
Signed-off-by: Audrow Nash <audrow@intrinsic.ai>

---------

Signed-off-by: Audrow Nash <audrow@intrinsic.ai>
Co-authored-by: Michael Orlov <michael.orlov@apex.ai>

* [humble] Add default initialization for CompressionOptions (backport #1539) (#1546)

* Add default initialization for CompressionOptions (#1539)

* feat: add sane defaults for CompressionOptions

Signed-off-by: Arne Böckmann <a.boeckmann@cellumation.com>

* Update rosbag2_compression/include/rosbag2_compression/compression_options.hpp

Co-authored-by: Tomoya Fujita <Tomoya.Fujita@sony.com>
Signed-off-by: Arne Böckmann <a.boeckmann@cellumation.com>

---------

Signed-off-by: Arne Böckmann <a.boeckmann@cellumation.com>
Co-authored-by: Arne Böckmann <a.boeckmann@cellumation.com>
Co-authored-by: Tomoya Fujita <Tomoya.Fujita@sony.com>
(cherry picked from commit 931bf54)

# Conflicts:
#	rosbag2_compression/include/rosbag2_compression/compression_options.hpp

* Address merge conflicts

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>

---------

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>
Co-authored-by: Arne B <arne@rnae.de>
Co-authored-by: Michael Orlov <michael.orlov@apex.ai>

* Fix/zstd vendor does not find system zstd (#1111) (#1560)

cmake did not find the Findzstd.cmake in cmake/Modules

Since we cannot use pkg-config for some Windows issues, the parsing of
the version is done by looking for the string in zstd.h.

Signed-off-by: Matthias Schoepfer <m.schoepfer@rethinkrobotics.com>
(cherry picked from commit e7e7269)

Co-authored-by: DasRoteSkelett <matthias.schoepfer@googlemail.com>

* [humble] Use rw_lock to protect mcap metadata lists. (backport #1561) (#1567)

* Use rw_lock to protect mcap metadata lists. (#1561)

* use rw_lock to protect mcap metadata lists.

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>

* introduce MCAPStorage::write_lock_free private method.

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>

---------

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>
(cherry picked from commit 90d1da8)

# Conflicts:
#	rosbag2_storage_mcap/src/mcap_storage.cpp

* Resolve merge conflicts

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>

---------

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>
Co-authored-by: Tomoya Fujita <Tomoya.Fujita@sony.com>
Co-authored-by: Michael Orlov <michael.orlov@apex.ai>

* Add /bigobj to MSVC compiles. (#1571)

Signed-off-by: Chris Lalancette <clalancette@gmail.com>

* Fix split by time. (backport #1022) (#1616)

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>

* [humble] Add BagSplitInfo service call on bag close (backport #1422) (#1637)

* Add BagSplitInfo service call on bag close (#1422)

- Note: The `BagSplitInfo::opened_file` will have empty string to
indicate that it was "bag close" and not bag split event.

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>
(cherry picked from commit ba199d0)

# Conflicts:
#	rosbag2_cpp/test/rosbag2_cpp/test_sequential_writer.cpp

* Fix merge conflicts

- Ensure that writer_ is destructed before intercepted fake_metadata_

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>

---------

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>
Co-authored-by: Michael Orlov <michael.orlov@apex.ai>

* [Humble] Resolve recording option problem (backport #1649) (#1651)

* Resolve recording option problem (#1649)

Signed-off-by: Barry Xu <barry.xu@sony.com>
(cherry picked from commit 4914ab3)

# Conflicts:
#	ros2bag/ros2bag/verb/record.py

* Fix cherry-pick conflicts for mergify/bp/humble/pr-1649 (#1652)

Signed-off-by: Barry Xu <barry.xu@sony.com>

---------

Signed-off-by: Barry Xu <barry.xu@sony.com>
Co-authored-by: Barry Xu <barry.xu@sony.com>

* [humble] Add --log-level to ros2 bag play and record (#1655)

* Add --log-level to ros2 bag play and record

Co-authored-by: Michael Orlov <michael.orlov@apex.ai>
Signed-off-by: Roman Sokolkov <rsokolkov@gmail.com>

* Fix missing import

Signed-off-by: Roman Sokolkov <rsokolkov@gmail.com>

---------

Signed-off-by: Roman Sokolkov <rsokolkov@gmail.com>
Co-authored-by: Michael Orlov <michael.orlov@apex.ai>

* [humble] Bugfix for writer not being able to open again after closing (backport #1599) (#1653)

* [iron] Bugfix for writer not being able to open again after closing (backport #1599) (#1635)

* re-applies fixes from #1590 to rolling. Also removes new message definition in sequential writer test for multiple open operations. Also clears topic_names_to_message_definitions_ and handles message_definitions_s underlying container similarly. Lastly, also avoids reset of factory in the compression writer, adds unit test there too.

Signed-off-by: Yannick Schulz <yschulz854@gmail.com>
Signed-off-by: Michael Orlov <michael.orlov@apex.ai>

* removes unused compressor_ member from compresser writer class. Also delegates rest of the closing behavior to the base class in close method, as it is handled in the open and write methods of the compression writer

Signed-off-by: Yannick Schulz <yschulz854@gmail.com>

* Remove unrelated delta

- message_definitions_ was intentionally allocated on the stack and
should persist between writer close() and open() because it represents
cache for message definitions which is not changes.

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>

* Don't call virtual methods from destructors

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>

* Cleanup 'rosbag2_directory_next' after the test run

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>

* Protect Writer::open(..) and Writer::close() with mutex on upper level

- Rationale: To avoid race conditions if open(..) and close() could be
ever be called from different threads.

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>

* Bugfix for WRITE_SPLIT callback not called for the last compressed file

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>

* Bugfix for lost messages from cache when closing compression writer

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>

* Address build failure by using rcpputils::fs instead of std::filesystem

- Note: On Iron we haven't migrated to the std::filesystem and using
rcpputils::fs

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>

* Adopt failing 'open_succeeds_twice' test for Iron

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>

* Return from writer's open() immediately if storage already open

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>

---------

Signed-off-by: Yannick Schulz <yschulz854@gmail.com>
Signed-off-by: Michael Orlov <michael.orlov@apex.ai>
Co-authored-by: Yannick Schulz <yschulz854@gmail.com>
(cherry picked from commit a360d9b)

# Conflicts:
#	rosbag2_compression/src/rosbag2_compression/sequential_compression_writer.cpp
#	rosbag2_cpp/src/rosbag2_cpp/writer.cpp
#	rosbag2_cpp/src/rosbag2_cpp/writers/sequential_writer.cpp

* Address merge conflicts

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>

* Fix for segfault in open_twice test

- Ensure that writer_ is destructed before intercepted fake_metadata_

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>

* Fix for "open_succeeds_twice" test failure on second run

- Use std::filesystem for temp files and folders operation. For some
reason rcpputils::fs::delete_all(folder_name) wasn't able to delete temp
folder with subfolders.

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>

* Adopt changes in TestRosbag2CPPAPI::minimal_writer_example for humble

- The `serialized_msg2` is not owning the serialized data after the
first call writer.write(serialized_msg2,..). i.e. need to use another
message or another API in test. This is not a bug - this is by design.

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>

---------

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>
Co-authored-by: Michael Orlov <michael.orlov@apex.ai>

---------

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>
Signed-off-by: Audrow Nash <audrow@intrinsic.ai>
Signed-off-by: Chris Lalancette <clalancette@gmail.com>
Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>
Signed-off-by: Barry Xu <barry.xu@sony.com>
Signed-off-by: Roman Sokolkov <rsokolkov@gmail.com>
Signed-off-by: Emerson Knapp <emerson.b.knapp@gmail.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Co-authored-by: Alejandro Hernández Cordero <ahcorde@gmail.com>
Co-authored-by: Michael Orlov <michael.orlov@apex.ai>
Co-authored-by: Tomoya Fujita <Tomoya.Fujita@sony.com>
Co-authored-by: Christoph Fröhlich <christophfroehlich@users.noreply.github.com>
Co-authored-by: Audrow Nash <audrow@openrobotics.org>
Co-authored-by: Arne B <arne@rnae.de>
Co-authored-by: DasRoteSkelett <matthias.schoepfer@googlemail.com>
Co-authored-by: Chris Lalancette <clalancette@gmail.com>
Co-authored-by: Barry Xu <barry.xu@sony.com>
Co-authored-by: Roman <rsokolkov@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants