Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use rw_lock to protect mcap metadata lists. #1561

Merged

Conversation

fujitatomoya
Copy link
Contributor

address #1542

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>
@fujitatomoya fujitatomoya requested a review from a team as a code owner February 1, 2024 17:50
@fujitatomoya fujitatomoya requested review from gbiggs and jhdcs and removed request for a team February 1, 2024 17:50
@MichaelOrlov MichaelOrlov requested review from MichaelOrlov and removed request for gbiggs February 2, 2024 03:29
Copy link
Contributor

@MichaelOrlov MichaelOrlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fujitatomoya Thanks for trying to fix this issue with race condition.
At first glance, the fix is trivial. However, we can do way better.

  • We don't need to use shared_mutex because there are no situations in rosbag2 when write() method could be called from multiple threads and unlikely would be because
    1. it is protected with mutex on upper rosbag2_cpp::writer::write(..) method for the cases when we are writing without cache.
    2. We are using SingleThreadedExecutor which is processing all subscription callbacks one by one sequentially in the one thread.
    3. When double-buffer cache is enabled we are using another method
      void MCAPStorage::write(
      const std::vector<std::shared_ptr<const rosbag2_storage::SerializedBagMessage>> & msgs)
      {
      for (const auto & msg : msgs) {
      write(msg);
      }
      }
  • And here we can do optimization. It will be more efficient to lock mutex once before we iterate through the vector of messages and call the write method for each message. Other API calls as create_topic(..) or remove_topic(..) have lower priority and could wait until we will finish dumping the vector of messages from our double-buffer cache to the storage. It will be more elegant to create a new private method for writing one message without lock. We can name it write_lock_free(msg) or write_locked(msg) and lock mutex in public write(..) method right before calling newly created write_lock_free(msg) and in the
    void MCAPStorage::write(
    const std::vector<std::shared_ptr<const rosbag2_storage::SerializedBagMessage>> & msgs)
    {
    for (const auto & msg : msgs) {
    write(msg);
    }
    }

    right before for loop.

@fujitatomoya
Copy link
Contributor Author

fujitatomoya commented Feb 2, 2024

@MichaelOrlov thanks!

because there are no situations in rosbag2 when write() method could be called from multiple threads and unlikely would be because

I see, rosbag2_cpp::writer::write protects the methods with mutex lock.

whole point is rosbag2_cpp::cache::CacheConsumer::exec_consuming(), this one is staying outside of the lock mechanism.

We are using SingleThreadedExecutor which is processing all subscription callbacks one by one sequentially in the one thread.

currently true. minor concern was if we support MultiThreadedExecutor in the future, we will need to come back to this discussion. probably that is not concern, i guess rosbag2 takes the different path to take advantage of cache with dedicated single thread for the performance?

It will be more elegant to create a new private method for writing one message without lock. We can name it write_lock_free(msg) or write_locked(msg) and lock mutex in public write(..) method right before calling newly created write_lock_free(msg)

okay, can do that.

  • will use std::mutex, instead of shared_mutex.
  • conceal lock free implementation in private scope, then control lock in public scope.

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>
@fujitatomoya
Copy link
Contributor Author

@MichaelOrlov requesting another look! thanks!

@MichaelOrlov
Copy link
Contributor

@fujitatomoya Thanks for the followup. As reagrds.

currently true. minor concern was if we support MultiThreadedExecutor in the future, we will need to come back to this discussion. probably that is not concern, i guess rosbag2 takes the different path to take advantage of cache with dedicated single thread for the performance?

Yes. Currently using double-buffer cache gives the same performance gain as a multithreaded executor but in two threads and decouples writing to the storage and transport layer. Which is very important due to the possible delays with filesystem or DB operations.
Even if we would start using MutithreadedExecutor for whatever reasons in the future we still bounded to the locking on one std::mutex on the rosbag2_cpp writer(message) layer here

void Writer::write(std::shared_ptr<const rosbag2_storage::SerializedBagMessage> message)
{
std::lock_guard<std::mutex> writer_lock(writer_mutex_);
writer_impl_->write(message);
}

And would need to reconsider and make changes on that level as well.
Switching to shared_mutex on the rosbag2_storage_mcap layer if would needed in the future will not cause difficulties even with backporting since the whole changes in the .cpp file only and will be API/ABI compatible.

Copy link
Contributor

@MichaelOrlov MichaelOrlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fujitatomoya Thanks. LGTM with green CI.

@MichaelOrlov MichaelOrlov changed the title use rw_lock to protect mcap metadata lists. Use rw_lock to protect mcap metadata lists. Feb 2, 2024
@MichaelOrlov
Copy link
Contributor

Gist: https://gist.githubusercontent.com/MichaelOrlov/a06b14d20610f8f8cc8007db7fc61efe/raw/06daf70cbe74c9b1eb6715c512d655cd101f648f/ros2.repos
BUILD args: --packages-above-and-dependencies rosbag2_storage_mcap rosbag2_tests
TEST args: --packages-above rosbag2_storage_mcap rosbag2_tests
ROS Distro: rolling
Job: ci_launcher
ci_launcher ran: https://ci.ros2.org/job/ci_launcher/13233

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Windows Build Status

@MichaelOrlov MichaelOrlov merged commit 90d1da8 into ros2:rolling Feb 3, 2024
14 checks passed
@MichaelOrlov
Copy link
Contributor

https://github.com/Mergifyio backport iron humble

Copy link

mergify bot commented Feb 3, 2024

backport iron humble

✅ Backports have been created

mergify bot pushed a commit that referenced this pull request Feb 3, 2024
* use rw_lock to protect mcap metadata lists.

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>

* introduce MCAPStorage::write_lock_free private method.

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>

---------

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>
(cherry picked from commit 90d1da8)

# Conflicts:
#	rosbag2_storage_mcap/src/mcap_storage.cpp
mergify bot pushed a commit that referenced this pull request Feb 3, 2024
* use rw_lock to protect mcap metadata lists.

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>

* introduce MCAPStorage::write_lock_free private method.

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>

---------

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>
(cherry picked from commit 90d1da8)

# Conflicts:
#	rosbag2_storage_mcap/src/mcap_storage.cpp
MichaelOrlov added a commit that referenced this pull request Feb 4, 2024
…#1567)

* Use rw_lock to protect mcap metadata lists. (#1561)

* use rw_lock to protect mcap metadata lists.

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>

* introduce MCAPStorage::write_lock_free private method.

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>

---------

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>
(cherry picked from commit 90d1da8)

# Conflicts:
#	rosbag2_storage_mcap/src/mcap_storage.cpp

* Resolve merge conflicts

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>

---------

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>
Co-authored-by: Tomoya Fujita <Tomoya.Fujita@sony.com>
Co-authored-by: Michael Orlov <michael.orlov@apex.ai>
MichaelOrlov added a commit that referenced this pull request Feb 5, 2024
…1566)

* Use rw_lock to protect mcap metadata lists. (#1561)

* use rw_lock to protect mcap metadata lists.

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>

* introduce MCAPStorage::write_lock_free private method.

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>

---------

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>
(cherry picked from commit 90d1da8)

# Conflicts:
#	rosbag2_storage_mcap/src/mcap_storage.cpp

* Resolve merge conflicts

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>

* Suppress warning STL4015: The std::iterator class template is deprecated in C++17

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>

---------

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>
Co-authored-by: Tomoya Fujita <Tomoya.Fujita@sony.com>
Co-authored-by: Michael Orlov <michael.orlov@apex.ai>
anrp-tri pushed a commit to anrp-tri/rosbag2 that referenced this pull request Feb 6, 2024
) (ros2#1567)

* Use rw_lock to protect mcap metadata lists. (ros2#1561)

* use rw_lock to protect mcap metadata lists.

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>

* introduce MCAPStorage::write_lock_free private method.

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>

---------

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>
(cherry picked from commit 90d1da8)

# Conflicts:
#	rosbag2_storage_mcap/src/mcap_storage.cpp

* Resolve merge conflicts

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>

---------

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>
Co-authored-by: Tomoya Fujita <Tomoya.Fujita@sony.com>
Co-authored-by: Michael Orlov <michael.orlov@apex.ai>
@ros-discourse
Copy link

This pull request has been mentioned on ROS Discourse. There might be relevant details there:

https://discourse.ros.org/t/ros-2-meeting-minutes-2024-02-15/36221/1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants