Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improving performance through asynchronous storage writes #436

Closed
adamdbrw opened this issue Jun 10, 2020 · 2 comments
Closed

Improving performance through asynchronous storage writes #436

adamdbrw opened this issue Jun 10, 2020 · 2 comments
Assignees
Labels
enhancement New feature or request performance

Comments

@adamdbrw
Copy link
Collaborator

adamdbrw commented Jun 10, 2020

Description

With the current implementation, callbacks are not processed until storage finishes writing. This degrades performance with sufficient load as messages are lost when the subscriber queue gets full. Also, I/O use can oscillate between 100% and 0% for large messages resulting in inefficient use of a bottleneck resource (disk write bandwidth)

A solution is to implement a consumer-producer scheme, where callbacks are put in a shared queue for storage writer to process.

The transaction should be opened right when the first write comes and writing should be continuous. When transaction is committed, a buffer is to be used to accumulate messages that arrived in the meanwhile.

Related Issues

This issue comes from work on performance testing of rosbag2: #435

Completion Criteria

  • implementation is reviewed and pushed out
  • debugging shows that callbacks are processed with the desired frequency, without waiting for disk writes.
  • I/O and system tools such as iotop show smoothed I/O profile

Testing Notes / Suggestions

  • Use performance test packages (coming soon, see Performance testing of rosbag2 #435) to compare performance before and after the change (it is expected to improve for heavier loads).
  • Run rosbag2_tests to verify that no new issues were introduced
@adamdbrw adamdbrw added the enhancement New feature or request label Jun 10, 2020
@emersonknapp emersonknapp added performance help wanted Extra attention is needed labels Jul 17, 2020
@emersonknapp emersonknapp removed the help wanted Extra attention is needed label Jul 17, 2020
@Karsten1987
Copy link
Collaborator

I think a double buffering approach could be a good solution to this. Saying instead of having only one vector, we feature two (or many) cache instances and write the one being filled asynchronously to disk.

@emersonknapp
Copy link
Collaborator

That sounds like a good approach - I don't see why we would need more than two, though.

  • Buffer0 receives messages (transaction is open and storage thread is writing)
  • Buffer0 full - swap message callback pointer to Buffer1 - commit Buffer0 transaction
  • Buffer1 receives messages
  • Buffer0 transaction commit complete - open transaction for Buffer 1
  • Buffer1 full - swap message callback pointer to Buffer0 - commit Buffer1 transaction
  • etc...

Like so? I suppose we might need more buffers if the "transaction commit" takes longer than filling the next buffer - but in that case we're just writing too slowly and are going to probably drop data anyways, right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request performance
Projects
None yet
Development

No branches or pull requests

3 participants