Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SequentialWriter to cache by message size instead of message count #530

Merged
merged 3 commits into from
Oct 9, 2020

Conversation

jaisontj
Copy link
Contributor

@jaisontj jaisontj commented Oct 2, 2020

This is part 1 of 2 PRs addressing #464

  • The SequentialWriter cache now caches by message size and not by message count

As for:

Consider how this might live side-by-side with "cache by time" (maybe "whichever comes first")

It can work the same way as split bag file by duration and split bag file by size in that, it will be "whichever comes first"

@jaisontj jaisontj requested a review from a team as a code owner October 2, 2020 19:55
@jaisontj jaisontj requested review from thomas-moulard and Karsten1987 and removed request for a team October 2, 2020 19:55
@jaisontj jaisontj marked this pull request as draft October 2, 2020 19:56
@@ -249,11 +249,13 @@ void SequentialWriter::write(std::shared_ptr<rosbag2_storage::SerializedBagMessa
storage_->write(converted_msg);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@emersonknapp we discussed cleaning this up better, however, I think this initial 0 check is required cause the cache size is set to 0 in this scenario (cause max_cache_size is 0) and so we need to handle it separately.

Let me know what you think.

@@ -314,8 +319,13 @@ TEST_F(SequentialWriterTest, do_not_use_cache_if_cache_size_is_zero) {

std::string rmw_format = "rmw_format";

std::string msg_content = "Hello";
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test ensures that cache is not used if cache size is 0 and previously did not care about the serialized_data in the message. However, I thought it is better if we do test with a close to real world scenario where the message has some data as well.

@jaisontj jaisontj changed the title SequentialWriter caches by message size and not message count SequentialWriter to cache by message size and not message count Oct 5, 2020
@@ -117,7 +118,7 @@ TEST_F(RecordFixture, record_end_to_end_test) {
wrong_message->string_value = "wrong_content";

auto process_handle = start_execution(
"ros2 bag record --output " + root_bag_path_.string() + " /test_topic");
"ros2 bag record --max-cache-size 0 --output " + root_bag_path_.string() + " /test_topic");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't 0 the default?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, I should add this to the comment and the title also.

0 default = dont use cache and write to disk always, which isnt great for performance. So as per the issue #464 I've also changed the default to 1MB

@jaisontj jaisontj changed the title SequentialWriter to cache by message size and not message count SequentialWriter to cache by message size and change default cache size to 1MB Oct 5, 2020
@jaisontj jaisontj marked this pull request as ready for review October 5, 2020 22:49
@Karsten1987
Copy link
Collaborator

Karsten1987 commented Oct 6, 2020

Thanks for picking this up. I haven't had time to fully look into this (will hopefully get to it tomorrow), but as a first glance I just want to challenge the fact that number of messages is not useful. I could see that there's a use case of splitting bag files with a specific message count, e.g. recording a single image topic with the result of having 100 images per bag file.

Could we support both? I would see that each condition results in its own predicate which determines when to split. That way we could also easily support splitting by time at some point in the future.

@jaisontj
Copy link
Contributor Author

jaisontj commented Oct 6, 2020

Thanks for picking this up. I haven't had time to fully look into this (will hopefully get to it tomorrow), but as a first glance I just want to challenge the fact that number of messages is not useful. I could see that there's a use case of splitting bag files with a specific message count, e.g. recording a single image topic with the result of having 100 images per bag file.

Could we support both? I would see that each condition results in its own predicate which determines when to split. That way we could also easily support splitting by time at some point in the future.

Hi @Karsten1987, this PR does not change the bag splitting functionality, rather it addresses the way messages are stored in cache before being written to a bag. Previously, it used the number of messages to hold in cache whereas bag splitting happens based on bag size or duration. I didnt see any support for split by message count (I could be wrong though).

Let me know if I've understood you correctly.

@Karsten1987
Copy link
Collaborator

I haven't had time to fully look into this

That's what I meant with this 🤭
I am sorry, I really should have looked better at this. Please ignore my comment.

@jaisontj
Copy link
Contributor Author

jaisontj commented Oct 6, 2020

I haven't had time to fully look into this

That's what I meant with this 🤭
I am sorry, I really should have looked better at this. Please ignore my comment.

Haha, its all good :)

@emersonknapp
Copy link
Collaborator

Just noting that this touches the same code as #506 for full visibility

@jaisontj jaisontj force-pushed the jaisontj/fix-464 branch 2 times, most recently from e810f1c to f6e3104 Compare October 7, 2020 05:16
@jaisontj jaisontj changed the title SequentialWriter to cache by message size and change default cache size to 1MB SequentialWriter to cache by message size instead of message count Oct 7, 2020
Copy link
Contributor

@dabonnie dabonnie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable with green CI

Copy link
Collaborator

@emersonknapp emersonknapp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, nice and minimal!

@jaisontj
Copy link
Contributor Author

jaisontj commented Oct 7, 2020

  • Linux Build Status
  • Linux-aarch64 Build Status
  • macOS Build Status
  • Windows Build Status

Signed-off-by: Jaison Titus <jaisontj@amazon.com>
current_cache_size_

Signed-off-by: Jaison Titus <jaisontj@amazon.com>
@jaisontj
Copy link
Contributor Author

jaisontj commented Oct 7, 2020

Looks like there was a change in master which caused the CI to fail. Pushed again after a rebase with master. Running CI again:

  • Linux Build Status
  • Linux-aarch64 Build Status
  • macOS Build Status
  • Windows Build Status

@jaisontj
Copy link
Contributor Author

jaisontj commented Oct 8, 2020

The CI run above tried testing rosbag2_performance_writer_benchmarking which was not built since I used build --packages-up-to rosbag2

Here is the correct one with:
build_args: --packages-up-to rosbag2 ros2bag rosbag2_performance_writer_benchmarking rosbag2_py
test_args: --packages-select-regex rosbag2 ros2bag

  • Linux Build Status
  • Linux-aarch64 Build Status
  • macOS Build Status
  • Windows Build Status

Signed-off-by: Jaison Titus <jaisontj@amazon.com>
@jaisontj
Copy link
Contributor Author

jaisontj commented Oct 9, 2020

New CI Build after fixing the warning on windows:

build_args: --packages-up-to rosbag2 ros2bag rosbag2_performance_writer_benchmarking rosbag2_py
test_args: --packages-select-regex rosbag2 ros2bag

  • Linux Build Status
  • Linux-aarch64 Build Status
  • macOS Build Status
  • Windows Build Status

@emersonknapp
Copy link
Collaborator

Instability on OSX and aarch64 is related to xmllint, which is failing on all packages currently in the nightly build - so it is definitely not related to this PR

@emersonknapp emersonknapp merged commit 8a917af into ros2:master Oct 9, 2020
@jaisontj jaisontj deleted the jaisontj/fix-464 branch October 9, 2020 21:04
emersonknapp pushed a commit that referenced this pull request Feb 2, 2021
)

* Fixes #464 - caches by message size and not message count

Signed-off-by: Jaison Titus <jaisontj@amazon.com>
emersonknapp pushed a commit that referenced this pull request Feb 17, 2021
)

* Fixes #464 - caches by message size and not message count

Signed-off-by: Jaison Titus <jaisontj@amazon.com>
skudryas pushed a commit to skudryas/rosbag2 that referenced this pull request Mar 12, 2021
…os2#530)

* Fixes ros2#464 - caches by message size and not message count

Signed-off-by: Jaison Titus <jaisontj@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants