Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Buffer size used in writeDelimitedTo method is often too small #4177

Open
pstibrany opened this issue Jan 17, 2018 · 4 comments
Open

Buffer size used in writeDelimitedTo method is often too small #4177

pstibrany opened this issue Jan 17, 2018 · 4 comments

Comments

@pstibrany
Copy link

When sending small messages (e.g. message consisting of single enum) by using AbstractMessageLite#writeDelimitedTo method, this method together with CodedOutputStream will compute very small buffer size of 20.

AbstractMessageLite#writeDelimitedTo then writes single byte (message size) to the buffer, not leaving enough space for CodedOutputStream.writeEnum method, which always wants at least 20 available bytes, and which then must flush that single byte (message size) to make enough room to write enums.

This interplay between AbstractMessageLite#writeDelimitedTo and CodedOutputStream, which needs more free space in the buffer now for many types, causes lot of unnecessary writes to the output stream (which is bad especially when using SocketOutputStream).

We have observed this as a major performance hit when upgrading from protobuf 2.5.0 to version 3.

For a simple test case, please see https://github.com/pstibrany/protobuf-small-buffer-issue

@pstibrany
Copy link
Author

Small message is just an example, problem can of course be observed for bigger messages as well. CodedOutputStream now simply needs more space in the buffer to avoid flushing, than what AbstractMessageLite#writeDelimitedTo provides.

@TeBoring TeBoring assigned ghost Jun 15, 2019
@ghost ghost removed their assignment Jul 19, 2019
@elharo elharo self-assigned this Mar 1, 2022
@elharo
Copy link
Contributor

elharo commented Mar 4, 2022

The big question here is, as with all optimizations, does it matter? That is, suppose AbstractMessageLite#writeDelimitedTo is ten times slower. Does this have any noticeable impact on the performance of anything? E.g. does an app take five seconds longer to start up? Does a batch job complete in a day instead of an hour? Or does a BigQuery cost twice as much to run? Can any significant impact be shown that justifies the effort of improving this?

Any evidence you have along these lines would be helpful.

@pstibrany
Copy link
Author

pstibrany commented Mar 4, 2022

The big question here is, as with all optimizations, does it matter?

We definitely noticed it on our system ("We have observed this as a major performance hit when upgrading from protobuf 2.5.0 to version 3."), when I investigated this. Unfortunately that was 4 years ago, and I don't remember or work on the same system anymore (I don't have access to it), so I cannot provide more details.

@elharo
Copy link
Contributor

elharo commented Mar 4, 2022

Confirmed this still exists as of 3.19.4:

 [java] Serialized size: 2, writes: 2
 [java] Serialized size: 9, writes: 2
 [java] Serialized size: 11, writes: 3

@elharo elharo removed their assignment Jul 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants