Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check memory usage before decoding async response #74594

Merged
merged 14 commits into from Jun 30, 2021

Conversation

dnhatn
Copy link
Member

@dnhatn dnhatn commented Jun 27, 2021

This change makes sure the system has enough memory before decoding an async search response as a large response can lead to OOM.

@dnhatn dnhatn marked this pull request as ready for review June 27, 2021 15:01
@dnhatn dnhatn added the :Search/Search Search-related issues that do not fall into other categories label Jun 27, 2021
@elasticmachine elasticmachine added the Team:Search Meta label for search team label Jun 27, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

@dnhatn dnhatn added >enhancement v8.0.0 v7.14.0 and removed Team:Search Meta label for search team labels Jun 27, 2021
switch (fieldName) {
case RESULT_FIELD:
ensureExpectedToken(XContentParser.Token.VALUE_STRING, parser.currentToken(), parser);
final CharBuffer encodedBuffer = parser.charBuffer();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we can use parser.binaryValue() instead that will provide already decoded from Base64 array?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at the code of getBinaryValue and readBinaryValue of JsonParser. Both use two intermediate buffers while decoding a Base64 string. The current approach doesn't require extra memory. I know we shouldn't utilize these optimizations, but they help when encoding/decoding huge responses.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation, Nhat.

I've checked parser.charBuffer implementation, it uses String::toCharArray() which will allocate
a newly character array whose length is the length of this string. So it means that we first allocate 2X Response size memory, and only use circuit breaker for another 1X memory?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had some other ideas, what do you think of it:

  1. You said we can record the size of search response, what if we record the size of just encoded Based 64 string (which we should know)
  2. Another very simple way is to use setting search.max_async_search_response_size that Set max allowed size for stored async response #74455 introduces. We can just check for max possible size available in memory.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've checked parser.charBuffer implementation, it uses String::toCharArray() which will allocate
a newly character array whose length is the length of this string.

Great catch, thanks Mayya! I have updated this PR to account an extra buffer to the XContentParser. Unlike the reserved memory for the response, we can release this memory immediately after parsing.

You said we can record the size of search response, what if we record the size of just encoded Based 64 string (which we should know)

We already have it when parsing the xContent (i.e., encodedBuffer.length()).

Another very simple way is to use setting search.max_async_search_response_size that Set max allowed size for stored async response #74455 introduces. We can just check for max possible size available in memory.

We can overly reserve the memory especially when users use a large value for this setting.

@dnhatn
Copy link
Member Author

dnhatn commented Jun 29, 2021

Thanks Mayya for your review. I have addressed your feedback. Would you please take another look?

boolean restoreResponseHeaders, boolean checkAuthentication,
Counter reservedBytes) {
// Reserve an extra buffer as XContentParser can use it to hold parsed values.
circuitBreaker.addEstimateBytesAndMaybeBreak(source.length(), "parse xContent of async response");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happens if this circuitBreaker raises an exception? Should it be inside try statement to make sure we release allocated bytes number?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this circuitBreaker raises an exception, then we haven't reserved the memory yet. Therefore, we shouldn't release it.

@mayya-sharipova
Copy link
Contributor

mayya-sharipova commented Jun 29, 2021

@dnhatn Nhat, thanks for iterating. The PR overall LGTM.
But one concern I have is that the code got too complex. I was wondering if we can simplify even at the expense of not super precise estimate: for example we can keep the old code and just add a circuit breaker of 2X times source length: circuitBreaker.addEstimateBytesAndMaybeBreak(2* source.length(), "parse xContent of async response");
I think this should give us a good enough estimate? WDYT?

Or is there some extra benefit of this modified code? Reduced memory?

@dnhatn
Copy link
Member Author

dnhatn commented Jun 29, 2021

I prefer not to introduce these optimizations too, but the old implementation can use up to 4 times memory of the source length: the internal buffer of the parser, a base64 encoded string, a base64 decoded buffer, and a response. I will try to simplify the code.

@dnhatn
Copy link
Member Author

dnhatn commented Jun 29, 2021

@mayya-sharipova I pushed cc66b09 to simplify the code. Would you mind taking another look? Thank you!

Copy link
Contributor

@mayya-sharipova mayya-sharipova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dnhatn Thanks for iterating Nhat! I like how the code looks now! Great job!

@dnhatn
Copy link
Member Author

dnhatn commented Jun 30, 2021

@mayya-sharipova Thanks so much for your reviews.

@dnhatn dnhatn merged commit 3ca6077 into elastic:master Jun 30, 2021
@dnhatn dnhatn deleted the async-search-decode branch June 30, 2021 02:10
dnhatn added a commit to dnhatn/elasticsearch that referenced this pull request Jun 30, 2021
This change makes sure the system has enough memory before decoding an
async search response as a large response can lead to OOM.
dnhatn added a commit that referenced this pull request Jun 30, 2021
This change makes sure the system has enough memory before decoding an
async search response as a large response can lead to OOM.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement :Search/Search Search-related issues that do not fall into other categories v7.14.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants