Check memory usage before decoding async response #74594

dnhatn · 2021-06-27T01:31:43Z

This change makes sure the system has enough memory before decoding an async search response as a large response can lead to OOM.

elasticmachine · 2021-06-27T15:02:10Z

Pinging @elastic/es-search (Team:Search)

mayya-sharipova · 2021-06-29T01:59:02Z

x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/async/AsyncTaskIndexService.java

+                    switch (fieldName) {
+                        case RESULT_FIELD:
+                            ensureExpectedToken(XContentParser.Token.VALUE_STRING, parser.currentToken(), parser);
+                            final CharBuffer encodedBuffer = parser.charBuffer();


Do you think we can use parser.binaryValue() instead that will provide already decoded from Base64 array?

I looked at the code of getBinaryValue and readBinaryValue of JsonParser. Both use two intermediate buffers while decoding a Base64 string. The current approach doesn't require extra memory. I know we shouldn't utilize these optimizations, but they help when encoding/decoding huge responses.

Thanks for the explanation, Nhat.

I've checked parser.charBuffer implementation, it uses String::toCharArray() which will allocate
a newly character array whose length is the length of this string. So it means that we first allocate 2X Response size memory, and only use circuit breaker for another 1X memory?

I had some other ideas, what do you think of it:

You said we can record the size of search response, what if we record the size of just encoded Based 64 string (which we should know)

Another very simple way is to use setting search.max_async_search_response_size that Set max allowed size for stored async response #74455 introduces. We can just check for max possible size available in memory.

I've checked parser.charBuffer implementation, it uses String::toCharArray() which will allocate
a newly character array whose length is the length of this string.

Great catch, thanks Mayya! I have updated this PR to account an extra buffer to the XContentParser. Unlike the reserved memory for the response, we can release this memory immediately after parsing.

You said we can record the size of search response, what if we record the size of just encoded Based 64 string (which we should know)

We already have it when parsing the xContent (i.e., encodedBuffer.length()).

Another very simple way is to use setting search.max_async_search_response_size that Set max allowed size for stored async response #74455 introduces. We can just check for max possible size available in memory.

We can overly reserve the memory especially when users use a large value for this setting.

dnhatn · 2021-06-29T16:44:18Z

Thanks Mayya for your review. I have addressed your feedback. Would you please take another look?

mayya-sharipova · 2021-06-29T18:46:38Z

x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/async/AsyncTaskIndexService.java

+                                     boolean restoreResponseHeaders, boolean checkAuthentication,
+                                     Counter reservedBytes) {
+        // Reserve an extra buffer as XContentParser can use it to hold parsed values.
+        circuitBreaker.addEstimateBytesAndMaybeBreak(source.length(), "parse xContent of async response");


what happens if this circuitBreaker raises an exception? Should it be inside try statement to make sure we release allocated bytes number?

If this circuitBreaker raises an exception, then we haven't reserved the memory yet. Therefore, we shouldn't release it.

mayya-sharipova · 2021-06-29T19:03:09Z

@dnhatn Nhat, thanks for iterating. The PR overall LGTM.
But one concern I have is that the code got too complex. I was wondering if we can simplify even at the expense of not super precise estimate: for example we can keep the old code and just add a circuit breaker of 2X times source length: circuitBreaker.addEstimateBytesAndMaybeBreak(2* source.length(), "parse xContent of async response");
I think this should give us a good enough estimate? WDYT?

Or is there some extra benefit of this modified code? Reduced memory?

dnhatn · 2021-06-29T19:29:00Z

I prefer not to introduce these optimizations too, but the old implementation can use up to 4 times memory of the source length: the internal buffer of the parser, a base64 encoded string, a base64 decoded buffer, and a response. I will try to simplify the code.

dnhatn · 2021-06-29T21:13:23Z

@mayya-sharipova I pushed cc66b09 to simplify the code. Would you mind taking another look? Thank you!

mayya-sharipova

@dnhatn Thanks for iterating Nhat! I like how the code looks now! Great job!

dnhatn · 2021-06-30T02:10:34Z

@mayya-sharipova Thanks so much for your reviews.

This change makes sure the system has enough memory before decoding an async search response as a large response can lead to OOM.

dnhatn added 4 commits June 26, 2021 14:43

Simplify get status

e2cc138

Check memory usage before decoding async response

51f49c2

fix parser

4406eb7

do not require authentication in retrieve status

e4a1492

dnhatn marked this pull request as ready for review June 27, 2021 15:01

dnhatn added the :Search/Search Search-related issues that do not fall into other categories label Jun 27, 2021

elasticmachine added the Team:Search Meta label for search team label Jun 27, 2021

dnhatn added >enhancement v8.0.0 v7.14.0 and removed Team:Search Meta label for search team labels Jun 27, 2021

dnhatn requested review from mayya-sharipova and jtibshirani June 27, 2021 15:02

dnhatn added 3 commits June 27, 2021 11:18

wording

a410b2b

realtime get

22ee363

safer

2fa20c4

mayya-sharipova reviewed Jun 29, 2021

View reviewed changes

Merge branch 'master' into async-search-decode

8f69d13

dnhatn requested a review from mayya-sharipova June 29, 2021 02:32

dnhatn added 4 commits June 28, 2021 22:38

oops

d57cf28

account the extra buffer

b8f9dde

extra method

8f45382

Merge branch 'master' into async-search-decode

fc4462c

mayya-sharipova reviewed Jun 29, 2021

View reviewed changes

reserve twice

cc66b09

dnhatn requested a review from mayya-sharipova June 29, 2021 21:13

inside

c29aa6b

mayya-sharipova approved these changes Jun 30, 2021

View reviewed changes

dnhatn merged commit 3ca6077 into elastic:master Jun 30, 2021

dnhatn deleted the async-search-decode branch June 30, 2021 02:10

dnhatn mentioned this pull request Jun 30, 2021

Check memory usage before decoding async response #74727

Merged

dnhatn added a commit that referenced this pull request Jun 30, 2021

Check memory usage before decoding async response (#74594)

febed8b

This change makes sure the system has enough memory before decoding an async search response as a large response can lead to OOM.

jtibshirani mentioned this pull request Jul 7, 2021

Prevent storing huge async search responses #67594

Closed

jakelandis added v8.0.0-alpha1 and removed v8.0.0 labels Jul 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Check memory usage before decoding async response #74594

Check memory usage before decoding async response #74594

dnhatn commented Jun 27, 2021

elasticmachine commented Jun 27, 2021

mayya-sharipova Jun 29, 2021

dnhatn Jun 29, 2021

mayya-sharipova Jun 29, 2021

mayya-sharipova Jun 29, 2021

dnhatn Jun 29, 2021

dnhatn commented Jun 29, 2021

mayya-sharipova Jun 29, 2021 •

edited

dnhatn Jun 29, 2021

mayya-sharipova commented Jun 29, 2021 •

edited

dnhatn commented Jun 29, 2021

dnhatn commented Jun 29, 2021

mayya-sharipova left a comment

dnhatn commented Jun 30, 2021

Check memory usage before decoding async response #74594

Check memory usage before decoding async response #74594

Conversation

dnhatn commented Jun 27, 2021

elasticmachine commented Jun 27, 2021

mayya-sharipova Jun 29, 2021

Choose a reason for hiding this comment

dnhatn Jun 29, 2021

Choose a reason for hiding this comment

mayya-sharipova Jun 29, 2021

Choose a reason for hiding this comment

mayya-sharipova Jun 29, 2021

Choose a reason for hiding this comment

dnhatn Jun 29, 2021

Choose a reason for hiding this comment

dnhatn commented Jun 29, 2021

mayya-sharipova Jun 29, 2021 • edited

Choose a reason for hiding this comment

dnhatn Jun 29, 2021

Choose a reason for hiding this comment

mayya-sharipova commented Jun 29, 2021 • edited

dnhatn commented Jun 29, 2021

dnhatn commented Jun 29, 2021

mayya-sharipova left a comment

Choose a reason for hiding this comment

dnhatn commented Jun 30, 2021

mayya-sharipova Jun 29, 2021 •

edited

mayya-sharipova commented Jun 29, 2021 •

edited