-
Notifications
You must be signed in to change notification settings - Fork 24.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expose batched_reduce_size
via _search
#23288
Conversation
In elastic#23253 we added an the ability to incrementally reduce search results. This change exposes the parameter to control the batch since and therefore the memory consumption of a large search request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but it's missing the doc changes.
}, | ||
"batched_reduce_size" : { | ||
"type" : "number", | ||
"description" : "The number of shard results that should be reduced at once on the coordinating node. This value should be used as a protection mechanism to reduce the memory overhead per search request if the potential number of shards in the request can be large." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add
"default" : 512
I am not sure what you mean? I didn't add a documentation for this since it's so specialized. I want to expose it once we remove the softlimit? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it'd be useful to return the number of reduction phases that we did so we can add a test that asserts that we did the right number. Just to make sure we didn't drop setting the parameter on the floor. I'm fine with you doing that in a followup or doing it myself since I'm the one that wants it.
@@ -60,7 +60,7 @@ public RandomizingClient(Client client, Random random) { | |||
|
|||
@Override | |||
public SearchRequestBuilder prepareSearch(String... indices) { | |||
return in.prepareSearch(indices).setSearchType(defaultSearchType).setPreference(defaultPreference).setReduceUpTo(reduceUpTo); | |||
return in.prepareSearch(indices).setSearchType(defaultSearchType).setPreference(defaultPreference).setBatchedReduceSize(reduceUpTo); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want to rename reduceUpTo
here as well?
I don't think we should clutter the API with such an internal optimization. What do you expect from it? |
Just to know if the parameter had any effect at all. Right now you can't tell. |
this argument is odd. We are testing it throughout the stack with unittests and we know it's passed to the SearchRequest since there is the exception coming from. We can't pass any pointers back just for the integration test sake? |
Once we've removed the limit we can test with a really large reduce. We'll want to do that anyway. I like returning the reduction count as well because it is simpler to debug if it fails and because it'll give us more information if the huge reduce fails and the reduction count test doesn't. I'm ok with not doing the test. |
we have a whole bunch of tests for this that were added in the original PR |
I spoke to @nik9000 and I start to agree we should have this response parameter especially for debugging purposes of this feature at the users end. I added new commits. |
@@ -179,13 +179,6 @@ public void scrollId(String scrollId) { | |||
return internalResponse.profile(); | |||
} | |||
|
|||
static final class Fields { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
OK |
The assertion that if there are buffered aggs at least one incremental reduce phase should have happened doens't hold if there are shard failure. This commit removes this assertion. Relates to #23288
Both PRs below have been backported to 5.4 such that we can enable BWC tests of this feature as well as remove version dependend serialization for search request / responses. Relates to elastic#23288 Relates to elastic#23253
The assertion that if there are buffered aggs at least one incremental reduce phase should have happened doens't hold if there are shard failure. This commit removes this assertion. Relates to #23288
builder.field("terminated_early", isTerminatedEarly()); | ||
} | ||
if (getNumReducePhases() != 1) { | ||
builder.field("num_reduce_phases", getNumReducePhases()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
* Updated api gen to 5.4 and added a way to patch specification files through special *.patch.json companion files. Due to pending discusion on elastic/elasticsearch@e579629 :q! * updated x-pack spec to 5.4 * add codegen part for xpack info related APIs * Added support for Field Caps API * add support for RemoteInfo API and adds cross cluster support to IndexName * added support for SourceExists() * add skipversion, eventhough this API existed it was undocumented prior to 5.4 * expose word delimiter graph token filter as per elastic/elasticsearch#23327 * spaces=>tabs * expose num_reduce_phases as per elastic/elasticsearch#23288 * implemented XPackInfo() started on XPackUsage() * added response structure for XPackUsage() * change license date from DateTime to DateTimeOffset' * implement PR feedback on #2743 * remove explicit folder includes in csproj files
Conflicts: src/Nest/Search/Search/SearchResponse.cs
Conflicts: src/Nest/Search/Search/SearchResponse.cs
In #23253 we added the ability to incrementally reduce search results.
This change exposes the parameter to control the batch size and therefore
the memory consumption of a large search request.