Change pre_filter_shard_size default to 1 for frozen index searches #39835

timroes · 2019-03-08T11:36:54Z

We have currently an open issue in Kibana (elastic/kibana#32742) about setting pre_filter_shard_size to 1 for all requests in case you want to query frozen indexes. This caused a couple of questions I am copying over from the other issue:

As far as I understand the documentation, the pre-filtering phase does not have any "significant overhead". Why is this value 128 by default, and not by 1? If there is no significant overhead, but we know that especially when including frozen index, we'll gain a lot of performance benefit, what's the reason of not enabling that phase by default? Since we would set this to 1 in every request (since we don't know if it includes a frozen index or not), once you enabled querying frozen indexes in the advanced setting, I would really like to understand the drawbacks of setting this to 1.

Would it potentially make more sense, setting this by default to 1 on the Elasticsearch side once ignore_throttled=false is set. Or maybe ES would even be possible to determine if an frozen index will be hit by a query and then using that appropriate value? Or (much of the question on top): Could that potentially be 1 for all requests, what are the drawbacks there?

The text was updated successfully, but these errors were encountered:

elasticmachine · 2019-03-08T11:40:32Z

Pinging @elastic/es-search

When a search on some indices may take a long time, it may cause problems to other indices that are being searched as part of the same search request, because the search context needs to stay open for a long time, in case the faster indices are being written to. This is especially a problem when searching against throttled and non-throttled indices as part of the same request. This commit splits the search in two sub-searches in this case: one for throttled indices, and one for non-throttled indices. This way the two don't interfere with each other. Also, the sub-search against the throttled indices can have pre_filter_shard_size set to 1 automatically, which is what we currently recommend our users to do. Closes elastic#39835 Closes elastic#40900

Now that we split the search execution in two whenever searching read-only and write indices as part of the same request (see elastic#42510), we can also automatically set `pre_filter_shard_size` to the appropriate value whenever not explicitly provided: `1` for readonly indices, and `128` (like before this change) for write indices. Note that we may still end up searching write and readonly indices as part of the same search execution, for instance when a scroll is provided or size is set to `0`, in which case we set `pre_filter_shard_size` to `128` when not explicitly set. Closes elastic#39835

jimczi · 2020-03-13T00:16:59Z

Since we reverted #42510 I think it makes sense to re-evaluate a simple solution to automatically execute the can match phase despite the default. This shouldn't be an issue going forward in Kibana since they plan to use the new async search which sets the pre_filter_shard_size to 1 by default but we should make the change in blocking _search nevertheless. We've improved the handling of queries on frozen indices that didn't run the can match phase by avoiding the usage of the throttled search thread pool on shards that cannot match the date range filter so it shouldn't be required to run the can match phase for these indices anymore. However the can_match_phase is now also used to pre-sort shards on sorted queries so we could automatically run this phase if it can optimize the query phase significantly.
@javanna WDYT ?

javanna · 2020-03-13T08:56:08Z

heya @jimczi what do you mean with "automatically execute the can match phase despite the default"?

jimczi · 2020-03-13T09:32:13Z

Defaulting to a static value of 128 is not flexible enough so today we require our users to set this value correctly. What I meant is that we should consider the default (pre_filter_shard_size is not present in the request) dynamic in order to be able:

Automatically run the can_match phase if frozen indices are part of the query
- or if a primary field sort is used (sort by a field).
- ... (we can add more conditions here)

Users would be able to opt-out by setting pre_filter_shard_size to a static value in their requests but that shouldn't be needed in the majority of cases.

javanna · 2020-03-13T13:35:48Z

++ to having a more dynamic default, good idea @jimczi . We could then probably open a discussion on the need for the request parameter, and whether it still needs to be a threshold based on number of shards. Maybe it should become something to force enabling/disabling the execution of the can match phase.

This commit changes the pre_filter_shard_size default from 128 to unspecified. This allows to apply heuristics based on the request and the target indices when deciding whether the can match phase should run or not. When unspecified, this pr runs the can match phase automatically if one of these conditions is met: * The request targets more than 128 shards. * The request contains read-only indices. * The primary sort of the query targets an indexed field. Users can opt-out from this behavior by setting the `pre_filter_shard_size` to a static value. Closes elastic#39835

) This commit changes the pre_filter_shard_size default from 128 to unspecified. This allows to apply heuristics based on the request and the target indices when deciding whether the can match phase should run or not. When unspecified, this pr runs the can match phase automatically if one of these conditions is met: * The request targets more than 128 shards. * The request contains read-only indices. * The primary sort of the query targets an indexed field. Users can opt-out from this behavior by setting the `pre_filter_shard_size` to a static value. Closes #39835

…stic#53873) This commit changes the pre_filter_shard_size default from 128 to unspecified. This allows to apply heuristics based on the request and the target indices when deciding whether the can match phase should run or not. When unspecified, this pr runs the can match phase automatically if one of these conditions is met: * The request targets more than 128 shards. * The request contains read-only indices. * The primary sort of the query targets an indexed field. Users can opt-out from this behavior by setting the `pre_filter_shard_size` to a static value. Closes elastic#39835

) (#54007) This commit changes the pre_filter_shard_size default from 128 to unspecified. This allows to apply heuristics based on the request and the target indices when deciding whether the can match phase should run or not. When unspecified, this pr runs the can match phase automatically if one of these conditions is met: * The request targets more than 128 shards. * The request contains read-only indices. * The primary sort of the query targets an indexed field. Users can opt-out from this behavior by setting the `pre_filter_shard_size` to a static value. Closes #39835

timroes mentioned this issue Mar 8, 2019

set pre_filter_shard_size to 1 when includeFrozen is specified and frozen indices are queried elastic/kibana#32742

Closed

jimczi added >enhancement discuss :Search/Search Search-related issues that do not fall into other categories labels Mar 8, 2019

tomcallahan added stalled and removed discuss labels Mar 14, 2019

javanna self-assigned this May 24, 2019

javanna removed the stalled label May 24, 2019

javanna mentioned this issue May 24, 2019

Split search in two when made against read-only and write indices #42510

Merged

javanna mentioned this issue Jun 19, 2019

Automatically adjust pre_filter_shard_size to 1 for readonly indices #43377

Closed

javanna removed their assignment Feb 28, 2020

jimczi mentioned this issue Mar 20, 2020

Add heuristics to compute pre_filter_shard_size when unspecified #53873

Merged

jimczi closed this as completed in #53873 Mar 23, 2020

codebrain mentioned this issue Apr 1, 2020

7.7.0 meta ticket (Part 2) elastic/elasticsearch-net#4533

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Change pre_filter_shard_size default to 1 for frozen index searches #39835

Change pre_filter_shard_size default to 1 for frozen index searches #39835

timroes commented Mar 8, 2019

elasticmachine commented Mar 8, 2019

jimczi commented Mar 13, 2020

javanna commented Mar 13, 2020

jimczi commented Mar 13, 2020

javanna commented Mar 13, 2020

Change pre_filter_shard_size default to 1 for frozen index searches #39835

Change pre_filter_shard_size default to 1 for frozen index searches #39835

Comments

timroes commented Mar 8, 2019

elasticmachine commented Mar 8, 2019

jimczi commented Mar 13, 2020

javanna commented Mar 13, 2020

jimczi commented Mar 13, 2020

javanna commented Mar 13, 2020