Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add heuristics to compute pre_filter_shard_size when unspecified #53873

Merged
merged 4 commits into from
Mar 23, 2020

Conversation

jimczi
Copy link
Contributor

@jimczi jimczi commented Mar 20, 2020

This commit changes the pre_filter_shard_size default from 128 to unspecified.
This allows to apply heuristics based on the request and the target indices when deciding
whether the can match phase should run or not. When unspecified, this pr runs the can match phase automatically if one of these conditions is met:

  • The request targets more than 128 shards.
  • The request contains read-only indices.
  • The primary sort of the query targets an indexed field.

Users can opt-out from this behavior by setting the pre_filter_shard_size to a static value.

Closes #39835

This commit changes the pre_filter_shard_size default from 128 to unspecified.
This allows to apply heuristics based on the request and the target indices when deciding
whether the can match phase should run or not. When unspecified, this pr runs the can match phase
automatically if one of these conditions is met:
  * The request targets more than 128 shards.
  * The request contains read-only indices.
  * The primary sort of the query targets an indexed field.
Users can opt-out from this behavior by setting the `pre_filter_shard_size` to a static value.

Closes elastic#39835
@jimczi jimczi added >enhancement :Search/Search Search-related issues that do not fall into other categories v8.0.0 v7.7.0 labels Mar 20, 2020
@jimczi jimczi requested a review from javanna March 20, 2020 14:13
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search (:Search/Search)

@jimczi jimczi changed the title Add better heuristic to compute pre_filter_shard_size when unspecified Add heuristics to compute pre_filter_shard_size when unspecified Mar 20, 2020
Copy link
Member

@javanna javanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left some questions and nits but nothing major. This change makes so much sense that I now wonder "why didn't we do this before?" :)

docs/reference/search/search.asciidoc Outdated Show resolved Hide resolved
When unspecified, the pre-filter phase is executed if any of these conditions is met:
- The request targets more than `128` shards.
- The request contains read-only indices.
- The primary sort of the query targets an indexed field.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thinking out loud: this could hit people who use CCS without minimizing roundtrips and experience latency, but given that we minimize roundtrips by default, it should be ok. and they can always increase the parameter manually should they see their CCS search slow down.

docs/reference/search/search.asciidoc Outdated Show resolved Hide resolved
docs/reference/search/search.asciidoc Show resolved Hide resolved
a threshold that, when exceeded, will enforce a round-trip to pre-filter search shards that cannot possibly match.
This filter phase can limit the number of shards significantly. For instance, if a date range filter is applied, then all indices (frozen or unfrozen) that do not contain documents within the date range can be skipped efficiently.
The default value for `pre_filter_shard_size` is `128` but it's recommended to set it to `1` when searching frozen indices. There is no
significant overhead associated with this pre-filter phase.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it make sense to still explain and make sure that users don't mess with pre_filter_shard_size at this point? I would not want them to end up setting it when searching frozen indices. It sounds like there is never a good reason to do so?

Copy link
Member

@javanna javanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks @jimczi

@jimczi jimczi merged commit 04bd154 into elastic:master Mar 23, 2020
@jimczi jimczi deleted the pre_filter_shard_size_heuristic branch March 23, 2020 18:06
jimczi added a commit to jimczi/elasticsearch that referenced this pull request Mar 23, 2020
…stic#53873)

This commit changes the pre_filter_shard_size default from 128 to unspecified.
This allows to apply heuristics based on the request and the target indices when deciding
whether the can match phase should run or not. When unspecified, this pr runs the can match phase
automatically if one of these conditions is met:
  * The request targets more than 128 shards.
  * The request contains read-only indices.
  * The primary sort of the query targets an indexed field.
Users can opt-out from this behavior by setting the `pre_filter_shard_size` to a static value.

Closes elastic#39835
nik9000 added a commit to nik9000/elasticsearch that referenced this pull request Mar 23, 2020
nik9000 added a commit that referenced this pull request Mar 24, 2020
jimczi added a commit that referenced this pull request Mar 24, 2020
) (#54007)

This commit changes the pre_filter_shard_size default from 128 to unspecified.
This allows to apply heuristics based on the request and the target indices when deciding
whether the can match phase should run or not. When unspecified, this pr runs the can match phase
automatically if one of these conditions is met:
  * The request targets more than 128 shards.
  * The request contains read-only indices.
  * The primary sort of the query targets an indexed field.
Users can opt-out from this behavior by setting the `pre_filter_shard_size` to a static value.

Closes #39835
@jpountz
Copy link
Contributor

jpountz commented Mar 24, 2020

@jimczi We have the following statement in the docs: The default value for pre_filter_shard_sizeis128but it's recommended to set it to1 when searching frozen indices.. Do you think we can remove it now given this PR? Nevermind I misread the PR!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement :Search/Search Search-related issues that do not fall into other categories v7.7.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Change pre_filter_shard_size default to 1 for frozen index searches
5 participants