Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

advanced setting to control search request preference #17271

Merged
merged 5 commits into from
Mar 28, 2018

Conversation

nreese
Copy link
Contributor

@nreese nreese commented Mar 20, 2018

fixes #15573

Provides advance setting courier:setRequestPreference. When true (default), perference is set to sessionId as it always has been. When false, preference is not set in the search header.

@elasticmachine
Copy link
Contributor

💔 Build Failed

@nreese
Copy link
Contributor Author

nreese commented Mar 20, 2018

@gchaps Can you please provide feedback on the wording for courier:setRequestPreference description?

@elasticmachine
Copy link
Contributor

💚 Build Succeeded

Copy link
Contributor

@stacey-gammon stacey-gammon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool!

value: true,
description: '<a href="https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-preference.html" target="_blank" rel="noopener noreferrer">Request Preference</a> ' +
' controls the shard copies used for search execution.' +
' Set to true to execute all search requests on the same shards. ' +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the description should mention the original reason we started using preference, which I believe was caching (Please correct me if I'm wrong @rashidkpc).

Like your description states, not sending preference sort of randomizes the node that handles the request, which means that the caches from a previous request might not be reusable on a second request and will need to be rebuilt.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about this

"Set to true to execute all search requests on the same shards. This has the benefit of reusing Shard Request Cache across requests. Set to false to have search request execution randomized among all available shard copies. Setting to false may provide better performance since requests can be spread across all shard copies but may result in inconsistent results as different shards may be in different refresh states."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder how many requests can actually even use the cache since... "Most queries that use now (see Date Mathedit) cannot be cached." from warning in shard-request-cache.html

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are other caches, like the caches from filters, which I'm almost positive are not impacted by using now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Contributor

@spalger spalger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@elasticmachine
Copy link
Contributor

💚 Build Succeeded

@LucaWintergerst
Copy link
Contributor

LucaWintergerst commented Mar 21, 2018

The way this is implemented right now might conflict with the zone awareness in Elasticsearch

When executing search or GET requests, with shard awareness enabled, Elasticsearch will prefer using local shards — shards in the same awareness group — to execute the request. This is usually faster than crossing between racks or across zone boundaries.

If I set this setting to false, the performance might be much worse as I will only hit half of the cluster!

A solution would be to set the preference to an empty string. This will then override the es default in case awareness is being used.
The reason why I'm proposing this is that this will preserve the behaviour that users experience right now

Ideally a user would be able to choose between sessionId none (which would be an empty string) local (which would mirror the behaviour when using zones)

@nreese
Copy link
Contributor Author

nreese commented Mar 21, 2018

@LucaWintergerst Thanks for brining this up before the changes were committed. There is no reason why the setting can't be a list instead of a boolean.

Just a couple of questions to make sure I understand everything.

  1. Why would not setting the preference with zone awareness enabled only hit half of the cluster?
  2. What happens if zone awareness is not enabled and preference is set to an empty string?

@LucaWintergerst
Copy link
Contributor

LucaWintergerst commented Mar 21, 2018

  1. Let me try to explain in a bit more detail.
    Shard awareness is used to make elasticsearch nodes aware of the underlying hardware. In most cases a cluster will be tagged with two awareness values to ensure that the replica shards reside on a different set of hardware than the primaries. (a set could be a different server rack, a different physical host when running in VMs, a different region in AWS, etc.)
    When awareness is used, the preference always defaults to hitting nodes tagged with the same attribute as the node which received the request from Kibana. If I have two zones, my query will only run on 50% of my nodes.
    Since kibana is currently using preference: sessionId it overrides that default behaviour.
    If we were to completely remove the preference we would suddenly default to the local shards behaviour which would cause queries to only run in a subset of the nodes for many users.

  2. an empty string when zone awareness is not set will behave the same way as no preference. I will get someone from the ES team to confirm

@LucaWintergerst
Copy link
Contributor

LucaWintergerst commented Mar 21, 2018

after talking to a developer from the ES team we agreed that since this is an advanced setting it would be fine the way it is now.
However, it would be even better if we could make it fully customizable so all possible preferences can be used (https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-preference.html)

How much work would it be to make it customizable?
We would need the following functions:

  • setRequestPreference: $SESSION_ID <- default
  • setRequestPreference: custom (set it to a user defined string)
  • setRequestPreference: none (do not set it at all)

We need this distinction as none will not be the same as an empty string.
I would love to get additional feedback on this.

@nreese
Copy link
Contributor Author

nreese commented Mar 21, 2018

I don't think we could do it with a single parameter but I could do this with 2 parameters.

  1. setRequestPreference: list of sessionId, custom, none
  2. customRequestPreference: string input that allows custom preference (defaults to _local). customRequestPreference is only used when setRequestPreference is set to custom.

How does that sound?

@LucaWintergerst
Copy link
Contributor

that sounds great to me

@gchaps
Copy link
Contributor

gchaps commented Mar 22, 2018

I'm thinking of something like this:

Allows you to set which shards handle your search requests. $Session_ID restricts operations to shards with your session id. custom allows you to define a your own preference, such as a web session ID or username. none means do not set a preference. This differs from an empty string which means no preference. Learn more

I'm not sure how much you want to explain about customRequestPreference

@nreese
Copy link
Contributor Author

nreese commented Mar 27, 2018

@spalger @stacey-gammon As discussed above, I have changed courier:setRequestPreference to be a select and added courier:customRequestPreference.

screen shot 2018-03-27 at 9 15 12 am

@gchaps I changed the wording of the session id to keep the bit about caching and added more details to none.

@elasticmachine
Copy link
Contributor

💚 Build Succeeded

@gchaps
Copy link
Contributor

gchaps commented Mar 27, 2018

@nreese I recommend breaking the last sentence into two sentences:

This might provide better performance because requests can be spread across all shard copies. However, results might be inconsistent because different shards might be in different refresh states.

We typically use "might" instead of "may" for localization reasons.

@elasticmachine
Copy link
Contributor

💚 Build Succeeded

@stacey-gammon
Copy link
Contributor

@LucaWintergerst - The Preference documents say:

By default, the operation is randomized among the available shard copies.

But if I am understanding correctly, that is only the default behavior when zone awareness is not set? When zone awareness is set, then the default behavior, when no preference is given, is _local? If I am understanding all this correctly, maybe the docs should be updated? (cc @gchaps)

@nreese - Looks great! Confirmed the requests look as expected and if I set it to something like _shards:8,9 I get no results (since I don't actually have those shards).

@LucaWintergerst
Copy link
Contributor

@stacey-gammon it's not just _local, it is kind of like this:_prefer_nodes=<all nodes in that awareness zone> (I have to admit that I don't know how it is done exactly within Elasticsearch)
Updating the docs sounds like a good idea. Perhaps changing it to this?:
By default, the operation is randomized among the available shard copies, unless allocation awareness is used

@nreese nreese merged commit 3fecd2d into elastic:master Mar 28, 2018
nreese added a commit to nreese/kibana that referenced this pull request Mar 28, 2018
* advanced setting to control search request preference

* add header tests

* add sentince about caching to description

* change courier:setRequestPreference to list and add courier:customRequestPreference

* update setting text
nreese added a commit that referenced this pull request Mar 28, 2018
* advanced setting to control search request preference

* add header tests

* add sentince about caching to description

* change courier:setRequestPreference to list and add courier:customRequestPreference

* update setting text
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow customisation of "preferences" for _msearch
6 participants