Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query prioritisation support #1017

Open
Bukhtawar opened this issue Jul 27, 2021 · 3 comments
Open

Query prioritisation support #1017

Bukhtawar opened this issue Jul 27, 2021 · 3 comments
Labels
enhancement Enhancement or improvement to existing feature or request Search:Performance Search Search query, autocomplete ...etc

Comments

@Bukhtawar
Copy link
Collaborator

Problem

We lack a prioritisation mechanism for queries for instance

  1. At shard level fetch phase could have a higher priority than query phase requests, so that requests complete faster with higher probabilities
  2. Async search queries running across multiple clusters should have a lower priority than usual search requests
  3. Resource intensive queries could similarly have a lower priority
@Bukhtawar Bukhtawar added the enhancement Enhancement or improvement to existing feature or request label Jul 27, 2021
@AmiStrn
Copy link
Contributor

AmiStrn commented Jul 27, 2021

I was having a discussion about this literally just today. Thanks for this @Bukhtawar

@getsaurabh02
Copy link
Member

Throwing some additional thoughts and copying details from #1140 and closing it, since this was created first.

As part of #1042 we are planning to do resource mapping of queries, and selective rejection when in duress. We want to extend the solution to also allow query prioritisation, which provides mechanism to selectively execute queries, when there are multiple queries with different priorities contending for the same resources.

Not all queries in the workload are of equal importance to customer. Often performance of one request or set of queries might be more important than others. With query prioritisation, customers can define the relative importance of queries in a workload by setting a priority value. The priority is specified for a dynamic queue such as one of (CRITICAL, HIGHEST, HIGH, NORMAL, LOW, LOWEST).

Opensearch will use the priority when accepting queries for execution, and to determine the amount of resources to be allocated to a query. By default, queries run with their priority set to NORMAL. These priority of query is also used under duress, to selectively cancel resource guzzling queries, and recover the system.

@jhinch-at-atlassian-com
Copy link

I wanted to capture a practical use case for this feature in the company I work at. We have three main sources of load to OpenSearch:

  • Live user queries (essentially powering the search results of a search box) which have strict reliability, availability and latency SLOs. These queries need to be run immediately as there is an end user looking for search results. Queries of these type are observed with daily peaks under business hours and lows outside business hours. An outage of these use cases would have material customer impact
  • Background batch jobs which execute queries which can be slower to execute, can be retried and can be run at any point in the day.

Ideally if OpenSearch was under stress (particularly on specific data nodes), it would be able to first load shed the background jobs before it starts to load shed live user queries. This could be done outside of OpenSearch, but either the backpressure would be too crude, creating failing requests which would have executed just fine as the target data node/shard was not under stress but others were, or requires complex logic to track which shards and data node a request will map to and use external monitoring to determine if that shard/data node is under stress. The OpenSearch cluster itself is much better suited to be able to determine what shards are overloaded, provided it knows what requests are more or less important.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancement or improvement to existing feature or request Search:Performance Search Search query, autocomplete ...etc
Projects
Status: Later (6 months plus)
Development

No branches or pull requests

6 participants