Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

threshold based automatic query prioritization #9493

Merged
merged 8 commits into from
Mar 13, 2020

Conversation

clintropolis
Copy link
Member

Description

This PR is a follow-up to #9407 that adds a new interface QueryPrioritizationStrategy intended to enable implementations to automatically prioritize queries based on some criteria. As a proof of concept implementation of this functionality, it provides ThresholdBasedQueryDeprioritizationStrategy, which offers the 3 thresholds of: period from the current time of the query, duration of the interval of the query, and number of segments taking part in the query, described in #6993.

This strategy can be enabled by setting druid.query.scheduler.prioritization.strategy to threshold.

Property Description Default
druid.query.scheduler.prioritization.periodThreshold ISO duration threshold for how old data can be queried before automatically adjusting query priority. None
druid.query.scheduler.prioritization.durationThreshold ISO duration threshold for maximum duration a queries interval can span before the priority is automatically adjusted. None
druid.query.scheduler.prioritization.segmentCountThreshold Number threshold for maximum number of segments that can take part in a query before its priority is automatically adjusted. None
druid.query.scheduler.prioritization.adjustment Amount to reduce the priority of queries which cross any threshold. None

This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths.
  • added integration tests.
  • been tested in a test Druid cluster.

@sascha-coenen
Copy link

sascha-coenen commented Mar 10, 2020

AWESOME! I love it. This will be so useful.

If a query surpasses the thresholds several times, for instance it would be several times the segment threshold or the duration threshold would fit into the query time range several times, would the "adjustment" be decremented several times too?
With the laning being implemented, will this become an alternative to the 'adjustment" property, to specify different lanes based on the query weight?


##### Prioritization strategies

###### No auto prioritization strategy
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: hmm, is 'no auto' = 'manual'? Just wondering what is a better name.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"None prioritization strategy" ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, after thinking about it, I think manual maybe makes the most sense, to be symmetrical with the ManualQueryLaningStrategy of #9492 because both rely on the user adding a value to the query context.


###### Threshold deprioritization strategy

This prioritization strategy deprioritizes queries that cross any of a configurable set of thresholds, such as how far in the past the data is, how large of an interval a query covers, or the number of segments taking part in a query.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: by rephrasing it to say, "... strategy lowers priority of queries that..." , we could make the heading be "Threshold prioritization strategy"

@clintropolis
Copy link
Member Author

thanks for the review @jihoonson and @himanshug

@clintropolis clintropolis merged commit 6afd55c into apache:master Mar 13, 2020
@clintropolis clintropolis deleted the query-auto-prioritization branch March 13, 2020 08:42
@jihoonson jihoonson added this to the 0.18.0 milestone Mar 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants