Split query by interval #713

kolesnikovae · 2023-05-24T19:27:38Z

Resolves #696

The PR contains the basic implementation of query parallelization by time sub-ranges. I decided to not over-copmlicate the solution for now, because the primary goal of the work is to find bottle necks in the read path. Over time, however, we should have a composable query plan (logical and physical).

The implementation only covers the following APIs:

SelectMergeStacktraces
SelectSeries

The following configuration options were added:

querier.split-queries-by-interval. Defaults to 0 – the new mechanism is disabled.
querier.max-query-parallelism. Defaults to 0 – the limit is disabled. Specifies how many sub-queries can be executed simultaneously per a single query in the frontend. In practice, I don't think we need to limit this, or have a very big value (thousands), because it is up to the scheduler (query-scheduler.max-outstanding-requests-per-tenant defaults to 100) and the querier to decide when to execute a sub-query. The parameter is present for consistence with the existing frontend implementations.
querier.max-concurrent. Defaults to 4 – The default value was previously set statically. Indicates how many requests (queries or sub-queries) a single querier can handle concurrently.

(I propose to set the values separately, after a sane default is found).

There's an issue that's worths mentioning: a flame graph aggregated from many SelectMergeStacktraces calls differs from the one fetched without parallelization because of how the truncation works. The difference is small but might be noticeable is some cases:

Comparison

The discrepancy comes from the fact that the set of truncated nodes for each "intermediate" flame graph (read: tree) is different and their weights vary from one to another. I'm not sure if this is something that requires an immediate fix, but we should keep an eye on this.

Before is on the left. Notice that the share of other node has decreased. However, it may have opposite results in some cases, where the nodes are truncated early (when we don't know about other trees) before a critical mass has built up, contributing to other more than we want. This is a problem because our assumption that we preserve the top N most significant nodes is not true anymore. In case if it becomes a real issue, I propose to simply increase the number of nodes for trees that are inputs of the aggregation (e.g. by 20-100%), first.

cyriltovena · 2023-05-25T08:19:41Z

Definitively like this approach.

pkg/frontend/frontend_select_merge_stacktraces.go

pkg/util/math/math.go

cyriltovena

LGTM

cyriltovena · 2023-05-31T15:05:57Z

I'm waiting for this one to be merged to start working on query limits in the frontend.

# Conflicts: # pkg/api/api.go

* Separate handlers in querier fronted * Clarify http responce decompression implementation details * Draft query time split * Split SelectSeries by time * Align interval to step duration * Remove unused code * Add querier.max-concurrent option * Fix connect headers

Separate handlers in querier fronted

c5ce94d

kolesnikovae changed the title ~~Separate handlers in querier fronted~~ Split query by interval May 25, 2023

kolesnikovae added 2 commits May 25, 2023 10:53

Clarify http responce decompression implementation details

46ac6de

Draft query time split

8e344dd

kolesnikovae force-pushed the feat/query_split_by_interval branch 2 times, most recently from 2c5afd8 to f6efd32 Compare May 29, 2023 09:53

Split SelectSeries by time

73b6469

kolesnikovae force-pushed the feat/query_split_by_interval branch from f6efd32 to 73b6469 Compare May 29, 2023 10:57

Align interval to step duration

5ab5a2e

kolesnikovae commented May 29, 2023

View reviewed changes

pkg/frontend/frontend_select_merge_stacktraces.go Show resolved Hide resolved

kolesnikovae marked this pull request as ready for review May 29, 2023 15:39

kolesnikovae requested a review from cyriltovena May 30, 2023 08:15

cyriltovena reviewed May 30, 2023

View reviewed changes

pkg/util/math/math.go Outdated Show resolved Hide resolved

cyriltovena approved these changes May 30, 2023

View reviewed changes

kolesnikovae added 2 commits May 30, 2023 19:24

Remove unused code

12a8bdf

Add querier.max-concurrent option

5b0a109

kolesnikovae force-pushed the feat/query_split_by_interval branch from f2bc3a9 to 5b0a109 Compare May 31, 2023 13:15

kolesnikovae added 2 commits May 31, 2023 17:10

Fix connect headers

b4af8e3

Merge branch 'main' into feat/query_split_by_interval

24c90eb

# Conflicts: # pkg/api/api.go

kolesnikovae merged commit 9f19269 into main May 31, 2023
17 checks passed

kolesnikovae deleted the feat/query_split_by_interval branch May 31, 2023 15:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Split query by interval #713

Split query by interval #713

kolesnikovae commented May 24, 2023 •

edited

cyriltovena commented May 25, 2023

cyriltovena left a comment

cyriltovena commented May 31, 2023

Split query by interval #713

Split query by interval #713

Conversation

kolesnikovae commented May 24, 2023 • edited

cyriltovena commented May 25, 2023

cyriltovena left a comment

Choose a reason for hiding this comment

cyriltovena commented May 31, 2023

kolesnikovae commented May 24, 2023 •

edited