Create a physical plan for block querying #2586

simonswine · 2023-10-25T16:26:04Z

This PR will reduce the work necessary to query blocks, by planning out
which blocks are fetched from which ingester/store-gateway.

That way queries will potentially not need profile-by-profile
deduplication for blocks that had been deduplicated by the compactor.

TODO:

respect limit the BlockMetacall to start/end in
ingester/store-gateway
respect hints in ingester
~~Reuse the compactor block mark deletion algorithm in part to disqualify blocks~~ (It is a bit more nuanced, so keeping it separate for now)

pkg/phlaredb/phlaredb.go

pkg/querier/ingester_querier.go

kolesnikovae · 2023-11-08T07:10:03Z

pkg/querier/replication.go

+	// adapt the plan to make sure all replicas will deduplicate
+	if deduplicate {
+		for _, hints := range plan {
+			hints.Deduplication = deduplicate
+		}
+	}


Does it mean that if we got a query like "last 2 days" we will force deduplication as the plan also includes blocks that have not been compacted yet? I guess, this should be mitigated by query time range split – I'm wondering if we should set to the block duration by default

Yes this is true. The first step is for the query time range split to alight with roughly the processing time for the compacted block. So it should be slightly higher than block duration, just because we introduce a fair bit of latency before a block is compacted/deduplicated and available to store-gateways.

Ideally in a future where the query-frontend will also be aware of the plan, we can split accordingly to what the block layout is.

api/ingester/v1/ingester.proto

This PR will reduce the work necessary to query blocks, by planning out which blocks are fetched from which ingester/store-gateway. That way queries will potentially not need profile-by-profile deduplication for blocks that had been deduplicated by the compactor.

pkg/phlaredb/block_querier.go

Query head first (heads and flushing) and then queriers, this will make sure that we do not miss a block moving into flushing.

kolesnikovae

LGTM 🚀 Looking forward to seeing this in action!

cyriltovena · 2023-11-09T08:26:28Z

pkg/phlaredb/block_querier.go

+				}
+
+				// TODO(simonswine): Split profiles per row group and run the MergeByStacktraces in parallel.
+				merge, err := querier.MergeByStacktraces(ctx, iter.NewSliceIterator(querier.Sort(profiles)))


I think we can avoid sorting and pulling actual labels. This could be big savings.

So you mean all in one step: a SelectAndMergeProfiles runs one iterator through data?

* Create a physical plan for block querying This PR will reduce the work necessary to query blocks, by planning out which blocks are fetched from which ingester/store-gateway. That way queries will potentially not need profile-by-profile deduplication for blocks that had been deduplicated by the compactor. * Remove unused field * Missing `make generate` change * Signal that there are no profiles streamed * Generate block metadata correctly * Improve tracing for block plan * Request BlockMetadata correctly Query head first (heads and flushing) and then queriers, this will make sure that we do not miss a block moving into flushing. * Forward hints correctly to store-gateway * Sort iterator properly * Ensure block metadata repsones are ts bound for ingesters * Ignore storeQueryAfter for plan building * Move the signalling to close connction to the right place * Ensure we only read replica for the correct instance type * Handle no result profiles * Log full plan in debug level

simonswine changed the title ~~WIP: Create a physical plan for blcok querying~~ WIP: Create a physical plan for block querying Oct 26, 2023

simonswine force-pushed the 20230928_deduplicate-blocks branch 3 times, most recently from 4300368 to 9338f89 Compare October 31, 2023 18:22

simonswine force-pushed the 20230928_deduplicate-blocks branch 3 times, most recently from 8df68f6 to c7c1e11 Compare November 7, 2023 17:03

kolesnikovae reviewed Nov 8, 2023

View reviewed changes

simonswine added 4 commits November 8, 2023 09:32

Remove unused field

a553afe

Missing make generate change

dcd9360

Signal that there are no profiles streamed

cadba7a

simonswine force-pushed the 20230928_deduplicate-blocks branch from c7c1e11 to cadba7a Compare November 8, 2023 09:51

cyriltovena reviewed Nov 8, 2023

View reviewed changes

pkg/phlaredb/block_querier.go Outdated Show resolved Hide resolved

simonswine added 10 commits November 8, 2023 13:04

Generate block metadata correctly

84bcdb6

Improve tracing for block plan

bb62420

Request BlockMetadata correctly

e3fc52f

Query head first (heads and flushing) and then queriers, this will make sure that we do not miss a block moving into flushing.

Forward hints correctly to store-gateway

a39438f

Sort iterator properly

d2552ca

Ensure block metadata repsones are ts bound for ingesters

599f657

Ignore storeQueryAfter for plan building

f83c292

Move the signalling to close connction to the right place

ec89f93

Ensure we only read replica for the correct instance type

f98f715

Handle no result profiles

491b73a

simonswine marked this pull request as ready for review November 8, 2023 19:29

simonswine requested review from a team as code owners November 8, 2023 19:29

kolesnikovae approved these changes Nov 9, 2023

View reviewed changes

cyriltovena reviewed Nov 9, 2023

View reviewed changes

simonswine changed the title ~~WIP: Create a physical plan for block querying~~ Create a physical plan for block querying Nov 9, 2023

Log full plan in debug level

d2c7d76

simonswine merged commit d4e3b03 into grafana:main Nov 9, 2023
17 checks passed

simonswine mentioned this pull request Nov 9, 2023

Make deduplication decision more robust #2660

Merged

simonswine mentioned this pull request Nov 13, 2023

Query more efficently by using the information about (deduplicated) blocks #2674

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create a physical plan for block querying #2586

Create a physical plan for block querying #2586

simonswine commented Oct 25, 2023 •

edited

Loading

kolesnikovae Nov 8, 2023

simonswine Nov 8, 2023

kolesnikovae left a comment

cyriltovena Nov 9, 2023

simonswine Nov 9, 2023

Create a physical plan for block querying #2586

Create a physical plan for block querying #2586

Conversation

simonswine commented Oct 25, 2023 • edited Loading

kolesnikovae Nov 8, 2023

Choose a reason for hiding this comment

simonswine Nov 8, 2023

Choose a reason for hiding this comment

kolesnikovae left a comment

Choose a reason for hiding this comment

cyriltovena Nov 9, 2023

Choose a reason for hiding this comment

simonswine Nov 9, 2023

Choose a reason for hiding this comment

simonswine commented Oct 25, 2023 •

edited

Loading