feat(metadata): introduce a separate split interval for recent query window #11897

ashwanthgoli · 2024-02-08T11:23:06Z

What this PR does / why we need it:

Metadata queries are split using a 24h interval by default. The same interval gets used in the cache key.
But it is not possible to extract a subset of labels/series from a cached extent, unlike samples they are not associated with a timestamp. To prevent short queries from returning the results of an entire 24h extent, caching is disabled for the last 24h by default using max_metadata_cache_freshness.

But we have noticed in our cloud environments that most queries fall in the last 24h interval. This pr introduces the following changes to help cache the recent metadata query results:

Use a smaller split interval for caching recent metadata queries. Portion of the query within recent_metadata_query_window is split using a different interval of split_recent_metadata_queries_by_interval.
Only use extents that are entirely within the requested metadata query range. This is done to avoid using extents that contain results from the outside of the requested range.

To use a shorter split interval for recent metadata queries, the following needs to be configured:

recent_metadata_query_window to configure the window inside which the shorter split interval gets applied. Disabled by default.
split_recent_metadata_queries_by_interval to configure the split interval for the portion of the query within the recent metadata query window. Defaults to 1h. Recommended to be configured to a value smaller than split_metadata_queries_by_interval.
reduce max_metadata_cache_freshness to control the cache freshness

Which issue(s) this PR fixes:
Fixes #

Special notes for your reviewer:

Checklist

Reviewed the CONTRIBUTING.md guide (required)
Documentation added
Tests updated
CHANGELOG.md updated
- If the change is worth mentioning in the release notes, add add-to-release-notes label
Changes that require user attention or interaction to upgrade are documented in docs/sources/setup/upgrade/_index.md
For Helm chart changes bump the Helm chart version in production/helm/loki/Chart.yaml and update production/helm/loki/CHANGELOG.md and production/helm/loki/README.md. Example PR
If the change is deprecating or removing a configuration option, update the deprecated-config.yaml and deleted-config.yaml files respectively in the tools/deprecated-config-checker directory. Example PR

kavirajk

Overall looks good 👍 nice job @ashwanthgoli

Left few minor comments. Two other things

I see you have some tests to lock the behaviour of returning "correct" split either inside or outside recent metadata query window. But not enough tests to check the results of the split I think?. Asking because I'm curious say for smaller recent window (say 30m), if a query comes asking for 5m of labels, how big returned data time range would be 1h, 1w?
Like to see what others think of having "second" window to split metadata queries before merging it.

kavirajk · 2024-02-08T13:47:44Z

docs/sources/configure/_index.md

+# `split_recent_metadata_queries_by_interval`. The value 0 disables using a
+# different split interval for recent metadata queries.
+# CLI flag: -experimental.querier.recent-metadata-query-window
+[recent_metadata_query_window: <duration> | default = 0s]


The comment does pretty good job of explaining "what" it does. IMHO, we should also explains "why" you need this flag in the first place?

Say if you have metadata queries coming into the system falls into some sort of more than 1 bucket. Say 1. >1h 2. <30m, Then it's preferable to have two ways of splitting the metadata queries. Normal splitting with 1h for all queries > 1hr and split by say 1m for all queries <=30m. Something similar.

kavirajk · 2024-02-08T13:50:55Z

pkg/querier/queryrange/labels_cache.go

+	}
+
+	// if the query start is not before window start, it would be split using recentMetadataQuerySplitInterval
+	if windowStart := ref.Add(-recentMetadataQueryWindow); !start.Before(windowStart) {


nit: !start.Before(windowStart) -> windowStart.After(start). Seems bit more intuitive IMO.

had to use !before because we also have to consider the case where start aligns with the recentMetadataQueryWindow start.

kavirajk · 2024-02-08T13:54:05Z

pkg/querier/queryrange/splitters.go


-	// move the ingester splits to the end to maintain correct order
-	reqs = append(reqs, ingesterSplits...)
+	// move the ingester/"recent metadata" splits to the end to maintain correct order


kavirajk · 2024-02-08T13:54:09Z

pkg/querier/queryrange/splitters.go

 		if endTimeInclusive {
 			end = end.Add(-util.SplitGap)
 		}

-		// query only overlaps ingester query window, nothing more to do
+		// query only overlaps ingester/"recent metadata" query window, nothing more to do


nit: / always confuses me if it's "or" or "and". Can we be more specific?

kavirajk · 2024-02-08T14:00:47Z

pkg/querier/queryrange/splitters.go

 	)

-	start, end, needsIngesterSplits := ingesterQueryBounds(execTime, s.iqo, req)
+	switch req.(type) {
+	case *LokiSeriesRequest, *LabelRequest:


curious why aren't we considering split_ingester_queries_by_interval for series and labels queries?

i feel these features have orthogonal goals.

split_ingester_queries_by_interval is introduced to reduce the number of sub-queries we send to ingesters. currently metadata queries send atmost 1 subquery to ingesters with the default split interval.

while split_recent_metadata_queries_by_interval would increase the number of sub-queries, they should ideally deamplify over time as the results get cached.

I am not sure if we'd use these two together for metadata queries and what the config would look like if we do so.

Make sense. can you record this somewhere as comment for the future reference?

slim-bean · 2024-02-09T13:05:34Z

Haven't looked to closely at the implementation, but the idea makes good sense to me.

First thought, does this need to be configurable? What circumstances would we ever change these configs?

ashwanthgoli · 2024-02-09T13:14:41Z

First thought, does this need to be configurable? What circumstances would we ever change these configs?

mostly recent_metadata_query_window will be left at 24h unless the orig split interval is changed. Having this configurable also allows disabling recent metadata splits if it's not working as expected.

split_recent_metadata_queries_by_interval would be useful to create even smaller cache buckets, smaller the bucket the higher the chance of a hit. To start with 1h seems like a good default to me.

kavirajk

LGTM 👍

dannykopping

Great work! Just one minor nit.

dannykopping · 2024-02-13T13:30:55Z

pkg/querier/queryrange/labels_cache.go

+// metadataSplitIntervalForTimeRange returns split interval for series and label requests.
+// If `recent_metadata_query_window` is configured and the query start interval is within this window,
+// it returns `split_recent_metadata_queries_by_interval`.
+// For other cases, the default split interval of `split_metadata_queries_by_interval` will be used.


I actually think your code is pretty simple and self-documenting; all this comment can do is go stale.

dannykopping · 2024-02-13T13:32:15Z

pkg/storage/chunk/cache/resultscache/cache.go

@@ -334,6 +336,25 @@ func (s ResultsCache) partition(req Request, extents []Extent) ([]Request, []Res
 			continue
 		}

+		if s.onlyUseEntireExtent && (start > extent.GetStart() || end < extent.GetEnd()) {


This is awesome 👏

…window (#11897) (cherry picked from commit 9e7725b)

…window (grafana#11897)

metadata: introduce recent metadata split window

d835151

pull-request-size bot added the size/XL label Feb 8, 2024

github-actions bot added the type/docs Issues related to technical documentation; the Docs Squad uses this label across many repositories label Feb 8, 2024

ashwanthgoli changed the title ~~metadata: introduce recent metadata split window~~ metadata: Introduce a separate split interval for recent query window Feb 8, 2024

ashwanthgoli changed the title ~~metadata: Introduce a separate split interval for recent query window~~ feat(metadata): introduce a separate split interval for recent query window Feb 8, 2024

kavirajk reviewed Feb 8, 2024

View reviewed changes

ashwanthgoli added 2 commits February 9, 2024 12:49

only extract full extents for metadata caching

79f6a11

fix and refactor label/series cache test

77264f9

pull-request-size bot added size/XXL and removed size/XL labels Feb 9, 2024

ashwanthgoli added 3 commits February 9, 2024 17:56

review suggestions

ce40b99

review suggestions #2

41159b7

add changelog

0ca86f0

ashwanthgoli marked this pull request as ready for review February 9, 2024 12:45

ashwanthgoli requested a review from a team as a code owner February 9, 2024 12:45

Merge branch 'main' into ashwanth/recent-metadata-splits

7c1cc79

kavirajk approved these changes Feb 13, 2024

View reviewed changes

dannykopping approved these changes Feb 13, 2024

View reviewed changes

ashwanthgoli added 2 commits February 14, 2024 16:23

nit

1e90614

Merge branch 'main' into ashwanth/recent-metadata-splits

5d00e24

ashwanthgoli merged commit 9e7725b into main Feb 14, 2024
9 checks passed

ashwanthgoli deleted the ashwanth/recent-metadata-splits branch February 14, 2024 11:16

ashwanthgoli added the backport k189 label Feb 14, 2024

grafanabot pushed a commit that referenced this pull request Feb 14, 2024

feat(metadata): introduce a separate split interval for recent query …

a811ebd

…window (#11897) (cherry picked from commit 9e7725b)

grafanabot mentioned this pull request Feb 14, 2024

[k189] feat(metadata): introduce a separate split interval for recent query window #11942

Merged

8 tasks

loki-gh-app bot mentioned this pull request Mar 27, 2024

chore(add-major-release-workflow): release 3.0.0-rc.1 #12380

Closed

rhnasc pushed a commit to inloco/loki that referenced this pull request Apr 12, 2024

feat(metadata): introduce a separate split interval for recent query …

979e3a4

…window (grafana#11897)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(metadata): introduce a separate split interval for recent query window #11897

feat(metadata): introduce a separate split interval for recent query window #11897

ashwanthgoli commented Feb 8, 2024 •

edited

kavirajk left a comment

kavirajk Feb 8, 2024

kavirajk Feb 8, 2024

ashwanthgoli Feb 9, 2024

kavirajk Feb 8, 2024

kavirajk Feb 8, 2024

kavirajk Feb 8, 2024

ashwanthgoli Feb 9, 2024

kavirajk Feb 9, 2024

slim-bean commented Feb 9, 2024

ashwanthgoli commented Feb 9, 2024 •

edited

kavirajk left a comment

dannykopping left a comment

dannykopping Feb 13, 2024

dannykopping Feb 13, 2024

feat(metadata): introduce a separate split interval for recent query window #11897

feat(metadata): introduce a separate split interval for recent query window #11897

Conversation

ashwanthgoli commented Feb 8, 2024 • edited

kavirajk left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

slim-bean commented Feb 9, 2024

ashwanthgoli commented Feb 9, 2024 • edited

kavirajk left a comment

Choose a reason for hiding this comment

dannykopping left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ashwanthgoli commented Feb 8, 2024 •

edited

ashwanthgoli commented Feb 9, 2024 •

edited