Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(caching): Support caching /series and /labels query results #11539

Merged
merged 27 commits into from Jan 4, 2024

Conversation

kavirajk
Copy link
Collaborator

@kavirajk kavirajk commented Dec 20, 2023

What this PR does / why we need it:
Add support for caching metadata queries (both series and labels).
caching happens after splitting similar to other types of queries.

This pr adds the following configs to enable them.

cache_series_results: true|false (default false)
cache_label_results: true|false (default false)

And the cache backend for them can be configured using series_results_cache and label_results_cache blocks under the query_range section.

Currently the split interval for metadata queries is fixed and defaults to 24h, this pr makes it configurable by introducing split_metadata_queries_by_interval

Which issue(s) this PR fixes:
Fixes #

Special notes for your reviewer:

Checklist

  • Reviewed the CONTRIBUTING.md guide (required)
  • Documentation added
  • Tests updated
  • CHANGELOG.md updated
    • If the change is worth mentioning in the release notes, add add-to-release-notes label
  • Changes that require user attention or interaction to upgrade are documented in docs/sources/setup/upgrade/_index.md
  • For Helm chart changes bump the Helm chart version in production/helm/loki/Chart.yaml and update production/helm/loki/CHANGELOG.md and production/helm/loki/README.md. Example PR
  • If the change is deprecating or removing a configuration option, update the deprecated-config.yaml and deleted-config.yaml files respectively in the tools/deprecated-config-checker directory. Example PR

Copy link
Contributor

github-actions bot commented Dec 20, 2023

Trivy scan found the following vulnerabilities:

  • HIGH, Target: docker.io/grafana/loki:main-d2f4378 (alpine 3.18.4), Type: alpine openssl: Incorrect cipher key and IV length processing in libcrypto3 v3.1.3-r0. Fixed in v3.1.4-r0
  • HIGH, Target: docker.io/grafana/loki:main-d2f4378 (alpine 3.18.4), Type: alpine openssl: Incorrect cipher key and IV length processing in libssl3 v3.1.3-r0. Fixed in v3.1.4-r0
    \nTo see more details on these vulnerabilities, and how/where to fix them, please run docker build -t grafana/loki:main-d2f4378 -f cmd/loki/Dockerfile .
    trivy i grafana/loki:main-d2f4378 on your branch. If these were not introduced by your PR, please considering fixing them in via a subsequent PR. Thanks!

kavirajk and others added 10 commits December 21, 2023 18:37
Signed-off-by: Kaviraj <kavirajkanagaraj@gmail.com>
Signed-off-by: Kaviraj <kavirajkanagaraj@gmail.com>
Signed-off-by: Kaviraj <kavirajkanagaraj@gmail.com>
Signed-off-by: Kaviraj <kavirajkanagaraj@gmail.com>
Signed-off-by: Kaviraj <kavirajkanagaraj@gmail.com>
Signed-off-by: Kaviraj <kavirajkanagaraj@gmail.com>
@github-actions github-actions bot added the type/docs Issues related to technical documentation; the Docs Squad uses this label across many repositories label Dec 26, 2023
@ashwanthgoli ashwanthgoli marked this pull request as ready for review December 29, 2023 10:01
@ashwanthgoli ashwanthgoli requested a review from a team as a code owner December 29, 2023 10:01
kavirajk and others added 2 commits January 2, 2024 15:03
Signed-off-by: Kaviraj <kavirajkanagaraj@gmail.com>
Copy link
Contributor

@dannykopping dannykopping left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks great! Nice work fellas


# If series_results_cache is not configured and cache_series_results is true,
# the config for the results cache is used.
series_results_cache:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see much value in having separate cache configs for this, label results, and normal result cache. IMHO we should err on the side of simplicity until we find a very compelling case for needing separate caches for all 3.

Our config is already monstrously complex; we should do what we can to not make it more so.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree to this. one downside is that we would lose granularity with stats and metric collection since they are tried to the cache instance. labels, series and results caching all would be clubbed together.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How valuable are the metrics? Could we use a different label based on which "subcomponent" is observing metrics?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Thought about this when introducing it. Now we have so much repeated configs for normal results cache (metric range queries), index stats, volume, labels, series and for metric instant queries in future. All have exactly same configs.

One downside is it's hard to clean this up without breaking changes I think. Other option is to introduce shared config with "subcomponent" label but only for labels and series without breaking existing configs, then we loose the consistency with how we configure different results cache.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be fine with leaving this as is for now but combining them all in v3.0 with breaking changes, thoughts?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. Adding it to the epic 👍

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another thing that I am worried about is blowing up the size of stats proto. does encoding ignore fields that are not set? if that's the case we don't have to worry about this

only one of these would be set for a given request, I guess we could just collapse this into a single field with a new cache type field

Cache result = 3 [
(gogoproto.nullable) = false,
(gogoproto.jsontag) = "result"
];
Cache statsResult = 4 [
(gogoproto.nullable) = false,
(gogoproto.jsontag) = "statsResult"
];
Cache volumeResult = 5 [
(gogoproto.nullable) = false,
(gogoproto.jsontag) = "volumeResult"
];
Cache seriesResult = 6 [
(gogoproto.nullable) = false,
(gogoproto.jsontag) = "seriesResult"
];
Cache labelResult = 7 [
(gogoproto.nullable) = false,
(gogoproto.jsontag) = "labelResult"
];
}

pkg/logql/metrics.go Show resolved Hide resolved
pkg/loki/config_wrapper.go Show resolved Hide resolved
pkg/querier/queryrange/codec.go Show resolved Hide resolved
pkg/querier/queryrange/labels_cache.go Show resolved Hide resolved
pkg/querier/queryrange/labels_cache.go Show resolved Hide resolved
pkg/querier/queryrange/roundtrip.go Outdated Show resolved Hide resolved
pkg/querier/queryrange/series_cache.go Show resolved Hide resolved
Signed-off-by: Kaviraj <kavirajkanagaraj@gmail.com>
Copy link
Contributor

@dannykopping dannykopping left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kavirajk kavirajk merged commit ce57448 into main Jan 4, 2024
8 checks passed
@kavirajk kavirajk deleted the kavirajk/metadata-caching branch January 4, 2024 12:12
kavirajk added a commit that referenced this pull request Jan 5, 2024
…11539)

**What this PR does / why we need it**:
Add support for caching metadata queries (both series and labels).
caching happens after splitting similar to other types of queries.

This pr adds the following configs to enable them.
```
cache_series_results: true|false (default false)
cache_label_results: true|false (default false)
```
And the cache backend for them can be configured using
`series_results_cache` and `label_results_cache` blocks under the
`query_range` section.

Currently the split interval for metadata queries is fixed and defaults
to 24h, this pr makes it configurable by introducing
`split_metadata_queries_by_interval`

**Which issue(s) this PR fixes**:
Fixes #<issue number>

**Special notes for your reviewer**:

**Checklist**
- [x] Reviewed the
[`CONTRIBUTING.md`](https://github.com/grafana/loki/blob/main/CONTRIBUTING.md)
guide (**required**)
- [x] Documentation added
- [x] Tests updated
- [ ] `CHANGELOG.md` updated
- [ ] If the change is worth mentioning in the release notes, add
`add-to-release-notes` label
- [ ] Changes that require user attention or interaction to upgrade are
documented in `docs/sources/setup/upgrade/_index.md`
- [ ] For Helm chart changes bump the Helm chart version in
`production/helm/loki/Chart.yaml` and update
`production/helm/loki/CHANGELOG.md` and
`production/helm/loki/README.md`. [Example
PR](d10549e)
- [ ] If the change is deprecating or removing a configuration option,
update the `deprecated-config.yaml` and `deleted-config.yaml` files
respectively in the `tools/deprecated-config-checker` directory.
[Example
PR](0d4416a)

---------

Signed-off-by: Kaviraj <kavirajkanagaraj@gmail.com>
Co-authored-by: Ashwanth Goli <iamashwanth@gmail.com>
rhnasc pushed a commit to inloco/loki that referenced this pull request Apr 12, 2024
…rafana#11539)

**What this PR does / why we need it**:
Add support for caching metadata queries (both series and labels).
caching happens after splitting similar to other types of queries.

This pr adds the following configs to enable them.
```
cache_series_results: true|false (default false)
cache_label_results: true|false (default false)
```
And the cache backend for them can be configured using
`series_results_cache` and `label_results_cache` blocks under the
`query_range` section.

Currently the split interval for metadata queries is fixed and defaults
to 24h, this pr makes it configurable by introducing
`split_metadata_queries_by_interval`

**Which issue(s) this PR fixes**:
Fixes #<issue number>

**Special notes for your reviewer**:

**Checklist**
- [x] Reviewed the
[`CONTRIBUTING.md`](https://github.com/grafana/loki/blob/main/CONTRIBUTING.md)
guide (**required**)
- [x] Documentation added
- [x] Tests updated
- [ ] `CHANGELOG.md` updated
- [ ] If the change is worth mentioning in the release notes, add
`add-to-release-notes` label
- [ ] Changes that require user attention or interaction to upgrade are
documented in `docs/sources/setup/upgrade/_index.md`
- [ ] For Helm chart changes bump the Helm chart version in
`production/helm/loki/Chart.yaml` and update
`production/helm/loki/CHANGELOG.md` and
`production/helm/loki/README.md`. [Example
PR](grafana@d10549e)
- [ ] If the change is deprecating or removing a configuration option,
update the `deprecated-config.yaml` and `deleted-config.yaml` files
respectively in the `tools/deprecated-config-checker` directory.
[Example
PR](grafana@0d4416a)

---------

Signed-off-by: Kaviraj <kavirajkanagaraj@gmail.com>
Co-authored-by: Ashwanth Goli <iamashwanth@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size/XXL type/docs Issues related to technical documentation; the Docs Squad uses this label across many repositories
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants