New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
config: adds frontend.max-query-capacity
to tune per-tenant query capacity
#11284
Conversation
Trivy scan found the following vulnerabilities:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[docs team] Docs LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty good, but I think there's some refactoring that needs to be done for safety and clarity.
pkg/scheduler/scheduler.go
Outdated
|
||
pendingRequests: map[requestKey]*schedulerRequest{}, | ||
connectedFrontends: map[string]*connectedFrontend{}, | ||
queueMetrics: queueMetrics, | ||
ringManager: ringManager, | ||
requestQueue: queue.NewRequestQueue(cfg.MaxOutstandingPerTenant, cfg.QuerierForgetDelay, queueMetrics), | ||
requestQueue: queue.NewRequestQueue(cfg.MaxOutstandingPerTenant, cfg.QuerierForgetDelay, limits.QueueLimits(schedulerLimits), queueMetrics), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this would look neater if we move it to the queue pkg queue.Limits(frontendLimits)
but I left it in scheduler pkg since it is an adapter between scheduler and queue limits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the refactor! It reads much more clearly now.
There were a couple troubling things that caught my eye though with the changes, but they shouldn't take long to resolve
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking much cleaner now, thanks @ashwanthgoli!
Just minor nits now
Co-authored-by: Danny Kopping <danny.kopping@grafana.com>
Co-authored-by: Danny Kopping <danny.kopping@grafana.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
…apacity (grafana#11284) **What this PR does / why we need it**: Adds a new config `frontend.max-query-capacity` that allows users to configure what portion of the the available querier replicas can be used by a tenant. `max_query_capacity` is the corresponding YAML option that can be configured in limits or runtime overrides. For example, setting this to 0.5 would allow a tenant to use half of the available queriers. This complements the existing `frontend.max-queriers-per-tenant`. When both are configured, the smaller value of the resulting querier replica count is considered: ``` min(frontend.max-queriers-per-tenant, ceil(querier_replicas * frontend.max-query-capacity)) ``` *All* queriers will handle requests for a tenant if neither limits are applied. **Which issue(s) this PR fixes**: Fixes #<issue number> **Special notes for your reviewer**: noticed that we don't pass down the shuffle sharding limits for frontend (only using it with schedulers) https://github.com/grafana/loki/blob/26f097162a856db48ecbd16bef2f0b750029855b/pkg/loki/modules.go#L895 but the [docs](https://github.com/grafana/loki/blob/26f097162a856db48ecbd16bef2f0b750029855b/pkg/validation/limits.go#L276) mention that`frontend.max-queriers-per-tenant` applies to frontend as well. ``` This option only works with queriers connecting to the query-frontend / query-scheduler, not when using downstream URL. ``` **Checklist** - [x] Reviewed the [`CONTRIBUTING.md`](https://github.com/grafana/loki/blob/main/CONTRIBUTING.md) guide (**required**) - [x] Documentation added - [x] Tests updated - [x] `CHANGELOG.md` updated - [ ] If the change is worth mentioning in the release notes, add `add-to-release-notes` label - [ ] Changes that require user attention or interaction to upgrade are documented in `docs/sources/setup/upgrade/_index.md` - [ ] For Helm chart changes bump the Helm chart version in `production/helm/loki/Chart.yaml` and update `production/helm/loki/CHANGELOG.md` and `production/helm/loki/README.md`. [Example PR](grafana@d10549e) - [ ] If the change is deprecating or removing a configuration option, update the `deprecated-config.yaml` and `deleted-config.yaml` files respectively in the `tools/deprecated-config-checker` directory. [Example PR](grafana@0d4416a) --------- Co-authored-by: J Stickler <julie.stickler@grafana.com> Co-authored-by: Danny Kopping <danny.kopping@grafana.com>
What this PR does / why we need it:
Adds a new config
frontend.max-query-capacity
that allows users to configure what portion of the the available querier replicas can be used by a tenant.max_query_capacity
is the corresponding YAML option that can be configured in limits or runtime overrides.For example, setting this to 0.5 would allow a tenant to use half of the available queriers.
This complements the existing
frontend.max-queriers-per-tenant
. When both are configured, the smaller value of the resulting querier replica count is considered:All queriers will handle requests for a tenant if neither limits are applied.
Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
noticed that we don't pass down the shuffle sharding limits for frontend (only using it with schedulers)
loki/pkg/loki/modules.go
Line 895 in 26f0971
but the docs mention that
frontend.max-queriers-per-tenant
applies to frontend as well.Checklist
CONTRIBUTING.md
guide (required)CHANGELOG.md
updatedadd-to-release-notes
labeldocs/sources/setup/upgrade/_index.md
production/helm/loki/Chart.yaml
and updateproduction/helm/loki/CHANGELOG.md
andproduction/helm/loki/README.md
. Example PRdeprecated-config.yaml
anddeleted-config.yaml
files respectively in thetools/deprecated-config-checker
directory. Example PR