-
Notifications
You must be signed in to change notification settings - Fork 464
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Return Canceled
rather than Aborted
when a Series
request to a store-gateway is cancelled by the calling querier.
#4007
Conversation
…gateway is cancelled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! We can merge this, but can you also look at LabelNames()
and LabelValues()
? I think they suffer the same issue.
Yep, I'll take a look at that shortly. |
* Update test * Add missing changelog entries for commits since Mimir 2.5 (#4006) All other commits weren't user-facing or were helm-chart specific. See #3979 Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com> * Add --concurrency support to 'mimirtool rules sync' command (#3996) * Add --concurrency support to 'mimirtool rules sync' command Signed-off-by: Marco Pracucci <marco@pracucci.com> * Update pkg/mimirtool/commands/rules.go Co-authored-by: Patrick Oyarzun <patrick.oyarzun@grafana.com> Signed-off-by: Marco Pracucci <marco@pracucci.com> Co-authored-by: Patrick Oyarzun <patrick.oyarzun@grafana.com> * store-gateway: ExpandedPostings shortcut: avoid LabelValues unless necessary (#3872) * Return `Canceled` rather than `Aborted` when a `Series` request to a store-gateway is cancelled by the calling querier. (#4007) * Return Canceled rather than Aborted when a Series request to a store-gateway is cancelled. * Add changelog entry. * Update mimir-prometheus, add support for align_evaluation_time_on_interval. (#4013) Signed-off-by: Peter Štibraný <pstibrany@gmail.com> * Fix title of guide in link text; reword phrase. (#4008) * Fix ExampleInitLogger to work in UTC (#4016) The test didn't pass in my time zone (tm). --- FAIL: ExampleInitLogger (0.00s) got: ts=1970-01-01T01:00:00+01:00 caller=log_test.go:31 level=info test=1 ts=1970-01-01T01:00:00+01:00 caller=log_test.go:33 level=info msg="test 3" want: ts=1970-01-01T00:00:00Z caller=log_test.go:31 level=info test=1 ts=1970-01-01T00:00:00Z caller=log_test.go:33 level=info msg="test 3" Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> * Create outline of Mimir 2.6 release notes (#4002) Includes notable features and bugfixes based on the CHANGELOG. Helm changes to be filled out later by product and engineering. See #3979 Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com> * Fix post-merge comments from PR #4013. (#4014) Signed-off-by: Peter Štibraný <pstibrany@gmail.com> * Update CODEOWNERS to include mimir-ruler-and-alertmanager-maintainers (#4019) For those who only want notifications re the ruler or Alertmanager. * Remove internal use of store.max-query-length (#4017) Make deprecation of the option more obvious and attempt to remove any use of store.max-query-length in our documentation, jsonnet, helm, and integration tests. See #2793 See #3825 Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com> Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com> * [otlp] Update OTel Collector to latest release (#3852) * [otlp] Update otel collector dependecy to latest * Update code to deal with deprecated functions Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com> * [otlp] Docs: Highlight common issues with OTLP --> Prometheus (#3629) Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com> Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com> * make it possible to inject memberlist kv codecs (#4018) * make it possible to inject memberlist kv codecs Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com> * add comment Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com> * improve comment wording Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com> Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com> * Limits and errors for ephemeral storage (#4004) * Add limits for ephemeral storage. * Add new reason when ingestion of ephemeral metrics fails. * Add tests for max ephemeral series limit. * Introduce new discard reasons when ingesting ephemeral series. Signed-off-by: Peter Štibraný <pstibrany@gmail.com> * Reduce maintainership and step down as team member. (#4023) * Reduce maintainership and step down as team member. My future priorities will be on the alerting aspects of Mimir, so I think it is right to reduce my maintainership accordingly and allow others to take my place. Similarly, remove myself as a team member. * Sort previous team members. Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com> Signed-off-by: Marco Pracucci <marco@pracucci.com> Signed-off-by: Peter Štibraný <pstibrany@gmail.com> Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com> Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com> Co-authored-by: Nick Pillitteri <56quarters@users.noreply.github.com> Co-authored-by: Marco Pracucci <marco@pracucci.com> Co-authored-by: Patrick Oyarzun <patrick.oyarzun@grafana.com> Co-authored-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com> Co-authored-by: Charles Korn <charleskorn@users.noreply.github.com> Co-authored-by: Peter Štibraný <pstibrany@gmail.com> Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com> Co-authored-by: Oleg Zaytsev <mail@olegzaytsev.com> Co-authored-by: gotjosh <josue.abreu@gmail.com> Co-authored-by: Goutham Veeramachaneni <gouthamve@gmail.com> Co-authored-by: Mauro Stettler <mauro.stettler@gmail.com> Co-authored-by: Steve Simpson <steve.simpson@grafana.com>
…ateway is cancelled by the caller, and return Internal otherwise. See #4007 for explanation.
…ateway is cancelled by the caller, and return Internal otherwise. See #4007 for explanation.
What this PR does
Queriers make multiple requests to store-gateways simultaneously. If one of these requests fails, or if the querier decides to stop processing the request (eg. due to a query limit being reached, or an invalid query), the querier will cancel all in-flight store-gateway requests.
Previously, the store-gateway would return an
Aborted
gRPC error if the request is cancelled by the caller. However, this would then be recorded in logs and metrics withstatus="error"
.Canceled
is the preferred error for this scenario (the caller cancelling the request). Requests returningCanceled
are also recorded in our logs and metrics withstatus="cancel"
, which more accurately reflects what happened. (I was confused while diagnosing an alert by a high number of requests logged withstatus="error"
which were in fact the store-gateway handling an expected scenario in the desired way -status="cancel"
is much clearer to me in this scenario.)This PR changes the
Series
endpoint on store-gateways to returnCanceled
when the caller (eg. a querier) cancels the request.This does not require any changes on the querier side, as we already have special handling for the scenario where the querier cancels the request.
Which issue(s) this PR fixes or relates to
(none)
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]