-
Notifications
You must be signed in to change notification settings - Fork 526
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add store gateway consistency check errors to errors catalog #2150
Conversation
862457e
to
ab3b727
Compare
- At query time the querier and ruler determine how old a bucket index is based on the time it was last updated by the compactor. | ||
- If the age is older than the maximum stale period configured via `-blocks-storage.bucket-store.bucket-index.max-stale-period`, the query fails. | ||
- This circuit breaker ensures queriers and rulers do not return any partial query results due to a stale view over the long-term storage. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- At query time the querier and ruler determine how old a bucket index is based on the time it was last updated by the compactor. | |
- If the age is older than the maximum stale period configured via `-blocks-storage.bucket-store.bucket-index.max-stale-period`, the query fails. | |
- This circuit breaker ensures queriers and rulers do not return any partial query results due to a stale view over the long-term storage. | |
- At query time, the querier and the ruler determine how old a bucket index is based on the time that it was last updated by the compactor. | |
- If the age is older than the maximum stale period that is configured via `-blocks-storage.bucket-store.bucket-index.max-stale-period`, the query fails. | |
- This circuit breaker ensures that the queriers and rulers do not return any partial query results, due to a stale view that spans long-term storage. |
Not 100% what "over" is trying to convey here. Do you mean, ", due to a stale overview of the entire long-term storage."?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Applied your suggestion except for the last part, please check that it makes sense.
@@ -112,3 +114,7 @@ func (f *BucketIndexBlocksFinder) GetBlocks(ctx context.Context, userID string, | |||
|
|||
return blocks, matchingDeletionMarks, nil | |||
} | |||
|
|||
func newBucketIndexTooOldError(updatedAt time.Time, maxStalePeriod time.Duration) error { | |||
return errors.New(globalerror.BucketIndexTooOld.MessageWithLimitConfig(fmt.Sprintf("bucket index is too old. It was last updated at %s which exceeds the maximum allowed staleness period of %v", updatedAt.UTC().Format(time.RFC3339Nano), maxStalePeriod), tsdb.BucketIndexConfigPrefix+tsdb.MaxStalePeriodFlag)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return errors.New(globalerror.BucketIndexTooOld.MessageWithLimitConfig(fmt.Sprintf("bucket index is too old. It was last updated at %s which exceeds the maximum allowed staleness period of %v", updatedAt.UTC().Format(time.RFC3339Nano), maxStalePeriod), tsdb.BucketIndexConfigPrefix+tsdb.MaxStalePeriodFlag)) | |
return errors.New(globalerror.BucketIndexTooOld.MessageWithLimitConfig(fmt.Sprintf("The bucket index is too old. It was last updated at %s, which exceeds the maximum allowed staleness period of %v", updatedAt.UTC().Format(time.RFC3339Nano), maxStalePeriod), tsdb.BucketIndexConfigPrefix+tsdb.MaxStalePeriodFlag)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit before applying this: golang errors have to start with lowercase.
If we're going to hack that rule (which may make sense since we're building end-user facing messages here, not just internal wrappable errors), we should make the hacking standard.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Applied accept for the initial capital. If we decide to change that, let's do it in a separate PR since it would affect many errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No separate PR needed. Let’s keep this as is per @colega’s rationale.
}{ | ||
"newBucketIndexTooOldError": { | ||
err: newBucketIndexTooOldError(time.Unix(1000000000, 0), time.Hour), | ||
msg: `bucket index is too old. It was last updated at 2001-09-09T01:46:40Z which exceeds the maximum allowed staleness period of 1h0m0s (err-mimir-bucket-index-too-old). You can adjust the related per-tenant limit by configuring -blocks-storage.bucket-store.bucket-index.max-stale-period, or by contacting your service administrator.`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
msg: `bucket index is too old. It was last updated at 2001-09-09T01:46:40Z which exceeds the maximum allowed staleness period of 1h0m0s (err-mimir-bucket-index-too-old). You can adjust the related per-tenant limit by configuring -blocks-storage.bucket-store.bucket-index.max-stale-period, or by contacting your service administrator.`, | |
msg: "The bucket index is too old. It was last updated at 2001-09-09T01:46:40Z, which exceeds the maximum allowed staleness period of 1h0m0s (err-mimir-bucket-index-too-old). To adjust the related per-tenant limit, configure `-blocks-storage.bucket-store.bucket-index.max-stale-period`, or contact your service administrator.", |
Change from single to double quotes; please make sure I did not break things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Applied the first part. The second part of this message is part of a template, so if we decide to change that, let's do it in a separate PR since it would affect many errors.
}{ | ||
"newStoreConsistencyCheckFailedError": { | ||
err: newStoreConsistencyCheckFailedError([]ulid.ULID{ulid.MustNew(1, nil)}), | ||
msg: `consistency check failed because some blocks were not queried (err-mimir-store-consistency-check-failed). The blocks are: 00000000010000000000000000`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
msg: `consistency check failed because some blocks were not queried (err-mimir-store-consistency-check-failed). The blocks are: 00000000010000000000000000`, | |
msg: `The consistency check failed because some blocks were not queried (err-mimir-store-consistency-check-failed). The blocks are: 00000000010000000000000000`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Applied except for the initial capital
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unblocking with feedback; please have a look so things can be more clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change(s) look good to me, but I agree with the other reviewer comments.
Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com>
Co-authored-by: Marco Pracucci <marco@pracucci.com> Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com>
db27a98
to
7e39f62
Compare
@osg-grafana Please check the latest changes when you're free, thank you! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great job, LGTM! I just left a couple of final nits.
"Ship it!" :) |
Co-authored-by: Marco Pracucci <marco@pracucci.com>
…#2150) * Add store gateway consistency check errors to errors catalogue * Apply suggestions from code review Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com> * Manually apply code review changes * Apply suggestions from code review Co-authored-by: Marco Pracucci <marco@pracucci.com> Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com> * Update based on changes from code review * Apply suggestions from code review Co-authored-by: Marco Pracucci <marco@pracucci.com> Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com> Co-authored-by: Marco Pracucci <marco@pracucci.com>
What this PR does
Following what we've done in #2066, this PR adds common store gateway consistency check errors to errors catalogue.
Which issue(s) this PR fixes or relates to
N/A
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]