-
Notifications
You must be signed in to change notification settings - Fork 476
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add mixin alert for ring member mismatch #2404
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I left few comments, then LGTM!
Please add a CHANGELOG entry under ### Mixin
. You can consider this PR as [ENHANCEMENT]
.
4ef661f
to
dad6955
Compare
@pracucci bump for when you have time please :-) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for my late. It backtested this alert again and I would keep it only for ingester, which is realistically the only critical component (where things would really go nuts if there's this mismatch).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unblocking with minor feedback
|
||
How to **investigate**: | ||
|
||
- Check the [hash ring web page]({{< relref "../reference-http-api/index.md#ingesters-ring-status" >}}) for the component for which the alert has fired, and look for unexpected instances in the list. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Check the [hash ring web page]({{< relref "../reference-http-api/index.md#ingesters-ring-status" >}}) for the component for which the alert has fired, and look for unexpected instances in the list. | |
- For the component for which the alert has fired, and to look for unexpected instances in the list, see [Ingesters ring status]({{< relref "../reference-http-api/index.md#ingesters-ring-status" >}}). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this suggestion makes sense. This is an action to take once you know a component that is mis-matched.
The CHANGELOG has just been cut to prepare for the next Mimir release. Please rebase |
66e5347
to
824e850
Compare
This alert fires when the expected number of ring members does not match the running number for each component (one of `compactor`, `distributor`, `ingester`, `ruler`, or `store-gateway`). Fixes: grafana#1960
Because we aren't building dashboards with $namespace/$cluster etc in the alerts, we can't use dashboard-utils.jobMatcher and thus need to optionally match on possible namespace prefix (.*/).
824e850
to
b855870
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We generally prefer to no keep commented code, but this PR has been open for too long so let's start using it ;)
This alert fires when the expected number of ring members does not
match the running number for each component (one of
compactor
,distributor
,ingester
,ruler
, orstore-gateway
).What this PR does
Which issue(s) this PR fixes or relates to
Fixes #1960
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]