-
Notifications
You must be signed in to change notification settings - Fork 461
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use BasicLifecycler for distributors and auto-forget #2154
Conversation
e461b3e
to
2a5c10c
Compare
100669f
to
a096530
Compare
c05e273
to
83c065d
Compare
LGTM, left some nitpicks, waiting to be un-drafted. |
Hi, let me know if there's anything useful here: #684 , I'd like to close that old PR of mine - I never had time to finish it, plus lifecycler had some feature that I didn't know how to emulate. |
Definitely some useful parts! I see some similarity with needing a number of healthy ring members:
I'm not sure what the "MinReadyDuration" feature would look like as a delegate. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! Nice work.
Use the BasicLifecycler in distributors for managing their lifecycle so that we can take advantage of the "auto-forget" delegates feature. This prevents the ring from filling up with "unhealthy" distributors that are never removed. This wasn't a bug but it was confusing for users and operators. Fixes #2138 Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>
Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>
Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>
315bc07
to
ed3350f
Compare
Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>
* main: (63 commits) Add new section on website for links to blog posts, podcasts and talks. (grafana#2216) Rename codified errors to errors catalog (grafana#2256) Helm: add a step to contributing doc (grafana#2257) Signal that 2.2 release is now in progress. (grafana#2254) Removed migration of alertmanager local state files from old hierarchy (Cortex 1.8 and earlier) (grafana#2253) operations/mimir: Change multi_zone_ingester_max_unavailable to 25 (grafana#2251) Helm: weekly release (grafana#2252) Jsonnet: Configure ingester max global metadata per user and per metric (grafana#2250) Helm: metamonitor naming (grafana#2236) Mimir documentation about out-of-order (grafana#2183) Vendor latest mimir-prometheus/main (grafana#2243) Set CODEOWNERS to primary technical writer (grafana#2242) Use BasicLifecycler for distributors and auto-forget (grafana#2154) Docs: Basic documentation for deploying the ruler using jsonnet. (grafana#2127) Fix post merge reviews on 2187 (grafana#2230) Add tests for user metadata in the ingester (grafana#2184) Change the error message template for per-tenant limits (grafana#2234) helm: meta-monitoring (grafana#2068) Article about migrating from Consul to memberlist. Added documentation for /memberlist endpoint. (grafana#2166) Update runbooks to mention possibility to investigate memberlist KV store in various alerts (grafana#2158) ...
Use the BasicLifecycler in distributors for managing their lifecycle so that we can take advantage of the "auto-forget" delegates feature. This prevents the ring from filling up with "unhealthy" distributors that are never removed. This wasn't a bug but it was confusing for users and operators. Fixes grafana#2138 Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>
|
||
// ringAutoForgetUnhealthyPeriods is how many consecutive timeout periods an unhealthy instance | ||
// in the ring will be automatically removed after. | ||
ringAutoForgetUnhealthyPeriods = 10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks very high. I would be more aggressive with distributors, like 2 should be enough.
What this PR does
Use the BasicLifecycler in distributors for managing their lifecycle so
that we can take advantage of the "auto-forget" delegates feature. This
prevents the ring from filling up with "unhealthy" distributors that are
never removed. This wasn't a bug but it was confusing for users and
operators.
Which issue(s) this PR fixes or relates to
Fixes #2138
Signed-off-by: Nick Pillitteri nick.pillitteri@grafana.com
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]