-
Notifications
You must be signed in to change notification settings - Fork 551
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cluster: fix a case where version updates weren't generated #8902
Conversation
Calculating deltas on health reports was not robust, because after a leadership change theq controller can come up on a node that already has a health report showing a newer version, i.e. no delta wrt the next report. Instead, generate an update to _node_versions whenever we see a version that is newer than that in the map, and clear _node_versions on leadership/term changes. This guarantees that within each controller term, we will pay attention to node versions from all nodes until we have accumulated a version from each node, and thereafter we will only submit updates if the version in a health report is newer than the one in _node_versions. Fixes: redpanda-data#8758
21fab3b
to
f54c7c9
Compare
I did a first cut of this that explicitly tracked the controller term of each node's most recent update (jcsp@21fab3b) but that felt a bit over-complex. The approach in this PR should be easier to reason about. |
Test failures:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks!
/backport v22.3.x |
/backport v22.2.x |
This could result in clusters failing to activate features after an upgrade, if the controller leadership changed at just the wrong moment.
Fixes: #8758
Backports Required
UX Changes
None
Release Notes
Bug Fixes/