-
Notifications
You must be signed in to change notification settings - Fork 27
Update rollout-operator controller to use zpdb before delete #324
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
charleskorn
reviewed
Oct 22, 2025
Contributor
Author
|
@charleskorn - can you have another look at this one please. Note that this change means the eviction controller always runs, and note the jsonnet updates to support this. |
charleskorn
reviewed
Oct 23, 2025
Co-authored-by: Charles Korn <charleskorn@users.noreply.github.com>
Co-authored-by: Charles Korn <charleskorn@users.noreply.github.com>
Co-authored-by: Charles Korn <charleskorn@users.noreply.github.com>
Co-authored-by: Charles Korn <charleskorn@users.noreply.github.com>
Contributor
Author
|
@charleskorn Thanks for the feedback - I've adjusted accordingly. |
charleskorn
reviewed
Oct 24, 2025
charleskorn
reviewed
Oct 24, 2025
Contributor
charleskorn
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM modulo suggestion below
charleskorn
approved these changes
Oct 27, 2025
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR seeks to address a race condition which can occur when rolling pod updates are running and a voluntary pod eviction is received.
During the main
reconcileStatefulSetsGroup()reconcile loop, there is opportunity for a voluntary eviction to have occurred, but the reconcile loop has not detected this and a pod can be deleted (for updating) which would breach the ZPDB.This is more likely to occur when running the zone aware PodDisruptionBudget in a partition awareness mode. In the traditional max unavailable = 1 PDB, a voluntary eviction is denied if there is any disruption in any zone. Whilst the rolling update controller is performing updates the likelihood of a a voluntary eviction being allowed is small.
However, when in partition awareness mode, the voluntary eviction will be allowed if both pods in the partition are ready. ie a reconcile over
ingester-zone-apods is occurring - with each pod being deleted, a voluntary eviction comes in foringester-zone-b-50. This eviction will be allowed if the reconcile loop has not yet reachedingester-zone-a-50. The update loop will not be aware of this eviction and issue a delete oningester-zone-a-50.The fix for this is to call the zpdb eviction controller prior to each delete and confirm that the zpdb will not be breached by this delete.
Note - this will result in additional statefulset and pod checks before each rolling pod update (delete).