-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Guard against a concurrent downscaling and rollout #79
Comments
I'd like to propose an alternative that I think would be a little simpler, and not involve modifications to the statefulset annotations. Instead of adding a lock annotation that the In psuedo code, the def is_being_rolled_out(statefulsets) -> bool:
for sts in statefulsets:
if not sts.state.replicas == sts.state.updated == sts.state.ready:
return True
return False The I also think this would be a better fit for the way the rollout operator works: there's not really any "state" to a rollout, the rollout operator reads the state of the world and adjusts statefulsets accordingly. |
I like that proposal. There is a race though, where if a rollout is in progress and gets back to a steady state where One way we could mitigate this is by making sure each rollout iteration doesn't delete all of the pods - it could delete |
I'm not clear about how you think it would get back to steady state in the middle of a rollout. Once a statefulset is updated as part of a rollout, all the pods in that statefulset would be considered not "updated". |
You're right. I was thinking that nodes were marked as updated in batches for some reason, but of course that's not the case. |
At present the rollout operator prevents concurrent rollout operations, but doesn’t prevent concurrent downscale + rollout operations. This can happen, for example, if the controller is currently rolling zone B or C, and then something downscales zone A.
In order to guard against this, I propose using an annotation on the resource being scaled to represent a “lock” across the scaling group. Ex:
Creating a scaling-lock
When the rollout operator reconciles, it will choose a resource, if any, that already has a scaling lock to reconcile first, allowing reentrancy. If none exist, it will add a scaling-lock annotation to the sts when it begins rolling its pods. A scaling lock should not be needed though if the controller is just adjusting replicas up/down (since last-downscaled will already provide protection here), only when rolling pods.
Validating the scaling-lock
When the scaling-lock is present on any resource in the rollout-group, the downscale webhook will reject downscale attempts for resources that don't have the scaling lock. For example: if the scaling lock is present on the zone B statefulset, zone A's statefulset cannot be downscaled.
Removing the scaling-lock
When the rollout-operator reconciles a resource, if it sees no non-ready pods and has nothing to do for a resource, it will clear the scaling-lock for that resource if any is present.
Other alternatives
While this could also be accomplished by extending the last-updated annotation to be updated by the rollout-operator’s controller, that would block downscales from taking place for some time after rollouts are complete, which is not necessary. We just want to prevent concurrent scaling from occurring.
The text was updated successfully, but these errors were encountered: