Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for OnDelete update strategy when checking for a successful deploy #912

Closed
1 task done
null-sleep opened this issue Dec 7, 2022 · 5 comments
Closed
1 task done
Assignees
Labels
🆕 feature Makes something new possible

Comments

@null-sleep
Copy link

null-sleep commented Dec 7, 2022

Feature request

  • If the maintainers agree with the feature as described here, I intend to submit a Pull Request myself.1

Proposal:

For the OnDelete update strategy when checking for a successful deploy using deploy_succeeded?, krane does not check if the rollout is successful and instead prints a warning message and moves on. The message given is: # Gem cannot monitor update since it doesn't occur until delete.

  1. This presumes if the update strategy is OnDelete, there is nothing in place to roll the pods which is not always true. Ex. K8s operators used to rollout pods can use the OnDelete strategy to control the order of pod updates.
  2. deploy_succeeded? in krane simply checks if all pods are on the updated revision and ready, the same logical check works for OnDelete, and so updates can still be monitored. (It's still useful to print a WARN saying OnDelete requires pod deletes.)

I propose that krane should support checking for a deploy being successful when the update strategy in OnDelete and prints a warning message to inform the user that pods have to be deleted to be updated.

A project at Shopify would already benefit from this addition. This project is used to control how pods of a statefulsets are updated by updating all pods in a zone and uses the OnDelete strategy. By allowing to watch for successful deploys, this will allow krane and shipit/buildkite pipelines at Shopify to work seamlessly. Currently if we use OnDelete, shipit is unable to track the status of the deploy.

@null-sleep null-sleep added the 🆕 feature Makes something new possible label Dec 7, 2022
@null-sleep
Copy link
Author

null-sleep commented Dec 7, 2022

One note: There is a bug in K8s currently where for Statefulsets with an update strategy of OnDelete, the status.currentRevision will NOT be updated when all the pods have been updated to status.updateRevision. There is an active PR with a fix being worked on. Until then, the status can still be checked using readyReplicas, replicas and updatedReplicas (which does get updated). If updatedReplicas does not match replicas then pods have not been updated to the latest revision and readyReplicas can be used to check for their actual health.

observed_generation == current_generation &&
        desired_replicas == status_data['readyReplicas'].to_i == status_data['updatedReplicas'].to_i

Once the PR with the fix is merged, the same existing logic used for RollingUpdate can be used for OnDelete

As indicated with the checkbox, I am more than happy to work on this

@null-sleep
Copy link
Author

We seem to have agreement on adding this behind a flag. The flag used when discussing was --watch-statefulsets but I think --watch-ondelete might be more accurate.

@KnVerey
Copy link
Contributor

KnVerey commented Dec 7, 2022

We seem to have agreement on adding this behind a flag. The flag used when discussing was --watch-statefulsets but I think --watch-ondelete might be more accurate.

I suggest extending the krane.shopify.io/required-rollout annotation to StatefulSet for this purpose, to allow the behaviour to be controlled on a per-resource basis using a style Krane users are already accustomed to. I suggest the only accepted values be full and none. The effective default value should be "full" for RollingUpdate SS. For OnDelete SS, I think it depends on whether or not the Krane team is willing to cut a major version for this, since defaulting to expecting progress where we used to expect nothing could be a breaking change. Arguably, defaulting to watching as much of the rollout as possible is the better default for a rollout monitoring tool, and having the default value not change based on part of the SS is easier to understand.

@ayana-s
Copy link
Contributor

ayana-s commented Jun 6, 2023

The current possible annotation values are as follows:

full: The deployment is successful when all pods in the new replicaSet are ready.
none: The deployment is successful as soon as the new replicaSet is created for the deployment.

We don't believe none is compatible with StatefulSets: eg. currently, a Deployment with the none annotation is marked as successful once the latest replicaSet has been created, but StatefulSets don't use replicaSets create/delete pods (unlike Deployments).

Any recommendations for a new definition to use for none? As a starting point, thoughts on:

@ayana-s
Copy link
Contributor

ayana-s commented Aug 30, 2023

In this PR, I've updated krane to support checking for successful StatefulSet deploys when updateStrategy: OnDelete. I've extended the krane.shopify.io/required-rollout annotation to be compatible with StatefulSet resources.

Now, for all StatefulSets, krane checks the following to confirm a successful deploy:

observed_generation == current_generation
status_data['updateRevision'] == pod_data["metadata"]["labels"]["controller-revision-hash"]

For updateStrategy: OnDelete && required-rollout: full, krane also checks:

replicas == readyReplicas == updatedReplicas

For updateStrategy: OnDelete with no specified required-rollout value, no additional checks are conducted. The code runs as it did previously.

For updateStrategy: RollingUpdate, the additional checks remain the same as before:

replicas == readyReplicas == currentReplicas && currentRevision == updateRevision

@ayana-s ayana-s closed this as completed Aug 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🆕 feature Makes something new possible
Projects
None yet
Development

No branches or pull requests

3 participants