Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document rollout support for K Worker Nodes #3401

Closed
Arvinderpal opened this issue Jul 27, 2020 · 13 comments · Fixed by #4285
Closed

Document rollout support for K Worker Nodes #3401

Arvinderpal opened this issue Jul 27, 2020 · 13 comments · Fixed by #4285
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor.
Milestone

Comments

@Arvinderpal
Copy link
Contributor

Arvinderpal commented Jul 27, 2020

User Story

As a user/operator I would like to roll out all K worker nodes to new hardware for various reasons.
As a developer/user/operator I would like to have symmetry between control-plane and worker node roll outs for added simplicity.

Detailed Description

KCP.Spec.UpgradeAfter allows machines to be rolled out after a specific date and time even if nothing in the Spec has changed. This approach has some benefits - it can be used to move control-plane nodes to new hardware, perform cert rotation, allow changes in infra machine templates to be reflected in control plane nodes (w/o creating a brand new template), ...

[edited]
CAPI should support a similar approach for worker nodes.
CAPI already supports immediate rollout of worker Nodes -- see comment below. We should improve the docs to reflect this.

/kind feature

@k8s-ci-robot k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Jul 27, 2020
@Arvinderpal
Copy link
Contributor Author

One approach would be to introduce the upgradeAfter field in MachineDeployment and use similar logic to KCP in deciding when an roll out should be triggered.

@detiber
Copy link
Member

detiber commented Jul 27, 2020

I don't necessarily see an issue with introducing an upgradeAfter field, however I'm wondering if it would make sense to have more generic support for something similar to kubectl rollout for MachineDeployments and possibly also KubeadmControlPlane?

@vincepri
Copy link
Member

+1 Adding a similar functionality to MachineDeployment as well, I'd wait for v0.4.0 and rename these fields to RolloutAfter. @detiber's suggestion to support something similar to kubectl rollout sounds interesting as well

@Arvinderpal
Copy link
Contributor Author

A rollout command for clusterctl would be great. Perhaps we can also add support for the various sub-commands like undo/status/history as with kubectl rollout.

We can wait for 0.4.0

@vincepri
Copy link
Member

/milestone v0.4.0

@k8s-ci-robot k8s-ci-robot added this to the v0.4.0 milestone Jul 29, 2020
@Arvinderpal Arvinderpal mentioned this issue Aug 3, 2020
9 tasks
@vincepri vincepri changed the title Support upgradeAfter for K Worker Nodes Support rolloutAfter for K Worker Nodes Sep 28, 2020
@Arvinderpal
Copy link
Contributor Author

After a looking into this a bit further, MachineDeployment rollout is already supported. In fact, it follows the same conventions as a regular Deployment rollout. -- a Deployment's rollout is triggered if and only if the Deployment's Pod template (that is, .spec.template) is changed, so a restartedAt annotation is added to deployments to trigger an immediate restart.

Taking the same approach, when an (arbitrary) annotation is added to the MD.spec.template, a rollout is triggered:

kubectl patch machinedeployment test-md-0 --type merge -p '{"spec":{"template":{"metadata":{"annotations":{"cluster.x-k8s.io/restartedAt": "2020-09-23T09:47:07-07:00"}}}}}'

Changes to the MD.spec.template result in a new MachineSet. We can see the revision number annotation in the MD updated to the latest MS. We can also see our restartedAt annotation in the new MS.spec.template as well.

@vincepri Should we document the above approach? The rollout process for KCP is a bit different, but if that's not documented, we should add that as well.

@fabriziopandini
Copy link
Member

Am I wrong or the approach described above triggers an immediate upgrade, while in this issue we are seeking fo triggering a deferred upgrade?

@Arvinderpal
Copy link
Contributor Author

Yes, it triggers an immediate rollout, which IMO is fine. This approach follows closely with the general Deployment/RelicaSet model.

KCP handles things a bit differently, it has a specific field for "forcing" a rollout to occur immediately or sometime in the future.

@fabriziopandini
Copy link
Member

Thanks, if this is the case we already have this doc https://cluster-api.sigs.k8s.io/tasks/change-machine-template.html#changing-infrastructure-machine-templates; I'm +1 to improve it if necessary

@Arvinderpal
Copy link
Contributor Author

Yes, IMO the doc should be improved. That specific page talks about changing machine templates. However, the use case here is to issue an immediate rollout, irrespective of if anything has changed in the MD.

@Arvinderpal Arvinderpal changed the title Support rolloutAfter for K Worker Nodes Document rollout support for K Worker Nodes Nov 8, 2020
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 6, 2021
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Mar 8, 2021
mig4 added a commit to mig4/cluster-api that referenced this issue Mar 10, 2021
Add a section that documents `KubeadmControlPlane.Spec.UpgradeAfter` and
how to achieve a similar effect for machines managed by a
`MachineDeployment`.

Fixes kubernetes-sigs#3401
@mig4
Copy link
Contributor

mig4 commented Mar 11, 2021

/lifecycle active

@k8s-ci-robot k8s-ci-robot added lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor. and removed lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. labels Mar 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants