Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set containerd as default container runtime for k8s>=1.22 #4222

Merged
merged 7 commits into from
Jun 24, 2021

Conversation

voelzmo
Copy link
Member

@voelzmo voelzmo commented Jun 16, 2021

How to categorize this PR?

/kind api-change

What this PR does / why we need it:
An empty cri.name field for a worker pool means all nodes are configured with docker as the container runtime. When users upgrade their shoots to k8s>=1.22 (or create new ones) the default is changed to containerd. This also means that starting with k8s 1.22 the shoot object will always contain an explicit configuration of the container runtime.

Which issue(s) this PR fixes:
related #4110

Release note:

Shoots created with or updated to Kubernetes version >= 1.22 will get `containerd` as default container runtime. If you upgrade an existing shoot which doesn't specify a `cri.name` property in its worker pools, this will trigger a graceful node rollout and the container runtime is switched from `docker` to `containerd`.

/cc @BeckerMax

An empty `cri.name` field for a worker pool means all nodes are configured with `docker` as the container runtime. When users upgrade their shoots to k8s>=1.22 (or create new ones) the default is changed to `containerd`. This also means that starting with k8s 1.22 the shoot object will always contain an explicit configuration of the container runtime.
@voelzmo voelzmo requested a review from a team as a code owner June 16, 2021 14:59
@gardener-robot gardener-robot added kind/api-change API change with impact on API users needs/review size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Jun 16, 2021
Copy link
Member

@vpnachev vpnachev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one topic that I think we should discuss.
In the Cloud Profile, each machine version can specify the supporter CRs and CRIs.
Defaulting to containerd for OS name/version that does not support containerd should result in a shoot validation failure later, hence I think the static validation is fine, but wanted to mention this as I might be wrong.

pkg/apis/core/v1beta1/defaults.go Outdated Show resolved Hide resolved
@voelzmo
Copy link
Member Author

voelzmo commented Jun 16, 2021

In the Cloud Profile, each machine version can specify the supporter CRs and CRIs.
Defaulting to containerd for OS name/version that does not support containerd should result in a shoot validation failure later, hence I think the static validation is fine, but wanted to mention this as I might be wrong.

You're absolutely correct, this is what happens when you (or the defaulting mechanism implemented here) is setting worker.cri.name to containerd, but your cloudprofile doesn't list containerd as supported in the MachineImageVersion you're using:

shoots.core.gardener.cloud "shoot-gcp-contd" is forbidden: [spec.provider.workers[0].cri.name: Unsupported value: "containerd"]

It doesn't quite hint at how to resolve this (use a MachineImageVersion which supports the selected container runtime or switch to one of the supported runtimes of your MachineImageVersion). Maybe we could add a hint to the CloudProfile's MachineImageVersion as the source of "supported runtimes" in the error message, wdyt?

Also do this for v1alpha1, not only v1beta1.
@vpnachev
Copy link
Member

Maybe we could add a hint to the CloudProfile's MachineImageVersion as the source of "supported runtimes" in the error message, wdyt?

Yes, this makes sense. The error message can return the list of the supported runtimes for the selected OS/version.

@voelzmo
Copy link
Member Author

voelzmo commented Jun 17, 2021

The error message can return the list of the supported runtimes for the selected OS/version.

I think this is what the error message already does, for example

shoots.core.gardener.cloud "shoot-gcp-contd" is forbidden: [spec.provider.workers[0].cri.name: Unsupported value: "docker": supported values: "containerd"]

When I first encountered this issue, it was hard for me to understand that this is due to the MachineImageVersion I'm using. Therefore, I was thinking to add something like "Selected machine image gardenlinux version 0.184 supports these values: ["containerd"]", but given that we're constructing the error message with a generic field.NotSupported, I'm not sure if that's possible or even desired. WDYT?

pkg/apis/core/v1alpha1/defaults_test.go Outdated Show resolved Hide resolved
pkg/apis/core/v1alpha1/defaults_test.go Outdated Show resolved Hide resolved
pkg/apis/core/v1alpha1/defaults.go Outdated Show resolved Hide resolved
rfranzke
rfranzke previously approved these changes Jun 21, 2021
@rfranzke rfranzke requested a review from vpnachev June 21, 2021 07:55
@rfranzke
Copy link
Member

/assign @vpnachev

vpnachev
vpnachev previously approved these changes Jun 23, 2021
Copy link
Member

@vpnachev vpnachev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

Just a small suggestion for the for loop.

pkg/apis/core/v1alpha1/defaults.go Outdated Show resolved Hide resolved
pkg/apis/core/v1beta1/defaults.go Outdated Show resolved Hide resolved
Copy link
Member

@vpnachev vpnachev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@gardener-robot gardener-robot added reviewed/lgtm lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. and removed needs/second-opinion labels Jun 23, 2021
@BeckerMax BeckerMax merged commit 616b60a into gardener:master Jun 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/api-change API change with impact on API users lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants