-
Notifications
You must be signed in to change notification settings - Fork 449
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
WINC-505: Windows containerd runtime enablement
Enhancement proposal for making containerd as default runtime in windows node. Signed-off-by: selansen <esiva@redhat.com>
- Loading branch information
selansen
committed
Nov 24, 2021
1 parent
05f2817
commit a1d9d1d
Showing
1 changed file
with
175 additions
and
0 deletions.
There are no files selected for viewing
175 changes: 175 additions & 0 deletions
175
enhancements/windows-containers/container-runtime-containerd.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,175 @@ | ||
--- | ||
title: container-runtime-containerd | ||
authors: | ||
- "@selansen" | ||
reviewers: | ||
- "@aravindhp" | ||
- "@openshift/openshift-team-windows-containers" | ||
approvers: | ||
- "@aravindhp" | ||
- "@mrunalp" | ||
creation-date: 2021-11-19 | ||
last-updated: 2021-11-20 | ||
status: implementable | ||
--- | ||
|
||
# containerd - new container run time | ||
|
||
## Release Signoff Checklist | ||
|
||
- [x] Enhancement is `implementable` | ||
- [x] Design details are appropriately documented from clear requirements | ||
- [x] Test plan is defined | ||
- [x] Graduation criteria for dev preview, tech preview, GA | ||
- [ ] User-facing documentation is created in [OpenShift-docs](https://github.com/OpenShift/OpenShift-docs/) | ||
|
||
## Summary | ||
|
||
The intent of this enhancement is to allow customers to bring up windows node with containerd | ||
as default run time from kubernetes 1.24 based OpenShift 4.11 onwards. When customers try to upgrade | ||
current cluster to kubernetes 1.24 based OpenShift 4.11, runtime will be migrated from docker to containerd | ||
|
||
## Motivation | ||
|
||
In Kubernetes, the CRI interface is used to talk to a container runtime. The design of CRI is to be able | ||
to run a CRI implementation as a separate binary. However currently the CRI of docker (a.k.a. dockershim) | ||
is part of kubelet code, runs as part of kubelet and is tightly coupled with kubelet's lifecycle. From kubernetes | ||
1.24 onwards dockershim will be removed from kubelet code. Currently WMCO uses docker as default run time. Aim is | ||
to make containerd as default runtime and move away from docker before dokershim has been decoupled from kubelet. | ||
|
||
### Goals | ||
|
||
As part of this enhancement we plan to do the following: | ||
* Make containerd as default run time. | ||
* When upgrade happens from older release newer one, containerd will become default run time. | ||
|
||
### Non-Goals | ||
|
||
* De-configuring docker run time is not part of this enhancement. | ||
* Refactoring or re-design of WMCO due to dockershim deprecation. | ||
* Switching between docker and containerd runtimes is not supported. | ||
|
||
## Proposal | ||
|
||
To make containerd as default run time, containerd should be installed before kueblet and | ||
kubelet parameter need to be updated so that containerd will become default run time. In | ||
upgrade case each Windows VM configured by previous versions of the WMCO has its Machine | ||
object deleted, resulting in the drain and deletion of the Windows node, and the termination | ||
then followed by recreation of the VM. The upgraded WMCO instance will then be able to configure | ||
VMs that will be created with containerd as default runtime. All minimum requirement stated | ||
as part of WMCO upgrade will be applicable here as well. | ||
|
||
Containerd Migration plan | ||
* Containerd will become default runtime as part of OpenShift 4.11 | ||
* For current usage, we plan to introduce feature flag and release it only for community operator. | ||
* If it is downgraded same thing will apply. windows VM Will be de-configured and re-configured with | ||
the WMCO supported runtime. ( Note : downgrades are not supported by the OLM) | ||
|
||
### User Stories | ||
|
||
Stories can be found within the [Windows Containers: containerd](https://issues.redhat.com/browse/WINC-505) | ||
|
||
### Justification | ||
|
||
If we dont make containerd as default runtime, we will be left out with Mirantis supported dockershim | ||
and will have to re-deisgn WMCO to incorporate Mirantis dockershim. If we want to use CRI-O for windows, | ||
the amount of time and engineering efforts involved making CRI-O to work for windows is huge. This | ||
doesn't do business justification due to the high cost and time. Containerd is widely adopted and supported | ||
by open source community. Also, Microsoft is major contributor in windows containerd development and making | ||
containerd as default runtime for their kubernetes offerings. Most of the windows supported k8s orchestrators | ||
already moved towards containerd. | ||
|
||
### Design Details | ||
|
||
we plan to target containerd 1.6.0 to integrate into WMCO. Containerd will be installed | ||
as a first service before kubelet installation as kubelet must require containerd to be | ||
running. We already bundled containerd package with WMCO. We may include crictl bundle as | ||
part of the package for debugging purpose. Once containerd is installed, rest of the service | ||
installation steps remain same. kubelet's dependency on docker runtime will be removed and | ||
replaced by containerd. Separate folder will be created under C:\k\ to store containerd config | ||
and related files. | ||
steps to install containerd as service | ||
* scp containerd/related executables into windows VM | ||
* copy the files in appropriate folder | ||
* create contaierd config file | ||
* run extra command for performance `Add-MpPreference -ExclusionProcess "$Env:containerd.exe"` | ||
* register containerd as service | ||
* start containerd service | ||
|
||
## Network changes | ||
Current CNI/IPAM will be used for containerd and no change in HNS-Netwrok and HNS-Ednpoint creation | ||
steps. The config file which will be used to by containerd will point to same CNI/IPAM executables. | ||
|
||
## feature flag | ||
Containerd feature flag will be used to enable this feature. This will be enabled by default in | ||
community operator for customers and developers to try it out. This will become default for OpenShift | ||
4.11. At any given point in time we have one runtime support and do not allow switching between runtime. | ||
|
||
## logging | ||
Containerd can be started with parameters in which we can enable logging and specify file path to | ||
log warnings/errors. Log files will be stored at c:\var\log\containerd | ||
|
||
## upgrade | ||
When a cluster is upgraded OLM will switch to using a new Red Hat operators | ||
index. Because WMCO is named the same in both indexes, OLM will upgrade WMCO | ||
from the previous version, up to the latest version available in the new | ||
cluster. | ||
|
||
The procedure for an upgrade is as follows: | ||
1) As part of upgrade, basic validation will be done and de-configuring will take place. | ||
2) During De-configuration, older version of kubelet, kube-proxy and CNI will be uninstalled. | ||
3) The newer version of WMCO will install all the required components along with containerd. | ||
4) Kuebelet will start using containerd for pulling and managing container images. | ||
|
||
|
||
### Risks and Mitigations | ||
|
||
* If the cluster is upgraded, and the new version introduces an issue due to containerd | ||
that should be addressed coz docker runtime support won't be available in kubelet. | ||
going back to older WMCO might also run into issue due to kubelet version mismatch | ||
between API server and kubelet.As we are planning to bring this feature before 4.11 | ||
this can be tested well and should make sure we address all the issues by working | ||
with containerd open source community. | ||
* containerd doesn't support image-pull-progress-deadline as of now. There is a PR | ||
https://github.com/containerd/containerd/pull/6150 work in progress. Until this | ||
gets merged, if windows image pull takes more time than the default value, we might | ||
run into to image pull timeout error. proposed solution would be pull the image first | ||
with ctr or crictl commandline tool and then create pods. | ||
* Currently windows_exporter has been used to collect metrics from windows node. we do | ||
have containerd support in https://github.com/prometheus-community/windows_exporter/releases/tag/v0.16.0 | ||
All the functionalities supported in docker runtime need to be checked for containerd. | ||
|
||
### Test Plan | ||
|
||
* We will have new e2e test case to be added for containerd replicating same existing | ||
test case that covers WMCO functionality. | ||
* Containerd is agnostic to platform so testing in any platform should be fine. | ||
* Update WMCO community image on release repo so that CI workflow will use containerd | ||
based WMCO community operator. | ||
|
||
### Graduation Criteria | ||
|
||
This enhancement will start with WMCO community operator. This will become default feature from | ||
OpenShift 4.11 release onwards. | ||
|
||
### Upgrade / Downgrade Strategy | ||
|
||
Upgrade is already discussed in design section. Downgrades are [not supported](https://github.com/operator-framework/operator-lifecycle-manager/issues/1177) | ||
by OLM. | ||
|
||
### Version Skew Strategy | ||
We plan to maintain parity with the upstream [containerd](https://github.com/containerd/containerd/releases) | ||
|
||
## Implementation History | ||
|
||
v1: Initial Proposal | ||
|
||
## Alternatives | ||
|
||
There are few alternatives but they either not cost-effective or depending on competitors less modular | ||
components. | ||
* Implementing CRIO runtime for windows involves huge engineering effort along with less community | ||
support ( most community supporters already moved to containerd). | ||
* There is an effort going on to continue to use dockershim and docker runtime. As kubelet is going to | ||
remove dockershim specific code, we still have to come up with design change to make it work from k8s 1.24 | ||
onwards. |