Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Instruction] Use cri-dockerd for docker runtime since KubeEdge v1.14 #4843

Open
Shelley-BaoYue opened this issue Jul 11, 2023 · 13 comments
Open
Labels
kind/documentation Categorizes issue or PR as related to documentation. sig/node Categorizes an issue or PR as relevant to SIG Node.

Comments

@Shelley-BaoYue
Copy link
Collaborator

Shelley-BaoYue commented Jul 11, 2023

Since KubeEdge v1.14, the Kubernetes dependence version has been upgraded to v1.24, and the container runtime at the edge only supports remote runtime type. Due to Kubernetes has removed dockershim, if you still need to use Docker Engine, you need to install cri-dockerd first.

What is cri-dockerd

In Kubernetes 1.23 and earlier, you could use Docker Engine with Kubernetes, relying on a built-in component of Kubernetes named dockershim. The dockershim component was removed in the Kubernetes 1.24 release, while KubeEdge v1.14 upgraded Kubernetes dependence version to v1.24. However, a third-party replacement, cri-dockerd, is available. The cri-dockerd adapter lets you use Docker Engine through the Container Runtime Interface

How to use cri-dockerd

1. install cri-dockerd

If you have installed go at least v1.18, you can follow cri-dockerd installation guide to install cri-dockerd and start it.

If go is lower than version 1.18, an error cannot find package "." in: cri-dockerd/vendor/net/netip will be reported when build cri-dockerd. You can install it throuth installation package refer to the following steps.

wget https://github.com/Mirantis/cri-dockerd/releases/download/{VERSION}/cri-dockerd-{VERSION}.{ARCH}.tgz
tar zxf cri-dockerd-{VERSION}.{ARCH}.tgz 
cp cri-dockerd/cri-dockerd /usr/local/bin/cri-dockerd

wget https://raw.githubusercontent.com/Mirantis/cri-dockerd/{VERSION}/packaging/systemd/cri-docker.service
wget https://raw.githubusercontent.com/Mirantis/cri-dockerd/{VERSION}/packaging/systemd/cri-docker.socket
cp cri-docker.service cri-docker.socket /etc/systemd/system/
sed -i -e 's,/usr/bin/cri-dockerd,/usr/local/bin/cri-dockerd,' /etc/systemd/system/cri-docker.service

systemctl daemon-reload
systemctl enable cri-docker.service
systemctl enable --now cri-docker.socket

2. CNI Plugin

Install CNI Plugin.

3. Configue runtime endpoint

Set --runtimetype=remote and --remote-runtime-endpoint=unix:///var/run/cri-dockerd.sock when using keadm join.

FAQ

If there are any questions, welcome to communicate here.

@Shelley-BaoYue Shelley-BaoYue added kind/documentation Categorizes issue or PR as related to documentation. sig/node Categorizes an issue or PR as relevant to SIG Node. labels Jul 11, 2023
@Shelley-BaoYue Shelley-BaoYue changed the title Use cri-dockerd for docker runtime since KubeEdge v1.14 [Instruction] Use cri-dockerd for docker runtime since KubeEdge v1.14 Jul 12, 2023
@fisherxu
Copy link
Member

fisherxu commented Jul 13, 2023

Need to also add to docs :)

@whsasf
Copy link

whsasf commented Sep 4, 2023

it's not easy to install CNI, does anyone have a doc for this ? thanks.

@Shelley-BaoYue
Copy link
Collaborator Author

it's not easy to install CNI, does anyone have a doc for this ? thanks.

You can refer to func install_cni_plugins() in https://github.com/kubeedge/kubeedge/blob/master/hack/lib/install.sh, for reference only 😄

@Windrow14
Copy link
Contributor

Set --runtimetype=remote and --remote-runtime-endpoint=unix:///var/run/cri-dockerd.sock when using keadm join.

Do I need to add anything for cloud side when keadm init?

windrow@k8s-node:~$ sudo keadm init --advertise-address="192.168.0.120" --profile version=v1.14.2 --kube-config=/home/windrow/.kube/config
Kubernetes version verification passed, KubeEdge installation will start...
Error: timed out waiting for the condition

@Shelley-BaoYue
Copy link
Collaborator Author

kubectl get all -nkubeedge to see why cloudcore failed to start.

@Windrow14
Copy link
Contributor

kubectl get all -nkubeedge to see why cloudcore failed to start.

Oh I see, node-role.kubernetes.io/control-plane taint problem, used to be node-role.kubernetes.io/master, didn't it, never mind.

@Windrow14
Copy link
Contributor

@Shelley-BaoYue I've encountered a problem when keadm join.

As we all know, during joining process, keadm would pull docker.io/kubeedge/installation-package and create a container of it to get binary of edgecore. However, if cri-docker is installed, and --runtimetype=remote --remote-runtime-endpoint=unix:///var/run/cri-dockerd.sock is added to keadm join command but cni is not configured in /etc/cni/net.d, the creation of the container will fail, as a result, joining new edge node fails.
This behavior is different from early versions. The container for binaries installation should be agnostic to cni's state as it used to be. Some users, including me, expect to use a CNI plugin daemon set to install CNI on all nodes, not to do it manually before creating the cluster.

I've found the root cause of this problem is in cri-docker.service. In ExecStart of it, --network-plugin flag is not specified whose value is cni by default, which makes cri-docker have to find a cni configuration when creating a new container.
So we can add --network-plugin= to ExecStart in cri-docker.service to set it to empty. Then reload daemon and restart cri-docker.service before keadm join.
But then we have to change it back after the installation and configuration of the CNI plugin, and restart containers need cni network on that node.

I'm considering but not sure if is it possible to add a flag in ContainerConfig to make keadm use host network when creating container for installation-package. Could this be helpful?

cc: @fujitatomoya

@fujitatomoya
Copy link
Contributor

The container for binaries installation should be agnostic to cni's state as it used to be. Some users, including me, expect to use a CNI plugin daemon set to install CNI on all nodes, not to do it manually before creating the cluster.

agree. i think that this is a huge burden for user perspective.

But then we have to change it back after the installation and configuration of the CNI plugin, and restart containers need cni network on that node.

same here, sounds not really acceptable. but this constraint is NOT with KubeEdge, right? this is because of cri-dockerd requires CNI as default? (In order for a cluster to become operational, Calico, Flannel, Weave, or another CNI should be used.)

cni by default, which makes cri-docker have to find a cni configuration when creating a new container.
So we can add --network-plugin= to ExecStart in cri-docker.service to set it to empty.

it sounds like this falls back to default (CNI) behavior for cri-dockerd? i am not sure if this can make difference though.

But then we have to change it back after the installation and configuration of the CNI plugin, and restart containers need cni network on that node.

even above is doable, that is not the solution, right? the problem is we need to setup cri-dockerd statically configured with --network-plugin?

@Shelley-BaoYue
Copy link
Collaborator Author

Hi, @Windrow14 @fujitatomoya , sorry for the late reply. Thank you for your advice. I also think the configuration of cni plugin brings a lot of extra work to users. I will try it following the method provided by @Windrow14 and I'll also explore if there's any other solution. If you have any new suggestions, you're welcome to discuss here.

@Windrow14
Copy link
Contributor

@fujitatomoya The point is, we don't really want to run any process inside docker.io/kubeedge/installation-package. It seems to be just for downloading binaries. So we can run the image in a container without any network capability, or just download the binaries by other methods when creation of the container fails.

@xiaochaomi
Copy link

E1123 20:44:32.233140 673700 cri_stats_provider.go:455] "Failed to get the info of the filesystem with mountpoint" err="cannot find filesystem info for device "/dev/mmcblk0p2"" mountpoint="/var/lib/docker"
E1123 20:44:42.235693 673700 cri_stats_provider.go:455] "Failed to get the info of the filesystem with mountpoint" err="cannot find filesystem info for device "/dev/mmcblk0p2"" mountpoint="/var/lib/docker"
E1123 20:44:52.240966 673700 cri_stats_provider.go:455] "Failed to get the info of the filesystem with mountpoint" err="cannot find filesystem info for device "/dev/mmcblk0p2"" mountpoint="/var/lib/docker"
这是我出现的错误

@fujitatomoya
Copy link
Contributor

I've found the root cause of this problem is in cri-docker.service. In ExecStart of it, --network-plugin flag is not specified whose value is cni by default, which makes cri-docker have to find a cni configuration when creating a new container.

i think this is NOT a cause, it is meant to be this default behavior.

or just download the binaries by other methods when creation of the container fails

either container creation fails or not, just downloading other methods would be appropriate way to go for me...

@Windrow14
Copy link
Contributor

By using /etc/cni/net.d/10-containerd-net.conflist provided in https://kubernetes.io/docs/tasks/administer-cluster/migrating-from-dockershim/troubleshooting-cni-plugin-related-errors/#an-example-containerd-configuration-file, seems we don't need cri-dockerd at all, do we?
Though, containerd still need cni installed first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/documentation Categorizes issue or PR as related to documentation. sig/node Categorizes an issue or PR as relevant to SIG Node.
Projects
None yet
Development

No branches or pull requests

6 participants