Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rke v1.2.4-rc9 canal cni does not work with k8s v1.18, install-cni.sh missing #2405

Closed
pasikarkkainen opened this issue Jan 6, 2021 · 3 comments

Comments

@pasikarkkainen
Copy link

pasikarkkainen commented Jan 6, 2021

RKE version:
rke v1.2.4-rc9

Docker version: (docker version,docker info preferred)
native docker-1.13.1-162.git64e9980.el7.centos

Operating system and kernel: (cat /etc/os-release, uname -r preferred)
CentOS 7.9, Linux kernel 3.10.0-1160.11.1.el7.x86_64
SELinux enabled.

Type/provider of hosts: (VirtualBox/Bare-metal/AWS/GCE/DO)
VMware VMs

cluster.yml file:
default cluster.yml generated by rke v1.2.4-rc9, using canal as the cni networking plugin, and kubernetes v1.18.14-rancher1

Steps to Reproduce:

  • use "rke config" to generate rke cluster.yml with kubernetes v1.18.14-rancher1 and canal cni as the networking plugin.
  • use "rke up" to provision the kubernetes cluster.
  • rke finishes successfully, but canal cni actually failed and is not working.

Results:

  • canal cni errors on nodes:
[root@node01 ~]# docker ps -a | grep -i cni
ce02cb5fa3b9        9165569ec236                                                                                                         "/install-cni.sh"        3 minutes ago       Created                                         k8s_install-cni_canal-s58s2_kube-system_feddd4ca-d1db-4d0b-b512-9cb63cd12b46_6

[root@node01 ~]# docker logs ce02cb5fa3b9
container_linux.go:235: starting container process caused "exec: \"/install-cni.sh\": stat /install-cni.sh: no such file or directory"

[root@node01 ~]# docker ps -a | grep -i network
73f530929a2c        c12fea8dd81b                                                                                                         "kubectl apply -f ..."   11 minutes ago      Exited (0) 11 minutes ago                       k8s_rke-network-plugin-pod_rke-network-plugin-deploy-job-rzcvz_kube-system_da4a05d2-7fb2-4794-941b-58a0bb8cf976_0
b530657ffa9d        rancher/pause:3.2                                                                                                    "/pause"                 11 minutes ago      Exited (0) 11 minutes ago                       k8s_POD_rke-network-plugin-deploy-job-rzcvz_kube-system_da4a05d2-7fb2-4794-941b-58a0bb8cf976_0

[root@node01 ~]# docker logs 73f530929a2c
serviceaccount/canal created
clusterrole.rbac.authorization.k8s.io/calico created
clusterrole.rbac.authorization.k8s.io/flannel created
clusterrolebinding.rbac.authorization.k8s.io/canal-flannel created
clusterrolebinding.rbac.authorization.k8s.io/canal-calico created
configmap/canal-config created
daemonset.apps/canal created
customresourcedefinition.apiextensions.k8s.io/bgpconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/bgppeers.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/blockaffinities.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/clusterinformations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/felixconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/globalnetworkpolicies.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/globalnetworksets.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/hostendpoints.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamblocks.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamconfigs.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamhandles.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ippools.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/networkpolicies.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/networksets.crd.projectcalico.org created

[root@node01 ~]# docker logs b530657ffa9d
Shutting down, got signal: Terminated

[root@node01 ~]# docker logs kubelet
...
E0106 21:48:36.556228    4611 remote_runtime.go:222] StartContainer "6f6663505cc53122a32140d0c628d53a90b1550f1628c6469cd09882da698766" from runtime service failed: rpc error: code = Unknown desc = failed to start container "6f6663505cc53122a32140d0c628d53a90b1550f1628c6469cd09882da698766": Error response from daemon: oci runtime error: container_linux.go:235: starting container process caused "exec: \"/install-cni.sh\": stat /install-cni.sh: no such file or directory"
E0106 21:48:36.556309    4611 kuberuntime_manager.go:818] init container start failed: RunContainerError: failed to start container "6f6663505cc53122a32140d0c628d53a90b1550f1628c6469cd09882da698766": Error response from daemon: oci runtime error: container_linux.go:235: starting container process caused "exec: \"/install-cni.sh\": stat /install-cni.sh: no such file or directory"
E0106 21:48:36.556362    4611 pod_workers.go:191] Error syncing pod feddd4ca-d1db-4d0b-b512-9cb63cd12b46 ("canal-s58s2_kube-system(feddd4ca-d1db-4d0b-b512-9cb63cd12b46)"), skipping: failed to "StartContainer" for "install-cni" with RunContainerError: "failed to start container \"6f6663505cc53122a32140d0c628d53a90b1550f1628c6469cd09882da698766\": Error response from daemon: oci runtime error: container_linux.go:235: starting container process caused \"exec: \\\"/install-cni.sh\\\": stat /install-cni.sh: no such file or directory\""
E0106 21:48:36.656489    4611 kubelet.go:2190] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
E0106 21:48:37.205819    4611 pod_workers.go:191] Error syncing pod 6a6ccba6-df25-442e-8cd0-ef997d1eb2ca ("cattle-cluster-agent-8b675bbdb-qk6wj_cattle-system(6a6ccba6-df25-442e-8cd0-ef997d1eb2ca)"), skipping: network is not ready: runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
E0106 21:48:39.205713    4611 pod_workers.go:191] Error syncing pod 6a6ccba6-df25-442e-8cd0-ef997d1eb2ca ("cattle-cluster-agent-8b675bbdb-qk6wj_cattle-system(6a6ccba6-df25-442e-8cd0-ef997d1eb2ca)"), skipping: network is not ready: runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
W0106 21:48:39.606279    4611 cni.go:237] Unable to update cni config: no networks found in /etc/cni/net.d
E0106 21:48:41.206091    4611 pod_workers.go:191] Error syncing pod 6a6ccba6-df25-442e-8cd0-ef997d1eb2ca ("cattle-cluster-agent-8b675bbdb-qk6wj_cattle-system(6a6ccba6-df25-442e-8cd0-ef997d1eb2ca)"), skipping: network is not ready: runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
E0106 21:48:41.657895    4611 kubelet.go:2190] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
E0106 21:48:43.209285    4611 pod_workers.go:191] Error syncing pod 6a6ccba6-df25-442e-8cd0-ef997d1eb2ca ("cattle-cluster-agent-8b675bbdb-qk6wj_cattle-system(6a6ccba6-df25-442e-8cd0-ef997d1eb2ca)"), skipping: network is not ready: runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
W0106 21:48:44.606660    4611 cni.go:237] Unable to update cni config: no networks found in /etc/cni/net.d
E0106 21:48:45.205740    4611 pod_workers.go:191] Error syncing pod 6a6ccba6-df25-442e-8cd0-ef997d1eb2ca ("cattle-cluster-agent-8b675bbdb-qk6wj_cattle-system(6a6ccba6-df25-442e-8cd0-ef997d1eb2ca)"), skipping: network is not ready: runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
  • here are the cluster.yml system_images cni versions that do NOT work with k8s v1.18.x on CentOS 7.9 nodes with native docker:
  calico_node: rancher/calico-node:v3.16.5
  calico_cni: rancher/calico-cni:v3.16.5
  calico_controllers: rancher/calico-kube-controllers:v3.16.5
  calico_ctl: rancher/calico-ctl:v3.16.5
  calico_flexvol: rancher/calico-pod2daemon-flexvol:v3.16.5
  canal_node: rancher/calico-node:v3.16.5
  canal_cni: rancher/calico-cni:v3.16.5
  canal_controllers: rancher/calico-kube-controllers:v3.16.5
  canal_flannel: rancher/coreos-flannel:v0.13.0-rancher1
  canal_flexvol: rancher/calico-pod2daemon-flexvol:v3.16.5
  • here are the older cluster.yml system_images cni versions that do work OK with k8s v1.18.x on CentOS 7.9 nodes with native docker:
  calico_node: rancher/calico-node:v3.13.4
  calico_cni: rancher/calico-cni:v3.13.4
  calico_controllers: rancher/calico-kube-controllers:v3.13.4
  calico_ctl: rancher/calico-ctl:v3.13.4
  calico_flexvol: rancher/calico-pod2daemon-flexvol:v3.13.4
  canal_node: rancher/calico-node:v3.13.4
  canal_cni: rancher/calico-cni:v3.13.4
  canal_flannel: rancher/coreos-flannel:v0.12.0
  canal_flexvol: rancher/calico-pod2daemon-flexvol:v3.13.4

Any idea why the "install-cni.sh" script is missing with k8s v1.18.x? Should cni use different script name with v1.18.x ?

rke v1.2.4-rc9 with default cluster.yml using k8s v1.19.6-rancher1 seems to work OK, and canal cni works OK.

@pasikarkkainen pasikarkkainen changed the title rke v1.2.4-rc9 canal cni does not work with CentOS 7.9 nodes, install-cni.sh missing rke v1.2.4-rc9 canal cni does not work with k8s v1.18, install-cni.sh missing Jan 7, 2021
@pasikarkkainen
Copy link
Author

pasikarkkainen commented Jan 7, 2021

I think I figured out what's going on..

when one does "rke config" and rke config generation asks for kubernetes image to use.. even when I enter "rancher/hyperkube:v1.18.14-rancher1" there, rke won't actually use that.. and ends up generating config/system_images for the default k8s version, which is v1.19.6-rancher1 in rke v1.2.4-rc9.

Instead if I check system_images with "rke config --version v1.18.14-rancher1 -s" I do get the actual working system_images versions listed for that specific k8s version. And that does list the calico/canal v3.13.4 versions for k8s v1.18.x, which do work OK.

so I think the only issue here is that "rke config" even when it asks for which kubernetes image to use, does not actually use the image version given by the user, and system_images versions end up being generated for the default k8s version anyway.

@superseb
Copy link
Contributor

superseb commented Jan 7, 2021

Yes this will be covered in #2149 where basically the new defaults that were introduced for configuration will also be used in rke config. The preferred way is to use kubernetes_version in cluster.yml. This is described at https://rancher.com/docs/rke/latest/en/upgrades/

@stale
Copy link

stale bot commented Mar 8, 2021

This issue/PR has been automatically marked as stale because it has not had activity (commit/comment/label) for 60 days. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the status/stale label Mar 8, 2021
@stale stale bot closed this as completed Mar 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants