Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1.16 upgrade causes nodes to never enter ready state arm64 #82997

Closed
nbroeking opened this issue Sep 22, 2019 · 24 comments
Closed

1.16 upgrade causes nodes to never enter ready state arm64 #82997

nbroeking opened this issue Sep 22, 2019 · 24 comments
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. sig/network Categorizes an issue or PR as relevant to SIG Network. sig/node Categorizes an issue or PR as relevant to SIG Node.

Comments

@nbroeking
Copy link

nbroeking commented Sep 22, 2019

What happened:
Upgrading to 1.16.0 causes Cluster nodes to never be ready on arm64

What you expected to happen:
Cluster should become ready

How to reproduce it (as minimally and precisely as possible):
1.) kubeadm init --pod-network-cidr=10.244.0.0/16
2.) Join from 2 other arm boards using command outputed from kubeadm init

Anything else we need to know?:
Docker version 0

docker version
Client:
 Version:           18.09.1
 API version:       1.39
 Go version:        go1.11.6
 Git commit:        4c52b90
 Built:             Fri, 13 Sep 2019 10:45:43 +0100
 OS/Arch:           linux/arm
 Experimental:      false

Server:
 Engine:
  Version:          18.09.1
  API version:      1.39 (minimum version 1.12)
  Go version:       go1.11.6
  Git commit:       4c52b90
  Built:            Fri Sep 13 09:45:43 2019
  OS/Arch:          linux/arm
  Experimental:     false
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.0", GitCommit:"2bd9643cee5b3b3a5ecbd3af49d09018f0773c77", GitTreeState:"clean", BuildDate:"2019-09-19T13:57:45Z", GoVersion:"go1.12.9", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.0", GitCommit:"2bd9643cee5b3b3a5ecbd3af49d09018f0773c77", GitTreeState:"clean", BuildDate:"2019-09-18T14:27:17Z", GoVersion:"go1.12.9", Compiler:"gc", Platform:"linux/arm"}
$ kubectl describe node pi0
Name:               pi0
Roles:              master
Labels:             beta.kubernetes.io/arch=arm
                    beta.kubernetes.io/os=linux
                    kubernetes.io/arch=arm
                    kubernetes.io/hostname=pi0
                    kubernetes.io/os=linux
                    node-role.kubernetes.io/master=
Annotations:        flannel.alpha.coreos.com/backend-data: {"VtepMAC":"5e:4b:79:7e:8f:18"}
                    flannel.alpha.coreos.com/backend-type: vxlan
                    flannel.alpha.coreos.com/kube-subnet-manager: true
                    flannel.alpha.coreos.com/public-ip: 192.168.0.10
                    kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock
                    node.alpha.kubernetes.io/ttl: 0
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Sun, 22 Sep 2019 17:08:17 -0600
Taints:             node-role.kubernetes.io/master:NoSchedule
                    node.kubernetes.io/not-ready:NoSchedule
Unschedulable:      false
Conditions:
  Type             Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----             ------  -----------------                 ------------------                ------                       -------
  MemoryPressure   False   Sun, 22 Sep 2019 17:33:20 -0600   Sun, 22 Sep 2019 17:08:17 -0600   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure     False   Sun, 22 Sep 2019 17:33:20 -0600   Sun, 22 Sep 2019 17:08:17 -0600   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure      False   Sun, 22 Sep 2019 17:33:20 -0600   Sun, 22 Sep 2019 17:08:17 -0600   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready            False   Sun, 22 Sep 2019 17:33:20 -0600   Sun, 22 Sep 2019 17:08:17 -0600   KubeletNotReady              runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized. WARNING: CPU hardcapping unsupported
Addresses:
  InternalIP:  192.168.0.10
  Hostname:    pi0
Capacity:
 cpu:                4
 ephemeral-storage:  30400236Ki
 memory:             3999784Ki
 pods:               110
Allocatable:
 cpu:                4
 ephemeral-storage:  28016857452
 memory:             3897384Ki
 pods:               110
System Info:
 Machine ID:                 76ffec47b3ac4c9e83794145c04ce02f
 System UUID:                76ffec47b3ac4c9e83794145c04ce02f
 Boot ID:                    20eed1f7-6a75-46af-8cff-c54e42b0896f
 Kernel Version:             4.19.66-v7l+
 OS Image:                   Raspbian GNU/Linux 10 (buster)
 Operating System:           linux
 Architecture:               arm
 Container Runtime Version:  docker://18.9.1
 Kubelet Version:            v1.16.0
 Kube-Proxy Version:         v1.16.0
PodCIDR:                     10.244.0.0/24
PodCIDRs:                    10.244.0.0/24
Non-terminated Pods:         (6 in total)
  Namespace                  Name                           CPU Requests  CPU Limits  Memory Requests  Memory Limits  AGE
  ---------                  ----                           ------------  ----------  ---------------  -------------  ---
  kube-system                etcd-pi0                       0 (0%)        0 (0%)      0 (0%)           0 (0%)         24m
  kube-system                kube-apiserver-pi0             250m (6%)     0 (0%)      0 (0%)           0 (0%)         24m
  kube-system                kube-controller-manager-pi0    200m (5%)     0 (0%)      0 (0%)           0 (0%)         24m
  kube-system                kube-flannel-ds-arm-d78ct      100m (2%)     100m (2%)   50Mi (1%)        50Mi (1%)      18m
  kube-system                kube-proxy-tmtm8               0 (0%)        0 (0%)      0 (0%)           0 (0%)         25m
  kube-system                kube-scheduler-pi0             100m (2%)     0 (0%)      0 (0%)           0 (0%)         24m
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource           Requests    Limits
  --------           --------    ------
  cpu                650m (16%)  100m (2%)
  memory             50Mi (1%)   50Mi (1%)
  ephemeral-storage  0 (0%)      0 (0%)
Events:
  Type    Reason                   Age                From             Message
  ----    ------                   ----               ----             -------
  Normal  NodeAllocatableEnforced  26m                kubelet, pi0     Updated Node Allocatable limit across pods
  Normal  NodeHasSufficientMemory  26m (x8 over 26m)  kubelet, pi0     Node pi0 status is now: NodeHasSufficientMemory
  Normal  NodeHasNoDiskPressure    26m (x8 over 26m)  kubelet, pi0     Node pi0 status is now: NodeHasNoDiskPressure
  Normal  NodeHasSufficientPID     26m (x7 over 26m)  kubelet, pi0     Node pi0 status is now: NodeHasSufficientPID
  Normal  Starting                 25m                kube-proxy, pi0  Starting kube-proxy.

Environment:
raspberry pi 4 - 4 core 4gb of ram

  • Kubernetes version (use kubectl version):
    Included above

  • Cloud provider or hardware configuration:
    Hardware

  • OS (e.g: cat /etc/os-release):
    Raspbian

  • Kernel (e.g. uname -a):
    Linux pi0 4.19.66-v7l+ Change service name to service ID. #1253 SMP Thu Aug 15 12:02:08 BST 2019 armv7l GNU/Linux

  • Install tools:
    kubeadm

  • Network plugin and version (if this is a network-related bug):
    I tried adding flannel after the fact but it does not change the state

  • Others:
    Sig - @kubernetes/network

@nbroeking nbroeking added the kind/bug Categorizes issue or PR as related to a bug. label Sep 22, 2019
@k8s-ci-robot k8s-ci-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Sep 22, 2019
@charleslong
Copy link

charleslong commented Sep 23, 2019

Can confirm on both on Centos 7 x86_64 and Raspbian armv7l. After upgrading to v1.16.0 previously functional nodes all report:

KubeletNotReady runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized

A downgrade to 15.04 and a reset restored functionality.

@nbroeking
Copy link
Author

@kubernetes/Network I think this is the right sig ^

@tedyu
Copy link
Contributor

tedyu commented Sep 23, 2019

w.r.t. 'network plugin is not ready', can you show more from the log ?

	network, err := getDefaultCNINetwork(plugin.confDir, plugin.binDirs)
	if err != nil {
		klog.Warningf("Unable to update cni config: %s", err)

Looks like getDefaultCNINetwork should have given some detail in the log.

@charleslong
Copy link

charleslong commented Sep 23, 2019

I rebroke my raspbian node and collected this from the logs:

> Sep 22 22:29:16 kubelet[5427]: W0922 22:29:16.598269    5427 cni.go:202] Error validating CNI config &{cbr0  false [0x8d20e40 0x8d20e80] [123 10 32 32 34 110 97 109 101 34 58 32 34 99 98 114 48 34 44 10 32 32 34 112 108 117 103 105 110 115 34 58 32 91 10 32 32 32 32 123 10 32 32 32 32 32 32 34 116 121 112 101 34 58 32 34 102 108 97 110 110 101 108 34 44 10 32 32 32 32 32 32 34 100 101 108 101 103 97 116 101 34 58 32 123 10 32 32 32 32 32 32 32 32 34 104 97 105 114 112 105 110 77 111 100 101 34 58 32 116 114 117 101 44 10 32 32 32 32 32 32 32 32 34 105 115 68 101 102 97 117 108 116 71 97 116 101 119 97 121 34 58 32 116 114 117 101 10 32 32 32 32 32 32 125 10 32 32 32 32 125 44 10 32 32 32 32 123 10 32 32 32 32 32 32 34 116 121 112 101 34 58 32 34 112 111 114 116 109 97 112 34 44 10 32 32 32 32 32 32 34 99 97 112 97 98 105 108 105 116 105 101 115 34 58 32 123 10 32 32 32 32 32 32 32 32 34 112 111 114 116 77 97 112 112 105 110 103 115 34 58 32 116 114 117 101 10 32 32 32 32 32 32 125 10 32 32 32 32 125 10 32 32 93 10 125 10]}: [plugin flannel does not support config version ""]
> Sep 22 22:29:16 kubelet[5427]: W0922 22:29:16.601212    5427 cni.go:237] Unable to update cni config: no valid networks found in /etc/cni/net.d
> Sep 22 22:29:18 kubelet[5427]: E0922 22:29:18.191633    5427 kubelet.go:2187] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized

Specifically this part in the CNI configuration validation seems pertinent:

flannel does not support config version ""

Looking at "/etc/cni/net.d/10-flannel.conflist" it is, in fact, missing "cniVersion" (i.e. "cniVersion": "<VERSION>,"):

{
  "name": "cbr0",
  "plugins": [
    {
      "type": "flannel",
      "delegate": {
        "hairpinMode": true,
        "isDefaultGateway": true
      }
    },
    {
      "type": "portmap",
      "capabilities": {
        "portMappings": true
      }
    }
  ]
}

I understand that kubernetes 1.6 requires CNI plugin version 0.5.1 or later cniVersion is the version of the CNI interface, not the plugin.

@snowball77
Copy link

I can confirm the issue on Ubuntu 18.04 with all patches and kubernetes-cni 0.7.5. Is there a way to manually add a CNI version line to the config? How should it look like?

@caoruidong
Copy link

Adding "cniVersion": "0.2.0" solved the issue.

See https://stackoverflow.com/questions/58037620/how-to-fix-flannel-cni-plugin-error-plugin-flannel-、does-not-support-config-ve

@snowball77
Copy link

After finding a variant of the config file online I used
{ "name": "cbr0", "cniVersion": "0.3.1", "plugins": [ { "type": "flannel", "delegate": { "hairpinMode": true, "isDefaultGateway": true } }, { "type": "portmap", "capabilities": { "portMappings": true } } ] }
Then it seemed to work, at least the node become ready again.

@snowball77
Copy link

It would be good for kubeadm to check for this in the configfile beforehand and give a warning or error or update the file accordingly.

@charleslong
Copy link

cniVersion is the version of the CNI interface that talks to flannel, not the version of the plugin (to correct myself above), versions listed here and the recommended version is '0.3.1', which snowball77 posted above, same syntax.

The conflist is created by applying kube-flannel.yml, pulled from https://github.com/coreos/flannel . Even the most recent kube-flannel.yml does not contain a cniVersion. Per a discussion in #flannel-users this just requires an update to kube-flannel.yml and for kubernetes to update their documentation as to what kube-flannel.yml should be applied.

@snowball77
Copy link

snowball77 commented Sep 23, 2019

Thank you for the clarification. In the meanwhile I figured out that the cni file gets overwritten every time kubelet starts. So I would welcome a permanent fix. Any workaround how I might update the flannel.yml file myself to fix it? Just take the linked file, fix it at the config and then apply the new version?

@nbroeking
Copy link
Author

@tedyu Here are some logs

$ kubectl logs kube-flannel-ds-arm-vpgml -n kube-system
I0923 14:49:33.673970       1 main.go:514] Determining IP address of default interface
I0923 14:49:33.674695       1 main.go:527] Using interface with name eth0 and address 192.168.0.11
I0923 14:49:33.674743       1 main.go:544] Defaulting external address to interface address (192.168.0.11)
I0923 14:49:33.767988       1 kube.go:126] Waiting 10m0s for node controller to sync
I0923 14:49:33.768174       1 kube.go:309] Starting kube subnet manager
I0923 14:49:34.768470       1 kube.go:133] Node controller sync successful
I0923 14:49:34.768606       1 main.go:244] Created subnet manager: Kubernetes Subnet Manager - pi1
I0923 14:49:34.768665       1 main.go:247] Installing signal handlers
I0923 14:49:34.769150       1 main.go:386] Found network config - Backend type: vxlan
I0923 14:49:34.769348       1 vxlan.go:120] VXLAN config: VNI=1 Port=0 GBP=false DirectRouting=false
E0923 14:49:34.770960       1 main.go:289] Error registering network: failed to configure interface flannel.1: failed to ensure address of interface flannel.1: link has incompatible addresses. Remove additional addresses and try again. &netlink.Vxlan{LinkAttrs:netlink.LinkAttrs{Index:5, MTU:1450, TxQLen:0, Name:"flannel.1", HardwareAddr:net.HardwareAddr{0xda, 0xdf, 0x42, 0xd3, 0x5b, 0x94}, Flags:0x13, RawFlags:0x11043, ParentIndex:0, MasterIndex:0, Namespace:interface {}(nil), Alias:"", Statistics:(*netlink.LinkStatistics)(0x12f56104), Promisc:0, Xdp:(*netlink.LinkXdp)(0x12fe6970), EncapType:"ether", Protinfo:(*netlink.Protinfo)(nil), OperState:0x0}, VxlanId:1, VtepDevIndex:2, SrcAddr:net.IP{0xc0, 0xa8, 0x0, 0xb}, Group:net.IP(nil), TTL:0, TOS:0, Learning:false, Proxy:false, RSC:false, L2miss:false, L3miss:false, UDPCSum:true, NoAge:false, GBP:false, Age:300, Limit:0, Port:8472, PortLow:0, PortHigh:0}
I0923 14:49:34.771091       1 main.go:366] Stopping shutdownHandler...
Conditions:
  Type             Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----             ------  -----------------                 ------------------                ------                       -------
  MemoryPressure   False   Mon, 23 Sep 2019 08:50:25 -0600   Sun, 22 Sep 2019 17:08:17 -0600   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure     False   Mon, 23 Sep 2019 08:50:25 -0600   Sun, 22 Sep 2019 17:08:17 -0600   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure      False   Mon, 23 Sep 2019 08:50:25 -0600   Sun, 22 Sep 2019 17:08:17 -0600   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready            False   Mon, 23 Sep 2019 08:50:25 -0600   Sun, 22 Sep 2019 17:08:17 -0600   KubeletNotReady              runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized. WARNING: CPU hardcapping unsupported
nbroeking@pi0:~ $ journalctl -u kubelet
-- Logs begin at Sun 2019-09-22 22:01:54 BST, end at Mon 2019-09-23 15:52:07 BST. --
Sep 22 22:10:39 pi0 systemd[1]: Started kubelet: The Kubernetes Node Agent.
Sep 22 22:10:41 pi0 kubelet[340]: Flag --cgroup-driver has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-
Sep 22 22:10:41 pi0 kubelet[340]: Flag --cgroup-driver has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-
Sep 22 22:10:41 pi0 kubelet[340]: I0922 22:10:41.459178     340 server.go:410] Version: v1.16.0
Sep 22 22:10:41 pi0 kubelet[340]: I0922 22:10:41.464848     340 plugins.go:100] No cloud provider specified.
Sep 22 22:10:41 pi0 kubelet[340]: I0922 22:10:41.465008     340 server.go:773] Client rotation is on, will bootstrap in background
Sep 22 22:10:41 pi0 kubelet[340]: I0922 22:10:41.495236     340 certificate_store.go:129] Loading cert/key pair from "/var/lib/kubelet/pki/kubelet-client-current.pem".
Sep 22 22:10:59 pi0 kubelet[340]: E0922 22:10:59.706453     340 machine.go:288] failed to get cache information for node 0: open /sys/devices/system/cpu/cpu0/cache: no such file or directory
Sep 22 22:10:59 pi0 kubelet[340]: I0922 22:10:59.789783     340 server.go:644] --cgroups-per-qos enabled, but --cgroup-root was not specified.  defaulting to /
Sep 22 22:10:59 pi0 kubelet[340]: I0922 22:10:59.795425     340 container_manager_linux.go:265] container manager verified user specified cgroup-root exists: []
Sep 22 22:10:59 pi0 kubelet[340]: I0922 22:10:59.795488     340 container_manager_linux.go:270] Creating Container Manager object based on Node Config: {RuntimeCgroupsName: SystemCgroupsName: KubeletCgroupsName: ContainerRuntime:docker Cg
Sep 22 22:10:59 pi0 kubelet[340]: I0922 22:10:59.795910     340 fake_topology_manager.go:29] [fake topologymanager] NewFakeManager
Sep 22 22:10:59 pi0 kubelet[340]: I0922 22:10:59.795944     340 container_manager_linux.go:305] Creating device plugin manager: true
Sep 22 22:10:59 pi0 kubelet[340]: I0922 22:10:59.799085     340 fake_topology_manager.go:39] [fake topologymanager] AddHintProvider HintProvider:  &{kubelet.sock /var/lib/kubelet/device-plugins/ map[] {0 0} <nil> {{} [0 0 0]} 0x13f1cd0 0x
Sep 22 22:10:59 pi0 kubelet[340]: I0922 22:10:59.799275     340 state_mem.go:36] [cpumanager] initializing new in-memory state store
Sep 22 22:10:59 pi0 kubelet[340]: I0922 22:10:59.800264     340 state_mem.go:84] [cpumanager] updated default cpuset: ""
Sep 22 22:10:59 pi0 kubelet[340]: I0922 22:10:59.800304     340 state_mem.go:92] [cpumanager] updated cpuset assignments: "map[]"
Sep 22 22:10:59 pi0 kubelet[340]: I0922 22:10:59.800338     340 fake_topology_manager.go:39] [fake topologymanager] AddHintProvider HintProvider:  &{{0 0} 0x62bf09c 10000000000 0x6c1ca80 <nil> <nil> <nil> <nil> map[memory:{{104857600 0} {
Sep 22 22:10:59 pi0 kubelet[340]: I0922 22:10:59.804230     340 kubelet.go:287] Adding pod path: /etc/kubernetes/manifests
Sep 22 22:10:59 pi0 kubelet[340]: I0922 22:10:59.804330     340 kubelet.go:312] Watching apiserver
Sep 22 22:10:59 pi0 kubelet[340]: I0922 22:10:59.814810     340 client.go:75] Connecting to docker on unix:///var/run/docker.sock
Sep 22 22:10:59 pi0 kubelet[340]: I0922 22:10:59.820859     340 client.go:104] Start docker client with request timeout=2m0s
Sep 22 22:10:59 pi0 kubelet[340]: E0922 22:10:59.835400     340 reflector.go:123] k8s.io/kubernetes/pkg/kubelet/kubelet.go:459: Failed to list *v1.Node: Get https://192.168.0.10:6443/api/v1/nodes?fieldSelector=metadata.name%3Dpi0&limit=50
Sep 22 22:10:59 pi0 kubelet[340]: E0922 22:10:59.835503     340 reflector.go:123] k8s.io/kubernetes/pkg/kubelet/kubelet.go:450: Failed to list *v1.Service: Get https://192.168.0.10:6443/api/v1/services?limit=500&resourceVersion=0: dial tc
Sep 22 22:10:59 pi0 kubelet[340]: E0922 22:10:59.835526     340 reflector.go:123] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:46: Failed to list *v1.Pod: Get https://192.168.0.10:6443/api/v1/pods?fieldSelector=spec.nodeName%3Dpi0&li
Sep 22 22:10:59 pi0 kubelet[340]: W0922 22:10:59.880983     340 docker_service.go:563] Hairpin mode set to "promiscuous-bridge" but kubenet is not enabled, falling back to "hairpin-veth"
Sep 22 22:10:59 pi0 kubelet[340]: I0922 22:10:59.881063     340 docker_service.go:240] Hairpin mode set to "hairpin-veth"
Sep 22 22:10:59 pi0 kubelet[340]: W0922 22:10:59.986613     340 cni.go:202] Error validating CNI config &{cbr0  false [0x6fc08e0 0x6fc0920] [123 10 32 32 34 110 97 109 101 34 58 32 34 99 98 114 48 34 44 10 32 32 34 112 108 117 103 105 110
Sep 22 22:10:59 pi0 kubelet[340]: W0922 22:10:59.986954     340 cni.go:237] Unable to update cni config: no valid networks found in /etc/cni/net.d
Sep 22 22:11:00 pi0 kubelet[340]: W0922 22:11:00.011658     340 cni.go:202] Error validating CNI config &{cbr0  false [0x6f6c7c0 0x6f6c840] [123 10 32 32 34 110 97 109 101 34 58 32 34 99 98 114 48 34 44 10 32 32 34 112 108 117 103 105 110
Sep 22 22:11:00 pi0 kubelet[340]: W0922 22:11:00.011932     340 cni.go:237] Unable to update cni config: no valid networks found in /etc/cni/net.d
Sep 22 22:11:00 pi0 kubelet[340]: I0922 22:11:00.012196     340 docker_service.go:255] Docker cri networking managed by cni
Sep 22 22:11:00 pi0 kubelet[340]: W0922 22:11:00.030760     340 cni.go:202] Error validating CNI config &{cbr0  false [0x6f1a180 0x6f1a200] [123 10 32 32 34 110 97 109 101 34 58 32 34 99 98 114 48 34 44 10 32 32 34 112 108 117 103 105 110
Sep 22 22:11:00 pi0 kubelet[340]: W0922 22:11:00.031141     340 cni.go:237] Unable to update cni config: no valid networks found in /etc/cni/net.d
Sep 22 22:11:00 pi0 kubelet[340]: I0922 22:11:00.047139     340 docker_service.go:260] Docker Info: &{ID:YNKI:2WKN:R67V:MKZF:PWOS:R63R:FFHL:GNWE:GWZD:B7I3:T7YE:DNT7 Containers:23 ContainersRunning:11 ContainersPaused:0 ContainersStopped:1
Sep 22 22:11:00 pi0 kubelet[340]: I0922 22:11:00.047409     340 docker_service.go:273] Setting cgroupDriver to cgroupfs
Sep 22 22:11:00 pi0 kubelet[340]: I0922 22:11:00.113425     340 remote_runtime.go:59] parsed scheme: ""
Sep 22 22:11:00 pi0 kubelet[340]: I0922 22:11:00.113485     340 remote_runtime.go:59] scheme "" not registered, fallback to default scheme
Sep 22 22:11:00 pi0 kubelet[340]: I0922 22:11:00.116890     340 passthrough.go:48] ccResolverWrapper: sending update to cc: {[{/var/run/dockershim.sock 0  <nil>}] <nil>}
Sep 22 22:11:00 pi0 kubelet[340]: I0922 22:11:00.116942     340 clientconn.go:577] ClientConn switching balancer to "pick_first"
Sep 22 22:11:00 pi0 kubelet[340]: I0922 22:11:00.117150     340 remote_image.go:50] parsed scheme: ""
Sep 22 22:11:00 pi0 kubelet[340]: I0922 22:11:00.117206     340 remote_image.go:50] scheme "" not registered, fallback to default scheme
Sep 22 22:11:00 pi0 kubelet[340]: I0922 22:11:00.117280     340 passthrough.go:48] ccResolverWrapper: sending update to cc: {[{/var/run/dockershim.sock 0  <nil>}] <nil>}
Sep 22 22:11:00 pi0 kubelet[340]: I0922 22:11:00.117310     340 clientconn.go:577] ClientConn switching balancer to "pick_first"
Sep 22 22:11:00 pi0 kubelet[340]: E0922 22:11:00.837295     340 reflector.go:123] k8s.io/kubernetes/pkg/kubelet/kubelet.go:459: Failed to list *v1.Node: Get https://192.168.0.10:6443/api/v1/nodes?fieldSelector=metadata.name%3Dpi0&limit=50
Sep 22 22:11:00 pi0 kubelet[340]: E0922 22:11:00.838952     340 reflector.go:123] k8s.io/kubernetes/pkg/kubelet/kubelet.go:450: Failed to list *v1.Service: Get https://192.168.0.10:6443/api/v1/services?limit=500&resourceVersion=0: dial tc
Sep 22 22:11:00 pi0 kubelet[340]: E0922 22:11:00.841031     340 reflector.go:123] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:46: Failed to list *v1.Pod: Get https://192.168.0.10:6443/api/v1/pods?fieldSelector=spec.nodeName%3Dpi0&li
Sep 22 22:11:01 pi0 kubelet[340]: E0922 22:11:01.839010     340 reflector.go:123] k8s.io/kubernetes/pkg/kubelet/kubelet.go:459: Failed to list *v1.Node: Get https://192.168.0.10:6443/api/v1/nodes?fieldSelector=metadata.name%3Dpi0&limit=50
Sep 22 22:11:01 pi0 kubelet[340]: E0922 22:11:01.841516     340 reflector.go:123] k8s.io/kubernetes/pkg/kubelet/kubelet.go:450: Failed to list *v1.Service: Get https://192.168.0.10:6443/api/v1/services?limit=500&resourceVersion=0: dial tc
Sep 22 22:11:01 pi0 kubelet[340]: E0922 22:11:01.843629     340 reflector.go:123] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:46: Failed to list *v1.Pod: Get https://192.168.0.10:6443/api/v1/pods?fieldSelector=spec.nodeName%3Dpi0&li
Sep 22 22:11:02 pi0 kubelet[340]: E0922 22:11:02.840729     340 reflector.go:123] k8s.io/kubernetes/pkg/kubelet/kubelet.go:459: Failed to list *v1.Node: Get https://192.168.0.10:6443/api/v1/nodes?fieldSelector=metadata.name%3Dpi0&limit=50
Sep 22 22:11:02 pi0 kubelet[340]: E0922 22:11:02.843747     340 reflector.go:123] k8s.io/kubernetes/pkg/kubelet/kubelet.go:450: Failed to list *v1.Service: Get https://192.168.0.10:6443/api/v1/services?limit=500&resourceVersion=0: dial tc
Sep 22 22:11:02 pi0 kubelet[340]: E0922 22:11:02.845638     340 reflector.go:123] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:46: Failed to list *v1.Pod: Get https://192.168.0.10:6443/api/v1/pods?fieldSelector=spec.nodeName%3Dpi0&li
Sep 22 22:11:03 pi0 kubelet[340]: E0922 22:11:03.842740     340 reflector.go:123] k8s.io/kubernetes/pkg/kubelet/kubelet.go:459: Failed to list *v1.Node: Get https://192.168.0.10:6443/api/v1/nodes?fieldSelector=metadata.name%3Dpi0&limit=50
Sep 22 22:11:03 pi0 kubelet[340]: E0922 22:11:03.846900     340 reflector.go:123] k8s.io/kubernetes/pkg/kubelet/kubelet.go:450: Failed to list *v1.Service: Get https://192.168.0.10:6443/api/v1/services?limit=500&resourceVersion=0: dial tc
Sep 22 22:11:03 pi0 kubelet[340]: E0922 22:11:03.847451     340 reflector.go:123] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:46: Failed to list *v1.Pod: Get https://192.168.0.10:6443/api/v1/pods?fieldSelector=spec.nodeName%3Dpi0&li
Sep 22 22:11:04 pi0 kubelet[340]: E0922 22:11:04.844868     340 reflector.go:123] k8s.io/kubernetes/pkg/kubelet/kubelet.go:459: Failed to list *v1.Node: Get https://192.168.0.10:6443/api/v1/nodes?fieldSelector=metadata.name%3Dpi0&limit=50
Sep 22 22:11:04 pi0 kubelet[340]: E0922 22:11:04.848600     340 reflector.go:123] k8s.io/kubernetes/pkg/kubelet/kubelet.go:450: Failed to list *v1.Service: Get https://192.168.0.10:6443/api/v1/services?limit=500&resourceVersion=0: dial tc

@charleslong
Copy link

I was able to resolve this by applying a fixed kube-flannel.yml (with cniVersion) and then deleting the impacted flannel pod so that it restarted; now my test upgraded nodes are reporting Ready. There is a PR for kube-flannel.yml from lwr20 which should resolve this issue going forward.

@jeremyje
Copy link
Contributor

I can confirm that on a vanilla install of Ubuntu 18.04 LTS (fully updated) and running through the setup steps as closely as I can, https://gist.github.com/jeremyje/14e26148909734ebe1d6395cc8b0e156 this fix works.

@charleslong
Copy link

@jeremyje sudo sed -i '/"name": "cbr0",/a"cniVersion": "0.2.0",' /etc/cni/net.d/10-flannel.conflist is a temporary fix and will be removed if the flannel pod is restarted.

@liggitt
Copy link
Member

liggitt commented Sep 24, 2019

/sig node network

@k8s-ci-robot k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. sig/network Categorizes an issue or PR as relevant to SIG Network. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Sep 24, 2019
@liggitt
Copy link
Member

liggitt commented Sep 24, 2019

/sig cluster-lifecycle

@k8s-ci-robot k8s-ci-robot added the sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. label Sep 24, 2019
@charleslong
Copy link

New patch for this:

flannel-io/flannel#1174

@athenabot
Copy link

/triage unresolved

Comment /remove-triage unresolved when the issue is assessed and confirmed.

🤖 I am a bot run by vllry. 👩‍🔬

@k8s-ci-robot k8s-ci-robot added the triage/unresolved Indicates an issue that can not or will not be resolved. label Sep 25, 2019
@bowei
Copy link
Member

bowei commented Oct 3, 2019

/assign @dcbw

@bowei
Copy link
Member

bowei commented Oct 3, 2019

/remove-triage unresolved

@k8s-ci-robot k8s-ci-robot removed the triage/unresolved Indicates an issue that can not or will not be resolved. label Oct 3, 2019
@dcbw
Copy link
Member

dcbw commented Oct 3, 2019

This is actually a CNI 'flannel' plugin bug. The documented flannel CNI configuration JSON does not include a "cniVersion" which is technically a legal CNI version number that is accepted as version 0.1.0 of the spec. The CNI flannel plugin's VERSION command response claims to support v0.1.0 and up, but does not support "" (even though that is equivalent to 0.1.0).

Thus the mismatch. The bug should be fixed in CNI's libcni itself, either in validatePlugin() or by adding "" to var All = PluginSupports("0.1.0", "0.2.0", "0.3.0", "0.3.1", "0.4.0") in version.go.

Filed upstream issue containernetworking/cni#720 for this.

@squeed @bboreham

@nbroeking
Copy link
Author

Thanks everyone. Closing due to issue being in flannel.

@mikechen66
Copy link

The quickest way is to configure kube-flannel and apply it in ARM64 including Jetson TX2/Nano and Rapsberry. Please have a look at the Config the master node

1. Update kube-flannel

$ curl https://rawgit.com/coreos/flannel/master/Documentation/kube-flannel.yml \
> | sed "s/amd64/arm/g" | sed "s/vxlan/host-gw/g" \
> > kube-flannel.yaml

2. Apply kube-flannel.yaml

$ kubectl apply -f kube-flannel.yaml

3. Check the configuration

$ kubectl describe --namespace=kube-system configmaps/kube-flannel-cfg
$ kubectl describe --namespace=kube-system daemonsets/kube-flannel-ds-arm

@xwdreamer
Copy link

@mikechen66 , what's the difference between vxlan and host-gw
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. sig/network Categorizes an issue or PR as relevant to SIG Network. sig/node Categorizes an issue or PR as relevant to SIG Node.
Projects
None yet
Development

No branches or pull requests