One Node loose private and external ip address #68270

shahbour · 2018-09-05T06:22:05Z

Is this a BUG REPORT or FEATURE REQUEST?:
/kind bug
/sig node

What happened:
One of my nodes engine02 lose the internal and external IP after some time

NAME                          STATUS    ROLES     AGE       VERSION   INTERNAL-IP      EXTERNAL-IP      OS-IMAGE                KERNEL-VERSION               CONTAINER-RUNTIME
engine01                      Ready     <none>    60d       v1.11.2   192.168.70.230   192.168.70.230   CentOS Linux 7 (Core)   3.10.0-693.21.1.el7.x86_64   docker://1.13.1
engine02                      Ready     <none>    60d       v1.11.2   <none>           <none>           CentOS Linux 7 (Core)   3.10.0-693.21.1.el7.x86_64   docker://1.13.1
engine03                      Ready     <none>    60d       v1.11.2   172.16.71.11     172.16.71.11     CentOS Linux 7 (Core)   3.10.0-693.21.1.el7.x86_64   docker://1.13.1
kube-master                   Ready     master    90d       v1.11.2   192.168.70.232   192.168.70.232   CentOS Linux 7 (Core)   3.10.0-693.21.1.el7.x86_64   docker://1.13.1

What you expected to happen:

(⎈ |production:default)➜  ~ kubectl get node -o wide
NAME                          STATUS                     ROLES     AGE       VERSION   INTERNAL-IP      EXTERNAL-IP      OS-IMAGE                KERNEL-VERSION               CONTAINER-RUNTIME
engine01                      Ready                      <none>    59d       v1.11.2   192.168.70.230   192.168.70.230   CentOS Linux 7 (Core)   3.10.0-693.21.1.el7.x86_64   docker://1.13.1
engine02                      Ready,SchedulingDisabled   <none>    59d       v1.11.2   192.168.70.231   192.168.70.231   CentOS Linux 7 (Core)   3.10.0-693.21.1.el7.x86_64   docker://1.13.1
engine03                      Ready                      <none>    59d       v1.11.2   172.16.71.11     172.16.71.11     CentOS Linux 7 (Core)   3.10.0-693.21.1.el7.x86_64   docker://1.13.1
kube-master                   Ready                      master    89d       v1.11.2   192.168.70.232   192.168.70.232   CentOS Linux 7 (Core)   3.10.0-693.21.1.el7.x86_64   docker://1.13.1

How to reproduce it (as minimally and precisely as possible):
If i do restart for the engine02 node, the ip do show for some time then after a while it just disappear . i found this while debuging an issue

Anything else we need to know?:
I did not find any thing in log but i did not know exactly what log i should search in

Environment:

Kubernetes version (use kubectl version):

➜ kubectl version
Client Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.1", GitCommit:"b1b29978270dc22fecc592ac55d903350454310a", GitTreeState:"clean", BuildDate:"2018-07-17T18:53:20Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.2", GitCommit:"bb9ffb1654d4a729bb4cec18ff088eacc153c239", GitTreeState:"clean", BuildDate:"2018-08-07T23:08:19Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"linux/amd64"}

Cloud provider or hardware configuration:
Vsphere
OS (e.g. from /etc/os-release):

➜  ~ cat /etc/os-release 
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"

Kernel (e.g. uname -a):
Linux engine02 3.10.0-693.21.1.el7.x86_64 Unit test coverage in Kubelet is lousy. (~30%) #1 SMP Wed Mar 7 19:03:37 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Install tools:
Others:

The text was updated successfully, but these errors were encountered:

stieler-it · 2018-09-05T06:40:39Z

We are already discussing this problem, but I think this might be the correct place for it. What we currently think to know:

Nodes lose IP information in their status (internalIP and externalIP)
Information can come back after time and will immediately come back when kubelet is restarted
Happens with different cloud providers (so far OpenStack, Azure, vsphere)
Seems to be a regression with K8S 1.11.x, does not happen in 1.10.5

Related issues:

Kubernetes nodes lose InternalIP and ExternalIP temporarily cloud-provider-openstack#280
Kubernetes 1.11.1 nodes occasionally do not register internal IP address rancher/rke#860
Kubernetes 1.11 cluster node lost its InternalIP Azure/acs-engine#3503

shahbour · 2018-09-05T06:50:23Z

Some notes:
it did start from version 1.11.0 as you can see in the weave issue above

(⎈ |production:kube-system)➜  ~ kubectl get node -o wide
NAME                          STATUS                     ROLES     AGE       VERSION   INTERNAL-IP      EXTERNAL-IP      OS-IMAGE                KERNEL-VERSION               CONTAINER-RUNTIME
engine01                      Ready,SchedulingDisabled   <none>    59d       v1.11.2   192.168.70.230   192.168.70.230   CentOS Linux 7 (Core)   3.10.0-693.21.1.el7.x86_64   docker://1.13.1
engine02                      Ready                      <none>    59d       v1.11.0   <none>           <none>           CentOS Linux 7 (Core)   3.10.0-693.21.1.el7.x86_64   docker://1.13.1
engine03                      Ready                      <none>    59d       v1.11.2   172.16.71.11     172.16.71.11     CentOS Linux 7 (Core)   3.10.0-693.21.1.el7.x86_64   docker://1.13.1
kube-master                   Ready                      master    89d       v1.11.2   192.168.70.232   192.168.70.232   CentOS Linux 7 (Core)   3.10.0-693.21.1.el7.x86_64   docker://1.13.1

I do have vshpere cloud provider , i did install it on our own vsphere using kubeadm.

If you need any log i would be happy to share it

stieler-it · 2018-09-05T07:23:22Z

Do you have any connection errors when grep'ing kubelet logs for "node status"?

shahbour · 2018-09-05T07:33:27Z

Yes log is full of that, please note that I did restart kubelet

2018-09-05 07:12 systemctl restart kubelet.service

➜  ~ grep "node status" /var/log/messages
Sep  3 07:35:28 engine02 kubelet: E0903 07:35:28.899230   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?resourceVersion=0&timeout=10s: dial tcp 192.168.70.232:6443: connect: connection refused
Sep  3 07:35:28 engine02 kubelet: E0903 07:35:28.900518   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?timeout=10s: dial tcp 192.168.70.232:6443: connect: connection refused
Sep  3 07:35:28 engine02 kubelet: E0903 07:35:28.902115   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?timeout=10s: dial tcp 192.168.70.232:6443: connect: connection refused
Sep  3 07:35:28 engine02 kubelet: E0903 07:35:28.903700   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?timeout=10s: dial tcp 192.168.70.232:6443: connect: connection refused
Sep  3 07:35:28 engine02 kubelet: E0903 07:35:28.904867   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?timeout=10s: dial tcp 192.168.70.232:6443: connect: connection refused
Sep  3 07:35:28 engine02 kubelet: E0903 07:35:28.904916   27639 kubelet_node_status.go:379] Unable to update node status: update node status exceeds retry count
Sep  3 07:35:48 engine02 kubelet: E0903 07:35:48.906432   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?resourceVersion=0&timeout=10s: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
Sep  3 07:35:58 engine02 kubelet: E0903 07:35:58.907013   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?timeout=10s: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
Sep  3 07:36:08 engine02 kubelet: E0903 07:36:08.907768   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?timeout=10s: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
Sep  3 07:36:08 engine02 kubelet: E0903 07:36:08.908883   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?timeout=10s: dial tcp 192.168.70.232:6443: connect: connection refused
Sep  3 07:36:08 engine02 kubelet: E0903 07:36:08.909872   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?timeout=10s: dial tcp 192.168.70.232:6443: connect: connection refused
Sep  3 07:36:08 engine02 kubelet: E0903 07:36:08.909926   27639 kubelet_node_status.go:379] Unable to update node status: update node status exceeds retry count
Sep  3 07:36:18 engine02 kubelet: E0903 07:36:18.911107   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?resourceVersion=0&timeout=10s: dial tcp 192.168.70.232:6443: connect: connection refused
Sep  3 07:36:18 engine02 kubelet: E0903 07:36:18.912322   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?timeout=10s: dial tcp 192.168.70.232:6443: connect: connection refused
Sep  3 07:36:18 engine02 kubelet: E0903 07:36:18.913246   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?timeout=10s: dial tcp 192.168.70.232:6443: connect: connection refused
Sep  3 07:36:18 engine02 kubelet: E0903 07:36:18.914313   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?timeout=10s: dial tcp 192.168.70.232:6443: connect: connection refused
Sep  3 07:36:18 engine02 kubelet: E0903 07:36:18.915218   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?timeout=10s: dial tcp 192.168.70.232:6443: connect: connection refused
Sep  3 07:36:18 engine02 kubelet: E0903 07:36:18.915278   27639 kubelet_node_status.go:379] Unable to update node status: update node status exceeds retry count
Sep  3 07:36:28 engine02 kubelet: E0903 07:36:28.916545   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?resourceVersion=0&timeout=10s: dial tcp 192.168.70.232:6443: connect: connection refused
Sep  3 07:36:28 engine02 kubelet: E0903 07:36:28.917366   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?timeout=10s: dial tcp 192.168.70.232:6443: connect: connection refused
Sep  3 07:36:28 engine02 kubelet: E0903 07:36:28.918103   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?timeout=10s: dial tcp 192.168.70.232:6443: connect: connection refused
Sep  3 07:36:28 engine02 kubelet: E0903 07:36:28.918837   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?timeout=10s: dial tcp 192.168.70.232:6443: connect: connection refused
Sep  3 07:36:28 engine02 kubelet: E0903 07:36:28.919612   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?timeout=10s: dial tcp 192.168.70.232:6443: connect: connection refused
Sep  3 07:36:28 engine02 kubelet: E0903 07:36:28.919686   27639 kubelet_node_status.go:379] Unable to update node status: update node status exceeds retry count
Sep  3 07:36:38 engine02 kubelet: E0903 07:36:38.920985   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?resourceVersion=0&timeout=10s: dial tcp 192.168.70.232:6443: connect: connection refused
Sep  3 07:36:38 engine02 kubelet: E0903 07:36:38.921887   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?timeout=10s: dial tcp 192.168.70.232:6443: connect: connection refused
Sep  3 07:36:38 engine02 kubelet: E0903 07:36:38.922680   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?timeout=10s: dial tcp 192.168.70.232:6443: connect: connection refused
Sep  3 07:36:38 engine02 kubelet: E0903 07:36:38.923490   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?timeout=10s: dial tcp 192.168.70.232:6443: connect: connection refused
Sep  3 07:36:38 engine02 kubelet: E0903 07:36:38.924282   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?timeout=10s: dial tcp 192.168.70.232:6443: connect: connection refused
Sep  3 07:36:38 engine02 kubelet: E0903 07:36:38.924358   27639 kubelet_node_status.go:379] Unable to update node status: update node status exceeds retry count
Sep  3 07:36:48 engine02 kubelet: E0903 07:36:48.925726   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?resourceVersion=0&timeout=10s: dial tcp 192.168.70.232:6443: connect: connection refused
Sep  3 07:36:48 engine02 kubelet: E0903 07:36:48.926672   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?timeout=10s: dial tcp 192.168.70.232:6443: connect: connection refused
Sep  3 07:36:48 engine02 kubelet: E0903 07:36:48.927563   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?timeout=10s: dial tcp 192.168.70.232:6443: connect: connection refused
Sep  3 07:36:48 engine02 kubelet: E0903 07:36:48.928365   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?timeout=10s: dial tcp 192.168.70.232:6443: connect: connection refused
Sep  3 07:36:48 engine02 kubelet: E0903 07:36:48.929118   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?timeout=10s: dial tcp 192.168.70.232:6443: connect: connection refused
Sep  3 07:36:48 engine02 kubelet: E0903 07:36:48.929150   27639 kubelet_node_status.go:379] Unable to update node status: update node status exceeds retry count
Sep  3 07:37:08 engine02 kubelet: E0903 07:37:08.930270   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?resourceVersion=0&timeout=10s: context deadline exceeded (Client.Timeout exceeded while awaiting headers)
Sep  3 07:37:18 engine02 kubelet: E0903 07:37:18.930970   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?timeout=10s: context deadline exceeded
Sep  3 08:09:28 engine02 kubelet: E0903 08:09:28.865697   27639 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubevols/kubernetes-dynamic-pvc-ac6a1d7d-6a2c-11e8-87de-0050568166d0.vmdk\"" failed. No retries permitted until 2018-09-03 08:09:30.865628329 +0000 UTC m=+4628926.006948413 (durationBeforeRetry 2s). Error: "Volume not attached according to node status for volume \"pvc-ac6a1d7d-6a2c-11e8-87de-0050568166d0\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubevols/kubernetes-dynamic-pvc-ac6a1d7d-6a2c-11e8-87de-0050568166d0.vmdk\") pod \"rabbitmq-rabbitmq-ha-1\" (UID: \"ad1e687c-af50-11e8-b7be-0050568166d0\") "
Sep  3 08:09:30 engine02 kubelet: E0903 08:09:30.881016   27639 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubevols/kubernetes-dynamic-pvc-ac6a1d7d-6a2c-11e8-87de-0050568166d0.vmdk\"" failed. No retries permitted until 2018-09-03 08:09:34.880955415 +0000 UTC m=+4628930.022275387 (durationBeforeRetry 4s). Error: "Volume not attached according to node status for volume \"pvc-ac6a1d7d-6a2c-11e8-87de-0050568166d0\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubevols/kubernetes-dynamic-pvc-ac6a1d7d-6a2c-11e8-87de-0050568166d0.vmdk\") pod \"rabbitmq-rabbitmq-ha-1\" (UID: \"ad1e687c-af50-11e8-b7be-0050568166d0\") "
Sep  3 08:09:55 engine02 kubelet: E0903 08:09:55.775875   27639 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubevols/kubernetes-dynamic-pvc-ac6a1d7d-6a2c-11e8-87de-0050568166d0.vmdk\"" failed. No retries permitted until 2018-09-03 08:10:03.775785447 +0000 UTC m=+4628958.917105455 (durationBeforeRetry 8s). Error: "Volume not attached according to node status for volume \"pvc-ac6a1d7d-6a2c-11e8-87de-0050568166d0\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubevols/kubernetes-dynamic-pvc-ac6a1d7d-6a2c-11e8-87de-0050568166d0.vmdk\") pod \"rabbitmq-rabbitmq-ha-1\" (UID: \"ad1e687c-af50-11e8-b7be-0050568166d0\") "
Sep  3 08:10:03 engine02 kubelet: E0903 08:10:03.787976   27639 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubevols/kubernetes-dynamic-pvc-ac6a1d7d-6a2c-11e8-87de-0050568166d0.vmdk\"" failed. No retries permitted until 2018-09-03 08:10:19.787874552 +0000 UTC m=+4628974.929194625 (durationBeforeRetry 16s). Error: "Volume not attached according to node status for volume \"pvc-ac6a1d7d-6a2c-11e8-87de-0050568166d0\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubevols/kubernetes-dynamic-pvc-ac6a1d7d-6a2c-11e8-87de-0050568166d0.vmdk\") pod \"rabbitmq-rabbitmq-ha-1\" (UID: \"ad1e687c-af50-11e8-b7be-0050568166d0\") "
Sep  4 11:56:34 engine02 kubelet: E0904 11:56:34.438862     804 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?resourceVersion=0&timeout=10s: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
Sep  4 11:56:38 engine02 kubelet: E0904 11:56:38.462093     804 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\"" failed. No retries permitted until 2018-09-04 11:56:46.462032212 +0000 UTC m=+4086.955907754 (durationBeforeRetry 8s). Error: "Volume not attached according to node status for volume \"vsphere-es-pv\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\") pod \"elasticsearch-logging-749d8bc9bc-vvnn8\" (UID: \"820b5be9-b039-11e8-b7be-0050568166d0\") "
Sep  4 11:57:04 engine02 kubelet: E0904 11:57:04.398858     804 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\"" failed. No retries permitted until 2018-09-04 11:57:20.398806711 +0000 UTC m=+4120.892682230 (durationBeforeRetry 16s). Error: "Volume not attached according to node status for volume \"vsphere-es-pv\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\") pod \"elasticsearch-logging-749d8bc9bc-vvnn8\" (UID: \"820b5be9-b039-11e8-b7be-0050568166d0\") "
Sep  4 11:57:34 engine02 kubelet: E0904 11:57:34.470714     804 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\"" failed. No retries permitted until 2018-09-04 11:58:06.470673606 +0000 UTC m=+4166.964549132 (durationBeforeRetry 32s). Error: "Volume not attached according to node status for volume \"vsphere-es-pv\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\") pod \"elasticsearch-logging-749d8bc9bc-vvnn8\" (UID: \"820b5be9-b039-11e8-b7be-0050568166d0\") "
Sep  4 11:58:06 engine02 kubelet: E0904 11:58:06.550371     804 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\"" failed. No retries permitted until 2018-09-04 11:59:10.550304065 +0000 UTC m=+4231.044179614 (durationBeforeRetry 1m4s). Error: "Volume not attached according to node status for volume \"vsphere-es-pv\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\") pod \"elasticsearch-logging-749d8bc9bc-vvnn8\" (UID: \"820b5be9-b039-11e8-b7be-0050568166d0\") "
Sep  4 11:59:14 engine02 kubelet: E0904 11:59:14.364514     804 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\"" failed. No retries permitted until 2018-09-04 12:01:16.364403626 +0000 UTC m=+4356.858279268 (durationBeforeRetry 2m2s). Error: "Volume not attached according to node status for volume \"vsphere-es-pv\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\") pod \"elasticsearch-logging-749d8bc9bc-vvnn8\" (UID: \"820b5be9-b039-11e8-b7be-0050568166d0\") "
Sep  4 11:59:59 engine02 kubelet: W0904 11:59:59.952086     804 kubelet_node_status.go:1114] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s
Sep  4 12:01:02 engine02 kubelet: E0904 12:01:02.588554     804 kubelet_node_status.go:391] Error updating node status, will retry: failed to patch status "{\"status\":{\"$setElementOrder/conditions\":[{\"type\":\"OutOfDisk\"},{\"type\":\"MemoryPressure\"},{\"type\":\"DiskPressure\"},{\"type\":\"PIDPressure\"},{\"type\":\"Ready\"}],\"conditions\":[{\"lastHeartbeatTime\":\"2018-09-04T12:00:40Z\",\"type\":\"OutOfDisk\"},{\"lastHeartbeatTime\":\"2018-09-04T12:00:40Z\",\"type\":\"MemoryPressure\"},{\"lastHeartbeatTime\":\"2018-09-04T12:00:40Z\",\"type\":\"DiskPressure\"},{\"lastHeartbeatTime\":\"2018-09-04T12:00:40Z\",\"type\":\"PIDPressure\"},{\"lastHeartbeatTime\":\"2018-09-04T12:00:40Z\",\"type\":\"Ready\"}]}}" for node "engine02": Patch https://192.168.70.232:6443/api/v1/nodes/engine02/status?timeout=10s: context deadline exceeded
Sep  4 12:01:16 engine02 kubelet: E0904 12:01:16.457135     804 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\"" failed. No retries permitted until 2018-09-04 12:03:18.457065655 +0000 UTC m=+4478.950941239 (durationBeforeRetry 2m2s). Error: "Volume not attached according to node status for volume \"vsphere-es-pv\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\") pod \"elasticsearch-logging-749d8bc9bc-8jlqk\" (UID: \"f6472567-b039-11e8-b7be-0050568166d0\") "
Sep  4 12:03:18 engine02 kubelet: E0904 12:03:18.528831     804 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\"" failed. No retries permitted until 2018-09-04 12:05:20.528766851 +0000 UTC m=+4601.022642408 (durationBeforeRetry 2m2s). Error: "Volume not attached according to node status for volume \"vsphere-es-pv\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\") pod \"elasticsearch-logging-749d8bc9bc-8jlqk\" (UID: \"f6472567-b039-11e8-b7be-0050568166d0\") "
Sep  4 12:05:20 engine02 kubelet: E0904 12:05:20.610568     804 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\"" failed. No retries permitted until 2018-09-04 12:07:22.610499519 +0000 UTC m=+4723.104375101 (durationBeforeRetry 2m2s). Error: "Volume not attached according to node status for volume \"vsphere-es-pv\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\") pod \"elasticsearch-logging-749d8bc9bc-8jlqk\" (UID: \"f6472567-b039-11e8-b7be-0050568166d0\") "
Sep  4 12:07:42 engine02 kubelet: E0904 12:07:42.787880     804 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\"" failed. No retries permitted until 2018-09-04 12:09:44.787797916 +0000 UTC m=+4865.281673428 (durationBeforeRetry 2m2s). Error: "Volume not attached according to node status for volume \"vsphere-es-pv\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\") pod \"elasticsearch-logging-749d8bc9bc-8jlqk\" (UID: \"f6472567-b039-11e8-b7be-0050568166d0\") "
Sep  4 12:09:44 engine02 kubelet: E0904 12:09:44.848776     804 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\"" failed. No retries permitted until 2018-09-04 12:11:46.848716767 +0000 UTC m=+4987.342592332 (durationBeforeRetry 2m2s). Error: "Volume not attached according to node status for volume \"vsphere-es-pv\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\") pod \"elasticsearch-logging-749d8bc9bc-8jlqk\" (UID: \"f6472567-b039-11e8-b7be-0050568166d0\") "
Sep  4 12:11:46 engine02 kubelet: E0904 12:11:46.938195     804 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\"" failed. No retries permitted until 2018-09-04 12:13:48.938134262 +0000 UTC m=+5109.432009858 (durationBeforeRetry 2m2s). Error: "Volume not attached according to node status for volume \"vsphere-es-pv\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\") pod \"elasticsearch-logging-749d8bc9bc-8jlqk\" (UID: \"f6472567-b039-11e8-b7be-0050568166d0\") "
Sep  4 12:13:48 engine02 kubelet: E0904 12:13:48.976105     804 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\"" failed. No retries permitted until 2018-09-04 12:15:50.976009545 +0000 UTC m=+5231.469885124 (durationBeforeRetry 2m2s). Error: "Volume not attached according to node status for volume \"vsphere-es-pv\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\") pod \"elasticsearch-logging-749d8bc9bc-8jlqk\" (UID: \"f6472567-b039-11e8-b7be-0050568166d0\") "
Sep  4 12:15:51 engine02 kubelet: E0904 12:15:51.079056     804 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\"" failed. No retries permitted until 2018-09-04 12:17:53.078988077 +0000 UTC m=+5353.572863672 (durationBeforeRetry 2m2s). Error: "Volume not attached according to node status for volume \"vsphere-es-pv\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\") pod \"elasticsearch-logging-749d8bc9bc-8jlqk\" (UID: \"f6472567-b039-11e8-b7be-0050568166d0\") "
Sep  4 12:17:53 engine02 kubelet: E0904 12:17:53.104947     804 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\"" failed. No retries permitted until 2018-09-04 12:19:55.104867527 +0000 UTC m=+5475.598743103 (durationBeforeRetry 2m2s). Error: "Volume not attached according to node status for volume \"vsphere-es-pv\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\") pod \"elasticsearch-logging-749d8bc9bc-8jlqk\" (UID: \"f6472567-b039-11e8-b7be-0050568166d0\") "
Sep  4 12:19:55 engine02 kubelet: E0904 12:19:55.142301     804 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\"" failed. No retries permitted until 2018-09-04 12:21:57.142196146 +0000 UTC m=+5597.636071728 (durationBeforeRetry 2m2s). Error: "Volume not attached according to node status for volume \"vsphere-es-pv\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\") pod \"elasticsearch-logging-749d8bc9bc-8jlqk\" (UID: \"f6472567-b039-11e8-b7be-0050568166d0\") "
Sep  4 12:21:57 engine02 kubelet: E0904 12:21:57.216295     804 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\"" failed. No retries permitted until 2018-09-04 12:23:59.216188034 +0000 UTC m=+5719.710063606 (durationBeforeRetry 2m2s). Error: "Volume not attached according to node status for volume \"vsphere-es-pv\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\") pod \"elasticsearch-logging-749d8bc9bc-8jlqk\" (UID: \"f6472567-b039-11e8-b7be-0050568166d0\") "
Sep  4 12:22:58 engine02 kubelet: E0904 12:22:58.236983     804 kubelet_node_status.go:391] Error updating node status, will retry: failed to patch status "{\"status\":{\"$setElementOrder/conditions\":[{\"type\":\"OutOfDisk\"},{\"type\":\"MemoryPressure\"},{\"type\":\"DiskPressure\"},{\"type\":\"PIDPressure\"},{\"type\":\"Ready\"}],\"conditions\":[{\"lastHeartbeatTime\":\"2018-09-04T12:22:35Z\",\"type\":\"OutOfDisk\"},{\"lastHeartbeatTime\":\"2018-09-04T12:22:35Z\",\"type\":\"MemoryPressure\"},{\"lastHeartbeatTime\":\"2018-09-04T12:22:35Z\",\"type\":\"DiskPressure\"},{\"lastHeartbeatTime\":\"2018-09-04T12:22:35Z\",\"type\":\"PIDPressure\"},{\"lastHeartbeatTime\":\"2018-09-04T12:22:35Z\",\"type\":\"Ready\"}]}}" for node "engine02": Patch https://192.168.70.232:6443/api/v1/nodes/engine02/status?timeout=10s: context deadline exceeded
Sep  4 12:23:59 engine02 kubelet: E0904 12:23:59.307749     804 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\"" failed. No retries permitted until 2018-09-04 12:26:01.307682829 +0000 UTC m=+5841.801558371 (durationBeforeRetry 2m2s). Error: "Volume not attached according to node status for volume \"vsphere-es-pv\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\") pod \"elasticsearch-logging-749d8bc9bc-8jlqk\" (UID: \"f6472567-b039-11e8-b7be-0050568166d0\") "
Sep  4 12:26:01 engine02 kubelet: E0904 12:26:01.337923     804 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\"" failed. No retries permitted until 2018-09-04 12:28:03.337849408 +0000 UTC m=+5963.831724943 (durationBeforeRetry 2m2s). Error: "Volume not attached according to node status for volume \"vsphere-es-pv\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\") pod \"elasticsearch-logging-749d8bc9bc-8jlqk\" (UID: \"f6472567-b039-11e8-b7be-0050568166d0\") "
Sep  4 12:28:03 engine02 kubelet: E0904 12:28:03.916377     804 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?resourceVersion=0&timeout=10s: context deadline exceeded
Sep  4 12:28:03 engine02 kubelet: E0904 12:28:03.926029     804 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\"" failed. No retries permitted until 2018-09-04 12:30:05.925993854 +0000 UTC m=+6086.419869388 (durationBeforeRetry 2m2s). Error: "Volume not attached according to node status for volume \"vsphere-es-pv\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\") pod \"elasticsearch-logging-749d8bc9bc-8jlqk\" (UID: \"f6472567-b039-11e8-b7be-0050568166d0\") "
Sep  4 12:30:05 engine02 kubelet: E0904 12:30:05.967223     804 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\"" failed. No retries permitted until 2018-09-04 12:32:07.967150273 +0000 UTC m=+6208.461025812 (durationBeforeRetry 2m2s). Error: "Volume not attached according to node status for volume \"vsphere-es-pv\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\") pod \"elasticsearch-logging-749d8bc9bc-8jlqk\" (UID: \"f6472567-b039-11e8-b7be-0050568166d0\") "
Sep  4 12:30:36 engine02 kubelet: E0904 12:30:36.809110     804 kubelet_node_status.go:391] Error updating node status, will retry: failed to patch status "{\"status\":{\"$setElementOrder/conditions\":[{\"type\":\"OutOfDisk\"},{\"type\":\"MemoryPressure\"},{\"type\":\"DiskPressure\"},{\"type\":\"PIDPressure\"},{\"type\":\"Ready\"}],\"conditions\":[{\"lastHeartbeatTime\":\"2018-09-04T12:30:14Z\",\"type\":\"OutOfDisk\"},{\"lastHeartbeatTime\":\"2018-09-04T12:30:14Z\",\"type\":\"MemoryPressure\"},{\"lastHeartbeatTime\":\"2018-09-04T12:30:14Z\",\"type\":\"DiskPressure\"},{\"lastHeartbeatTime\":\"2018-09-04T12:30:14Z\",\"type\":\"PIDPressure\"},{\"lastHeartbeatTime\":\"2018-09-04T12:30:14Z\",\"type\":\"Ready\"}]}}" for node "engine02": Patch https://192.168.70.232:6443/api/v1/nodes/engine02/status?timeout=10s: context deadline exceeded (Client.Timeout exceeded while awaiting headers)
Sep  4 12:32:08 engine02 kubelet: E0904 12:32:08.033116     804 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\"" failed. No retries permitted until 2018-09-04 12:34:10.033039767 +0000 UTC m=+6330.526915349 (durationBeforeRetry 2m2s). Error: "Volume not attached according to node status for volume \"vsphere-es-pv\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\") pod \"elasticsearch-logging-749d8bc9bc-8jlqk\" (UID: \"f6472567-b039-11e8-b7be-0050568166d0\") "
Sep  4 12:33:09 engine02 kubelet: E0904 12:33:09.624995     804 kubelet_node_status.go:391] Error updating node status, will retry: failed to patch status "{\"status\":{\"$setElementOrder/conditions\":[{\"type\":\"OutOfDisk\"},{\"type\":\"MemoryPressure\"},{\"type\":\"DiskPressure\"},{\"type\":\"PIDPressure\"},{\"type\":\"Ready\"}],\"conditions\":[{\"lastHeartbeatTime\":\"2018-09-04T12:32:47Z\",\"type\":\"OutOfDisk\"},{\"lastHeartbeatTime\":\"2018-09-04T12:32:47Z\",\"type\":\"MemoryPressure\"},{\"lastHeartbeatTime\":\"2018-09-04T12:32:47Z\",\"type\":\"DiskPressure\"},{\"lastHeartbeatTime\":\"2018-09-04T12:32:47Z\",\"type\":\"PIDPressure\"},{\"lastHeartbeatTime\":\"2018-09-04T12:32:47Z\",\"type\":\"Ready\"}]}}" for node "engine02": Patch https://192.168.70.232:6443/api/v1/nodes/engine02/status?timeout=10s: context deadline exceeded
Sep  4 12:34:10 engine02 kubelet: E0904 12:34:10.109208     804 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\"" failed. No retries permitted until 2018-09-04 12:36:12.109124111 +0000 UTC m=+6452.602999713 (durationBeforeRetry 2m2s). Error: "Volume not attached according to node status for volume \"vsphere-es-pv\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\") pod \"elasticsearch-logging-749d8bc9bc-8jlqk\" (UID: \"f6472567-b039-11e8-b7be-0050568166d0\") "
Sep  4 12:36:12 engine02 kubelet: E0904 12:36:12.155042     804 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\"" failed. No retries permitted until 2018-09-04 12:38:14.154960161 +0000 UTC m=+6574.648835750 (durationBeforeRetry 2m2s). Error: "Volume not attached according to node status for volume \"vsphere-es-pv\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\") pod \"elasticsearch-logging-749d8bc9bc-8jlqk\" (UID: \"f6472567-b039-11e8-b7be-0050568166d0\") "
Sep  4 12:38:14 engine02 kubelet: E0904 12:38:14.392556     804 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\"" failed. No retries permitted until 2018-09-04 12:40:16.39245309 +0000 UTC m=+6696.886328618 (durationBeforeRetry 2m2s). Error: "Volume not attached according to node status for volume \"vsphere-es-pv\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\") pod \"elasticsearch-logging-749d8bc9bc-8jlqk\" (UID: \"f6472567-b039-11e8-b7be-0050568166d0\") "
Sep  4 12:40:16 engine02 kubelet: E0904 12:40:16.435047     804 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\"" failed. No retries permitted until 2018-09-04 12:42:18.434977275 +0000 UTC m=+6818.928852849 (durationBeforeRetry 2m2s). Error: "Volume not attached according to node status for volume \"vsphere-es-pv\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\") pod \"elasticsearch-logging-749d8bc9bc-8jlqk\" (UID: \"f6472567-b039-11e8-b7be-0050568166d0\") "
Sep  4 14:32:24 engine02 kubelet: E0904 14:32:24.080894     804 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubevols/kubernetes-dynamic-pvc-03005cea-6a2c-11e8-87de-0050568166d0.vmdk\"" failed. No retries permitted until 2018-09-04 14:32:25.080834994 +0000 UTC m=+13425.574710508 (durationBeforeRetry 1s). Error: "Volume not attached according to node status for volume \"pvc-03005cea-6a2c-11e8-87de-0050568166d0\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubevols/kubernetes-dynamic-pvc-03005cea-6a2c-11e8-87de-0050568166d0.vmdk\") pod \"rabbitmq-rabbitmq-ha-0\" (UID: \"5672f576-b04f-11e8-b7be-0050568166d0\") "
Sep  4 14:32:25 engine02 kubelet: E0904 14:32:25.186936     804 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubevols/kubernetes-dynamic-pvc-03005cea-6a2c-11e8-87de-0050568166d0.vmdk\"" failed. No retries permitted until 2018-09-04 14:32:27.186869203 +0000 UTC m=+13427.680744731 (durationBeforeRetry 2s). Error: "Volume not attached according to node status for volume \"pvc-03005cea-6a2c-11e8-87de-0050568166d0\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubevols/kubernetes-dynamic-pvc-03005cea-6a2c-11e8-87de-0050568166d0.vmdk\") pod \"rabbitmq-rabbitmq-ha-0\" (UID: \"5672f576-b04f-11e8-b7be-0050568166d0\") "
Sep  4 14:32:27 engine02 kubelet: E0904 14:32:27.200035     804 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubevols/kubernetes-dynamic-pvc-03005cea-6a2c-11e8-87de-0050568166d0.vmdk\"" failed. No retries permitted until 2018-09-04 14:32:31.199965396 +0000 UTC m=+13431.693841018 (durationBeforeRetry 4s). Error: "Volume not attached according to node status for volume \"pvc-03005cea-6a2c-11e8-87de-0050568166d0\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubevols/kubernetes-dynamic-pvc-03005cea-6a2c-11e8-87de-0050568166d0.vmdk\") pod \"rabbitmq-rabbitmq-ha-0\" (UID: \"5672f576-b04f-11e8-b7be-0050568166d0\") "
Sep  4 14:32:52 engine02 kubelet: E0904 14:32:52.913975     804 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubevols/kubernetes-dynamic-pvc-03005cea-6a2c-11e8-87de-0050568166d0.vmdk\"" failed. No retries permitted until 2018-09-04 14:33:00.913881446 +0000 UTC m=+13461.407756984 (durationBeforeRetry 8s). Error: "Volume not attached according to node status for volume \"pvc-03005cea-6a2c-11e8-87de-0050568166d0\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubevols/kubernetes-dynamic-pvc-03005cea-6a2c-11e8-87de-0050568166d0.vmdk\") pod \"rabbitmq-rabbitmq-ha-0\" (UID: \"5672f576-b04f-11e8-b7be-0050568166d0\") "
Sep  4 14:33:00 engine02 kubelet: E0904 14:33:00.948623     804 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubevols/kubernetes-dynamic-pvc-03005cea-6a2c-11e8-87de-0050568166d0.vmdk\"" failed. No retries permitted until 2018-09-04 14:33:16.948498532 +0000 UTC m=+13477.442374068 (durationBeforeRetry 16s). Error: "Volume not attached according to node status for volume \"pvc-03005cea-6a2c-11e8-87de-0050568166d0\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubevols/kubernetes-dynamic-pvc-03005cea-6a2c-11e8-87de-0050568166d0.vmdk\") pod \"rabbitmq-rabbitmq-ha-0\" (UID: \"5672f576-b04f-11e8-b7be-0050568166d0\") "

FengyunPan2 · 2018-09-06T09:36:58Z

/cc

twittyc · 2018-09-06T20:18:01Z

Hey guys, We are getting this same problem with the OpenStack Cloud Provider.

cat kubelet-log.json |  grep -v "Volume not attached" | grep "node status"
{"log":"E0813 20:38:56.118941   18989 kubelet_node_status.go:391] Error updating node status, will retry: error getting node \"k8s-corp-prod-0-worker-us-corp-kc-8b-1\": Get https://127.0.0.1:6443/api/v1/nodes/k8s-corp-prod-0-worker-us-corp-kc-8b-1?resourceVersion=0\u0026timeout=10s: unexpected EOF\n","stream":"stderr","time":"2018-08-13T20:38:56.121316062Z"}
{"log":"E0813 23:50:34.361091   18989 kubelet_node_status.go:391] Error updating node status, will retry: error getting node \"k8s-corp-prod-0-worker-us-corp-kc-8b-1\": Get https://127.0.0.1:6443/api/v1/nodes/k8s-corp-prod-0-worker-us-corp-kc-8b-1?resourceVersion=0\u0026timeout=10s: unexpected EOF\n","stream":"stderr","time":"2018-08-13T23:50:34.361316592Z"}
{"log":"W0814 01:13:16.823637   18989 kubelet_node_status.go:1114] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s\n","stream":"stderr","time":"2018-08-14T01:13:16.823960043Z"}
{"log":"E0814 07:01:54.730030   18989 kubelet_node_status.go:391] Error updating node status, will retry: error getting node \"k8s-corp-prod-0-worker-us-corp-kc-8b-1\": Get https://127.0.0.1:6443/api/v1/nodes/k8s-corp-prod-0-worker-us-corp-kc-8b-1?resourceVersion=0\u0026timeout=10s: unexpected EOF\n","stream":"stderr","time":"2018-08-14T07:01:54.731184923Z"}

wanghaoran1988 · 2018-09-12T07:09:31Z

We met this problem with aws cloud provider, and #65226 fixed this.

openstacker · 2018-10-25T01:04:46Z

We met this problem with aws cloud provider, and #65226 fixed this.

Thanks for the feedback. So far, we know almost all cloud provider(AWS, Azure, OpenStack and vsphere) affected by this bug.

MaxDiOrio · 2018-11-16T21:14:33Z

I'm going to add my 2c to this one. I have had issues bringing up clusters, not where the IP goes missing, but internal/external IP's are set to the eth0 self-assigned IPv6 address and the only way to fix it is delete the node and recreate it.
rancher/rancher#16597

This is a workaround for the bug described at kubernetes/kubernetes#68270, where nodes lose their IP address. A fix is supposed to be in 1.11.6, but this should overcome the issue for the time being.

stieler-it · 2018-12-18T22:03:21Z

Seems to be fixed in k8s v1.11.5

mtougeron · 2018-12-18T22:21:25Z

@stieler-it fyi, fwiw, we just encountered it today on v1.11.5 a little less than an hour ago.

openstacker · 2018-12-19T02:35:41Z

It's merged into v1.11.6 not v1.11.5, just FYI.

bosatsu · 2019-01-02T19:57:58Z

New to kubernetes, but I believe I may be seeing this issue as well in v1.12.3. Let me know if there is any additional info I can provide to confirm or refute.

NAME            STATUS   ROLES    AGE   VERSION   INTERNAL-IP     EXTERNAL-IP   OS-IMAGE                KERNEL-VERSION               CONTAINER-RUNTIME
p3dk8sautoing   Ready    node     72m   v1.12.3   <none>          <none>        CentOS Linux 7 (Core)   3.10.0-862.14.4.el7.x86_64   docker://18.6.1
p3dk8sautomon   Ready    node     72m   v1.12.3   10.33.106.138   <none>        CentOS Linux 7 (Core)   3.10.0-862.14.4.el7.x86_64   docker://18.6.1
p3dk8sautomtr   Ready    master   73m   v1.12.3   <none>          <none>        CentOS Linux 7 (Core)   3.10.0-862.14.4.el7.x86_64   docker://18.6.1
p3dk8sautowkr   Ready    node     72m   v1.12.3   10.33.108.118   <none>        CentOS Linux 7 (Core)   3.10.0-862.14.4.el7.x86_64   docker://18.6.1

Jan 02 11:40:56 p3dk8sautomtr kubelet[10488]: I0102 11:40:56.885054   10488 setters.go:72] Using node IP: "10.33.88.37"
Jan 02 11:40:56 p3dk8sautomtr kubelet[10488]: W0102 11:40:56.885105   10488 kubelet_node_status.go:463] Failed to set some node status fields: failed to get node address from cloud provider that matches ip: 10.33.88.37
Jan 02 11:41:06 p3dk8sautomtr kubelet[10488]: I0102 11:41:06.900189   10488 setters.go:72] Using node IP: "10.33.88.37"
Jan 02 11:41:06 p3dk8sautomtr kubelet[10488]: W0102 11:41:06.900273   10488 kubelet_node_status.go:463] Failed to set some node status fields: failed to get node address from cloud provider that matches ip: 10.33.88.37

Environment:

Kubernetes version (use kubectl version):

Client Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.1", GitCommit:"eec55b9ba98609a46fee712359c7b5b365bdd920", GitTreeState:"clean", BuildDate:"2018-12-13T10:39:04Z", GoVersion:"go1.11.2", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.3", GitCommit:"435f92c719f279a3a67808c80521ea17d5715c66", GitTreeState:"clean", BuildDate:"2018-
11-26T12:46:57Z", GoVersion:"go1.10.4", Compiler:"gc", Platform:"linux/amd64"}

Cloud provider or hardware configuration:

Private Openstack cloud

OS (e.g. from /etc/os-release):

$ cat /etc/os-release
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"

Kernel (e.g. uname -a):

$ uname -a
Linux p3dk8sautomtr 3.10.0-862.14.4.el7.x86_64 #1 SMP Wed Sep 26 15:12:11 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

Install tools:

Kubespray release 2.8

yogeshnath · 2019-02-01T17:17:31Z

We are running into the same issue. Does anyone see this in 1.13 as well?

lcd2002 · 2019-03-02T02:07:56Z

I'm having this issue with 1.13.1. One of the masters (HA configurations) lost the "addresses" object, where InternalIP, InternalDNS, and Hostname are listed. I restarted kubelet several times and even rebooted the machine to no avail. anybody know how to get it back?

dassinion · 2019-05-01T07:53:29Z

I've got this issue on v1.14.1 as well. My cloud provider is AWS and two nodes has lose Internal IP address. Restart of kubelet didn't fixed this problem. As well as restart of node didn't help.
Only after kubeadm reset and rejoin to cluster again my node was back to cluster.

sfxworks · 2019-06-19T08:54:29Z

Tried everything in kubernetes/kubeadm#203 to no avail

sfxworks · 2019-06-19T09:03:05Z

Active in 1.14.3

NAME                  STATUS   ROLES    AGE     VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE                       KERNEL-VERSION   CONTAINER-RUNTIME
k8s-internal-master   Ready    master   5m51s   v1.14.3   <none>        <none>        Debian GNU/Linux 9 (stretch)   4.9.0-9-amd64    containerd://1.2.6```

Kubernetes version (use kubectl version):

Client Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.3", GitCommit:"5e53fd6bc17c0dec8434817e69b04a25d8ae0ff0", GitTreeState:"clean", BuildDate:"2019-06-06T01:44:30Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.3", GitCommit:"5e53fd6bc17c0dec8434817e69b04a25d8ae0ff0", GitTreeState:"clean", BuildDate:"2019-06-06T01:36:19Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"linux/amd64"}

Cloud provider or hardware configuration:

Private Openstack cloud

OS (e.g. from /etc/os-release):

PRETTY_NAME="Debian GNU/Linux 9 (stretch)"
NAME="Debian GNU/Linux"
VERSION_ID="9"
VERSION="9 (stretch)"
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"

Kernel (e.g. uname -a):

Linux k8s-internal-master 4.9.0-9-amd64 #1 SMP Debian 4.9.168-1+deb9u2 (2019-05-13) x86_64 GNU/Linux

x86_64 GNU/Linux
Install tools:
Kubeadm

apiVersion: kubeadm.k8s.io/v1beta1
kind: InitConfiguration
localAPIEndpoint:
  advertiseAddress: 198.18.2.118
  bindPort: 6443
nodeRegistration:
  name: k8s-internal-master

apiVersion: kubeadm.k8s.io/v1beta1
kind: ClusterConfiguration
clusterName: imh-openstack
controllerManager:
  extraArgs:
    external-cloud-volume-plugin: openstack
networking:
  dnsDomain: demo.imh
  serviceSubnet: 10.244.0.0/16

andrewsykim · 2019-06-19T16:17:08Z

Sorry folks, thought this was fixed in #65226 but based on recent reports looks like it isn't. Added this on the SIG Cloud Provider backlog here kubernetes/cloud-provider#37 and will prioritize for v1.16.

andrewsykim · 2019-06-19T16:17:35Z

/sig cloud-provider
/area cloudprovider

andrewsykim · 2019-07-10T21:17:07Z

Spoke with @sfxworks on slack a few weeks ago, turns out his cluster was misconfigured. Are folks still running into this issue or was it resolved in #65226?

stieler-it · 2019-07-11T08:04:42Z

We haven't seen this issue for a very long time now.

bashofmann · 2019-07-11T08:16:09Z

We also have not seen this issue anymore since 1.11.6.

andrewsykim · 2019-07-11T12:28:16Z

Thanks folks! Closing for now, please re-open if you can reproduce

/close

k8s-ci-robot · 2019-07-11T12:28:17Z

@andrewsykim: Closing this issue.

In response to this:

Thanks folks! Closing for now, please re-open if you can reproduce

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

mohideen · 2020-07-09T17:10:33Z

I'm seeing this issue in v1.18.5. We have a 6 node (3 control-plane, 3 worker) HA cluster and this happens after restart of the nodes. Also, I see that the order of restarts affect which node loses the IPs. I want to note that the loss of IP happened even if the node was drained before the restart.

Of the three control-plane nodes, which ever gets restarted last looses it internal and external IPs.

For example if cn1, c2, and cn3 are my control plane nodes:

If I restart, cn1, it looses it IPs when it comes back online
then if I restart, cn2, it looses it IPs when it comes back online, but after the cn2 restart cn1 gets its IPs.
if I restart cn1, cn2, and cn3 at the exact same moment, all the nodes have their IPs right.

Environment:

Baremetal
RHEL 7
Kubeadm provisioned HA with stacked etcd
vSphere cloud provider

/open

andrewsykim · 2020-07-09T17:12:44Z

@mohideen what's your CNI?

mohideen · 2020-07-09T17:48:52Z

We have the latest canal (Calico v3.15.0 + Flannel v0.11.0 in host-gw mode)

AcidAngel21 · 2020-07-23T09:14:18Z

We have exactly the same problem in v1.18.3 and v1.18.6.

Environment:
Rancher v2.4.5
Rancher OS 1.5.1
RKE provisioned HA with stacked etcd
vSphere cloud provider

AcidAngel21 · 2020-07-24T05:40:09Z

For me it only happens when the following is true. Tested with K8s v1.14.6, 1.17.9 and 1.18.6

vSphere CSI driver is installed. I used this manifests to deploy the driver https://github.com/kubernetes-sigs/vsphere-csi-driver/tree/master/manifests/v1.0.2/deploy
Kubelet flag --cloud-provider=external is set

It does not happen when the flag is set to --cloud-provider=vsphere

andrewsykim · 2020-07-24T14:33:50Z

@AcidAngel21 your issue is likely related to kubernetes/cloud-provider-vsphere#338

johnr84 · 2020-07-24T16:35:38Z

@andrewsykim Thanks, That really helped. we had 3 workers and all 3 were not showing the IP. I added exclude-nics=cali*,docker*,tun* to /etc/vmware-tools/tools.conf and restarted open-vm-tools.service. Also restarted CPI pods and the IPs are showing now.

galal-hussein mentioned this issue Sep 6, 2018

Kubernetes 1.11.1 nodes occasionally do not register internal IP address rancher/rke#860

Closed

florianrusch mentioned this issue Sep 21, 2018

Tiller can't find any preferred addresses helm/helm#4676

Closed

bashofmann mentioned this issue Sep 24, 2018

Store the latest cloud provider node addresses #65226

Merged

galal-hussein mentioned this issue Oct 5, 2018

Internal/External ipaddress missing for hosts in rhel clusters. rancher/rancher#14600

Closed

lingxiankong mentioned this issue Oct 30, 2018

Automated cherry pick of #65226: Put all the node address cloud provider retrival complex #70154

Merged

aemneina mentioned this issue Oct 31, 2018

nodes ip addresses go missing when using k8s 1.11.x rancher/rancher#16393

Closed

lixuna mentioned this issue Nov 1, 2018

App-deploy stage fails on OpenStack on cncf.ci crosscloudci/crosscloudci#80

Closed

leodotcloud mentioned this issue Jan 5, 2019

websocket error rancher/rancher#15262

Closed

prein mentioned this issue Feb 8, 2019

Kubernetes nodes lose InternalIP and ExternalIP temporarily kubernetes/cloud-provider-openstack#280

Closed

andrewsykim mentioned this issue Jun 19, 2019

Nodes intermittently loose their node addresses kubernetes/cloud-provider#37

Closed

k8s-ci-robot added area/cloudprovider sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. labels Jun 19, 2019

k8s-ci-robot closed this as completed Jul 11, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

One Node loose private and external ip address #68270

One Node loose private and external ip address #68270

shahbour commented Sep 5, 2018 •

edited

stieler-it commented Sep 5, 2018 •

edited

shahbour commented Sep 5, 2018

stieler-it commented Sep 5, 2018 •

edited

shahbour commented Sep 5, 2018

FengyunPan2 commented Sep 6, 2018

twittyc commented Sep 6, 2018

wanghaoran1988 commented Sep 12, 2018

openstacker commented Oct 25, 2018

MaxDiOrio commented Nov 16, 2018

stieler-it commented Dec 18, 2018

mtougeron commented Dec 18, 2018

openstacker commented Dec 19, 2018

bosatsu commented Jan 2, 2019 •

edited

yogeshnath commented Feb 1, 2019

lcd2002 commented Mar 2, 2019

dassinion commented May 1, 2019

sfxworks commented Jun 19, 2019

sfxworks commented Jun 19, 2019 •

edited

andrewsykim commented Jun 19, 2019

andrewsykim commented Jun 19, 2019 •

edited

andrewsykim commented Jul 10, 2019

stieler-it commented Jul 11, 2019

bashofmann commented Jul 11, 2019

andrewsykim commented Jul 11, 2019

k8s-ci-robot commented Jul 11, 2019

mohideen commented Jul 9, 2020 •

edited

andrewsykim commented Jul 9, 2020

mohideen commented Jul 9, 2020

AcidAngel21 commented Jul 23, 2020

AcidAngel21 commented Jul 24, 2020

andrewsykim commented Jul 24, 2020

johnr84 commented Jul 24, 2020

One Node loose private and external ip address #68270

One Node loose private and external ip address #68270

Comments

shahbour commented Sep 5, 2018 • edited

stieler-it commented Sep 5, 2018 • edited

shahbour commented Sep 5, 2018

stieler-it commented Sep 5, 2018 • edited

shahbour commented Sep 5, 2018

FengyunPan2 commented Sep 6, 2018

twittyc commented Sep 6, 2018

wanghaoran1988 commented Sep 12, 2018

openstacker commented Oct 25, 2018

MaxDiOrio commented Nov 16, 2018

stieler-it commented Dec 18, 2018

mtougeron commented Dec 18, 2018

openstacker commented Dec 19, 2018

bosatsu commented Jan 2, 2019 • edited

yogeshnath commented Feb 1, 2019

lcd2002 commented Mar 2, 2019

dassinion commented May 1, 2019

sfxworks commented Jun 19, 2019

sfxworks commented Jun 19, 2019 • edited

andrewsykim commented Jun 19, 2019

andrewsykim commented Jun 19, 2019 • edited

andrewsykim commented Jul 10, 2019

stieler-it commented Jul 11, 2019

bashofmann commented Jul 11, 2019

andrewsykim commented Jul 11, 2019

k8s-ci-robot commented Jul 11, 2019

mohideen commented Jul 9, 2020 • edited

andrewsykim commented Jul 9, 2020

mohideen commented Jul 9, 2020

AcidAngel21 commented Jul 23, 2020

AcidAngel21 commented Jul 24, 2020

andrewsykim commented Jul 24, 2020

johnr84 commented Jul 24, 2020

shahbour commented Sep 5, 2018 •

edited

stieler-it commented Sep 5, 2018 •

edited

stieler-it commented Sep 5, 2018 •

edited

bosatsu commented Jan 2, 2019 •

edited

sfxworks commented Jun 19, 2019 •

edited

andrewsykim commented Jun 19, 2019 •

edited

mohideen commented Jul 9, 2020 •

edited