Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

One Node loose private and external ip address #68270

Closed
shahbour opened this issue Sep 5, 2018 · 32 comments
Closed

One Node loose private and external ip address #68270

shahbour opened this issue Sep 5, 2018 · 32 comments
Labels
area/cloudprovider kind/bug Categorizes issue or PR as related to a bug. sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. sig/node Categorizes an issue or PR as relevant to SIG Node.

Comments

@shahbour
Copy link

shahbour commented Sep 5, 2018

Is this a BUG REPORT or FEATURE REQUEST?:
/kind bug
/sig node

What happened:
One of my nodes engine02 lose the internal and external IP after some time

NAME                          STATUS    ROLES     AGE       VERSION   INTERNAL-IP      EXTERNAL-IP      OS-IMAGE                KERNEL-VERSION               CONTAINER-RUNTIME
engine01                      Ready     <none>    60d       v1.11.2   192.168.70.230   192.168.70.230   CentOS Linux 7 (Core)   3.10.0-693.21.1.el7.x86_64   docker://1.13.1
engine02                      Ready     <none>    60d       v1.11.2   <none>           <none>           CentOS Linux 7 (Core)   3.10.0-693.21.1.el7.x86_64   docker://1.13.1
engine03                      Ready     <none>    60d       v1.11.2   172.16.71.11     172.16.71.11     CentOS Linux 7 (Core)   3.10.0-693.21.1.el7.x86_64   docker://1.13.1
kube-master                   Ready     master    90d       v1.11.2   192.168.70.232   192.168.70.232   CentOS Linux 7 (Core)   3.10.0-693.21.1.el7.x86_64   docker://1.13.1

What you expected to happen:

(⎈ |production:default)➜  ~ kubectl get node -o wide
NAME                          STATUS                     ROLES     AGE       VERSION   INTERNAL-IP      EXTERNAL-IP      OS-IMAGE                KERNEL-VERSION               CONTAINER-RUNTIME
engine01                      Ready                      <none>    59d       v1.11.2   192.168.70.230   192.168.70.230   CentOS Linux 7 (Core)   3.10.0-693.21.1.el7.x86_64   docker://1.13.1
engine02                      Ready,SchedulingDisabled   <none>    59d       v1.11.2   192.168.70.231   192.168.70.231   CentOS Linux 7 (Core)   3.10.0-693.21.1.el7.x86_64   docker://1.13.1
engine03                      Ready                      <none>    59d       v1.11.2   172.16.71.11     172.16.71.11     CentOS Linux 7 (Core)   3.10.0-693.21.1.el7.x86_64   docker://1.13.1
kube-master                   Ready                      master    89d       v1.11.2   192.168.70.232   192.168.70.232   CentOS Linux 7 (Core)   3.10.0-693.21.1.el7.x86_64   docker://1.13.1

How to reproduce it (as minimally and precisely as possible):
If i do restart for the engine02 node, the ip do show for some time then after a while it just disappear . i found this while debuging an issue

Anything else we need to know?:
I did not find any thing in log but i did not know exactly what log i should search in

Environment:

  • Kubernetes version (use kubectl version):
➜ kubectl version
Client Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.1", GitCommit:"b1b29978270dc22fecc592ac55d903350454310a", GitTreeState:"clean", BuildDate:"2018-07-17T18:53:20Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.2", GitCommit:"bb9ffb1654d4a729bb4cec18ff088eacc153c239", GitTreeState:"clean", BuildDate:"2018-08-07T23:08:19Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"linux/amd64"}
  • Cloud provider or hardware configuration:
    Vsphere
  • OS (e.g. from /etc/os-release):
➜  ~ cat /etc/os-release 
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"
@k8s-ci-robot k8s-ci-robot added kind/bug Categorizes issue or PR as related to a bug. needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. sig/node Categorizes an issue or PR as relevant to SIG Node. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Sep 5, 2018
@stieler-it
Copy link

stieler-it commented Sep 5, 2018

We are already discussing this problem, but I think this might be the correct place for it. What we currently think to know:

  • Nodes lose IP information in their status (internalIP and externalIP)
  • Information can come back after time and will immediately come back when kubelet is restarted
  • Happens with different cloud providers (so far OpenStack, Azure, vsphere)
  • Seems to be a regression with K8S 1.11.x, does not happen in 1.10.5

Related issues:

@shahbour
Copy link
Author

shahbour commented Sep 5, 2018

Some notes:
it did start from version 1.11.0 as you can see in the weave issue above

(⎈ |production:kube-system)➜  ~ kubectl get node -o wide
NAME                          STATUS                     ROLES     AGE       VERSION   INTERNAL-IP      EXTERNAL-IP      OS-IMAGE                KERNEL-VERSION               CONTAINER-RUNTIME
engine01                      Ready,SchedulingDisabled   <none>    59d       v1.11.2   192.168.70.230   192.168.70.230   CentOS Linux 7 (Core)   3.10.0-693.21.1.el7.x86_64   docker://1.13.1
engine02                      Ready                      <none>    59d       v1.11.0   <none>           <none>           CentOS Linux 7 (Core)   3.10.0-693.21.1.el7.x86_64   docker://1.13.1
engine03                      Ready                      <none>    59d       v1.11.2   172.16.71.11     172.16.71.11     CentOS Linux 7 (Core)   3.10.0-693.21.1.el7.x86_64   docker://1.13.1
kube-master                   Ready                      master    89d       v1.11.2   192.168.70.232   192.168.70.232   CentOS Linux 7 (Core)   3.10.0-693.21.1.el7.x86_64   docker://1.13.1

I do have vshpere cloud provider , i did install it on our own vsphere using kubeadm.

If you need any log i would be happy to share it

@stieler-it
Copy link

stieler-it commented Sep 5, 2018

Do you have any connection errors when grep'ing kubelet logs for "node status"?

@shahbour
Copy link
Author

shahbour commented Sep 5, 2018

Yes log is full of that, please note that I did restart kubelet

2018-09-05 07:12 systemctl restart kubelet.service

➜  ~ grep "node status" /var/log/messages
Sep  3 07:35:28 engine02 kubelet: E0903 07:35:28.899230   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?resourceVersion=0&timeout=10s: dial tcp 192.168.70.232:6443: connect: connection refused
Sep  3 07:35:28 engine02 kubelet: E0903 07:35:28.900518   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?timeout=10s: dial tcp 192.168.70.232:6443: connect: connection refused
Sep  3 07:35:28 engine02 kubelet: E0903 07:35:28.902115   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?timeout=10s: dial tcp 192.168.70.232:6443: connect: connection refused
Sep  3 07:35:28 engine02 kubelet: E0903 07:35:28.903700   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?timeout=10s: dial tcp 192.168.70.232:6443: connect: connection refused
Sep  3 07:35:28 engine02 kubelet: E0903 07:35:28.904867   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?timeout=10s: dial tcp 192.168.70.232:6443: connect: connection refused
Sep  3 07:35:28 engine02 kubelet: E0903 07:35:28.904916   27639 kubelet_node_status.go:379] Unable to update node status: update node status exceeds retry count
Sep  3 07:35:48 engine02 kubelet: E0903 07:35:48.906432   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?resourceVersion=0&timeout=10s: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
Sep  3 07:35:58 engine02 kubelet: E0903 07:35:58.907013   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?timeout=10s: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
Sep  3 07:36:08 engine02 kubelet: E0903 07:36:08.907768   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?timeout=10s: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
Sep  3 07:36:08 engine02 kubelet: E0903 07:36:08.908883   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?timeout=10s: dial tcp 192.168.70.232:6443: connect: connection refused
Sep  3 07:36:08 engine02 kubelet: E0903 07:36:08.909872   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?timeout=10s: dial tcp 192.168.70.232:6443: connect: connection refused
Sep  3 07:36:08 engine02 kubelet: E0903 07:36:08.909926   27639 kubelet_node_status.go:379] Unable to update node status: update node status exceeds retry count
Sep  3 07:36:18 engine02 kubelet: E0903 07:36:18.911107   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?resourceVersion=0&timeout=10s: dial tcp 192.168.70.232:6443: connect: connection refused
Sep  3 07:36:18 engine02 kubelet: E0903 07:36:18.912322   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?timeout=10s: dial tcp 192.168.70.232:6443: connect: connection refused
Sep  3 07:36:18 engine02 kubelet: E0903 07:36:18.913246   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?timeout=10s: dial tcp 192.168.70.232:6443: connect: connection refused
Sep  3 07:36:18 engine02 kubelet: E0903 07:36:18.914313   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?timeout=10s: dial tcp 192.168.70.232:6443: connect: connection refused
Sep  3 07:36:18 engine02 kubelet: E0903 07:36:18.915218   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?timeout=10s: dial tcp 192.168.70.232:6443: connect: connection refused
Sep  3 07:36:18 engine02 kubelet: E0903 07:36:18.915278   27639 kubelet_node_status.go:379] Unable to update node status: update node status exceeds retry count
Sep  3 07:36:28 engine02 kubelet: E0903 07:36:28.916545   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?resourceVersion=0&timeout=10s: dial tcp 192.168.70.232:6443: connect: connection refused
Sep  3 07:36:28 engine02 kubelet: E0903 07:36:28.917366   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?timeout=10s: dial tcp 192.168.70.232:6443: connect: connection refused
Sep  3 07:36:28 engine02 kubelet: E0903 07:36:28.918103   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?timeout=10s: dial tcp 192.168.70.232:6443: connect: connection refused
Sep  3 07:36:28 engine02 kubelet: E0903 07:36:28.918837   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?timeout=10s: dial tcp 192.168.70.232:6443: connect: connection refused
Sep  3 07:36:28 engine02 kubelet: E0903 07:36:28.919612   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?timeout=10s: dial tcp 192.168.70.232:6443: connect: connection refused
Sep  3 07:36:28 engine02 kubelet: E0903 07:36:28.919686   27639 kubelet_node_status.go:379] Unable to update node status: update node status exceeds retry count
Sep  3 07:36:38 engine02 kubelet: E0903 07:36:38.920985   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?resourceVersion=0&timeout=10s: dial tcp 192.168.70.232:6443: connect: connection refused
Sep  3 07:36:38 engine02 kubelet: E0903 07:36:38.921887   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?timeout=10s: dial tcp 192.168.70.232:6443: connect: connection refused
Sep  3 07:36:38 engine02 kubelet: E0903 07:36:38.922680   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?timeout=10s: dial tcp 192.168.70.232:6443: connect: connection refused
Sep  3 07:36:38 engine02 kubelet: E0903 07:36:38.923490   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?timeout=10s: dial tcp 192.168.70.232:6443: connect: connection refused
Sep  3 07:36:38 engine02 kubelet: E0903 07:36:38.924282   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?timeout=10s: dial tcp 192.168.70.232:6443: connect: connection refused
Sep  3 07:36:38 engine02 kubelet: E0903 07:36:38.924358   27639 kubelet_node_status.go:379] Unable to update node status: update node status exceeds retry count
Sep  3 07:36:48 engine02 kubelet: E0903 07:36:48.925726   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?resourceVersion=0&timeout=10s: dial tcp 192.168.70.232:6443: connect: connection refused
Sep  3 07:36:48 engine02 kubelet: E0903 07:36:48.926672   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?timeout=10s: dial tcp 192.168.70.232:6443: connect: connection refused
Sep  3 07:36:48 engine02 kubelet: E0903 07:36:48.927563   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?timeout=10s: dial tcp 192.168.70.232:6443: connect: connection refused
Sep  3 07:36:48 engine02 kubelet: E0903 07:36:48.928365   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?timeout=10s: dial tcp 192.168.70.232:6443: connect: connection refused
Sep  3 07:36:48 engine02 kubelet: E0903 07:36:48.929118   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?timeout=10s: dial tcp 192.168.70.232:6443: connect: connection refused
Sep  3 07:36:48 engine02 kubelet: E0903 07:36:48.929150   27639 kubelet_node_status.go:379] Unable to update node status: update node status exceeds retry count
Sep  3 07:37:08 engine02 kubelet: E0903 07:37:08.930270   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?resourceVersion=0&timeout=10s: context deadline exceeded (Client.Timeout exceeded while awaiting headers)
Sep  3 07:37:18 engine02 kubelet: E0903 07:37:18.930970   27639 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?timeout=10s: context deadline exceeded
Sep  3 08:09:28 engine02 kubelet: E0903 08:09:28.865697   27639 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubevols/kubernetes-dynamic-pvc-ac6a1d7d-6a2c-11e8-87de-0050568166d0.vmdk\"" failed. No retries permitted until 2018-09-03 08:09:30.865628329 +0000 UTC m=+4628926.006948413 (durationBeforeRetry 2s). Error: "Volume not attached according to node status for volume \"pvc-ac6a1d7d-6a2c-11e8-87de-0050568166d0\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubevols/kubernetes-dynamic-pvc-ac6a1d7d-6a2c-11e8-87de-0050568166d0.vmdk\") pod \"rabbitmq-rabbitmq-ha-1\" (UID: \"ad1e687c-af50-11e8-b7be-0050568166d0\") "
Sep  3 08:09:30 engine02 kubelet: E0903 08:09:30.881016   27639 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubevols/kubernetes-dynamic-pvc-ac6a1d7d-6a2c-11e8-87de-0050568166d0.vmdk\"" failed. No retries permitted until 2018-09-03 08:09:34.880955415 +0000 UTC m=+4628930.022275387 (durationBeforeRetry 4s). Error: "Volume not attached according to node status for volume \"pvc-ac6a1d7d-6a2c-11e8-87de-0050568166d0\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubevols/kubernetes-dynamic-pvc-ac6a1d7d-6a2c-11e8-87de-0050568166d0.vmdk\") pod \"rabbitmq-rabbitmq-ha-1\" (UID: \"ad1e687c-af50-11e8-b7be-0050568166d0\") "
Sep  3 08:09:55 engine02 kubelet: E0903 08:09:55.775875   27639 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubevols/kubernetes-dynamic-pvc-ac6a1d7d-6a2c-11e8-87de-0050568166d0.vmdk\"" failed. No retries permitted until 2018-09-03 08:10:03.775785447 +0000 UTC m=+4628958.917105455 (durationBeforeRetry 8s). Error: "Volume not attached according to node status for volume \"pvc-ac6a1d7d-6a2c-11e8-87de-0050568166d0\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubevols/kubernetes-dynamic-pvc-ac6a1d7d-6a2c-11e8-87de-0050568166d0.vmdk\") pod \"rabbitmq-rabbitmq-ha-1\" (UID: \"ad1e687c-af50-11e8-b7be-0050568166d0\") "
Sep  3 08:10:03 engine02 kubelet: E0903 08:10:03.787976   27639 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubevols/kubernetes-dynamic-pvc-ac6a1d7d-6a2c-11e8-87de-0050568166d0.vmdk\"" failed. No retries permitted until 2018-09-03 08:10:19.787874552 +0000 UTC m=+4628974.929194625 (durationBeforeRetry 16s). Error: "Volume not attached according to node status for volume \"pvc-ac6a1d7d-6a2c-11e8-87de-0050568166d0\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubevols/kubernetes-dynamic-pvc-ac6a1d7d-6a2c-11e8-87de-0050568166d0.vmdk\") pod \"rabbitmq-rabbitmq-ha-1\" (UID: \"ad1e687c-af50-11e8-b7be-0050568166d0\") "
Sep  4 11:56:34 engine02 kubelet: E0904 11:56:34.438862     804 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?resourceVersion=0&timeout=10s: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
Sep  4 11:56:38 engine02 kubelet: E0904 11:56:38.462093     804 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\"" failed. No retries permitted until 2018-09-04 11:56:46.462032212 +0000 UTC m=+4086.955907754 (durationBeforeRetry 8s). Error: "Volume not attached according to node status for volume \"vsphere-es-pv\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\") pod \"elasticsearch-logging-749d8bc9bc-vvnn8\" (UID: \"820b5be9-b039-11e8-b7be-0050568166d0\") "
Sep  4 11:57:04 engine02 kubelet: E0904 11:57:04.398858     804 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\"" failed. No retries permitted until 2018-09-04 11:57:20.398806711 +0000 UTC m=+4120.892682230 (durationBeforeRetry 16s). Error: "Volume not attached according to node status for volume \"vsphere-es-pv\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\") pod \"elasticsearch-logging-749d8bc9bc-vvnn8\" (UID: \"820b5be9-b039-11e8-b7be-0050568166d0\") "
Sep  4 11:57:34 engine02 kubelet: E0904 11:57:34.470714     804 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\"" failed. No retries permitted until 2018-09-04 11:58:06.470673606 +0000 UTC m=+4166.964549132 (durationBeforeRetry 32s). Error: "Volume not attached according to node status for volume \"vsphere-es-pv\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\") pod \"elasticsearch-logging-749d8bc9bc-vvnn8\" (UID: \"820b5be9-b039-11e8-b7be-0050568166d0\") "
Sep  4 11:58:06 engine02 kubelet: E0904 11:58:06.550371     804 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\"" failed. No retries permitted until 2018-09-04 11:59:10.550304065 +0000 UTC m=+4231.044179614 (durationBeforeRetry 1m4s). Error: "Volume not attached according to node status for volume \"vsphere-es-pv\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\") pod \"elasticsearch-logging-749d8bc9bc-vvnn8\" (UID: \"820b5be9-b039-11e8-b7be-0050568166d0\") "
Sep  4 11:59:14 engine02 kubelet: E0904 11:59:14.364514     804 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\"" failed. No retries permitted until 2018-09-04 12:01:16.364403626 +0000 UTC m=+4356.858279268 (durationBeforeRetry 2m2s). Error: "Volume not attached according to node status for volume \"vsphere-es-pv\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\") pod \"elasticsearch-logging-749d8bc9bc-vvnn8\" (UID: \"820b5be9-b039-11e8-b7be-0050568166d0\") "
Sep  4 11:59:59 engine02 kubelet: W0904 11:59:59.952086     804 kubelet_node_status.go:1114] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s
Sep  4 12:01:02 engine02 kubelet: E0904 12:01:02.588554     804 kubelet_node_status.go:391] Error updating node status, will retry: failed to patch status "{\"status\":{\"$setElementOrder/conditions\":[{\"type\":\"OutOfDisk\"},{\"type\":\"MemoryPressure\"},{\"type\":\"DiskPressure\"},{\"type\":\"PIDPressure\"},{\"type\":\"Ready\"}],\"conditions\":[{\"lastHeartbeatTime\":\"2018-09-04T12:00:40Z\",\"type\":\"OutOfDisk\"},{\"lastHeartbeatTime\":\"2018-09-04T12:00:40Z\",\"type\":\"MemoryPressure\"},{\"lastHeartbeatTime\":\"2018-09-04T12:00:40Z\",\"type\":\"DiskPressure\"},{\"lastHeartbeatTime\":\"2018-09-04T12:00:40Z\",\"type\":\"PIDPressure\"},{\"lastHeartbeatTime\":\"2018-09-04T12:00:40Z\",\"type\":\"Ready\"}]}}" for node "engine02": Patch https://192.168.70.232:6443/api/v1/nodes/engine02/status?timeout=10s: context deadline exceeded
Sep  4 12:01:16 engine02 kubelet: E0904 12:01:16.457135     804 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\"" failed. No retries permitted until 2018-09-04 12:03:18.457065655 +0000 UTC m=+4478.950941239 (durationBeforeRetry 2m2s). Error: "Volume not attached according to node status for volume \"vsphere-es-pv\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\") pod \"elasticsearch-logging-749d8bc9bc-8jlqk\" (UID: \"f6472567-b039-11e8-b7be-0050568166d0\") "
Sep  4 12:03:18 engine02 kubelet: E0904 12:03:18.528831     804 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\"" failed. No retries permitted until 2018-09-04 12:05:20.528766851 +0000 UTC m=+4601.022642408 (durationBeforeRetry 2m2s). Error: "Volume not attached according to node status for volume \"vsphere-es-pv\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\") pod \"elasticsearch-logging-749d8bc9bc-8jlqk\" (UID: \"f6472567-b039-11e8-b7be-0050568166d0\") "
Sep  4 12:05:20 engine02 kubelet: E0904 12:05:20.610568     804 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\"" failed. No retries permitted until 2018-09-04 12:07:22.610499519 +0000 UTC m=+4723.104375101 (durationBeforeRetry 2m2s). Error: "Volume not attached according to node status for volume \"vsphere-es-pv\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\") pod \"elasticsearch-logging-749d8bc9bc-8jlqk\" (UID: \"f6472567-b039-11e8-b7be-0050568166d0\") "
Sep  4 12:07:42 engine02 kubelet: E0904 12:07:42.787880     804 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\"" failed. No retries permitted until 2018-09-04 12:09:44.787797916 +0000 UTC m=+4865.281673428 (durationBeforeRetry 2m2s). Error: "Volume not attached according to node status for volume \"vsphere-es-pv\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\") pod \"elasticsearch-logging-749d8bc9bc-8jlqk\" (UID: \"f6472567-b039-11e8-b7be-0050568166d0\") "
Sep  4 12:09:44 engine02 kubelet: E0904 12:09:44.848776     804 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\"" failed. No retries permitted until 2018-09-04 12:11:46.848716767 +0000 UTC m=+4987.342592332 (durationBeforeRetry 2m2s). Error: "Volume not attached according to node status for volume \"vsphere-es-pv\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\") pod \"elasticsearch-logging-749d8bc9bc-8jlqk\" (UID: \"f6472567-b039-11e8-b7be-0050568166d0\") "
Sep  4 12:11:46 engine02 kubelet: E0904 12:11:46.938195     804 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\"" failed. No retries permitted until 2018-09-04 12:13:48.938134262 +0000 UTC m=+5109.432009858 (durationBeforeRetry 2m2s). Error: "Volume not attached according to node status for volume \"vsphere-es-pv\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\") pod \"elasticsearch-logging-749d8bc9bc-8jlqk\" (UID: \"f6472567-b039-11e8-b7be-0050568166d0\") "
Sep  4 12:13:48 engine02 kubelet: E0904 12:13:48.976105     804 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\"" failed. No retries permitted until 2018-09-04 12:15:50.976009545 +0000 UTC m=+5231.469885124 (durationBeforeRetry 2m2s). Error: "Volume not attached according to node status for volume \"vsphere-es-pv\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\") pod \"elasticsearch-logging-749d8bc9bc-8jlqk\" (UID: \"f6472567-b039-11e8-b7be-0050568166d0\") "
Sep  4 12:15:51 engine02 kubelet: E0904 12:15:51.079056     804 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\"" failed. No retries permitted until 2018-09-04 12:17:53.078988077 +0000 UTC m=+5353.572863672 (durationBeforeRetry 2m2s). Error: "Volume not attached according to node status for volume \"vsphere-es-pv\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\") pod \"elasticsearch-logging-749d8bc9bc-8jlqk\" (UID: \"f6472567-b039-11e8-b7be-0050568166d0\") "
Sep  4 12:17:53 engine02 kubelet: E0904 12:17:53.104947     804 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\"" failed. No retries permitted until 2018-09-04 12:19:55.104867527 +0000 UTC m=+5475.598743103 (durationBeforeRetry 2m2s). Error: "Volume not attached according to node status for volume \"vsphere-es-pv\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\") pod \"elasticsearch-logging-749d8bc9bc-8jlqk\" (UID: \"f6472567-b039-11e8-b7be-0050568166d0\") "
Sep  4 12:19:55 engine02 kubelet: E0904 12:19:55.142301     804 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\"" failed. No retries permitted until 2018-09-04 12:21:57.142196146 +0000 UTC m=+5597.636071728 (durationBeforeRetry 2m2s). Error: "Volume not attached according to node status for volume \"vsphere-es-pv\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\") pod \"elasticsearch-logging-749d8bc9bc-8jlqk\" (UID: \"f6472567-b039-11e8-b7be-0050568166d0\") "
Sep  4 12:21:57 engine02 kubelet: E0904 12:21:57.216295     804 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\"" failed. No retries permitted until 2018-09-04 12:23:59.216188034 +0000 UTC m=+5719.710063606 (durationBeforeRetry 2m2s). Error: "Volume not attached according to node status for volume \"vsphere-es-pv\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\") pod \"elasticsearch-logging-749d8bc9bc-8jlqk\" (UID: \"f6472567-b039-11e8-b7be-0050568166d0\") "
Sep  4 12:22:58 engine02 kubelet: E0904 12:22:58.236983     804 kubelet_node_status.go:391] Error updating node status, will retry: failed to patch status "{\"status\":{\"$setElementOrder/conditions\":[{\"type\":\"OutOfDisk\"},{\"type\":\"MemoryPressure\"},{\"type\":\"DiskPressure\"},{\"type\":\"PIDPressure\"},{\"type\":\"Ready\"}],\"conditions\":[{\"lastHeartbeatTime\":\"2018-09-04T12:22:35Z\",\"type\":\"OutOfDisk\"},{\"lastHeartbeatTime\":\"2018-09-04T12:22:35Z\",\"type\":\"MemoryPressure\"},{\"lastHeartbeatTime\":\"2018-09-04T12:22:35Z\",\"type\":\"DiskPressure\"},{\"lastHeartbeatTime\":\"2018-09-04T12:22:35Z\",\"type\":\"PIDPressure\"},{\"lastHeartbeatTime\":\"2018-09-04T12:22:35Z\",\"type\":\"Ready\"}]}}" for node "engine02": Patch https://192.168.70.232:6443/api/v1/nodes/engine02/status?timeout=10s: context deadline exceeded
Sep  4 12:23:59 engine02 kubelet: E0904 12:23:59.307749     804 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\"" failed. No retries permitted until 2018-09-04 12:26:01.307682829 +0000 UTC m=+5841.801558371 (durationBeforeRetry 2m2s). Error: "Volume not attached according to node status for volume \"vsphere-es-pv\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\") pod \"elasticsearch-logging-749d8bc9bc-8jlqk\" (UID: \"f6472567-b039-11e8-b7be-0050568166d0\") "
Sep  4 12:26:01 engine02 kubelet: E0904 12:26:01.337923     804 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\"" failed. No retries permitted until 2018-09-04 12:28:03.337849408 +0000 UTC m=+5963.831724943 (durationBeforeRetry 2m2s). Error: "Volume not attached according to node status for volume \"vsphere-es-pv\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\") pod \"elasticsearch-logging-749d8bc9bc-8jlqk\" (UID: \"f6472567-b039-11e8-b7be-0050568166d0\") "
Sep  4 12:28:03 engine02 kubelet: E0904 12:28:03.916377     804 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "engine02": Get https://192.168.70.232:6443/api/v1/nodes/engine02?resourceVersion=0&timeout=10s: context deadline exceeded
Sep  4 12:28:03 engine02 kubelet: E0904 12:28:03.926029     804 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\"" failed. No retries permitted until 2018-09-04 12:30:05.925993854 +0000 UTC m=+6086.419869388 (durationBeforeRetry 2m2s). Error: "Volume not attached according to node status for volume \"vsphere-es-pv\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\") pod \"elasticsearch-logging-749d8bc9bc-8jlqk\" (UID: \"f6472567-b039-11e8-b7be-0050568166d0\") "
Sep  4 12:30:05 engine02 kubelet: E0904 12:30:05.967223     804 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\"" failed. No retries permitted until 2018-09-04 12:32:07.967150273 +0000 UTC m=+6208.461025812 (durationBeforeRetry 2m2s). Error: "Volume not attached according to node status for volume \"vsphere-es-pv\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\") pod \"elasticsearch-logging-749d8bc9bc-8jlqk\" (UID: \"f6472567-b039-11e8-b7be-0050568166d0\") "
Sep  4 12:30:36 engine02 kubelet: E0904 12:30:36.809110     804 kubelet_node_status.go:391] Error updating node status, will retry: failed to patch status "{\"status\":{\"$setElementOrder/conditions\":[{\"type\":\"OutOfDisk\"},{\"type\":\"MemoryPressure\"},{\"type\":\"DiskPressure\"},{\"type\":\"PIDPressure\"},{\"type\":\"Ready\"}],\"conditions\":[{\"lastHeartbeatTime\":\"2018-09-04T12:30:14Z\",\"type\":\"OutOfDisk\"},{\"lastHeartbeatTime\":\"2018-09-04T12:30:14Z\",\"type\":\"MemoryPressure\"},{\"lastHeartbeatTime\":\"2018-09-04T12:30:14Z\",\"type\":\"DiskPressure\"},{\"lastHeartbeatTime\":\"2018-09-04T12:30:14Z\",\"type\":\"PIDPressure\"},{\"lastHeartbeatTime\":\"2018-09-04T12:30:14Z\",\"type\":\"Ready\"}]}}" for node "engine02": Patch https://192.168.70.232:6443/api/v1/nodes/engine02/status?timeout=10s: context deadline exceeded (Client.Timeout exceeded while awaiting headers)
Sep  4 12:32:08 engine02 kubelet: E0904 12:32:08.033116     804 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\"" failed. No retries permitted until 2018-09-04 12:34:10.033039767 +0000 UTC m=+6330.526915349 (durationBeforeRetry 2m2s). Error: "Volume not attached according to node status for volume \"vsphere-es-pv\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\") pod \"elasticsearch-logging-749d8bc9bc-8jlqk\" (UID: \"f6472567-b039-11e8-b7be-0050568166d0\") "
Sep  4 12:33:09 engine02 kubelet: E0904 12:33:09.624995     804 kubelet_node_status.go:391] Error updating node status, will retry: failed to patch status "{\"status\":{\"$setElementOrder/conditions\":[{\"type\":\"OutOfDisk\"},{\"type\":\"MemoryPressure\"},{\"type\":\"DiskPressure\"},{\"type\":\"PIDPressure\"},{\"type\":\"Ready\"}],\"conditions\":[{\"lastHeartbeatTime\":\"2018-09-04T12:32:47Z\",\"type\":\"OutOfDisk\"},{\"lastHeartbeatTime\":\"2018-09-04T12:32:47Z\",\"type\":\"MemoryPressure\"},{\"lastHeartbeatTime\":\"2018-09-04T12:32:47Z\",\"type\":\"DiskPressure\"},{\"lastHeartbeatTime\":\"2018-09-04T12:32:47Z\",\"type\":\"PIDPressure\"},{\"lastHeartbeatTime\":\"2018-09-04T12:32:47Z\",\"type\":\"Ready\"}]}}" for node "engine02": Patch https://192.168.70.232:6443/api/v1/nodes/engine02/status?timeout=10s: context deadline exceeded
Sep  4 12:34:10 engine02 kubelet: E0904 12:34:10.109208     804 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\"" failed. No retries permitted until 2018-09-04 12:36:12.109124111 +0000 UTC m=+6452.602999713 (durationBeforeRetry 2m2s). Error: "Volume not attached according to node status for volume \"vsphere-es-pv\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\") pod \"elasticsearch-logging-749d8bc9bc-8jlqk\" (UID: \"f6472567-b039-11e8-b7be-0050568166d0\") "
Sep  4 12:36:12 engine02 kubelet: E0904 12:36:12.155042     804 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\"" failed. No retries permitted until 2018-09-04 12:38:14.154960161 +0000 UTC m=+6574.648835750 (durationBeforeRetry 2m2s). Error: "Volume not attached according to node status for volume \"vsphere-es-pv\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\") pod \"elasticsearch-logging-749d8bc9bc-8jlqk\" (UID: \"f6472567-b039-11e8-b7be-0050568166d0\") "
Sep  4 12:38:14 engine02 kubelet: E0904 12:38:14.392556     804 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\"" failed. No retries permitted until 2018-09-04 12:40:16.39245309 +0000 UTC m=+6696.886328618 (durationBeforeRetry 2m2s). Error: "Volume not attached according to node status for volume \"vsphere-es-pv\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\") pod \"elasticsearch-logging-749d8bc9bc-8jlqk\" (UID: \"f6472567-b039-11e8-b7be-0050568166d0\") "
Sep  4 12:40:16 engine02 kubelet: E0904 12:40:16.435047     804 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\"" failed. No retries permitted until 2018-09-04 12:42:18.434977275 +0000 UTC m=+6818.928852849 (durationBeforeRetry 2m2s). Error: "Volume not attached according to node status for volume \"vsphere-es-pv\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubernetes/elasticsearch\") pod \"elasticsearch-logging-749d8bc9bc-8jlqk\" (UID: \"f6472567-b039-11e8-b7be-0050568166d0\") "
Sep  4 14:32:24 engine02 kubelet: E0904 14:32:24.080894     804 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubevols/kubernetes-dynamic-pvc-03005cea-6a2c-11e8-87de-0050568166d0.vmdk\"" failed. No retries permitted until 2018-09-04 14:32:25.080834994 +0000 UTC m=+13425.574710508 (durationBeforeRetry 1s). Error: "Volume not attached according to node status for volume \"pvc-03005cea-6a2c-11e8-87de-0050568166d0\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubevols/kubernetes-dynamic-pvc-03005cea-6a2c-11e8-87de-0050568166d0.vmdk\") pod \"rabbitmq-rabbitmq-ha-0\" (UID: \"5672f576-b04f-11e8-b7be-0050568166d0\") "
Sep  4 14:32:25 engine02 kubelet: E0904 14:32:25.186936     804 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubevols/kubernetes-dynamic-pvc-03005cea-6a2c-11e8-87de-0050568166d0.vmdk\"" failed. No retries permitted until 2018-09-04 14:32:27.186869203 +0000 UTC m=+13427.680744731 (durationBeforeRetry 2s). Error: "Volume not attached according to node status for volume \"pvc-03005cea-6a2c-11e8-87de-0050568166d0\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubevols/kubernetes-dynamic-pvc-03005cea-6a2c-11e8-87de-0050568166d0.vmdk\") pod \"rabbitmq-rabbitmq-ha-0\" (UID: \"5672f576-b04f-11e8-b7be-0050568166d0\") "
Sep  4 14:32:27 engine02 kubelet: E0904 14:32:27.200035     804 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubevols/kubernetes-dynamic-pvc-03005cea-6a2c-11e8-87de-0050568166d0.vmdk\"" failed. No retries permitted until 2018-09-04 14:32:31.199965396 +0000 UTC m=+13431.693841018 (durationBeforeRetry 4s). Error: "Volume not attached according to node status for volume \"pvc-03005cea-6a2c-11e8-87de-0050568166d0\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubevols/kubernetes-dynamic-pvc-03005cea-6a2c-11e8-87de-0050568166d0.vmdk\") pod \"rabbitmq-rabbitmq-ha-0\" (UID: \"5672f576-b04f-11e8-b7be-0050568166d0\") "
Sep  4 14:32:52 engine02 kubelet: E0904 14:32:52.913975     804 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubevols/kubernetes-dynamic-pvc-03005cea-6a2c-11e8-87de-0050568166d0.vmdk\"" failed. No retries permitted until 2018-09-04 14:33:00.913881446 +0000 UTC m=+13461.407756984 (durationBeforeRetry 8s). Error: "Volume not attached according to node status for volume \"pvc-03005cea-6a2c-11e8-87de-0050568166d0\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubevols/kubernetes-dynamic-pvc-03005cea-6a2c-11e8-87de-0050568166d0.vmdk\") pod \"rabbitmq-rabbitmq-ha-0\" (UID: \"5672f576-b04f-11e8-b7be-0050568166d0\") "
Sep  4 14:33:00 engine02 kubelet: E0904 14:33:00.948623     804 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubevols/kubernetes-dynamic-pvc-03005cea-6a2c-11e8-87de-0050568166d0.vmdk\"" failed. No retries permitted until 2018-09-04 14:33:16.948498532 +0000 UTC m=+13477.442374068 (durationBeforeRetry 16s). Error: "Volume not attached according to node status for volume \"pvc-03005cea-6a2c-11e8-87de-0050568166d0\" (UniqueName: \"kubernetes.io/vsphere-volume/[3PAR_Datastore06] kubevols/kubernetes-dynamic-pvc-03005cea-6a2c-11e8-87de-0050568166d0.vmdk\") pod \"rabbitmq-rabbitmq-ha-0\" (UID: \"5672f576-b04f-11e8-b7be-0050568166d0\") "

@FengyunPan2
Copy link
Contributor

/cc

@twittyc
Copy link

twittyc commented Sep 6, 2018

Hey guys, We are getting this same problem with the OpenStack Cloud Provider.

cat kubelet-log.json |  grep -v "Volume not attached" | grep "node status"
{"log":"E0813 20:38:56.118941   18989 kubelet_node_status.go:391] Error updating node status, will retry: error getting node \"k8s-corp-prod-0-worker-us-corp-kc-8b-1\": Get https://127.0.0.1:6443/api/v1/nodes/k8s-corp-prod-0-worker-us-corp-kc-8b-1?resourceVersion=0\u0026timeout=10s: unexpected EOF\n","stream":"stderr","time":"2018-08-13T20:38:56.121316062Z"}
{"log":"E0813 23:50:34.361091   18989 kubelet_node_status.go:391] Error updating node status, will retry: error getting node \"k8s-corp-prod-0-worker-us-corp-kc-8b-1\": Get https://127.0.0.1:6443/api/v1/nodes/k8s-corp-prod-0-worker-us-corp-kc-8b-1?resourceVersion=0\u0026timeout=10s: unexpected EOF\n","stream":"stderr","time":"2018-08-13T23:50:34.361316592Z"}
{"log":"W0814 01:13:16.823637   18989 kubelet_node_status.go:1114] Failed to set some node status fields: failed to get node address from cloud provider: Timeout after 10s\n","stream":"stderr","time":"2018-08-14T01:13:16.823960043Z"}
{"log":"E0814 07:01:54.730030   18989 kubelet_node_status.go:391] Error updating node status, will retry: error getting node \"k8s-corp-prod-0-worker-us-corp-kc-8b-1\": Get https://127.0.0.1:6443/api/v1/nodes/k8s-corp-prod-0-worker-us-corp-kc-8b-1?resourceVersion=0\u0026timeout=10s: unexpected EOF\n","stream":"stderr","time":"2018-08-14T07:01:54.731184923Z"}

@wanghaoran1988
Copy link
Contributor

We met this problem with aws cloud provider, and #65226 fixed this.

@openstacker
Copy link

We met this problem with aws cloud provider, and #65226 fixed this.

Thanks for the feedback. So far, we know almost all cloud provider(AWS, Azure, OpenStack and vsphere) affected by this bug.

@MaxDiOrio
Copy link

I'm going to add my 2c to this one. I have had issues bringing up clusters, not where the IP goes missing, but internal/external IP's are set to the eth0 self-assigned IPv6 address and the only way to fix it is delete the node and recreate it.
rancher/rancher#16597

cloudboss pushed a commit to cloudboss/keights that referenced this issue Dec 17, 2018
This is a workaround for the bug described at
kubernetes/kubernetes#68270, where nodes
lose their IP address. A fix is supposed to be in 1.11.6, but this
should overcome the issue for the time being.
@stieler-it
Copy link

Seems to be fixed in k8s v1.11.5

@mtougeron
Copy link
Contributor

@stieler-it fyi, fwiw, we just encountered it today on v1.11.5 a little less than an hour ago.

@openstacker
Copy link

It's merged into v1.11.6 not v1.11.5, just FYI.

@bosatsu
Copy link

bosatsu commented Jan 2, 2019

New to kubernetes, but I believe I may be seeing this issue as well in v1.12.3. Let me know if there is any additional info I can provide to confirm or refute.

NAME            STATUS   ROLES    AGE   VERSION   INTERNAL-IP     EXTERNAL-IP   OS-IMAGE                KERNEL-VERSION               CONTAINER-RUNTIME
p3dk8sautoing   Ready    node     72m   v1.12.3   <none>          <none>        CentOS Linux 7 (Core)   3.10.0-862.14.4.el7.x86_64   docker://18.6.1
p3dk8sautomon   Ready    node     72m   v1.12.3   10.33.106.138   <none>        CentOS Linux 7 (Core)   3.10.0-862.14.4.el7.x86_64   docker://18.6.1
p3dk8sautomtr   Ready    master   73m   v1.12.3   <none>          <none>        CentOS Linux 7 (Core)   3.10.0-862.14.4.el7.x86_64   docker://18.6.1
p3dk8sautowkr   Ready    node     72m   v1.12.3   10.33.108.118   <none>        CentOS Linux 7 (Core)   3.10.0-862.14.4.el7.x86_64   docker://18.6.1
Jan 02 11:40:56 p3dk8sautomtr kubelet[10488]: I0102 11:40:56.885054   10488 setters.go:72] Using node IP: "10.33.88.37"
Jan 02 11:40:56 p3dk8sautomtr kubelet[10488]: W0102 11:40:56.885105   10488 kubelet_node_status.go:463] Failed to set some node status fields: failed to get node address from cloud provider that matches ip: 10.33.88.37
Jan 02 11:41:06 p3dk8sautomtr kubelet[10488]: I0102 11:41:06.900189   10488 setters.go:72] Using node IP: "10.33.88.37"
Jan 02 11:41:06 p3dk8sautomtr kubelet[10488]: W0102 11:41:06.900273   10488 kubelet_node_status.go:463] Failed to set some node status fields: failed to get node address from cloud provider that matches ip: 10.33.88.37

Environment:

  • Kubernetes version (use kubectl version):
Client Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.1", GitCommit:"eec55b9ba98609a46fee712359c7b5b365bdd920", GitTreeState:"clean", BuildDate:"2018-12-13T10:39:04Z", GoVersion:"go1.11.2", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.3", GitCommit:"435f92c719f279a3a67808c80521ea17d5715c66", GitTreeState:"clean", BuildDate:"2018-
11-26T12:46:57Z", GoVersion:"go1.10.4", Compiler:"gc", Platform:"linux/amd64"}
  • Cloud provider or hardware configuration:
Private Openstack cloud
  • OS (e.g. from /etc/os-release):
$ cat /etc/os-release
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"
  • Kernel (e.g. uname -a):
$ uname -a
Linux p3dk8sautomtr 3.10.0-862.14.4.el7.x86_64 #1 SMP Wed Sep 26 15:12:11 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
  • Install tools:
Kubespray release 2.8

@yogeshnath
Copy link

We are running into the same issue. Does anyone see this in 1.13 as well?

@lcd2002
Copy link

lcd2002 commented Mar 2, 2019

I'm having this issue with 1.13.1. One of the masters (HA configurations) lost the "addresses" object, where InternalIP, InternalDNS, and Hostname are listed. I restarted kubelet several times and even rebooted the machine to no avail. anybody know how to get it back?

@dassinion
Copy link

I've got this issue on v1.14.1 as well. My cloud provider is AWS and two nodes has lose Internal IP address. Restart of kubelet didn't fixed this problem. As well as restart of node didn't help.
Only after kubeadm reset and rejoin to cluster again my node was back to cluster.

@sfxworks
Copy link

Tried everything in kubernetes/kubeadm#203 to no avail

@sfxworks
Copy link

sfxworks commented Jun 19, 2019

Active in 1.14.3

NAME                  STATUS   ROLES    AGE     VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE                       KERNEL-VERSION   CONTAINER-RUNTIME
k8s-internal-master   Ready    master   5m51s   v1.14.3   <none>        <none>        Debian GNU/Linux 9 (stretch)   4.9.0-9-amd64    containerd://1.2.6```

Kubernetes version (use kubectl version):

Client Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.3", GitCommit:"5e53fd6bc17c0dec8434817e69b04a25d8ae0ff0", GitTreeState:"clean", BuildDate:"2019-06-06T01:44:30Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.3", GitCommit:"5e53fd6bc17c0dec8434817e69b04a25d8ae0ff0", GitTreeState:"clean", BuildDate:"2019-06-06T01:36:19Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"linux/amd64"}

Cloud provider or hardware configuration:

Private Openstack cloud

OS (e.g. from /etc/os-release):

PRETTY_NAME="Debian GNU/Linux 9 (stretch)"
NAME="Debian GNU/Linux"
VERSION_ID="9"
VERSION="9 (stretch)"
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"

Kernel (e.g. uname -a):

Linux k8s-internal-master 4.9.0-9-amd64 #1 SMP Debian 4.9.168-1+deb9u2 (2019-05-13) x86_64 GNU/Linux

x86_64 GNU/Linux
Install tools:
Kubeadm

apiVersion: kubeadm.k8s.io/v1beta1
kind: InitConfiguration
localAPIEndpoint:
  advertiseAddress: 198.18.2.118
  bindPort: 6443
nodeRegistration:
  name: k8s-internal-master
apiVersion: kubeadm.k8s.io/v1beta1
kind: ClusterConfiguration
clusterName: imh-openstack
controllerManager:
  extraArgs:
    external-cloud-volume-plugin: openstack
networking:
  dnsDomain: demo.imh
  serviceSubnet: 10.244.0.0/16

@andrewsykim
Copy link
Member

Sorry folks, thought this was fixed in #65226 but based on recent reports looks like it isn't. Added this on the SIG Cloud Provider backlog here kubernetes/cloud-provider#37 and will prioritize for v1.16.

@andrewsykim
Copy link
Member

andrewsykim commented Jun 19, 2019

/sig cloud-provider
/area cloudprovider

@k8s-ci-robot k8s-ci-robot added area/cloudprovider sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. labels Jun 19, 2019
@andrewsykim
Copy link
Member

Spoke with @sfxworks on slack a few weeks ago, turns out his cluster was misconfigured. Are folks still running into this issue or was it resolved in #65226?

@stieler-it
Copy link

We haven't seen this issue for a very long time now.

@bashofmann
Copy link

We also have not seen this issue anymore since 1.11.6.

@andrewsykim
Copy link
Member

Thanks folks! Closing for now, please re-open if you can reproduce

/close

@k8s-ci-robot
Copy link
Contributor

@andrewsykim: Closing this issue.

In response to this:

Thanks folks! Closing for now, please re-open if you can reproduce

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@mohideen
Copy link

mohideen commented Jul 9, 2020

I'm seeing this issue in v1.18.5. We have a 6 node (3 control-plane, 3 worker) HA cluster and this happens after restart of the nodes. Also, I see that the order of restarts affect which node loses the IPs. I want to note that the loss of IP happened even if the node was drained before the restart.

Of the three control-plane nodes, which ever gets restarted last looses it internal and external IPs.

For example if cn1, c2, and cn3 are my control plane nodes:

  1. If I restart, cn1, it looses it IPs when it comes back online
  2. then if I restart, cn2, it looses it IPs when it comes back online, but after the cn2 restart cn1 gets its IPs.
  3. if I restart cn1, cn2, and cn3 at the exact same moment, all the nodes have their IPs right.

Environment:

  • Baremetal
  • RHEL 7
  • Kubeadm provisioned HA with stacked etcd
  • vSphere cloud provider

/open

@andrewsykim
Copy link
Member

@mohideen what's your CNI?

@mohideen
Copy link

mohideen commented Jul 9, 2020

We have the latest canal (Calico v3.15.0 + Flannel v0.11.0 in host-gw mode)

@AcidAngel21
Copy link

We have exactly the same problem in v1.18.3 and v1.18.6.

Environment:
Rancher v2.4.5
Rancher OS 1.5.1
RKE provisioned HA with stacked etcd
vSphere cloud provider

@AcidAngel21
Copy link

For me it only happens when the following is true. Tested with K8s v1.14.6, 1.17.9 and 1.18.6

It does not happen when the flag is set to --cloud-provider=vsphere

@andrewsykim
Copy link
Member

@AcidAngel21 your issue is likely related to kubernetes/cloud-provider-vsphere#338

@johnr84
Copy link

johnr84 commented Jul 24, 2020

@andrewsykim Thanks, That really helped. we had 3 workers and all 3 were not showing the IP. I added exclude-nics=cali*,docker*,tun* to /etc/vmware-tools/tools.conf and restarted open-vm-tools.service. Also restarted CPI pods and the IPs are showing now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cloudprovider kind/bug Categorizes issue or PR as related to a bug. sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. sig/node Categorizes an issue or PR as relevant to SIG Node.
Projects
None yet
Development

No branches or pull requests