failure for upgrading due to short drain_grace_period and drain_timeout #1453

Abdelsalam-Abbas · 2017-07-16T20:20:27Z

Is this a BUG REPORT or FEATURE REQUEST? (choose one):
Bug Report

Environment:

Cloud provider or hardware configuration:
azure
OS (printf "$(uname -srm)\n$(cat /etc/os-release)\n"):
Linux 4.4.0-81-generic x86_64
Version of Ansible (ansible --version):
ansible 2.3.1.0
config file = /home/devops/kargo/ansible.cfg
configured module search path = [u'./library']
python version = 2.7.12 (default, Nov 19 2016, 06:48:10) [GCC 5.4.0 20160609]

Kubespray version (commit) (git rev-parse --short HEAD):
02e0fb5

Network plugin used:
flannel

Command used to invoke ansible:
ansible-playbook -i contrib/azurerm/inventory -u devops -b -e "@inventory/group_vars/all.yml" -e "@inventory/group_vars/k8s-cluster.yml" upgrade-cluster.yml -e kube_version=v1.6.2

Output of ansible run:

....
TASK [upgrade/pre-upgrade : Cordon node] **************************************************************************************************************************
Sunday 16 July 2017  19:56:41 +0000 (0:00:00.125)       0:33:59.417 ***********
changed: [minion-2 -> None]

TASK [upgrade/pre-upgrade : Drain node] ***************************************************************************************************************************
Sunday 16 July 2017  19:56:42 +0000 (0:00:00.945)       0:34:00.363 ***********
fatal: [minion-2 -> None]: FAILED! => {"changed": true, "cmd": ["/usr/local/bin/kubectl", "drain", "--force", "--ignore-daemonsets", "--grace-period", "30", "--tim
eout", "40s", "--delete-local-data", "minion-2"], "delta": "0:00:43.120196", "end": "2017-07-16 19:57:26.411965", "failed": true, "rc": 1, "start": "2017-07-16 19:
56:43.291769", "stderr": "WARNING: Deleting pods not managed by ReplicationController, ReplicaSet, Job, DaemonSet or StatefulSet: flannel-minion-2, kube-proxy-mini
on-2, nginx-proxy-minion-2; Deleting pods with local storage: rook-api-2485995279-7j6vw; Ignoring DaemonSet-managed pods: rook-ceph-osd-09mhn\nWARNING: Deleting po
ds not managed by ReplicationController, ReplicaSet, Job, DaemonSet or StatefulSet: flannel-minion-2, kube-proxy-minion-2, nginx-proxy-minion-2; Deleting pods with
 local storage: rook-api-2485995279-7j6vw; Ignoring DaemonSet-managed pods: rook-ceph-osd-09mhn\nThere are pending pods when an error occurred: Drain did not complete within 40s\npod/wordpress-2634894193-f10kn\npod/kube-dns-2117142060-4ghsl\npod/rook-api-2485995279-7j6vw\npod/rook-ceph-mon2-x9xzn\nerror: Drain did not complete within 40s", "stderr_lines": ["WARNING: Deleting pods not managed by ReplicationController, ReplicaSet, Job, DaemonSet or StatefulSet: flannel-minion-2, kube-proxy-minion-2, nginx-proxy-minion-2; Deleting pods with local storage: rook-api-2485995279-7j6vw; Ignoring DaemonSet-managed pods: rook-ceph-osd-09mhn", "WARNING: Deleting pods not managed by ReplicationController, ReplicaSet, Job, DaemonSet or StatefulSet: flannel-minion-2, kube-proxy-minion-2, nginx-proxy-minion-2; Deleting pods with local storage: rook-api-2485995279-7j6vw; Ignoring DaemonSet-managed pods: rook-ceph-osd-09mhn", "There are pending pods when an error occurred: Drain did not complete within 40s", "pod/wordpress-2634894193-f10kn", "pod/kube-dns-2117142060-4ghsl", "pod/rook-api-2485995279-7j6vw", "pod/rook-ceph-mon2-x9xzn", "error: Drain did not complete within 40s"], "stdout": "node \"minion-2\" already cordoned", "stdout_lines": ["node \"minion-2\" already cordoned"]}
        to retry, use: --limit @/home/devops/kargo/upgrade-cluster.retry

Anything else do we need to know:

I would like to extend the drain_timeout and drain_grace_period as PR
#1454

The text was updated successfully, but these errors were encountered:

ykfq · 2019-03-29T08:38:03Z

higher the timeouts for draining nodes make no sence, I just ignored the error and continu runing.
Add ignore_errors: yes under the task.

Abdelsalam-Abbas mentioned this issue Jul 16, 2017

higher the timeouts for draining nodes while upgrading kubernetes version #1454

Merged

Abdelsalam-Abbas closed this as completed Jul 19, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

failure for upgrading due to short drain_grace_period and drain_timeout #1453

failure for upgrading due to short drain_grace_period and drain_timeout #1453

Abdelsalam-Abbas commented Jul 16, 2017 •

edited

ykfq commented Mar 29, 2019

failure for upgrading due to short drain_grace_period and drain_timeout #1453

failure for upgrading due to short drain_grace_period and drain_timeout #1453

Comments

Abdelsalam-Abbas commented Jul 16, 2017 • edited

ykfq commented Mar 29, 2019

Abdelsalam-Abbas commented Jul 16, 2017 •

edited