You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm using Kured on Azure with an ACS Engine generated cluster, and I can see that nodes are being drained and refilled but it looks like they are not being rebooted.
For example, a reboot-required was set on 23:43 on April 13th for node k8s-agents-27478824-4:
And I see Kured triggering: draining and refilling nodes with pods:
$ kubectl get pods --all-namespaces -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE
cassandra cassandra-cassandra-0 1/1 Running 1 3d 10.30.0.37 k8s-agents-27478824-3
cassandra cassandra-cassandra-1 0/1 Pending 0 6s
...
Sadly, this seems to happen EVERY hour without fail. Digging into this it looks like this is because the nodes are actually not being rebooted:
$ last reboot
reboot system boot 4.13.0-1011-azur Fri Apr 13 23:17 still running
reboot system boot 4.13.0-1011-azur Sun Apr 8 19:21 still running
(Note that the last reboot time is before the timestamp of the reboot-required)
Is there something I need to do with Kured in order to tell it how to reboot nodes etc.? Or is this a bug?
The text was updated successfully, but these errors were encountered:
Using the latest image version master-b27aaa1 resolves the issue. As in #14 described this comes from the kubectl version drift between server and client. master-b27aa1 contains kubectl version 1.9.6
I'm using Kured on Azure with an ACS Engine generated cluster, and I can see that nodes are being drained and refilled but it looks like they are not being rebooted.
For example, a reboot-required was set on 23:43 on April 13th for node k8s-agents-27478824-4:
$ ls -al
...
-rw-r--r-- 1 root root 0 Apr 13 23:43 reboot-required
...
And I see Kured triggering: draining and refilling nodes with pods:
$ kubectl get pods --all-namespaces -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE
cassandra cassandra-cassandra-0 1/1 Running 1 3d 10.30.0.37 k8s-agents-27478824-3
cassandra cassandra-cassandra-1 0/1 Pending 0 6s
...
Sadly, this seems to happen EVERY hour without fail. Digging into this it looks like this is because the nodes are actually not being rebooted:
$ last reboot
reboot system boot 4.13.0-1011-azur Fri Apr 13 23:17 still running
reboot system boot 4.13.0-1011-azur Sun Apr 8 19:21 still running
(Note that the last reboot time is before the timestamp of the reboot-required)
Is there something I need to do with Kured in order to tell it how to reboot nodes etc.? Or is this a bug?
The text was updated successfully, but these errors were encountered: