Skip to content
This repository has been archived by the owner on Feb 20, 2023. It is now read-only.

Commit

Permalink
Configure Graceful Node Shutdown and lengthen max inhibitor delay
Browse files Browse the repository at this point in the history
* Configure Kubelet Graceful Node Shutdown to detect system shutdown
events and stop running containers gracefully when possible
* Allow up to 30s for critical pods to gracefully shutdown
* Allow up to 15s for regular pods to gracefully shutdown
* Node will be marked as NotReady promptly, instead of having to
wait for health checks
* Kubelet uses systemd inhibitor locks to delay shutdown for a limited
number of seconds
* Raise the default max inhibitor time from 5s to 45s

Verify systemd inhibitor locks are present:

```
sudo systemd-inhibit --list
WHO     UID USER PID  COMM    WHAT     WHY                                        MODE
kubelet 0   root 4581 kubelet shutdown Kubelet needs time to handle node shutdown delay
```

Tail journal logs and then shutdown a node via systemctl reboot
or via the cloud console to watch container shutdown

Rel:

* https://kubernetes.io/blog/2021/04/21/graceful-node-shutdown-beta/
* https://kubernetes.io/docs/reference/config-api/kubelet-config.v1beta1/
* kubernetes/kubernetes#107043
* coreos/fedora-coreos-tracker#821
* https://www.freedesktop.org/software/systemd/man/systemd-inhibit.html
* https://github.com/kubernetes/kubernetes/blob/release-1.24/pkg/kubelet/nodeshutdown/nodeshutdown_manager_linux.go
* https://github.com/godbus/dbus/blob/master/conn.go
  • Loading branch information
dghubble committed Aug 28, 2022
1 parent 832bd22 commit 2509640
Show file tree
Hide file tree
Showing 2 changed files with 14 additions and 0 deletions.
7 changes: 7 additions & 0 deletions butane/controller.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -148,6 +148,8 @@ storage:
featureGates:
LocalStorageCapacityIsolationFSQuotaMonitoring: false
rotateCertificates: true
shutdownGracePeriod: 45s
shutdownGracePeriodCriticalPods: 30s
staticPodPath: /etc/kubernetes/manifests
readOnlyPort: 0
resolvConf: /run/systemd/resolve/resolv.conf
Expand Down Expand Up @@ -188,6 +190,11 @@ storage:
echo "Retry applying manifests"
sleep 5
done
- path: /etc/systemd/logind.conf.d/inhibitors.conf
contents:
inline: |
[Login]
InhibitDelayMaxSec=45s
- path: /etc/sysctl.d/max-user-watches.conf
mode: 0644
contents:
Expand Down
7 changes: 7 additions & 0 deletions workers/butane/worker.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -116,10 +116,17 @@ storage:
featureGates:
LocalStorageCapacityIsolationFSQuotaMonitoring: false
rotateCertificates: true
shutdownGracePeriod: 45s
shutdownGracePeriodCriticalPods: 30s
staticPodPath: /etc/kubernetes/manifests
readOnlyPort: 0
resolvConf: /run/systemd/resolve/resolv.conf
volumePluginDir: /var/lib/kubelet/volumeplugins
- path: /etc/systemd/logind.conf.d/inhibitors.conf
contents:
inline: |
[Login]
InhibitDelayMaxSec=45s
- path: /etc/sysctl.d/max-user-watches.conf
mode: 0644
contents:
Expand Down

0 comments on commit 2509640

Please sign in to comment.