Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update graceful node shutdown docs for beta #26963

Merged

Conversation

bobbypage
Copy link
Member

@bobbypage bobbypage commented Mar 9, 2021

/sig node

Enhancement Issue - kubernetes/enhancements#2000
Promotion of graceful node shutdown to beta - kubernetes/kubernetes#99735

@k8s-ci-robot k8s-ci-robot added this to the 1.21 milestone Mar 9, 2021
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 9, 2021
@k8sio-netlify-preview-bot
Copy link
Collaborator

k8sio-netlify-preview-bot commented Mar 9, 2021

Deploy preview for kubernetes-io-vnext-staging processing.

Building with commit b222a13

https://app.netlify.com/sites/kubernetes-io-vnext-staging/deploys/605d33764045290008b53a37

@k8s-ci-robot k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. language/en Issues or PRs related to English language labels Mar 9, 2021
@k8s-ci-robot k8s-ci-robot added the sig/docs Categorizes an issue or PR as relevant to SIG Docs. label Mar 9, 2021
@reylejano
Copy link
Member

/assign @mvortizr

@yvespp
Copy link

yvespp commented Mar 9, 2021

@bobbypage how do I know if this is working?
I did a shutdown -h now on the node but in the kubelet log I don't see that the kubelet is stopping pods beforehand. I even fails to connect to containerd:

Mar 09 14:06:17 kubelab-worker-e488b83d3ace kubelet[13965]: I0309 14:06:17.737104   13965 operation_generator.go:594] MountVolume.MountDevice succeeded for volume "pvc-...
Mar 09 14:07:14 kubelab-worker-e488b83d3ace kubelet[13965]: W0309 14:07:14.006744   13965 clientconn.go:1223] grpc: addrConn.createTransport failed to connect to {/run/containerd/containerd.sock  <nil> 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial unix /run/containerd/containerd.sock:>
Mar 09 14:07:14 kubelab-worker-e488b83d3ace kubelet[13965]: W0309 14:07:14.006778   13965 clientconn.go:1223] grpc: addrConn.createTransport failed to connect to {/run/containerd/containerd.sock  <nil> 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial unix /run/containerd/containerd.sock:>
Mar 09 14:07:14 kubelab-worker-e488b83d3ace systemd[1]: Stopping kubelet: The Kubernetes Node Agent...
Mar 09 14:07:14 kubelab-worker-e488b83d3ace kubelet[13965]: I0309 14:07:14.010731   13965 dynamic_cafile_content.go:182] Shutting down client-ca-bundle::/etc/kubernetes/pki/ca.crt
Mar 09 14:07:14 kubelab-worker-e488b83d3ace systemd[1]: kubelet.service: Succeeded.
Mar 09 14:07:14 kubelab-worker-e488b83d3ace systemd[1]: Stopped kubelet: The Kubernetes Node Agent.
-- Reboot --
Mar 09 14:08:08 kubelab-worker-e488b83d3ace systemd[1]: Started kubelet: The Kubernetes Node Agent.

Also I can see that the containers get killed by systemd (I have set KillMode=mixed for containerd):

Mar 09 14:07:14 kubelab-worker-e488b83d3ace systemd[1]: containerd.service: Killing process 10820 (java) with signal SIGKILL.
Mar 09 14:07:14 kubelab-worker-e488b83d3ace systemd[1]: containerd.service: Killing process 10832 (java) with signal SIGKILL.
Mar 09 14:07:14 kubelab-worker-e488b83d3ace systemd[1]: containerd.service: Killing process 10835 (java) with signal SIGKILL.
Mar 09 14:07:14 kubelab-worker-e488b83d3ace systemd[1]: containerd.service: Killing process 10900 (java) with signal SIGKILL.

This is with version 1.20.4 with this config:

featureGates:
  GracefulNodeShutdown: true
shutdownGracePeriod: 30s
shutdownGracePeriodCriticalPods: 10s

Also tried setting --feature-gates=GracefulNodeShutdown=true in the kubelet flags.

I can see that the inhibitor is registered:

root@kubelab-worker-e488b83d3ace:~# systemd-inhibit --list
WHO     UID USER PID COMM    WHAT     WHY                                        MODE
kubelet 0   root 945 kubelet shutdown Kubelet needs time to handle node shutdown delay

1 inhibitors listed.

I can open a separate issue if you want to...

@yvespp
Copy link

yvespp commented Mar 9, 2021

Ok, if I do a systemctl poweroff instead of shutdown -h now it works.
However "Shut Down Guest OS" via VMware talks to the open-vm-tools which in turn executes a shutdown -h now: https://github.com/vmware/open-vm-tools/blob/master/open-vm-tools/lib/system/systemLinux.c#L325
I'll check if shutdown -h now can execute a different systemctl command...

This is on Ubuntu 20.04.2

@bobbypage
Copy link
Member Author

Hi @yvespp

Good to hear you got it working.

systemd inhibitors don't delay the shutdown if you execute shutdown -h now as root xref. systemd will respect the inhibit delay if the shutdown is triggered by a power press, what systemd calls HandlePowerKey (which can be result of an ACPI shutdown event detected by logind from reading /dev/input/...). If you want to test it via cli, I also found doing a scheduled shutdown, i.e. shutdown -h +1, will trigger the inhibit delay.

@yvespp
Copy link

yvespp commented Mar 10, 2021

@bobbypage VMWare seems to have no way to shut down the vm via power button/ACPI events. It either shuts down the vm via vm-tools or basically just pulls the power.
What worked is replacing /usr/sbin/shutdown whit a custom script and then use "Shut Down Guest OS" via VMware. Script:

#!/bin/bash
exec systemctl poweroff

I looks like there are some features in future versions of systemd that might improve the situation:

However I noticed that the pods on the node get deleted but they start immediately again and then get killed by systemd when to os shuts down... This can't be the intention, right?
Non daemon set pods the the status.reason: Shutdown but that doesn't seem to stop Kubernetes to start the pod on the same node again...

@bobbypage
Copy link
Member Author

@yvespp thanks for trying it out. Do you want maybe open a separate github issue so we can debug further and keep this PR focused on the doc changes? Thanks!

@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Mar 16, 2021
@bobbypage bobbypage marked this pull request as ready for review March 16, 2021 05:08
@bobbypage bobbypage changed the title [WIP] Update graceful node shutdown docs for beta Update graceful node shutdown docs for beta Mar 16, 2021
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 16, 2021
@bobbypage
Copy link
Member Author

Updated the PR with update to the docs, this should be ready to review now.

@reylejano
Copy link
Member

@kubernetes/sig-node-pr-reviews Can you provide a review?

Copy link
Contributor

@sftim sftim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some feedback. I'd be willing to merge without seeing that addressed, but I'd definitely prefer to have those changes considered.

Kubelet ensures that pods follow the normal [pod termination process](/docs/concepts/workloads/pods/pod-lifecycle/#pod-termination) during the node shutdown.

When the `GracefulNodeShutdown` feature gate is enabled, kubelet uses [systemd inhibitor locks](https://www.freedesktop.org/wiki/Software/systemd/inhibit/) to delay the node shutdown with a given duration. During a shutdown kubelet terminates pods in two phases:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd still mention that feature gate, but tell readers that it is enabled by default. Not every cluster opts in to every beta feature.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated with a note regarding feature gate, please take a look.

Comment on lines 361 to 362
Note that by default, both `ShutdownGracePeriod` and `ShutdownGracePeriodCriticalPods` are set to zero, thus disabling Graceful Node Shutdown functionality.
To enable the feature, the two kubelet config settings should be configured appropriately and set to non-zero values.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please consider moving this much earlier in the page - it looks important!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, moved this section earlier.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 18, 2021
@sftim
Copy link
Contributor

sftim commented Mar 23, 2021

@bobbypage the PR description mentions that it is a:

Placeholder PR for Graceful Node Shutdown Beta docs

If this is ready for review, I recommend you reword that so it's clear.

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Mar 25, 2021
@bobbypage
Copy link
Member Author

Thanks @sftim this is ready for review, I updated the PR description and addressed your comments.

@tengqm
Copy link
Contributor

tengqm commented Mar 25, 2021

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: tengqm

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 25, 2021
@reylejano
Copy link
Member

/lgtm

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 25, 2021
@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 25, 2021
@reylejano
Copy link
Member

/lgtm cancel

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 25, 2021
@reylejano
Copy link
Member

@bobbypage Please rebase

@sftim
Copy link
Contributor

sftim commented Mar 26, 2021

LGTM if rebased the “obvious” way.

@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 26, 2021
@bobbypage
Copy link
Member Author

bobbypage commented Mar 26, 2021

Rebased, thanks for the review! PTAL again, @sftim @reylejano

@reylejano
Copy link
Member

Hi @caesarxuchao @dchen1107 @SergeyKanzhelev @karan @mrunalp @kubernetes/sig-node-pr-reviews , please provide a technical review (tech lgtm) for this PR by March 31 to get this into the release. Thank you!

@sftim
Copy link
Contributor

sftim commented Mar 26, 2021

I'm happy to LGTM this. We can revert if there's been some kind of grave error here!

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 26, 2021
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 92cf8c32e8807a45156e2ddf470bae55efed9790

@k8s-ci-robot k8s-ci-robot merged commit 6abcc1c into kubernetes:dev-1.21 Mar 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. language/en Issues or PRs related to English language lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/docs Categorizes an issue or PR as relevant to SIG Docs. sig/node Categorizes an issue or PR as relevant to SIG Node. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants