Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

“Safely Drain A Node” does not explain how to handle DaemonSet pods #39816

Open
liuwh08 opened this issue Mar 6, 2023 · 8 comments
Open
Labels
language/en Issues or PRs related to English language lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. priority/backlog Higher priority than priority/awaiting-more-evidence. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. sig/docs Categorizes an issue or PR as relevant to SIG Docs.

Comments

@liuwh08
Copy link

liuwh08 commented Mar 6, 2023

The draining guide of draining explicitly ignores Daemonsets pods by

kubectl drain --ignore-daemonsets <node name>

Wonder why Daemonsets are safer to skip draining comparing to other pods from Deployment or Statefulsets (or maybe regular pods). There should be no difference compatability-wise about those pods to survive a drop-in kubelet MINOR upgrade.

@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Mar 6, 2023
@liuwh08
Copy link
Author

liuwh08 commented Mar 6, 2023

cc @liggitt

@utkarsh-singh1
Copy link
Contributor

utkarsh-singh1 commented Mar 7, 2023

@k8s-ci-robot k8s-ci-robot added language/en Issues or PRs related to English language sig/docs Categorizes an issue or PR as relevant to SIG Docs. labels Mar 7, 2023
@sftim
Copy link
Contributor

sftim commented Mar 8, 2023

If there are daemon set-managed pods, drain will not proceed without --ignore-daemonsets, and regardless it will not delete any daemon set-managed pods, because those pods would be immediately replaced by the daemon set controller, which ignores unschedulable markings.

from linked page https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands#drain

Does that explain it?

@liggitt
Copy link
Member

liggitt commented Mar 8, 2023

DaemonSet pods are not safer to skip draining, they are just more difficult to drain effectively.

In the skew/upgrade doc, it indicates pods must be removed from a node before upgrading, because kubelet does not test or guarantee compatibility of on-disk structures between minor versions. That guidance applies to all types of pods, including daemonset pods. From an upgrade perspective, it doesn't especially care whether you use the kubectl drain command or not, just that the end result is there are no pods assigned to the node prior to upgrade.

In the page that describes running the drain command, it walks through steps to remove pods, but apparently does not provide instructions for effectively removing daemonset pods. That's the issue to fix (either by documenting effective steps or by improving the drain command or daemonset controller or both)

@sftim
Copy link
Contributor

sftim commented Mar 8, 2023

I think the kubectl drain docs imply that the subcommand doesn't drain daemonsets. Part of the fix could include an improvement to how we explain the subcommand.

The key question that Safely Drain A Node must answer is: How do we stop DaemonSets from replacing Pods after we remove them, and before we have finished maintenance?

We can document this, but it might be a bit of a pain: I think we'd need to look at labelling nodes and changing the scheduling rules for each DaemonSet.

@sftim
Copy link
Contributor

sftim commented Mar 8, 2023

/triage accepted
/sig cluster-lifecycle

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Mar 8, 2023
@sftim
Copy link
Contributor

sftim commented Mar 8, 2023

/priority backlog

IMO: not quite important-longterm, but close

/lifecycle frozen

/retitle “Safely Drain A Node” does not explain how to handle DaemonSet pods

@k8s-ci-robot k8s-ci-robot changed the title Safely Drain a Node - About skipping Daemonsets “Safely Drain A Node” does not explain how to handle DaemonSet pods Mar 8, 2023
@k8s-ci-robot k8s-ci-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/backlog Higher priority than priority/awaiting-more-evidence. labels Mar 8, 2023
@k8s-triage-robot
Copy link

This issue has not been updated in over 1 year, and should be re-triaged.

You can:

  • Confirm that this issue is still relevant with /triage accepted (org members only)
  • Close this issue with /close

For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/

/remove-triage accepted

@k8s-ci-robot k8s-ci-robot added needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. and removed triage/accepted Indicates an issue or PR is ready to be actively worked on. labels Mar 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
language/en Issues or PRs related to English language lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. priority/backlog Higher priority than priority/awaiting-more-evidence. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. sig/docs Categorizes an issue or PR as relevant to SIG Docs.
Projects
None yet
Development

No branches or pull requests

6 participants