-
Notifications
You must be signed in to change notification settings - Fork 38.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pods that are terminating but force deleted must still be accounted for in admission #104824
Comments
/sig node |
/triage accepted |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
/remove-lifecycle rotten |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /close not-planned |
@k8s-triage-robot: Closing this issue, marking it as "Not Planned". In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/lifecycle frozen This is an issue that can occur at any time and batch is most likely to trigger it. Leaving it open for now until we decide whether static pods need to hold exclusive resources during graceful termination and can block admission (force deleting a regular pod probably means that you don't want to block admitting the pod, but you still need to hold actual admission until the old pod is terminated). There could be security issues with force terminating a pod that is holding an exclusive resource if the net new device doesn't guarantee a reset on acquisition (i.e. if the terminating pod is responsible for putting something into the pool clean, and the acquirer naively expects terminating to always run, the acquirer could see things from the previous user). Not a serious issue, but a likely source of bugs in reuse for most users. |
This issue has not been updated in over 1 year, and should be re-triaged. You can:
For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/ /remove-triage accepted |
A pod can be created, allocated unique resources on a node (such as a volume, an isolated CPU, or a device), and then force deleted. Force deletion however happens independent of the state of the Kubelet - the pod could still be running or not have completed termination (termination takes a non-zero amount of time from request to completion). A pod that is force deleted should still be visible to admission checks and other Kubelet loops until the pod is actually terminated.
There are two main admission loops - hard admission (which rejects the pod from the node immediately) and soft admission (which happens each time the pod is synced and might result in the pod being stopped temporarily) - which both need to take into account resources that are allocated but no longer in the list of "all pods".
Depending on the admission plugin, there may be different requirements about what information is necessary. For instance, if an admission plugin allocates resources only for running pods, then the plugin needs only pods that should or could have running containers (desired pods + terminating pods). If the admission plugin allocates resources for volumes or the entire pod, then the plugin needs to know (desired pods + non-terminal pods).
Currently, both use cases are handled identically, which means admission plugins often make direct calls to Kubelet manager functions, which could result in inconsistencies. Both cases are do not include pods that are force deleted, even those those pods may have running containers or be non-terminal.
Probable change
The current list of "all admitted pods" is served by GetActivePods() on the kubelet. This list is the set of observed desired configuration pods (every pod passed to HandlePodUpdates is added to the pod manager, regardless of admission status), which is filtered by:
However, this list needs to change to the union of all pods known to the pod manager and all pods known to the pod workers (which includes pods that have been force deleted), filtered by:
The
OtherPods
entry on the pod admission list would then correctly show pods still consuming resources. An admission plugin that depends only on running pods could then make calls to the pod worker via an appropriate method (or we could split OtherPods).The text was updated successfully, but these errors were encountered: