Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kubelet restart cause running pod restart with UnexpectedAdmissionError when pods have initContainers and external devices like GPU #124345

Open
zwpaper opened this issue Apr 17, 2024 · 11 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. sig/node Categorizes an issue or PR as relevant to SIG Node. triage/needs-information Indicates an issue needs more information in order to work on it.

Comments

@zwpaper
Copy link
Member

zwpaper commented Apr 17, 2024

What happened?

When restarting kubelet, it will restart the running pods with UnexpectedAdmissionError when pods' initContainers and containers both use external devices like GPU

What did you expect to happen?

Restart kubelet should not cause running pods to restart

How can we reproduce it (as minimally and precisely as possible)?

  1. create some pods with nvidia.com/gpu and initContainers which also use GPUs
  2. wait until initContainers exit
  3. restart kubelet

Anything else we need to know?

there was an issue and a fix for running pods with devices, but it looks like the initContainers is not counted as should skip containers.

the fix has cherry-picked to v1.25.16

Related Issues:

Kubernetes version

$ kubectl version
# 1.25.16

Cloud provider

OS version

# On Linux:
$ cat /etc/os-release
# paste output here
$ uname -a
# paste output here

# On Windows:
C:\> wmic os get Caption, Version, BuildNumber, OSArchitecture
# paste output here

Install tools

Container runtime (CRI) and version (if applicable)

Related plugins (CNI, CSI, ...) and versions (if applicable)

@zwpaper zwpaper added the kind/bug Categorizes issue or PR as related to a bug. label Apr 17, 2024
@k8s-ci-robot k8s-ci-robot added needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Apr 17, 2024
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@zwpaper
Copy link
Member Author

zwpaper commented Apr 17, 2024

/sig node

@k8s-ci-robot k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Apr 17, 2024
@ffromani
Copy link
Contributor

/cc

@ffromani
Copy link
Contributor

my 2c: completed containers, aka containers terminated succesfully and not deleted, like init containers or containers belonging to Jobs, should NOT be restarted or even retry admission when kubelet is restarted. The reason is, well, these containers already completed succesfully.

@ffromani
Copy link
Contributor

quoting https://kubernetes.io/docs/concepts/workloads/pods/init-containers/#detailed-behavior:

A Pod cannot be Ready until all init containers have succeeded. The ports on an init container are not aggregated under a Service. A Pod that is initializing is in the Pending state but should have a condition Initialized set to false.

If the Pod [restarts](https://kubernetes.io/docs/concepts/workloads/pods/init-containers/#pod-restart-reasons), or is restarted, all init containers must execute again.

and

Because init containers can be restarted, retried, or re-executed, init container code should be idempotent. In particular, code that writes to files on EmptyDirs should be prepared for the possibility that an output file already exists.

which ties to #123980

@SergeyKanzhelev SergeyKanzhelev added this to Triage in SIG Node Bugs Apr 17, 2024
@ffromani
Copy link
Contributor

xref: #117955

@zwpaper
Copy link
Member Author

zwpaper commented Apr 20, 2024

my 2c: completed containers, aka containers terminated succesfully and not deleted, like init containers or containers belonging to Jobs, should NOT be restarted or even retry admission when kubelet is restarted. The reason is, well, these containers already completed succesfully.

yes! that's why I created this issue, and this issue is more focus on the init-containers, the completed init-containers cause running pods to be restarted.

this fixed doesn't help because the pod is still running: a2ca66d

I was thinking about, can we add a check here to skip allocateDevice if containers is expected stopped, looks like my pods was killed here:

https://github.com/ffromani/kubernetes/blob/7e3638982acebe901b34f3d9bab4f4e4c5d703c9/pkg/kubelet/cm/devicemanager/manager.go#L811

@swatisehgal
Copy link
Contributor

/cc

@AnishShah
Copy link
Contributor

@zwpaper can you try this on a newer k8s version? 1.25 is out of support. Can you also share a pod spec to reproduce this error? Why is the pod failing with an admission error if the device is still available?

@SergeyKanzhelev
Copy link
Member

/cc

@AnishShah AnishShah moved this from Triage to Needs Information in SIG Node Bugs Apr 24, 2024
@SergeyKanzhelev
Copy link
Member

/triage needs-information

@k8s-ci-robot k8s-ci-robot added the triage/needs-information Indicates an issue needs more information in order to work on it. label Apr 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. sig/node Categorizes an issue or PR as relevant to SIG Node. triage/needs-information Indicates an issue needs more information in order to work on it.
Projects
SIG Node Bugs
Needs Information
Development

No branches or pull requests

6 participants