Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Awkward pod status message for admission errors due to node-pressure condition #112605

Closed
mimowo opened this issue Sep 20, 2022 · 5 comments · Fixed by #112644
Closed

Awkward pod status message for admission errors due to node-pressure condition #112605

mimowo opened this issue Sep 20, 2022 · 5 comments · Fixed by #112644
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/backlog Higher priority than priority/awaiting-more-evidence. sig/node Categorizes an issue or PR as relevant to SIG Node. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@mimowo
Copy link
Contributor

mimowo commented Sep 20, 2022

What happened?

When a Pod admission fails due to node-pressure condition then the Pod's status.message looks like this:
"Pod The node had condition: [DiskPressure]."
The "Pod The node..." reads awkward.

What did you expect to happen?

A message like: "The node had condition: [DiskPressure].".

How can we reproduce it (as minimally and precisely as possible)?

  1. Run a job with a pod on a single node cluster, when the node is under disk pressure (then it is tainted with node.kubernetes.io/disk-pressure:NoSchedule). The Pod gets stuck in the Pending phase.
  2. Untaint the node by kubectl command line: kubectl taint node $nodeName node.kubernetes.io/disk-pressure:NoSchedule-
  3. The Pod is scheduled but admission by Kubelet fails as the taint is re-added shortly after its manual removal.

Then, the status of the Pod contains the awkward message.

Anything else we need to know?

No response

Kubernetes version

$ kubectl version
1.25

Cloud provider

Reproducible on kind

OS version

# On Linux:
$ cat /etc/os-release
# paste output here
$ uname -a
# paste output here

# On Windows:
C:\> wmic os get Caption, Version, BuildNumber, OSArchitecture
# paste output here

Install tools

Container runtime (CRI) and version (if applicable)

Related plugins (CNI, CSI, ...) and versions (if applicable)

@mimowo mimowo added the kind/bug Categorizes issue or PR as related to a bug. label Sep 20, 2022
@k8s-ci-robot k8s-ci-robot added needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Sep 20, 2022
@SergeyKanzhelev
Copy link
Member

/sig node
/triage accepted
/priority backlog

@k8s-ci-robot k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. triage/accepted Indicates an issue or PR is ready to be actively worked on. priority/backlog Higher priority than priority/awaiting-more-evidence. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Sep 20, 2022
@SergeyKanzhelev SergeyKanzhelev added this to Triage in SIG Node Bugs Sep 20, 2022
@SergeyKanzhelev SergeyKanzhelev moved this from Triage to Triaged in SIG Node Bugs Sep 20, 2022
@vitorfhc
Copy link
Contributor

The first part (Pod The node had condition: [DiskPressure]) is returned here: eviction_manager.go.

I wonder where the Pod is being appended to the beginning. It's my first try to contribute, so I am a little bit lost.

@mimowo
Copy link
Contributor Author

mimowo commented Sep 21, 2022

The first part (Pod The node had condition: [DiskPressure]) is returned here: eviction_manager.go.

I wonder where the Pod is being appended to the beginning. It's my first try to contribute, so I am a little bit lost.

I think this is a candidate place which might be responsible for the prefix (but haven't checked):

Message: "Pod " + message})

I would suggest you to try to reproduce the issue locally first then experiment with modifying the code.

@vitorfhc
Copy link
Contributor

/assign

@vitorfhc
Copy link
Contributor

@mimowo thanks, I've managed to reproduce and used Delve to make sure who's responsible.

I opened the PR changing the message. If it's accepted it will change to Pod rejected: The node had condition: [DiskPressure].

SIG Node Bugs automation moved this from Triaged to Done Sep 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/backlog Higher priority than priority/awaiting-more-evidence. sig/node Categorizes an issue or PR as relevant to SIG Node. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Development

Successfully merging a pull request may close this issue.

4 participants