New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kubelet refuses to admit critical static pod #80203
Comments
workaround: |
/sig node |
|
Thanks for the reminding, I checked PodPriority=true had been enabled for kubelet, controller, apiserver and scheduler,
also we can see
seems that the creation steps for a static pod are
so if node under resource pressure, static pod will fail at step 2, and it won't get a chance to populate its Priority value. But for non-static, all things works find because step 3 happens before step 2. |
Priority is populated when a pod object is created, but as you noted, the problem is that a pod object (mirror pod) for a static pod is created after the actual pod is created on the kubelet. One option to solve this problem is to add logic to Kubelet to resolve PriorityClass for static pods. @dchen1107 Who would be able to help with this issue? |
The issue is a regression caused by #79554. In this case, we should revert the pr until kube-proxy is properly moving to daemonset. |
I chatted with @dchen1107 offline. She and SIG node will discuss the solution to add code to Kubelet to resolve PriorityClassName to the numeric value of priority for static pods. In the meantime, we are reverting #79554 until this issue is resolved. |
I was thinking about the following change to CreateMirrorPod of pkg/kubelet/pod/mirror_client.go
|
Hi, @tedyu thanks for hacking on that, I don't think this change can resolve the issue. As in my previous comment #80203 (comment) stated, the priority should be properly assigned before kubelet admit phase which means before calling Admit handlers at step 2, but your change happens at step 3, besides critical pod judgement based on priority not priority class name. And one more question, why do you assign Priority value to PriorityClassName? they are two different things, the priority value is a numeric number indicates the pod priority and the the PriorityClassName is a priority class name we intended to choose to populate the priority value. To avoid confusion and keep the pod spec api consistent, we'd better not mix them. Further more, @bsalamat I personally believe a better solution for this issue is not only focusing on |
Thanks for the feedback. I restored it to the first form. I will keep what you said in mind. |
We are going to discuss this issue at today's SIG Node meeting. @yuchengwu can you join us today? |
Static pods & mirror pods are error-prone, and SIG Node wanted to deprecated the feature for long or limited its usage for self hosted kubernetes. Meanwhile, we understand that many K8s productions already depend on static pods for some critical pods before daemonset and other practical reasons. Assuming all static pods are critical, and should be admitted and scheduled by Kubelet even node under resource pressure, I proposed to grant the static pods with the critical priority class, and the highest priority number. Again, we discourage a random user to use the static pod; if you have to use the static pods, please be cautious with the resource capacity plan on the node. We discussed this at today's SIG Node, @bsalamat attended too. I proposed the above thoughts, and had the agreement from the community. |
Thanks, @dchen1107! As Dawn wrote above, the plan is to consider all static pods critical, regardless of their priority classes. After this change is made to the kubelet, we can merge #80342 again to remove the critical pod annotation. |
Thanks @dchen1107 , I'd happy to join the meeting though it was over.
Well, this is our situation.
I am fine with this change, one question is does that means a static pod .spec.priorityClassName will only be configured to |
@dixudx: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
What happened:
In our cluster, we have configured our system pod as static critical by applying
system-node-critical
priorityClassName. But these pods cannot get admitted by kubelet when node under memory resource pressure.What you expected to happen:
Critical static pod should always be get scheduled and admitted even node under resource pressure.
How to reproduce it (as minimally and precisely as possible):
--pod-manifest-path
argumentAnything else we need to know?:
A critical pod should get admitted by design even node under pressure , https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/eviction/eviction_manager.go#L138-L140
So why kublelet rejected the pod, I looked into
IsCriticalPod
funckubernetes/pkg/kubelet/types/pod_update.go
Lines 142 to 147 in c30f024
apparently kubelet check if pod critical based on .pod.Spec.Priority, I remember .pod.Spec.Priority will be populated automatically if .pod.Spec.PriorityClassName was specified. But when it was populated? before or after kubelet admit phase, if it is the former then all things make sense, if it is the latter, then the issue may caused by other reasons, e.g. incorrect configuration.
To confirm that, I added several line to print pod Priority field then I found the Priority was actually not filled up before admit, surprising! As our product K8S differs from the master, I used minikube to start up a fresh new cluster to see if this had been fixed, steps are
obviously, kubelet failed to admit critical pod probably because static pod priority value not setted at a right point which expects earlier than kubelet admit phase.
Environment:
kubectl version
):cat /etc/os-release
):uname -a
):The text was updated successfully, but these errors were encountered: