-
Notifications
You must be signed in to change notification settings - Fork 39k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scheduler sometimes preempts unnecessary pods #70622
Comments
@bsalamat this is the issue I talked to you earlier in slack. I can reproduce it in 1.11 everytime. On 1.12 it's not that easy to reproduce, but I do see this issue also. So I believe it's a bug, in rare (racing) conditions. |
I remember that you were able to reproduce this in your cluster that you brought up with kubeadm, but I was not able to reproduce in a cluster brought up with "kube-up.sh". Have you tried bringing a cluster up with kube-up.sh and see if you can reproduce it? |
@bsalamat by "kube-up.sh" you mean "hack/local-up-cluster.sh"? |
No, I mean |
@bsalamat I don't have a paid gce/aws account lol, so never played with that. Right now I can easily reproduce in on v1.11.3 with hack/local-up-cluster.sh, and kubeadm v1.11.x and kubeadm v1.12.x. |
hmm.. this is odd that the issue is not reproducible when a cluster is not created by kubeadm. |
What happened:
Sometimes, scheduler doesn't preempt pods in an "exact" correct way/path. But good thing is the final state is accurate - pods which should be preempted are finally preempted.
What you expected to happen:
The internal preemption process should also be exactly correct to avoid producing unnecessary preemptions.
How to reproduce it (as minimally and precisely as possible):
Step 0: to make it a clean env that no pods has occupied cpu, I edited all workloads to remove their cpu request/limits, so the node has 0 usage on cpu:
Step 1: Create 4 priority classes
Step 2: Create priority{1,2,3}.yaml
By now, deploy1, deploy2, deploy3 occupied 7800m cpu:
Step 3: Create a high priority deployment4 to see how preemption works
Expected result is that pods in deploy1 and deploy2 are pending, and pods in deploy3 should NOT be touched. And finally pods in deploy3 and deploy4 are running.
But it turns out it's not the case, see detailed log
.
Anything else we need to know?:
It's easy to reproduce in a multiple nodes env (kubeadm), but not that easy to repro in a single node env (hack/local-up-cluster.sh).
Environment:
kubectl version
): v1.11.3, v1.12.1, and master branchuname -a
):/kind bug
/sig scheduling
The text was updated successfully, but these errors were encountered: