Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pods do not get evicted while logs say "evicting pods from node" #39

Closed
concaf opened this issue Nov 23, 2017 · 6 comments
Closed

Pods do not get evicted while logs say "evicting pods from node" #39

concaf opened this issue Nov 23, 2017 · 6 comments
Labels
kind/bug Categorizes issue or PR as related to a bug.
Milestone

Comments

@concaf
Copy link
Contributor

concaf commented Nov 23, 2017

So, if I understood correctly,

  • any node below the percentages in nodeResourceUtilizationThresholds.thresholds is considered underutilized
  • any node above the percentages in nodeResourceUtilizationThresholds.targetThresholds is considered overutilized
  • any node below the above 2 range is considered appropriately utilized by the descheduler and not taken into consideration

If this is correct, the following happens -

I have 4 nodes, 1 master node and 3 worker nodes -

$ kubectl get nodes
NAME                           STATUS                     ROLES     AGE       VERSION
kubernetes-master              Ready,SchedulingDisabled   <none>    6h        v1.10.0-alpha.0.456+f85649c6cd2032-dirty
kubernetes-minion-group-1vp4   Ready                      <none>    6h        v1.10.0-alpha.0.456+f85649c6cd2032-dirty
kubernetes-minion-group-frgx   Ready                      <none>    6h        v1.10.0-alpha.0.456+f85649c6cd2032-dirty
kubernetes-minion-group-k7c7   Ready                      <none>    6h        v1.10.0-alpha.0.456+f85649c6cd2032-dirty

I tainted and then uncordoned node kubernetes-minion-group-1vp4, which means there are no pods or Kubernetes resources on that node -

$ kubectl get all -o wide | grep kubernetes-minion-group-1vp4
$

and the allocated resources on this node are -

Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  CPU Requests  CPU Limits  Memory Requests  Memory Limits
  ------------  ----------  ---------------  -------------
  200m (10%)    0 (0%)      200Mi (2%)       300Mi (4%)

while on the other 2 worker nodes the allocated resources are -

--
  CPU Requests  CPU Limits  Memory Requests  Memory Limits
  ------------  ----------  ---------------  -------------
  1896m (94%)   446m (22%)  1133952Ki (15%)  1441152Ki (19%)
--
  CPU Requests  CPU Limits  Memory Requests  Memory Limits
  ------------  ----------  ---------------  -------------
  1840m (92%)   300m (15%)  1130Mi (15%)     1540Mi (21%)

So with the right DeschedulerPolicy, pods should have been descheduled from the loads that are over utilized and scheduled on the fresh node.

I wrote the following DeschedulerPolicy -

apiVersion: "descheduler/v1alpha1"
kind: "DeschedulerPolicy"
strategies:
  "LowNodeUtilization":
     enabled: true
     params:
       nodeResourceUtilizationThresholds:
         thresholds:  # any node below the following percentages is considered underutilized
           "cpu" : 40
           "memory": 40
           "pods": 40
         targetThresholds: # any node above the following percentages is considered overutilized
           "cpu" : 30
           "memory": 2
           "pods": 1

I run the descheduler as the following -

$ _output/bin/descheduler --kubeconfig-file /var/run/kubernetes/admin.kubeconfig --policy-config-file examples/policy.yaml  -v 5             
I1123 22:12:27.298937    9381 reflector.go:198] Starting reflector *v1.Node (1h0m0s) from github.com/kubernetes-incubator/descheduler/pkg/descheduler/node/node.go:83
I1123 22:12:27.299080    9381 node.go:50] node lister returned empty list, now fetch directly
I1123 22:12:27.299230    9381 reflector.go:236] Listing and watching *v1.Node from github.com/kubernetes-incubator/descheduler/pkg/descheduler/node/node.go:83
I1123 22:12:31.596854    9381 lownodeutilization.go:115] Node "kubernetes-master" usage: api.ResourceThresholds{"cpu":95, "memory":11.575631035804197, "pods":8.181818181818182}
I1123 22:12:31.597019    9381 lownodeutilization.go:115] Node "kubernetes-minion-group-1vp4" usage: api.ResourceThresholds{"memory":2.764226588836412, "pods":1.8181818181818181, "cpu":10}
I1123 22:12:31.597508    9381 lownodeutilization.go:115] Node "kubernetes-minion-group-frgx" usage: api.ResourceThresholds{"cpu":94.8, "memory":15.305177094063607, "pods":16.363636363636363}
I1123 22:12:31.597910    9381 lownodeutilization.go:115] Node "kubernetes-minion-group-k7c7" usage: api.ResourceThresholds{"cpu":92, "memory":15.617880226925726, "pods":14.545454545454545}
I1123 22:12:31.597955    9381 lownodeutilization.go:163] evicting pods from node "kubernetes-minion-group-frgx" with usage: api.ResourceThresholds{"cpu":94.8, "memory":15.305177094063607, "pods":16.363636363636363}
I1123 22:12:31.597993    9381 lownodeutilization.go:163] evicting pods from node "kubernetes-minion-group-k7c7" with usage: api.ResourceThresholds{"cpu":92, "memory":15.617880226925726, "pods":14.545454545454545}
I1123 22:12:31.598017    9381 lownodeutilization.go:163] evicting pods from node "kubernetes-master" with usage: api.ResourceThresholds{"cpu":95, "memory":11.575631035804197, "pods":8.181818181818182}
$

Seems like the descheduler ended up making the decisions for evicting pods from overutilized nodes, but when I check the cluster, nothing on the old nodes was terminated and nothing on the fresh node popped up -

$ kubectl get all -o wide | grep kubernetes-minion-group-1vp4
$

What am I doing wrong? :(

@concaf
Copy link
Contributor Author

concaf commented Nov 25, 2017

Interesting, this works fine for v1.8.4 -

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"8+", GitVersion:"v1.8.4-dirty", GitCommit:"9befc2b8928a9426501d3bf62f72849d5cbcd5a3", GitTreeState:"dirty", BuildDate:"2017-11-25T12:04:44Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"8+", GitVersion:"v1.8.4-dirty", GitCommit:"9befc2b8928a9426501d3bf62f72849d5cbcd5a3", GitTreeState:"dirty", BuildDate:"2017-11-25T11:54:10Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}

@aveshagarwal
Copy link
Contributor

So, if I understood correctly,

  • any node below the percentages in nodeResourceUtilizationThresholds.thresholds is considered underutilized
  • any node above the percentages in nodeResourceUtilizationThresholds.targetThresholds is considered overutilized
  • any node below the above 2 range is considered appropriately utilized by the descheduler and not taken into consideration

Your understanding is correct.

@aveshagarwal
Copy link
Contributor

         thresholds:  # any node below the following percentages is considered underutilized
           "cpu" : 40
           "memory": 40
           "pods": 40
         targetThresholds: # any node above the following percentages is considered overutilized
           "cpu" : 30
           "memory": 2
           "pods": 1

This is weird/incorrect as the idea is to have targetThresholds >= thresholds.

@concaf
Copy link
Contributor Author

concaf commented Nov 28, 2017

This is weird/incorrect as the idea is to have targetThresholds >= thresholds.

@aveshagarwal yep, was just playing around with the policy file.


This did not work for me when I deployed a cluster from Kubernetes master the other day, but when I switched to v1.8.4, the descheduler did evict the pods. Is this expected behavior, which versions does the descheduler support today?

@ravisantoshgudimetla
Copy link
Contributor

@containscafeine Are you still facing this issue? If not can you please close this?

@ravisantoshgudimetla ravisantoshgudimetla added the kind/bug Categorizes issue or PR as related to a bug. label Jan 3, 2018
@ravisantoshgudimetla ravisantoshgudimetla added this to the release-0.4 milestone Jan 3, 2018
@ravisantoshgudimetla
Copy link
Contributor

@containscafeine - Closing this as there is no update. Feel free to open, if you are still facing.

damemi pushed a commit to damemi/descheduler that referenced this issue Sep 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

3 participants