Nodes with scheduling disabled should not be taken into consideration for LowNodeUtilization #42

concaf · 2017-11-25T13:29:22Z

I have the following nodes -

$ kubectl get nodes
NAME                           STATUS                     ROLES     AGE       VERSION
kubernetes-master              Ready,SchedulingDisabled   <none>    56m       v1.8.4-dirty
kubernetes-minion-group-5rrh   Ready                      <none>    56m       v1.8.4-dirty
kubernetes-minion-group-fb8c   Ready                      <none>    56m       v1.8.4-dirty
kubernetes-minion-group-t1r3   Ready,SchedulingDisabled   <none>    56m       v1.8.4-dirty

The worker node kubernetes-minion-group-t1r3 was cordoned and marked as unschedulable, however it fulfilled the criteria for being an underutilized node according to the following policy file -

apiVersion: "descheduler/v1alpha1"
kind: "DeschedulerPolicy"
strategies:
  "LowNodeUtilization":
     enabled: true
     params:
       nodeResourceUtilizationThresholds:
         thresholds:  # any node below the following percentages is considered underutilized
           "cpu" : 40
           "memory": 40
           "pods": 40
         targetThresholds: # any node above the following percentages is considered overutilized
           "cpu" : 30
           "memory": 2
           "pods": 1

When I ran the descheduler, kubernetes-minion-group-t1r3 (the cordoned node) was taken into account and marked as underutilized and multiple pods were evicted from other nodes in the hope that the scheduler will schedule on kubernetes-minion-group-t1r3, but that never happened since the node was cordoned.

Does it make sense to not take a cordoned node into consideration while looking for underutilized nodes?

I ran the descheduler like the following -

$ _output/bin/descheduler --kubeconfig-file /var/run/kubernetes/admin.kubeconfig --policy-config-file examples/custom.yaml -v 5 
I1125 18:58:46.014381    2813 reflector.go:198] Starting reflector *v1.Node (1h0m0s) from github.com/kubernetes-incubator/descheduler/pkg/descheduler/node/node.go:83
I1125 18:58:46.016167    2813 node.go:50] node lister returned empty list, now fetch directly
I1125 18:58:46.017010    2813 reflector.go:236] Listing and watching *v1.Node from github.com/kubernetes-incubator/descheduler/pkg/descheduler/node/node.go:83
I1125 18:58:47.834184    2813 lownodeutilization.go:115] Node "kubernetes-master" usage: api.ResourceThresholds{"cpu":95, "memory":11.575631035804197, "pods":8.181818181818182}
I1125 18:58:47.834986    2813 lownodeutilization.go:115] Node "kubernetes-minion-group-5rrh" usage: api.ResourceThresholds{"cpu":90.5, "memory":6.932161992316314, "pods":17.272727272727273}
I1125 18:58:47.835701    2813 lownodeutilization.go:115] Node "kubernetes-minion-group-fb8c" usage: api.ResourceThresholds{"cpu":96.5, "memory":14.0975556030657, "pods":17.272727272727273}
I1125 18:58:47.835783    2813 lownodeutilization.go:115] Node "kubernetes-minion-group-t1r3" usage: api.ResourceThresholds{"cpu":10, "memory":2.764226588836412, "pods":1.8181818181818181}
I1125 18:58:47.835819    2813 lownodeutilization.go:163] evicting pods from node "kubernetes-minion-group-fb8c" with usage: api.ResourceThresholds{"cpu":96.5, "memory":14.0975556030657, "pods":17.272727272727273}
I1125 18:58:48.096681    2813 lownodeutilization.go:194] Evicted pod: "database-6f97f65956-6pxp5" (<nil>)
I1125 18:58:48.098323    2813 lownodeutilization.go:208] updated node usage: api.ResourceThresholds{"cpu":91.5, "memory":14.0975556030657, "pods":16.363636363636363}
I1125 18:58:48.361411    2813 lownodeutilization.go:194] Evicted pod: "wordpress-57f4bb46bf-g27k6" (<nil>)
I1125 18:58:48.361522    2813 lownodeutilization.go:208] updated node usage: api.ResourceThresholds{"cpu":86.5, "memory":14.0975556030657, "pods":15.454545454545455}
I1125 18:58:48.623304    2813 lownodeutilization.go:194] Evicted pod: "wordpress-57f4bb46bf-m62cm" (<nil>)
I1125 18:58:48.623330    2813 lownodeutilization.go:208] updated node usage: api.ResourceThresholds{"cpu":81.5, "memory":14.0975556030657, "pods":14.545454545454547}
I1125 18:58:48.894712    2813 lownodeutilization.go:194] Evicted pod: "wordpress-57f4bb46bf-mblx7" (<nil>)
I1125 18:58:48.894832    2813 lownodeutilization.go:208] updated node usage: api.ResourceThresholds{"cpu":76.5, "memory":14.0975556030657, "pods":13.636363636363638}
I1125 18:58:48.894991    2813 lownodeutilization.go:163] evicting pods from node "kubernetes-master" with usage: api.ResourceThresholds{"cpu":95, "memory":11.575631035804197, "pods":8.181818181818182}
I1125 18:58:48.895063    2813 lownodeutilization.go:163] evicting pods from node "kubernetes-minion-group-5rrh" with usage: api.ResourceThresholds{"cpu":90.5, "memory":6.932161992316314, "pods":17.272727272727273}

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"8+", GitVersion:"v1.8.4-dirty", GitCommit:"9befc2b8928a9426501d3bf62f72849d5cbcd5a3", GitTreeState:"dirty", BuildDate:"2017-11-25T12:04:44Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"8+", GitVersion:"v1.8.4-dirty", GitCommit:"9befc2b8928a9426501d3bf62f72849d5cbcd5a3", GitTreeState:"dirty", BuildDate:"2017-11-25T11:54:10Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}

The text was updated successfully, but these errors were encountered:

concaf · 2017-11-25T14:02:57Z

@aveshagarwal I found some commented code that's fix this issue (#43).
Was there any reason that the related code was commented out? Would you rather have the logic somewhere else?

aveshagarwal · 2017-11-27T16:42:51Z

Does it make sense to not take a cordoned node into consideration while looking for underutilized nodes?

Yes I agree, thats what I explained in #43 (comment) .

…penshift-4.7-atomic-openshift-descheduler Updating atomic-openshift-descheduler builder & base images to be consistent with ART

This was referenced Nov 27, 2017

Ignore unschedulable nodes for eviction decisions #43

Closed

Changes to fix low node utilization strategy #45

Merged

aveshagarwal closed this as completed in #45 Nov 27, 2017

seanmalloy mentioned this issue Mar 23, 2021

RemoveDuplicates: does not take cordoned nodes into account #531

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nodes with scheduling disabled should not be taken into consideration for LowNodeUtilization #42

Nodes with scheduling disabled should not be taken into consideration for LowNodeUtilization #42

concaf commented Nov 25, 2017

concaf commented Nov 25, 2017

aveshagarwal commented Nov 27, 2017

Nodes with scheduling disabled should not be taken into consideration for LowNodeUtilization #42

Nodes with scheduling disabled should not be taken into consideration for LowNodeUtilization #42

Comments

concaf commented Nov 25, 2017

concaf commented Nov 25, 2017

aveshagarwal commented Nov 27, 2017