Constant Underutilized eviction on highly allocated nodes

### Description

**Observed Behavior**:
The following node sees consistent disruption of the client-production pod:
```
Non-terminated Pods:  (13 in total)
  Namespace           Name                                   CPU Requests  CPU Limits  Memory Requests  Memory Limits   Age
  ---------           ----                                   ------------  ----------  ---------------  -------------   ---
  client-production   client-pod-production-cc75cfdcf-6n989  60750m (94%)  0 (0%)      238216Mi (95%)   237780Mi (94%)  7m24s
  kube-system         daemonset1-pod-k2rgd                   0 (0%)        0 (0%)      0 (0%)           0 (0%)          48m
  kube-system         daemonset2-pod-xvgdb                   200m (0%)     0 (0%)      835Mi (0%)       835Mi (0%)      48m
  kube-system         daemonset3-pod-dwvlf                   165m (0%)     0 (0%)      612Mi (0%)       512Mi (0%)      48m
  kube-system         daemonset4-pod-zqwkq                   200m (0%)     0 (0%)      464Mi (0%)       464Mi (0%)      48m
  kube-system         daemonset5-pod-qgqtd                   25m (0%)      0 (0%)      0 (0%)           0 (0%)          48m
  kube-system         daemonset6-pod-dgtgl                   30m (0%)      0 (0%)      120Mi (0%)       768Mi (0%)      48m
  kube-system         daemonset7-pod-gfdbj                   10m (0%)      0 (0%)      128Mi (0%)       128Mi (0%)      48m
  kube-system         static-pod1-node-name                  200m (0%)     200m (0%)   64Mi (0%)        64Mi (0%)       48m
  kube-system         static-pod2-node-name                  0 (0%)        0 (0%)      0 (0%)           0 (0%)          48m
  kube-system         daemonset8-pod-cwsz2                   100m (0%)     0 (0%)      64Mi (0%)        64Mi (0%)       48m
  kube-system         daemonset9-pod-mgtlm                   300m (0%)     0 (0%)      128Mi (0%)       128Mi (0%)      48m
  kube-system         daemonset0-pod-wp4jg                   0 (0%)        0 (0%)      0 (0%)           0 (0%)          48m
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource           Requests        Limits
  --------           --------        ------
  cpu                61980m (96%)    200m (0%)
  memory             240631Mi (96%)  240743Mi (96%)
  ephemeral-storage  83072Mi (36%)   76928Mi (34%)
  hugepages-1Gi      0 (0%)          0 (0%)
  hugepages-2Mi      0 (0%)          0 (0%)
```
The pod is consistently evicted across all nodes (AWS m6i.16xlarge) in the cluster. It is in a nodepool with only on-demand AWS instances: c6i.16xlarge, m6i.16xlarge, and r6i.16xlarge. c6i.16xlarge, m6i.16xlarge, and r6i.16xlarge are ordered in ascending price, but the client-production pod cannot fit on a c6i.16xlarge and should settle on a m6i.16xlarge, the next cheapest instance type.

There is another workload in this cluster that uses reserved instance types in a separate nodepool. This workload scales beyond its reserved capacity to double its reserved capacity, taking from on-demand (e.g. 200 total nodes, 100 reserved, 100 on-demand). This second workload is evicted throughout the day as it moves on-demand capacity back to reserved after scaling back down to minimum replicas.

There are effectively no other workloads in this cluster that differentiate it from our other clusters.

**Expected Behavior**:
Node with high allocation efficiency should not be disrupted if it cannot find cheaper node.

**Reproduction Steps** (Please include YAML):
See above. Will continue debugging on our end until then.

**Versions**:
- Chart Version: v1.5.0
- Kubernetes Version (`kubectl version`): v1.29.15

* Please vote on this issue by adding a 👍 [reaction](https://blog.github.com/2016-03-10-add-reactions-to-pull-requests-issues-and-comments/) to the original issue to help the community and maintainers prioritize this request
* Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
* If you are interested in working on this issue or have submitted a pull request, please leave a comment


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Constant Underutilized eviction on highly allocated nodes #2319

Description

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Constant Underutilized eviction on highly allocated nodes #2319

Description

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions