-
Notifications
You must be signed in to change notification settings - Fork 855
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Karpenter not respecting daemonsets resources #2751
Comments
Can you link to those issues? |
Sometimes when this happens it's because the daemonset being created doesn't have a high priority so it won't cause an eviction an existing node that had no more capacity. The daemonset controller then will create a pod for that daemonset on the existing node but it'll fail to schedule. Does this issue happen when you have given them a high priority? |
I have the same issue as OP; I'm on a new EKS cluster. I have two nodes from the initial managed node group and third node created by Karpenter. I had a daemonset that was failing to schedule on two nodes. One of them was the node created by Karpenter. Adding the system-node-critical priority to the daemonset caused them to schedule, however, it ended up evicting the Karpenter pods that had been running on the managed node group. After removing the priority, one Karpenter pod was able to reschedule and the daemonset that had been running on that node was no longer able to schedule (because it was replaced by a pod with no priority class). So, I think you're talking about a different issue.
|
We check the schedulability of daemonsets and include it in simulations. Sometimes, daemonsets with specific scheduling constraints can cause us to not know whether or not they will schedule, so they don't get included in the node sizing decision. However, if you use a high priority on your daemonset, then some of your workload pods will fail to schedule, and will simply get caught/healed in the next provisioning loop (and potentially later consolidated). |
Ah, you're right. The priority class name works fine for me. My secondary issue was actually that I was adding another daemon set but my managed node group had nodes that were too small to accommodate anything else. |
Hello @spring1843 , sure, here are the ones I found related to this issue:
Regarding the priority class, yeah, all of our DaemonSets have either system-node-critical or system-cluster-critical, but personally I feel that it shouldn't matter, if a Pod either from a DaemonSet or Deployment is meant to run in one node ... Karpenter should proactively respect that and not wait for kube to evict some other Pods and then wait for Karpenter to spawn another node. Thanks for the follow up btw 👍 |
Are you using VPA on your daemonsets? We do calculate the sum of all of the daemonset resources to ensure that there is enough space. If a DS is modified after that, it could lead to pods not scheduling. |
Running 0.19.0 still issue.
if I change the |
Hello @tzneal , no we don't use VPA at all, so the resources are pretty fixed. |
Interesting. Looks like our scheduler doesn't think that the daemonset can schedule. I see your provisioner has
Does your daemonset tolerate this? Can you share the pod spec? |
@ellistarn yes, all of our daemonsets have:
|
From the snippit you shared, you'd need
|
@ellistarn yes, sorry my mistake, I changed it to "key_name" in the original post, but the actual name we use is "dedicated". |
Can you share your AWSNodeTemplate? It's possible that you're setting things in userdata that can impact scheduling calculations. |
Labeled for closure due to inactivity in 10 days. |
Hi, I'm using Karpenter 0.22.1 and I still have this issue. Wasn't it supposed to be fixed in PR #1155? Best regards, Werner. |
Yes, it's fixed as far as we are aware. The only daemonset resource issue I'm aware of is that we don't currently support a LimitRange supplying a default resource for daemonsets correctly, but that's in work. If you are still experiencing a problem, please file a new issue with logs and daemonset/pod specs so we can investigate. |
I second this notion, and still experience issues where daemonset pods dont have room on a node due to insufficient CPU, |
Hello, I am using Karpenter v0.23.0 and having the same proble with daemonset not beeing deployed. |
Hello, same here with fluentd on v0.27.3 |
Version
Karpenter Version: v0.18.1
Kubernetes Version: v1.23.9 (EKS)
Expected Behavior
When scaling up, Karpenter should calculate the "reserved" CPU needs for all DaemonSets from all namespaces and from there calculate the amount of instances needed for the "non-scheduled" pods. The expected behaviour in the end should be that for all new nodes all DaemonSets should always be up and running without any resources issues.
Actual Behavior
From time to time, we have scales up where multiple pods are created at the same time, it can go from 15 to 4k Pods. In this situations, there are some Pods from DaemonSets that won't get scheduled with the error " 0/X nodes are available: 1 Insufficient cpu, X node(s) didn't match Pod's node affinity/selector.".
We have tried adding priorityClass and nothing has really improved. All of the DaemonSets have proper resource definitions. Plus, this error seems random, so not really a deterministic way to find other reasons.
It's worth mentioning that I found some old similar issues which where technically solved in older versions. Also, this happens in Provisioners regardless of the feature "consolidation".
Steps to Reproduce the Problem
Install multiple DaemonSets like:
Install Karpenter and then try to autoscale thousands or hundreds Pods multiple times and the error should appear.
Resource Specs and Logs
There aren't any warnings, nor errors in the log's controller.
Community Note
The text was updated successfully, but these errors were encountered: