-
Notifications
You must be signed in to change notification settings - Fork 875
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New Node creation stuck in a loop #1573
Comments
Can you list your daemon sets and what their resource requests are? |
|
Hey @vasu-git , We are working on a fix which should allow Karpenter to handle this scenario in future releases. Thanks for reporting the issue! |
Thanks @dewjam |
@dewjam hey
karpenter v0.5.3 |
Hey @infestonn , |
@infestonn As an example, using the default values Karpenter expects an
|
|
Yes. I think that's right. Not sure how to find a workaround yet. |
@infestonn You could set |
Because of kubernetes/kubernetes#102382, it is extremely inadvisable to set resource requests on high-priority daemonsets. Otherwise, a daemonset update could cause critical workloads to be (unnecessarily) preempted. This is why |
Were you able to reproduce the behavior in kubernetes/kubernetes#102382? It seems like setting Is your recommendation to not define resource "limits" on critical Daemonsets as well? (if you define resource limits, but not requests then requests = limits) |
What does "what Karpenter expects" means? Does provisioner API have any parameter for that? |
Sorry for the confusion @infestonn . Karpenter calculates When determining the best instance type for a workload, Karpenter assumes the instance will be launched with We are discussing exposing these params in the Karpenter Provisioner spec, though I don't have any timeline I can share. |
Thank you for clarification. |
Hello @vasu-git , I'm going to go ahead and close out this issue, but please feel free to reopen if you have any questions in the meantime. Thanks for reporting this issue! |
Version
Karpenter: v0.7.3
Kubernetes: v1.21.5-eks-bc4871b
Below are log snippets from ebs-csi-controller, karpenter and kubernetes events in the pod namespace. (when t3a.medium instances were being provisioned)
Ebs-csi-controller logs
Karpenter logs
Kubernetes Events
Expected Behavior
Karpenter should provision a node which is big enough for the unschedulable pod
Actual Behavior
Node creation gets stuck in an infinite loop because the node getting created is not big enough for the unschedulable pod?
Just fyi Other pods/nodes creation is working fine. I only see issue this issue when I try to deploy this particular pod.
The text was updated successfully, but these errors were encountered: