-
Notifications
You must be signed in to change notification settings - Fork 204
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Daemonset-driven consolidation #731
Comments
This is the current expected behavior. Karpenter provisions enough capacity for the pods and daemonsets that exist at the time of scheduling. When you add a new daemonset, for this to work properly, Karpenter would need to replace any existing nodes that the daemonset won't fit on. |
We typically recommend that you set a high priority for daemonsets to cover this use case. When scaling up a daemonset, it will trigger eviction for existing pods, which will feed back into karpenter's provisioning algorithm. |
I'll update the FAQ to cover this. |
I forgot to mention that I tried setting the priorityClassField of the DaemonSet to system-node-critical and also once to system-cluster-critical. In both cases all pods were scheduled, but both Karpenter controllers were evicted. I will try to avoid this by changing the pod disruption budget in the values file of Karpenter’s Helm chart. |
I could avoid the eviction of the Karpenter controllers, which have priority |
This trick doesn't always work. I removed Prometheus and installed the AWS CloudWatch agent as a DaemonSet. They also have priority of 1000000000. One of the four pods can't be scheduled, but no node is added. |
Here are some files that reflect the new situation. According to the generated |
This feature is very necessary, Karpenter should auto adjust upon introducing a new daemonset. We should NOT have to set priority classes on every single resource in the cluster. The correct solution here would be that if a node cant fit a newly installed daemonset pod due to CPU/RAM a new bigger node should be auto ordered that is bigger and could all pods that were housed on old and nodes as well as the daemonset. |
I have been using the following Kyverno policy to make sure apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: add-priority-class
annotations:
policies.kyverno.io/title: Add priority class for DaemonSets to help Karpenter.
policies.kyverno.io/subject: Pod
policies.kyverno.io/minversion: 1.6.0
policies.kyverno.io/description: Add priority class for DaemonSets to help Karpenter.
spec:
rules:
- name: add-priority-class-context
match:
any:
- resources:
kinds:
- DaemonSet
mutate:
patchStrategicMerge:
spec:
template:
spec:
priorityClassName: system-node-critical |
Nice @wdonne , Kyverno might be interested in taking that upstream. I think it would be useful for any autoscaler. See |
Hi @tzneal , thanks for the tip. I have created a pull request: kyverno/policies#631. If it gets merged I will create another one called "set-karpenter-non-cpu-limits". It relates to a best practice when using consolidation mode. I have a third one that sets the annotation |
this is a NOT a solution. It will not solve 99% of cases. It simply prioritizes daemonsets over non system critical items. Karpenter team, I at least by all means consider your product incomplete as is. please pay attention to this comment: Community please upvote so AWS understands that it charges money for an incomplete product. Let's not give them the idea that this ticket is somehow optional. |
couldn't've said it better myself. can't believe this bug is open for nearly a year :( |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
As this is an open source project: code contributions are welcome. If nobody writes the code, it doesn't get merged. |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
/remove-lifecycle stale |
any update? |
Requesting AWS to release formal update in which Karpenter trully lives into "Just-in-time Nodes for Any Kubernetes Cluster" even for this case. Depending on a plethora of workarounds that doesn't cover all cases tends to question the operational, production readiness of what's now considered a stable product, put in the market as proclaimed new default. |
We are now many versions on and this is still an issue. Any updates? |
Version
Karpenter Version: v0.22.1
Kubernetes Version: v1.24.8
Hi,
I have set up Karpenter with the following cluster configuration:
This is the provisioner:
Karpenter has currently provisioned three spot instances. When installing Prometheus with Helm chart version 19.3.1, two of the five node exporters can't be scheduled. The message is: "0/5 nodes are available: 1 Too many pods. preemption: 0/5 nodes are available: 5 No preemption victims found for incoming pod.". The Karpenter controllers didn't output any log entries.
This is the values file for the chart:
This is the live manifest of the DaemonSet of the Prometheus node exporter:
This is the live manifest of one of the pods that can't be scheduled:
I also did a test with the node selector "karpenter.sh/capacity-type: on-demand". Then one of the spot instances is deleted, but no new instance is created. The DaemonSet also doesn't create any pods.
PR aws/karpenter-provider-aws#1155 should have fixed the issue of DaemonSets not being part of the scaling decision, but perhaps this is a special case? The node exporter wants a pod on each node because it wants to tap telemetry.
Best regards,
Werner.
Expected Behavior
An extra node to be provisioned.
Actual Behavior
No extra ode is provisioned while two DaemonSet pods can't be scheduled.
Steps to Reproduce the Problem
I did this when there were already three Karpenter nodes, but I think you can just install Prometheus because the nodes are not full.
Resource Specs and Logs
karpenter-6d57cdbbd6-dqgcj.log
karpenter-6d57cdbbd6-lsv9f.log
Community Note
The text was updated successfully, but these errors were encountered: