You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
2023-05-06T15:45:53.971Z ERROR controller.provisioner Could not schedule pod, incompatible with provisioner "default", incompatible requirements, key karpenter.sh/provisioner-name, karpenter.sh/provisioner-name DoesNotExist not in karpenter.sh/provisioner-name In [default]; incompatible with provisioner "gpu", did not tolerate app=app1-service:NoSchedule {"commit": "f60dacd", "pod": "karpenter/karpenter-57c5f67dd6-7g9fd"}
2023-05-06T15:47:57.971Z DEBUG controller.provisioner relaxing soft constraints for pod since it previously failed to schedule, removing: spec.topologySpreadConstraints = {"maxSkew":1,"topologyKey":"topology.kubernetes.io/zone","whenUnsatisfiable":"ScheduleAnyway","labelSelector":{"matchLabels":{"app.kubernetes.io/instance":"karpenter","app.kubernetes.io/name":"karpenter"}}} {"commit": "f60dacd", "pod": "karpenter/karpenter-57c5f67dd6-7g9fd"}
2023-05-06T15:47:57.982Z ERROR controller.provisioner Could not schedule pod, incompatible with provisioner "gpu", did not tolerate app=app1-service:NoSchedule; incompatible with provisioner "default", incompatible requirements, key karpenter.sh/provisioner-name, karpenter.sh/provisioner-name DoesNotExist not in karpenter.sh/provisioner-name In [default] {"commit": "f60dacd", "pod": "karpenter/karpenter-57c5f67dd6-7g9fd"}
2023-05-06T15:47:57.982Z INFO controller.provisioner found provisionable pod(s) {"commit": "f60dacd", "pods": 2}
2023-05-06T15:47:57.982Z INFO controller.provisioner serviced new node(s) to fit pod(s) {"commit": "f60dacd", "nodes": 1, "pods": 1}
2023-05-06T15:47:57.989Z INFO controller.provisioner launching node with 1 pods requesting {"cpu":"125m","pods":"3"} from types g4dn.4xlarge {"commit": "f60dacd", "provisioner": "gpu"}
2023-05-06T15:47:59.662Z DEBUG controller.provisioner.cloudprovider created launch template {"commit": "f60dacd", "provisioner": "gpu", "launch-template-name": "Karpenter-dev-cluster-123456", "launch-template-id": "lt-0f3aef76435882c8d"}
2023-05-06T15:48:02.821Z INFO controller.provisioner.cloudprovider launched new instance {"commit": "f60dacd", "provisioner": "gpu", "launched-instance": "i-02c2ad45690ea2d4c", "hostname": "ip-10-111-22-33.ec2.internal", "type": "g4dn.4xlarge", "zone": "us-east-1a", "capacity-type": "on-demand"}
2023-05-06T15:48:17.886Z DEBUG controller.provisioner relaxing soft constraints for pod since it previously failed to schedule, removing: spec.topologySpreadConstraints = {"maxSkew":1,"topologyKey":"topology.kubernetes.io/zone","whenUnsatisfiable":"ScheduleAnyway","labelSelector":{"matchLabels":{"app.kubernetes.io/instance":"karpenter","app.kubernetes.io/name":"karpenter"}}} {"commit": "f60dacd", "pod": "karpenter/karpenter-57c5f67dd6-7g9fd"}
2023-05-06T15:48:17.886Z ERROR controller.provisioner Could not schedule pod, incompatible with provisioner "default", incompatible requirements, key karpenter.sh/provisioner-name, karpenter.sh/provisioner-name DoesNotExist not in karpenter.sh/provisioner-name In [default]; incompatible with provisioner "gpu", did not tolerate app=app1-service:NoSchedule {"commit": "f60dacd", "pod": "karpenter/karpenter-57c5f67dd6-7g9fd"}
2023-05-06T15:48:20.000Z DEBUG controller.deprovisioning relaxing soft constraints for pod since it previously failed to schedule, removing: spec.topologySpreadConstraints = {"maxSkew":1,"topologyKey":"topology.kubernetes.io/zone","whenUnsatisfiable":"ScheduleAnyway","labelSelector":{"matchLabels":{"app.kubernetes.io/instance":"karpenter","app.kubernetes.io/name":"karpenter"}}} {"commit": "f60dacd", "pod": "karpenter/karpenter-57c5f67dd6-7g9fd"}
2023-05-06T15:48:40.837Z DEBUG controller.deprovisioning relaxing soft constraints for pod since it previously failed to schedule, removing: spec.topologySpreadConstraints = {"maxSkew":1,"topologyKey":"topology.kubernetes.io/zone","whenUnsatisfiable":"ScheduleAnyway","labelSelector":{"matchLabels":{"app.kubernetes.io/instance":"karpenter","app.kubernetes.io/name":"karpenter"}}} {"commit": "f60dacd", "pod": "karpenter/karpenter-57c5f67dd6-7g9fd"}
2023-05-06T15:49:01.586Z DEBUG controller.provisioner relaxing soft constraints for pod since it previously failed to schedule, removing: spec.topologySpreadConstraints = {"maxSkew":1,"topologyKey":"topology.kubernetes.io/zone","whenUnsatisfiable":"ScheduleAnyway","labelSelector":{"matchLabels":{"app.kubernetes.io/instance":"karpenter","app.kubernetes.io/name":"karpenter"}}} {"commit": "f60dacd", "pod": "karpenter/karpenter-57c5f67dd6-7g9fd"}
2023-05-06T15:49:01.587Z ERROR controller.provisioner Could not schedule pod, incompatible with provisioner "default", incompatible requirements, key karpenter.sh/provisioner-name, karpenter.sh/provisioner-name DoesNotExist not in karpenter.sh/provisioner-name In [default]; incompatible with provisioner "gpu", did not tolerate app=app1-service:NoSchedule {"commit": "f60dacd", "pod": "karpenter/karpenter-57c5f67dd6-7g9fd"}
2023-05-06T15:49:52.987Z DEBUG controller.deprovisioning relaxing soft constraints for pod since it previously failed to schedule, removing: spec.topologySpreadConstraints = {"maxSkew":1,"topologyKey":"topology.kubernetes.io/zone","whenUnsatisfiable":"ScheduleAnyway","labelSelector":{"matchLabels":{"app.kubernetes.io/instance":"karpenter","app.kubernetes.io/name":"karpenter"}}} {"commit": "f60dacd", "pod": "karpenter/karpenter-57c5f67dd6-7g9fd"}
2023-05-06T15:50:28.991Z DEBUG controller.aws deleted launch template {"commit": "f60dacd"}
2023-05-06T15:50:29.096Z DEBUG controller.aws deleted launch template {"commit": "f60dacd"}
2023-05-06T15:50:30.099Z INFO controller.inflightchecks Inflight check failed for node, Expected resource "nvidia.com/gpu" didn't register on the node {"commit": "f60dacd", "node": "ip-10-111-22-33.ec2.internal"}
2023-05-06T15:50:31.541Z DEBUG controller.provisioner relaxing soft constraints for pod since it previously failed to schedule, removing: spec.topologySpreadConstraints = {"maxSkew":1,"topologyKey":"topology.kubernetes.io/zone","whenUnsatisfiable":"ScheduleAnyway","labelSelector":{"matchLabels":{"app.kubernetes.io/instance":"karpenter","app.kubernetes.io/name":"karpenter"}}} {"commit": "f60dacd", "pod": "karpenter/karpenter-57c5f67dd6-7g9fd"}
2023-05-06T15:50:31.541Z ERROR controller.provisioner Could not schedule pod, incompatible with provisioner "default", incompatible requirements, key karpenter.sh/provisioner-name, karpenter.sh/provisioner-name DoesNotExist not in karpenter.sh/provisioner-name In [default]; incompatible with provisioner "gpu", did not tolerate app=app1-service:NoSchedule {"commit": "f60dacd", "pod": "karpenter/karpenter-57c5f67dd6-7g9fd"}
2023-05-06T15:50:44.504Z DEBUG controller.deprovisioning relaxing soft constraints for pod since it previously failed to schedule, removing: spec.topologySpreadConstraints = {"maxSkew":1,"topologyKey":"topology.kubernetes.io/zone","whenUnsatisfiable":"ScheduleAnyway","labelSelector":{"matchLabels":{"app.kubernetes.io/instance":"karpenter","app.kubernetes.io/name":"karpenter"}}} {"commit": "f60dacd", "pod": "karpenter/karpenter-57c5f67dd6-7g9fd"}
Community Note
Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment
The text was updated successfully, but these errors were encountered:
Thanks @jonathan-innis the karpenter.sh/initialized label is not showing up in the node. Also I added the below to the deployment and the pod didn't get added to the node and the node doesn't have the label.
@iamtito You'll need to install the https://github.com/NVIDIA/k8s-device-plugin as a daemonset on the cluster in order for the resource to get registered on GPU nodes so that pods can schedule to it. This will also help the karpenter.sh/initialized issue since Karpenter isn't considering the node as initialized because of the lack of that resource.
Alternatively, if you use the Bottlerocket AMIFamily, the image has built-in support for the plugin so that you don't need to install the DS separately.
Version
Karpenter Version: v0.20.0
Kubernetes Version: v1.23
Expected Behavior
When i scaled down the deployment to
0
the node should be deprovisioned and deleted, but that's not happening.Actual Behavior
During scaling the deployment does not bring down the node
Steps to Reproduce the Problem
Create a provisioner using this config
I deployed this configuration:
and use this to deploy it
app1-service.yaml
Deploy it via
kubectly apply -napp -f app1-service.yaml
Scale-up should provision the ami from the customAMISelector
Sclae down should deprovision and delete the node
Resource Specs and Logs
Below is the current config spec that got deployed to the cluster
Logs:
Community Note
The text was updated successfully, but these errors were encountered: