title: "Getting Started with Karpenter" linkTitle: "Getting Started with Karpenter" weight: 10 description: > Set up a cluster and add Karpenter
Karpenter automatically provisions new nodes in response to unschedulable pods. Karpenter does this by observing events within the Kubernetes cluster, and then sending commands to the underlying cloud provider.
This guide shows how to get started with Karpenter by creating a Kubernetes cluster and installing Karpenter. To use Karpenter, you must be running a supported Kubernetes cluster on a supported cloud provider. Currently, only EKS on AWS is supported.
This guide uses eksctl
to create the cluster.
It should take less than 1 hour to complete, and cost less than $0.25.
Follow the clean-up instructions to reduce any charges.
Karpenter is installed in clusters with a Helm chart.
Karpenter requires cloud provider permissions to provision nodes, for AWS IAM Roles for Service Accounts (IRSA) should be used. IRSA permits Karpenter (within the cluster) to make privileged requests to AWS (as the cloud provider) via a ServiceAccount.
Install these tools before proceeding:
- AWS CLI
kubectl
- the Kubernetes CLIeksctl
- the CLI for AWS EKShelm
- the package manager for Kubernetes
Configure the AWS CLI
with a user that has sufficient privileges to create an EKS cluster. Verify that the CLI can
authenticate properly by running aws sts get-caller-identity
.
After setting up the tools, set the Karpenter and Kubernetes version:
export KARPENTER_NAMESPACE=karpenter
export KARPENTER_VERSION=v0.32.7
export K8S_VERSION=1.28
Then set the following environment variable:
{{% script file="./content/en/{VERSION}/getting-started/getting-started-with-karpenter/scripts/step01-config.sh" language="bash"%}}
{{% alert title="Warning" color="warning" %}} If you open a new shell to run steps in this procedure, you need to set some or all of the environment variables again. To remind yourself of these values, type:
echo $KARPENTER_NAMESPACE $KARPENTER_VERSION $K8S_VERSION $CLUSTER_NAME $AWS_DEFAULT_REGION $AWS_ACCOUNT_ID $TEMPOUT
{{% /alert %}}
Create a basic cluster with eksctl
.
The following cluster configuration will:
- Use CloudFormation to set up the infrastructure needed by the EKS cluster. See [CloudFormation]({{< relref "../../reference/cloudformation/" >}}) for a complete description of what
cloudformation.yaml
does for Karpenter. - Create a Kubernetes service account and AWS IAM Role, and associate them using IRSA to let Karpenter launch instances.
- Add the Karpenter node role to the aws-auth configmap to allow nodes to connect.
- Use AWS EKS managed node groups for the kube-system and karpenter namespaces. Uncomment fargateProfiles settings (and comment out managedNodeGroups settings) to use Fargate for both namespaces instead.
- Set KARPENTER_IAM_ROLE_ARN variables.
- Create a role to allow spot instances.
- Run helm to install karpenter
{{% script file="./content/en/{VERSION}/getting-started/getting-started-with-karpenter/scripts/step02-create-cluster.sh" language="bash"%}}
{{% script file="./content/en/{VERSION}/getting-started/getting-started-with-karpenter/scripts/step06-add-spot-role.sh" language="bash"%}}
{{% alert title="Windows Support Notice" color="warning" %}} In order to run Windows workloads, Windows support should be enabled in your EKS Cluster. See Enabling Windows support to learn more. {{% /alert %}}
{{% script file="./content/en/{VERSION}/getting-started/getting-started-with-karpenter/scripts/step08-apply-helm-chart.sh" language="bash"%}}
{{% alert title="Warning" color="warning" %}} Karpenter creates a mapping between CloudProvider machines and CustomResources in the cluster for capacity tracking. To ensure this mapping is consistent, Karpenter utilizes the following tag keys:
karpenter.sh/managed-by
karpenter.sh/nodepool
kubernetes.io/cluster/${CLUSTER_NAME}
Because Karpenter takes this dependency, any user that has the ability to Create/Delete these tags on CloudProvider machines will have the ability to orchestrate Karpenter to Create/Delete CloudProvider machines as a side effect. We recommend that you enforce tag-based IAM policies on these tags against any EC2 instance resource (i-*
) for any users that might have CreateTags/DeleteTags permissions but should not have RunInstances/TerminateInstances permissions.
{{% /alert %}}
A single Karpenter NodePool is capable of handling many different pod shapes. Karpenter makes scheduling and provisioning decisions based on pod attributes such as labels and affinity. In other words, Karpenter eliminates the need to manage many different node groups.
Create a default NodePool using the command below. This NodePool uses securityGroupSelectorTerms
and subnetSelectorTerms
to discover resources used to launch nodes. We applied the tag karpenter.sh/discovery
in the eksctl
command above. Depending on how these resources are shared between clusters, you may need to use different tagging schemes.
The consolidationPolicy
set to WhenUnderutilized
in the disruption
block configures Karpenter to reduce cluster cost by removing and replacing nodes. As a result, consolidation will terminate any empty nodes on the cluster. This behavior can be disabled by setting consolidateAfter
to Never
, telling Karpenter that it should never consolidate nodes. Review the [NodePool API docs]({{<ref "../../concepts/nodepools" >}}) for more information.
Note: This NodePool will create capacity as long as the sum of all created capacity is less than the specified limit.
{{% script file="./content/en/{VERSION}/getting-started/getting-started-with-karpenter/scripts/step12-add-nodepool.sh" language="bash"%}}
Karpenter is now active and ready to begin provisioning nodes.
This deployment uses the pause image and starts with zero replicas.
{{% script file="./content/en/{VERSION}/getting-started/getting-started-with-karpenter/scripts/step13-automatic-node-provisioning.sh" language="bash"%}}
Now, delete the deployment. After a short amount of time, Karpenter should terminate the empty nodes due to consolidation.
{{% script file="./content/en/{VERSION}/getting-started/getting-started-with-karpenter/scripts/step14-deprovisioning.sh" language="bash"%}}
If you delete a node with kubectl, Karpenter will gracefully cordon, drain, and shutdown the corresponding instance. Under the hood, Karpenter adds a finalizer to the node object, which blocks deletion until all pods are drained and the instance is terminated. Keep in mind, this only works for nodes provisioned by Karpenter.
{{% script file="./content/en/{VERSION}/getting-started/getting-started-with-karpenter/scripts/step16-delete-node.sh" language="bash"%}}
To avoid additional charges, remove the demo infrastructure from your AWS account.
{{% script file="./content/en/{VERSION}/getting-started/getting-started-with-karpenter/scripts/step17-cleanup.sh" language="bash"%}}
This section describes optional ways to configure Karpenter to enhance its capabilities. In particular, the following commands deploy a Prometheus and Grafana stack that is suitable for this guide but does not include persistent storage or other configurations that would be necessary for monitoring a production deployment of Karpenter. This deployment includes two Karpenter dashboards that are automatically onboarded to Grafana. They provide a variety of visualization examples on Karpenter metrics.
{{% script file="./content/en/{VERSION}/getting-started/getting-started-with-karpenter/scripts/step09-add-prometheus-grafana.sh" language="bash"%}}
The Grafana instance may be accessed using port forwarding.
{{% script file="./content/en/{VERSION}/getting-started/getting-started-with-karpenter/scripts/step10-add-grafana-port-forward.sh" language="bash"%}}
The new stack has only one user, admin
, and the password is stored in a secret. The following command will retrieve the password.
{{% script file="./content/en/{VERSION}/getting-started/getting-started-with-karpenter/scripts/step11-grafana-get-password.sh" language="bash"%}}
The section below covers advanced installation techniques for installing Karpenter. This includes things such as running Karpenter on a cluster without public internet access or ensuring that Karpenter avoids getting throttled by other components in your cluster.
You can optionally install Karpenter on a private cluster using the eksctl
installation by setting privateCluster.enabled
to true in your ClusterConfig and by setting --set settings.isolatedVPC=true
when installing the karpenter
helm chart.
privateCluster:
enabled: true
Private clusters have no outbound access to the internet. This means that in order for Karpenter to reach out to the services that it needs to access, you need to enable specific VPC private endpoints. Below shows the endpoints that you need to enable to successfully run Karpenter in a private cluster:
com.amazonaws.<region>.ec2
com.amazonaws.<region>.ecr.api
com.amazonaws.<region>.ecr.dkr
com.amazonaws.<region>.s3 – For pulling container images
com.amazonaws.<region>.sts – For IAM roles for service accounts
com.amazonaws.<region>.ssm - For resolving default AMIs
com.amazonaws.<region>.sqs - For accessing SQS if using interruption handling
com.amazonaws.<region>.eks - For Karpenter to discover the cluster endpoint
If you do not currently have these endpoints surfaced in your VPC, you can add the endpoints by running
aws ec2 create-vpc-endpoint --vpc-id ${VPC_ID} --service-name ${SERVICE_NAME} --vpc-endpoint-type Interface --subnet-ids ${SUBNET_IDS} --security-group-ids ${SECURITY_GROUP_IDS}
{{% alert title="Note" color="primary" %}}
Karpenter (controller and webhook deployment) container images must be in or copied to Amazon ECR private or to a another private registry accessible from inside the VPC. If these are not available from within the VPC, or from networks peered with the VPC, you will get Image pull errors when Kubernetes tries to pull these images from ECR public.
{{% /alert %}}
{{% alert title="Note" color="primary" %}}
There is currently no VPC private endpoint for the Price List Query API. As a result, pricing data can go stale over time. By default, Karpenter ships a static price list that is updated when each binary is released.
Failed requests for pricing data will result in the following error messages
ERROR controller.aws.pricing updating on-demand pricing, RequestError: send request failed
caused by: Post "https://api.pricing.us-east-1.amazonaws.com/": dial tcp 52.94.231.236:443: i/o timeout; RequestError: send request failed
caused by: Post "https://api.pricing.us-east-1.amazonaws.com/": dial tcp 52.94.231.236:443: i/o timeout, using existing pricing data from 2022-08-17T00:19:52Z {"commit": "4b5f953"}
{{% /alert %}}
Kubernetes uses FlowSchemas and PriorityLevelConfigurations to map calls to the API server into buckets which determine each user agent's throttling limits.
By default, Karpenter is placed in the workload-low
PriorityLevelConfiguration for all APIServer requests. This means that other components that make a high number of requests to the APIServer may affect the ability for Karpenter to make requests.
To ensure that Karpenter is unaffected by these other, lower priority components, we can place Karpenter into a higher-priority PriorityLevelConfiguration using a custom FlowSchema.
{{% script file="./content/en/{VERSION}/getting-started/getting-started-with-karpenter/scripts/step15-apply-flowschemas.sh" language="bash"%}}