Horizontal Pod Autoscaling allows automatic scaling of workload to match demand. This works by increasing or decreasing the number of Pods. This technique prevents unexpected cost explosions so you can focus on saving costs. With this configuration, Kubernetes engineers can work on actually reducing costs and scaling their pods rather than ensuring correct configuration. On top of that, this policy also verifies the usage of resources and resource metrics like cpu.
This policy helps to enforce the following HPA workloads:
- Ensure target cpu utilization is set
- Ensure max replicas is set and valid
- Ensure min replicas is set and valid
- Ensure scale target ref is configured properly
The field targetCPUUtilizationPercentage
defines the target for when the pods are to be scaled. CPU Utilization is the average CPU usage of all pods in a deployment divided by the requested CPU of the deployment. If the mean of CPU utilization is higher than the target, then the pod replicas will be readjusted.
If the targetCPUUtilizationPercentage
key is missing from the spec section:
kind: HorizontalPodAutoscaler
spec:
maxReplicas: 10
minReplicas: 1
OR a value outside of the range, 1 - 100 was used:
kind: HorizontalPodAutoscaler
spec:
targetCPUUtilizationPercentage: 200
The field maxReplicas
is vital because it sets the maximum number of Pod replicas for the autoscaler. It is a value betweent the range of 1 and 10.
If the maxReplicas
key is missing from the spec section:
kind: HorizontalPodAutoscaler
spec:
minReplicas: 5
targetCPUUtilizationPercentage: 40
OR a value outside of the range, 1 - 10 was used:
kind: HorizontalPodAutoscaler
spec:
maxReplicas: 12
minReplicas: 2
The field minReplicas
define the minimum number of replicas of a resource. As a best practice, it should be set to two, hence the minimum minReplicas one can set it 2.
If minReplicas
is not present:
kind: HorizontalPodAutoscaler
spec:
maxReplicas: 10
targetCPUUtilizationPercentage: 50
Nishant Verma \ theowlet