Skip to content
Permalink
Browse files

Configurable Scaling for the HPA (#18157)

* Add configurable scale behavior.

* Added documentation about configurable scaling in the HPA

Signed-off-by: Arjun Naik <arjun.rn@gmail.com>

Co-authored-by: Joseph Burnett <josephburnett79@gmail.com>
  • Loading branch information
2 people authored and k8s-ci-robot committed Jan 14, 2020
1 parent c192154 commit 5dbfaafe1ac8875e09ea4ef05390ebc47ad290cb
Showing with 164 additions and 9 deletions.
  1. +164 −9 content/en/docs/tasks/run-application/horizontal-pod-autoscale.md
@@ -3,6 +3,7 @@ reviewers:
- fgrzadkowski
- jszczepkowski
- directxman12
- josephburnett
title: Horizontal Pod Autoscaler
feature:
title: Horizontal scaling
@@ -162,11 +163,15 @@ can be fetched, scaling is skipped. This means that the HPA is still capable
of scaling up if one or more metrics give a `desiredReplicas` greater than
the current value.

Finally, just before HPA scales the target, the scale recommendation is recorded. The
controller considers all recommendations within a configurable window choosing the
highest recommendation from within that window. This value can be configured using the `--horizontal-pod-autoscaler-downscale-stabilization` flag, which defaults to 5 minutes.
This means that scaledowns will occur gradually, smoothing out the impact of rapidly
fluctuating metric values.
Finally, just before HPA scales the target, the scale recommendation is
recorded. The controller considers all recommendations within a configurable
window choosing the highest recommendation from within that window. This value
can be configured using the
`--horizontal-pod-autoscaler-downscale-stabilization` flag or the HPA object
behavior `behavior.scaleDown.stabilizationWindowSeconds` (see [Support for
configurable scaling behavior](#support-for-configurable-scaling-behavior)),
which defaults to 5 minutes. This means that scaledowns will occur gradually,
smoothing out the impact of rapidly fluctuating metric values.

## API Object

@@ -213,10 +218,7 @@ When managing the scale of a group of replicas using the Horizontal Pod Autoscal
it is possible that the number of replicas keeps fluctuating frequently due to the
dynamic nature of the metrics evaluated. This is sometimes referred to as *thrashing*.

Starting from v1.6, a cluster operator can mitigate this problem by tuning
the global HPA settings exposed as flags for the `kube-controller-manager` component:

Starting from v1.12, a new algorithmic update removes the need for the
Starting from v1.12, a new algorithmic update removes the need for an
upscale delay.

- `--horizontal-pod-autoscaler-downscale-stabilization`: The value for this option is a
@@ -232,6 +234,11 @@ the delay value is set too short, the scale of the replicas set may keep thrashi
usual.
{{< /note >}}

Starting from v1.17 the downscale stabilization window can be set on a per-HPA
basis by setting the `behavior.scaleDown.stabilizationWindowSeconds` field in
the v2beta2 API. See [Support for configurable scaling
behavior](#support-for-configurable-scaling-behavior).

## Support for multiple metrics

Kubernetes 1.6 adds support for scaling based on multiple metrics. You can use the `autoscaling/v2beta2` API
@@ -282,6 +289,154 @@ and [external.metrics.k8s.io](https://github.com/kubernetes/community/blob/maste
For examples of how to use them see [the walkthrough for using custom metrics](/docs/tasks/run-application/horizontal-pod-autoscale-walkthrough/#autoscaling-on-multiple-metrics-and-custom-metrics)
and [the walkthrough for using external metrics](/docs/tasks/run-application/horizontal-pod-autoscale-walkthrough/#autoscaling-on-metrics-not-related-to-kubernetes-objects).

## Support for configurable scaling behavior

Starting from
[v1.17](https://github.com/kubernetes/enhancements/blob/master/keps/sig-autoscaling/20190307-configurable-scale-velocity-for-hpa.md)
the `v2beta2` API allows scaling behavior to be configured through the HPA
`behavior` field. Behaviors are specified separately for scaling up and down in
`scaleUp` or `scaleDown` section under the `behavior` field. A stabilization
window can be specified for both directions which prevents the flapping of the
number of the replicas in the scaling target. Similarly specifing scaling
policies controls the rate of change of replicas while scaling.

### Scaling Policies

One or more scaling policies can be specified in the `behavior` section of the spec.
When multiple policies are specified the policy which allows the highest amount of
change is the policy which is selected by default. The following example shows this behavior
while scaling down:

```yaml
behavior:
scaleDown:
policies:
- type: Pods
value: 4
periodSeconds: 60
- type: Percent
value: 10
periodSeconds: 60
```

When the number of pods is more than 40 the second policy will be used for scaling down.
For instance if there are 80 replicas and the target has to be scaled down to 10 replicas
then during the first step 8 replicas will be reduced. In the next iteration when the number
of replicas is 72, 10% of the pods is 7.2 but the number is rounded up to 8. On each loop of
the autoscaler controller the number of pods to be change is re-calculated based on the number
of current replicas. When the number of replicas falls below 40 the first policy_(Pods)_ is applied
and 4 replicas will be reduced at a time.

`periodSeconds` indicates the length of time in the past for which the policy must hold true.
The first policy allows at most 4 replicas to be scaled down in one minute. The second policy
allows at most 10% of the current replicas to be scaled down in one minute.

The policy selection can be changed by specifying the `selectPolicy` field for a scaling
direction. By setting the value to `Min` which would select the policy which allows the
smallest change in the replica count. Setting the value to `Disabled` completely disabled
scaling in that direction.

### Stabilization Window

The stabilization window is used to retrict the flapping of replicas when the metrics
used for scaling keep fluctuating. The stabilization window is used by the autoscaling
algorithm to consider the computed desired state from the past to prevent scaling. In
the following example the stabilization window is specified for `scaleDown`.

```yaml
scaleDown:
stabilizationWindowSeconds: 300
```

When the metrics indicate that the target should be scaled down the algorithm looks
into previously computed desired states and uses the highest value from the specified
interval. In above example all desired states from the past 5 minutes will be considered.

### Default Behavior

To use the custom scaling not all fields have to be specified. Only values which need to be
customized can be specified. These custom values are merged with default values. The default values
match the existing behavior in the HPA algorithm.

```yaml
behavior:
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 100
periodSeconds: 15
scaleUp:
stabilizationWindowSeconds: 0
policies:
- type: Percent
value: 100
periodSeconds: 15
- type: Pods
value: 4
periodSeconds: 15
selectPolicy: Max
```
For scaling down the stabilization window is _300_ seconds(or the value of the
`--horizontal-pod-autoscaler-downscale-stabilization` flag if provided). There is only a single policy
for scaling down which allows a 100% of the currently running replicas to be removed which
means the scaling target can be scaled down to the minimum allowed replicas.
For scaling up there is no stabilization window. When the metrics indicate that the target should be
scaled up the target is scaled up immediately. There are 2 policies which. 4 pods or a 100% of the currently
running replicas will be added every 15 seconds till the HPA reaches its steady state.

### Example: change downscale stabilization window

To provide a custom downscale stabilization window of 1 minute, the following
behavior would be added to the HPA:

```yaml
behavior:
scaleDown:
stabilizationWindowSeconds: 60
```

### Example: limit scale down rate

To limit the rate at which pods are removed by the HPA to 10% per minute, the
following behavior would be added to the HPA:

```yaml
behavior:
scaleDown:
policies:
- type: Percent
value: 10
periodSeconds: 60
```

To allow a final drop of 5 pods, another policy can be added and a selection
strategy of minimum:

```yaml
behavior:
scaleDown:
policies:
- type: Percent
value: 10
periodSeconds: 60
- type: Pods
value: 5
periodSeconds: 60
selectPolicy: Max
```

### Example: disable scale down

The `selectPolicy` value of `Disabled` turns off scaling the given direction.
So to prevent downscaling the following policy would be used:

```yaml
behavior:
scaleDown:
selectPolicy: Disabled
```

{{% /capture %}}

{{% capture whatsnext %}}

0 comments on commit 5dbfaaf

Please sign in to comment.
You can’t perform that action at this time.