Skip to content

Commit 199e8ae

Browse files
committed
[WIP] Add load balancing docs
1 parent d2b1805 commit 199e8ae

File tree

4 files changed

+95
-47
lines changed

4 files changed

+95
-47
lines changed

docs/serving/autoscaling/target-burst-capacity.md

Lines changed: 0 additions & 47 deletions
This file was deleted.
Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
You can configure load balancing on Knative to use either:
2+
3+
- An ingress gateway, such as Istio or Kourier.
4+
- The Knative activator in the request path acting as a load balancer.
5+
6+
For more information about load balancing using an ingress gateway, see the [Serving API](../../reference/serving-api.md) documentation.
7+
8+
This guide explains how you can configure load balancing for your Knative system using the activator.
9+
10+
## About the activator
11+
12+
Knative assigns a subset of activators for each revision, depending on the revision size. More revision pods will mean a greater number of activators for that revision. Activators are scaled horizontally, so there may be multiple activators in a deployment.
13+
14+
In general, the system will perform best if the number of existing pods is larger than the number of activators, and those numbers divide equally.
15+
16+
The activator load balancing algorithm works as follows:
17+
- If concurrency is unlimited, the request is sent to a random pod.
18+
- If concurrency is set to a limited value, the activator will send the request to the first pod that has capacity.
19+
20+
### Prerequisites
21+
22+
- Ensure that there is no ingress gateway enabled.
23+
- Ensure that individual pod addressability is enabled.
24+
25+
### Configuring target burst capacity
26+
27+
Target burst capacity is mainly responsible for determining whether the activator is in the request path outside of scale from zero scenarios.
28+
29+
Target burst capacity can be configured using a combination of the following parameters:
30+
31+
* Setting the targeted concurrency limits for the revision. For more information, see the documentation on [concurrency](../../serving/autoscaling/concurrency.md).
32+
* Setting the target utilization parameters. For more information, see the documentation on [target utilization](../../serving/autoscaling/concurrency.md#target-utilization).
33+
* Setting the target burst capacity [per revision](./target-burst-capacity.md).
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
---
2+
title: "Load balancing"
3+
weight: 30
4+
type: "docs"
5+
---
6+
7+
{{% readfile file="README.md" %}}
Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
---
2+
title: "Configuring target burst capacity"
3+
linkTitle: "Configuring target burst capacity"
4+
weight: 50
5+
type: "docs"
6+
aliases:
7+
- /docs/serving/autoscaling/target-burst-capacity
8+
---
9+
10+
_Target burst capacity_ determines the size of traffic burst a Knative application can handle without buffering.
11+
If a traffic burst is too large for an application to handle without buffering, the activator will be placed in the request path to protect the revision and optimize request load balancing.
12+
The activator can also quickly spin up additional pods for capacity, and throttle how quickly requests are sent to pods.
13+
14+
You can configure target burst capacity using the `autoscaling.knative.dev/targetBurstCapacity` annotation key in `config-autoscaler` ConfigMap, as shown in the following example:
15+
16+
* **Global key:** No global key.
17+
* **Per-revision annotation key:** `autoscaling.knative.dev/targetBurstCapacity`
18+
* **Possible values:** float
19+
* **Default:** `70`
20+
21+
**Note:** If the activator is in the path, it will fully load all replicas up to `containerConcurrency`. It currently applies target utilization only on revision level.
22+
<!-- TODO: clarify what this note means-->
23+
24+
**Example:**
25+
{{< tabs name="targetBurstCapacity" default="Per Revision" >}}
26+
{{% tab name="Per Revision" %}}
27+
```yaml
28+
apiVersion: serving.knative.dev/v1
29+
kind: Service
30+
metadata:
31+
annotations:
32+
name: s3
33+
namespace: default
34+
spec:
35+
template:
36+
metadata:
37+
annotations:
38+
autoscaling.knative.dev/minScale: "2"
39+
autoscaling.knative.dev/targetBurstCapacity: "70"
40+
```
41+
{{< /tab >}}
42+
{{< /tabs >}}
43+
44+
- If `autoscaling.knative.dev/targetBurstCapacity` is set to `0`, the activator is only added to the request path during scale from zero scenarios, and ingress gateway load balancing will be applied.
45+
46+
**NOTE:** Ingress gateway load balancing requires additional configuration. For more information about load balancing using an ingress gateway, see the [Serving API](../../reference/serving-api.md) documentation.
47+
48+
- If `autoscaling.knative.dev/targetBurstCapacity` is set to `-1`, the activator is always in the request path, regardless of the revision size.
49+
50+
- If `autoscaling.knative.dev/targetBurstCapacity` is set to another integer, the activator may be in the path, depending on the revision scale and load.
51+
52+
<!--Target burst capacity can alternatively be configured globally, by configuring the following settings together:
53+
54+
* Setting the targeted concurrency limits for the revision. For more information, see the documentation on [concurrency](../../serving/autoscaling/concurrency.md).
55+
* Setting the target utilization parameters. For more information, see the documentation on [target utilization](../../serving/autoscaling/concurrency.md#target-utilization).-->

0 commit comments

Comments
 (0)