-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Add load balancing docs #2559
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Add load balancing docs #2559
Changes from all commits
Commits
Show all changes
11 commits
Select commit
Hold shift + click to select a range
b48c4f2
[WIP] Add load balancing docs
abrennan89 b2740d2
feedback from vagababov
abrennan89 d4fb947
fix link
abrennan89 6eaf219
review feedback and removed README
abrennan89 0567e59
minor tweaks and improvements
abrennan89 24b6dad
minor tweaks, formatting
abrennan89 4d17dc6
fixing links
abrennan89 4f74d71
removed note
abrennan89 4458457
fix link, default values
abrennan89 cc53e66
review updates
abrennan89 64bf341
review updates
abrennan89 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file was deleted.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,33 @@ | ||
| --- | ||
| title: "Load balancing" | ||
| weight: 30 | ||
| type: "docs" | ||
| --- | ||
|
|
||
| You can turn on Knative load balancing, by placing the _Activator service_ in the request path to act as a load balancer. | ||
|
|
||
| **NOTE:** To do this, you must first ensure that individual pod addressability is enabled. | ||
|
|
||
| ## Activator pod selection | ||
|
|
||
| Activator pods are scaled horizontally, so there may be multiple Activators in a deployment. In general, the system will perform best if the number of revision pods is larger than the number of Activator pods, and those numbers divide equally. | ||
| <!--TODO(#2472): Add better documentation about what the activator is; explain the components of load balancing; maybe add a diagram--> | ||
|
|
||
| Knative assigns a subset of Activators for each revision, depending on the revision size. More revision pods will mean a greater number of Activators for that revision. | ||
|
|
||
| The Activator load balancing algorithm works as follows: | ||
|
|
||
| - If concurrency is unlimited, the request is sent to the better of two random choices. | ||
| - If concurrency is set to a value less or equal than 3, the Activator will send the request to the first pod that has capacity. Otherwise, requests will be balanced in a round robin fashion, with respect to container concurrency. | ||
|
|
||
| For more information, see the documentation on [concurrency](../../serving/autoscaling/concurrency). | ||
|
|
||
| ## Configuring target burst capacity | ||
|
|
||
| Target burst capacity is mainly responsible for determining whether the Activator is in the request path outside of scale-from-zero scenarios. | ||
|
|
||
| Target burst capacity can be configured using a combination of the following parameters: | ||
|
|
||
| - Setting the targeted concurrency limits for the revision. See [concurrency](../../serving/autoscaling/concurrency). | ||
| - Setting the target utilization parameters. See [target utilization](../../serving/autoscaling/concurrency#target-utilization). | ||
| - Setting the target burst capacity. You can configure target burst capacity using the `autoscaling.knative.dev/targetBurstCapacity` annotation key in the `config-autoscaler` ConfigMap. See [Setting the target burst capacity](./target-burst-capacity#setting-the-target-burst-capacity). | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,76 @@ | ||
| --- | ||
| title: "Configuring target burst capacity" | ||
| linkTitle: "Configuring target burst capacity" | ||
| weight: 50 | ||
| type: "docs" | ||
| aliases: | ||
| - /docs/serving/autoscaling/target-burst-capacity | ||
| --- | ||
|
|
||
| _Target burst capacity_ is a [global and per-revision](../../serving/autoscaling/autoscaling-concepts.md) integer setting that determines the size of traffic burst a Knative application can handle without buffering. | ||
| If a traffic burst is too large for the application to handle, the _Activator_ service will be placed in the request path to protect the revision and optimize request load balancing. | ||
|
|
||
| The Activator service is responsible for receiving and buffering requests for inactive revisions, or for revisions where a traffic burst is larger than the limits of what can be handled without buffering for that revision. It can also quickly spin up additional pods for capacity, and throttle how quickly requests are sent to pods. | ||
|
|
||
| Target burst capacity can be configured using a combination of the following parameters: | ||
|
|
||
| - Setting the targeted concurrency limits for the revision. See [concurrency](../../serving/autoscaling/concurrency). | ||
| - Setting the target utilization parameters. See [target utilization](../../serving/autoscaling/concurrency#target-utilization). | ||
| - Setting the target burst capacity. You can configure target burst capacity using the `autoscaling.knative.dev/targetBurstCapacity` annotation key in the `config-autoscaler` ConfigMap. See [Setting the target burst capacity](#setting-the-target-burst-capacity). | ||
|
|
||
| ## Setting the target burst capacity | ||
|
|
||
| - **Global key:** `target-burst-capacity` | ||
| - **Per-revision annotation key:** `autoscaling.knative.dev/targetBurstCapacity` | ||
| - **Possible values:** float (`0` means the Activator is only in path when scaled to 0, `-1` means the Activator is always in path) | ||
| - **Default:** `200` | ||
|
|
||
| **Example:** | ||
| {{< tabs name="targetBurstCapacity" default="Per Revision" >}} | ||
| {{% tab name="Per Revision" %}} | ||
| ```yaml | ||
| apiVersion: serving.knative.dev/v1 | ||
| kind: Service | ||
| metadata: | ||
| annotations: | ||
abrennan89 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| name: <service_name> | ||
| namespace: default | ||
| spec: | ||
| template: | ||
| metadata: | ||
| annotations: | ||
| autoscaling.knative.dev/targetBurstCapacity: "200" | ||
| ``` | ||
| {{< /tab >}} | ||
| {{% tab name="Global (ConfigMap)" %}} | ||
| ```yaml | ||
| apiVersion: v1 | ||
| kind: ConfigMap | ||
| metadata: | ||
| name: config-autoscaler | ||
| namespace: knative-serving | ||
| data: | ||
| target-burst-capacity: "200" | ||
| ``` | ||
| {{< /tab >}} | ||
| {{% tab name="Global (Operator)" %}} | ||
| ```yaml | ||
| apiVersion: operator.knative.dev/v1alpha1 | ||
| kind: KnativeServing | ||
| metadata: | ||
| name: knative-serving | ||
| spec: | ||
| config: | ||
| autoscaler: | ||
| target-burst-capacity: "200" | ||
| ``` | ||
| {{< /tab >}} | ||
| {{< /tabs >}} | ||
|
|
||
| - If `autoscaling.knative.dev/targetBurstCapacity` is set to `0`, the Activator is only added to the request path during scale from zero scenarios, and ingress load balancing will be applied. | ||
|
|
||
| **NOTE:** Ingress gateway load balancing requires additional configuration. For more information about load balancing using an ingress gateway, see the [Serving API](../../reference/api/serving-api) documentation. | ||
|
|
||
| - If `autoscaling.knative.dev/targetBurstCapacity` is set to `-1`, the Activator is always in the request path, regardless of the revision size. | ||
|
|
||
| - If `autoscaling.knative.dev/targetBurstCapacity` is set to another integer, the Activator may be in the path, depending on the revision scale and load. | ||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this to work, ...Since you can put Activator in the request path, just the load balancing will be whatever underlying transport is (mesh provider most likely).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So do we actually need this note at all?
If a user needs to do something here maybe we should add that procedure or link to it to make it clear?