-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
6 changed files
with
266 additions
and
203 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,55 @@ | ||
# Autoscaling | ||
|
||
JCloud is Kubernetes based and one of the biggest benefits of Kubernetes is the ability to autoscale your applications | ||
on-demand. In JCloud, autoscaling function is naturally offered as well, so you can maintain the [serverless](https://en.wikipedia.org/wiki/Serverless_computing) deployments in a cost-effective way with no headache of setting the [right number of replicas](https://docs.jina.ai/how-to/scale-out/#scale-out-your-executor) anymore! | ||
|
||
### Configurations | ||
Autoscaling configurations can be specified on per Executor basis using `autoscale` argument in your Flow YAML, such as: | ||
|
||
```yaml | ||
jtype: Flow | ||
executors: | ||
- name: executor1 | ||
uses: jinahub+docker://Executor1 | ||
jcloud: | ||
autoscale: | ||
min: 1 | ||
max: 2 | ||
metric: rps | ||
target: 50 | ||
``` | ||
|
||
Below are the configurations explained in detail: | ||
|
||
```{note} | ||
JCloud Autoscaling leverages [Knative](https://knative.dev/docs/), where the configurations are directly supported. For more information, please visit [Knative Autoscaling](https://knative.dev/docs/serving/autoscaling/). | ||
``` | ||
|
||
|
||
| Name | Default | Allowed | Description | | ||
|--------|-------------|---------------------------|------------------------------------------------------------------| | ||
| min | 1 | int | Minimum number of replicas (0 means serverless) | | ||
| max | 2 | int | Maximum number of replicas (up to 5) | | ||
| metric | concurrency | `concurrency` / `rps` | Metric for scaling | | ||
| target | 100 | int | Target number for concurrency/rps after which replicas autoscale | | ||
|
||
|
||
Moreover, we also support using `jinahub+serverless` protocol for Exectuor's `uses` in the Flow YAML to indiciate the enrollment of Autoscaling, such as: | ||
|
||
```yaml | ||
jtype: Flow | ||
executors: | ||
- name: executor1 | ||
uses: jinahub+serverless://Executor1 | ||
``` | ||
|
||
Example above will take the following defaults: | ||
|
||
| min | max | metric | target | | ||
|-----|-----|-------------|--------| | ||
| 0 | 2 | concurrency | 100 | | ||
|
||
### Serverless | ||
|
||
After JCloud deployment using the Autoscaling configurations, the Flow serving part is just the same; the only difference you would probably notice is it may take extra seconds | ||
to handle the initial requests since it may need to scale the deployments behind the scenes. Let JCloud handle the scaling from now on and you should only worry about the code! |
This file was deleted.
Oops, something went wrong.
Oops, something went wrong.