Skip to content

Commit

Permalink
docs: update
Browse files Browse the repository at this point in the history
  • Loading branch information
zac-li committed Aug 11, 2022
1 parent 476415a commit 80d4bbb
Showing 1 changed file with 26 additions and 24 deletions.
50 changes: 26 additions & 24 deletions docs/fundamentals/jcloud/autoscale.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,51 +3,53 @@
In JCloud, demand-based autoscaling functionality is naturally offered thanks to the underlying Kubernetes architecture. This means that you can maintain [serverless](https://en.wikipedia.org/wiki/Serverless_computing) deployments in a cost-effective way with no headache of setting the [right number of replicas](https://docs.jina.ai/how-to/scale-out/#scale-out-your-executor) anymore!

## Configurations
Autoscaling configurations can be specified on a per Executor basis using the `autoscale` argument in your Flow YAML, such as:

Autoscaling can be enabled by using `jinahub+serverless` protocol for Exectuor's `uses` in the Flow YAML, such as:

```yaml
jtype: Flow
executors:
- name: executor1
uses: jinahub+docker://Executor1
jcloud:
autoscale:
min: 1
max: 2
metric: rps
target: 50
uses: jinahub+serverless://Executor1
```

Below are the configurations explained in detail:
JCloud Autoscaling leverages [Knative](https://knative.dev/docs/) behind the scenes, and `jinahub+serverless` uses a set of Knative configuratons as defaults:

```{note}
JCloud Autoscaling leverages [Knative](https://knative.dev/docs/), where the configurations are directly supported. For more information, please visit [Knative Autoscaling](https://knative.dev/docs/serving/autoscaling/).
For more information about the Knative Autoscaling configurations, please visit [Knative Autoscaling](https://knative.dev/docs/serving/autoscaling/).
```

| Name | Value | Description |
|--------|-------------|-------------------------------------------------|
| min | 0 | Minimum number of replicas (0 means serverless) |
| max | 2 | Maximum number of replicas |
| metric | concurrency | Metric for scaling |
| target | 100 | Target number after which replicas autoscale |

| Name | Default | Allowed | Description |
|--------|-------------|---------------------------|------------------------------------------------------------------|
| min | 1 | int | Minimum number of replicas (0 means serverless) |
| max | 2 | int | Maximum number of replicas (up to 5) |
| metric | concurrency | `concurrency` / `rps` | Metric for scaling |
| target | 100 | int | Target number for concurrency/rps after which replicas autoscale |

## Serverless

We also support using `jinahub+serverless` protocol for Exectuor's `uses` in the Flow YAML to indiciate the enrollment of Autoscaling, such as:
If `jinahub+serverless` doesn't meet your requirements, you can further customize Autoscaling configurations by using the `autoscale` argument on a per Executor basis in the Flow YAML, such as:

```yaml
jtype: Flow
executors:
- name: executor1
uses: jinahub+serverless://Executor1
uses: jinahub+docker://Executor1
jcloud:
autoscale:
min: 1
max: 2
metric: rps
target: 50
```

Example above will take the following defaults:
Below are the defaults and requirements for the configurations:

| min | max | metric | target |
|-----|-----|-------------|--------|
| 0 | 2 | concurrency | 100 |
| Name | Default | Allowed |
|--------|-------------|--------------------------|
| min | 1 | int |
| max | 2 | int, up to 5 |
| metric | concurrency | `concurrency` / `rps` |
| target | 100 | int |

After JCloud deployment using the Autoscaling configurations, the Flow serving part is just the same; the only difference you would probably notice is it may take extra seconds
to handle the initial requests since it may need to scale the deployments behind the scenes. Let JCloud handle the scaling from now on and you should only worry about the code!

0 comments on commit 80d4bbb

Please sign in to comment.