Skip to content

Commit

Permalink
docs(jcloud): autoscale docs
Browse files Browse the repository at this point in the history
  • Loading branch information
zac-li committed Aug 11, 2022
1 parent d6d4c30 commit 979cd7d
Show file tree
Hide file tree
Showing 6 changed files with 266 additions and 203 deletions.
1 change: 1 addition & 0 deletions docs/fundamentals/jcloud/advanced.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@ executors:
capacity: on-demand
```

(external-executors)=
## External executors

You can also expose only the Executors by setting `expose_gateway` to `False`. Read more about {ref}`External Executors <external-executors>`
Expand Down
55 changes: 55 additions & 0 deletions docs/fundamentals/jcloud/autoscale.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
# Autoscaling

JCloud is Kubernetes based and one of the biggest benefits of Kubernetes is the ability to autoscale your applications
on-demand. In JCloud, autoscaling function is naturally offered as well, so you can maintain the [serverless](https://en.wikipedia.org/wiki/Serverless_computing) deployments in a cost-effective way with no headache of setting the [right number of replicas](https://docs.jina.ai/how-to/scale-out/#scale-out-your-executor) anymore!

### Configurations
Autoscaling configurations can be specified on per Executor basis using `autoscale` argument in your Flow YAML, such as:

```yaml
jtype: Flow
executors:
- name: executor1
uses: jinahub+docker://Executor1
jcloud:
autoscale:
min: 1
max: 2
metric: rps
target: 50
```

Below are the configurations explained in detail:

```{note}
JCloud Autoscaling leverages [Knative](https://knative.dev/docs/), where the configurations are directly supported. For more information, please visit [Knative Autoscaling](https://knative.dev/docs/serving/autoscaling/).
```


| Name | Default | Allowed | Description |
|--------|-------------|---------------------------|------------------------------------------------------------------|
| min | 1 | int | Minimum number of replicas (0 means serverless) |
| max | 2 | int | Maximum number of replicas (up to 5) |
| metric | concurrency | `concurrency` / `rps` | Metric for scaling |
| target | 100 | int | Target number for concurrency/rps after which replicas autoscale |


Moreover, we also support using `jinahub+serverless` protocol for Exectuor's `uses` in the Flow YAML to indiciate the enrollment of Autoscaling, such as:

```yaml
jtype: Flow
executors:
- name: executor1
uses: jinahub+serverless://Executor1
```

Example above will take the following defaults:

| min | max | metric | target |
|-----|-----|-------------|--------|
| 0 | 2 | concurrency | 100 |

### Serverless

After JCloud deployment using the Autoscaling configurations, the Flow serving part is just the same; the only difference you would probably notice is it may take extra seconds
to handle the initial requests since it may need to scale the deployments behind the scenes. Let JCloud handle the scaling from now on and you should only worry about the code!
199 changes: 0 additions & 199 deletions docs/fundamentals/jcloud/basic.md

This file was deleted.

0 comments on commit 979cd7d

Please sign in to comment.