Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs(jcloud): autoscale docs #5056

Merged
merged 2 commits into from
Aug 11, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/fundamentals/jcloud/advanced.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@ executors:
capacity: on-demand
```

(external-executors)=
## External executors

You can also expose only the Executors by setting `expose_gateway` to `False`. Read more about {ref}`External Executors <external-executors>`
Expand Down
56 changes: 56 additions & 0 deletions docs/fundamentals/jcloud/autoscale.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
# Autoscaling

In JCloud, demand-based autoscaling functionality is naturally offered thanks to the underlying Kubernetes architecture. This means that you can maintain [serverless](https://en.wikipedia.org/wiki/Serverless_computing) deployments in a cost-effective way with no headache of setting the [right number of replicas](https://docs.jina.ai/how-to/scale-out/#scale-out-your-executor) anymore!

## Serverless

Autoscaling can be enabled by using `jinahub+serverless` protocol for Exectuor's `uses` in the Flow YAML, such as:

```yaml
jtype: Flow
executors:
- name: executor1
uses: jinahub+serverless://Executor1
```

JCloud Autoscaling leverages [Knative](https://knative.dev/docs/) behind the scenes, and `jinahub+serverless` uses a set of Knative configuratons as defaults:

```{note}
For more information about the Knative Autoscaling configurations, please visit [Knative Autoscaling](https://knative.dev/docs/serving/autoscaling/).
```

| Name | Value | Description |
|--------|-------------|-------------------------------------------------|
| min | 0 | Minimum number of replicas (0 means serverless) |
| max | 2 | Maximum number of replicas |
| metric | concurrency | Metric for scaling |
| target | 100 | Target number after which replicas autoscale |

## Configurations

If `jinahub+serverless` doesn't meet your requirements, you can further customize Autoscaling configurations by using the `autoscale` argument on a per Executor basis in the Flow YAML, such as:

```yaml
jtype: Flow
executors:
- name: executor1
uses: jinahub+docker://Executor1
jcloud:
autoscale:
min: 1
max: 2
metric: rps
target: 50
```

Below are the defaults and requirements for the configurations:

| Name | Default | Allowed |
|--------|-------------|--------------------------|
| min | 1 | int |
| max | 2 | int, up to 5 |
| metric | concurrency | `concurrency` / `rps` |
| target | 100 | int |

After JCloud deployment using the Autoscaling configurations, the Flow serving part is just the same; the only difference you would probably notice is it may take extra seconds
to handle the initial requests since it may need to scale the deployments behind the scenes. Let JCloud handle the scaling from now on and you should only worry about the code!
199 changes: 0 additions & 199 deletions docs/fundamentals/jcloud/basic.md

This file was deleted.