From ec3a700c632dba97c01529c0049c383d68636aa8 Mon Sep 17 00:00:00 2001 From: felix-wang <35718120+numb3r3@users.noreply.github.com> Date: Mon, 18 Apr 2022 15:15:21 +0800 Subject: [PATCH] chore: update docs (#684) * chore: update docs * chore: update changelog --- docs/changelog/index.md | 12 ++++++++++++ docs/user-guides/server.md | 6 +++--- 2 files changed, 15 insertions(+), 3 deletions(-) diff --git a/docs/changelog/index.md b/docs/changelog/index.md index 32520555a..c35d18b81 100644 --- a/docs/changelog/index.md +++ b/docs/changelog/index.md @@ -4,3 +4,15 @@ CLIP-as-service follows semantic versioning. However, before the project reach 1 This chapter only tracks the most important breaking changes and explain the rationale behind them. +## 0.2.0: improve the service scalability with replicas + +This change is mainly intended to improve the inference performance with replicas. + +Here is the short benchmark summary of the improvement (`replicas=4`): + +| batch_size | before | after | +|-------------|--------|---------| +| 1 | 23.74 | 18.89 | +| 8 | 58.88 | 30.38 | +| 16 | 14.96 | 91.86 | +| 32 | 14.78 | 101.75 | diff --git a/docs/user-guides/server.md b/docs/user-guides/server.md index 2b8c9e8a6..49220f346 100644 --- a/docs/user-guides/server.md +++ b/docs/user-guides/server.md @@ -184,9 +184,9 @@ There are also runtime-specific parameters listed below: ````{tab} ONNX -| Parameter | Description | -|-----------|---------------------------------------------------------------------------------------------------| -| `providers` | [ONNX runtime provides](https://onnxruntime.ai/docs/execution-providers/), default is auto-detect | +| Parameter | Description | +|-----------|--------------------------------------------------------------------------------------------------------------------------------| +| `device` | `cuda` or `cpu`. Default is `None` means auto-detect. ````