Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add request distributor server #903

Merged
merged 13 commits into from
Jan 12, 2024
Merged

Add request distributor server #903

merged 13 commits into from
Jan 12, 2024

Conversation

AllentDan
Copy link
Collaborator

No description provided.

@AllentDan AllentDan changed the title Add proxy Add proxy server Jan 5, 2024
lmdeploy/constants.py Outdated Show resolved Hide resolved
@lvhan028
Copy link
Collaborator

lvhan028 commented Jan 11, 2024

Run command:

python3 lmdeploy/serve/proxy/proxy.py --server-port 33338

got error message:

ImportError: cannot import name 'Doc' from 'typing_extensions' (/usr/local/lib/python3.8/dist-packages/typing_extensions.py)

@lvhan028
Copy link
Collaborator

miss __init__.py in proxy folder

@@ -0,0 +1,39 @@
# Copyright (c) OpenMMLab. All rights reserved.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it better to put constants.py to lmdeploy/serve/proxy?

Conflicts:
	docs/en/serving/restful_api.md
	docs/zh_cn/serving/restful_api.md
@@ -0,0 +1,39 @@
## Proxy
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

H1 title: "Request Distributor Server"

@@ -162,3 +162,7 @@ lmdeploy serve gradio api_server_url --server_name ${gradio_ui_ip} --server_port
5. If you need to adjust other default parameters of the session, such as the content of fields like system. You can directly pass in the initialization parameters of the [dialogue template](https://github.com/InternLM/lmdeploy/blob/main/lmdeploy/model.py). For example, for the internlm-chat-7b model, you can set the `--meta_instruction` parameter when starting the `api_server`.

6. Regarding the stop words, we only support characters that encode into a single index. Furthermore, there may be multiple indexes that decode into results containing the stop word. In such cases, if the number of these indexes is too large, we will only use the index encoded by the tokenizer. If you want use a stop symbol that encodes into multiple indexes, you may consider performing string matching on the streaming client side. Once a successful match is found, you can then break out of the streaming loop.

### multiple services
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

request distribution service

@@ -156,3 +156,7 @@ lmdeploy serve gradio api_server_url --server_name ${gradio_ui_ip} --server_port
5. 如需调整会话默认的其他参数,比如 system 等字段的内容,可以直接将[对话模板](https://github.com/InternLM/lmdeploy/blob/main/lmdeploy/model.py)初始化参数传入。比如 internlm-chat-7b 模型,可以通过启动`api_server`时,设置`--meta_instruction`参数。

6. 关于停止符,我们只支持编码后为单个 index 的字符。此外,可能存在多种 index 都会解码出带有停止符的结果。对于这种情况,如果这些 index 数量太多,我们只会采用 tokenizer 编码出的 index。而如果你想要编码后为多个 index 的停止符,可以考虑在流式客户端做字符串匹配,匹配成功后跳出流式循环即可。

### 多个服务并行
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

多机并行服务

@AllentDan AllentDan changed the title Add proxy server Add request distributor server Jan 11, 2024
Start the proxy service:

```shell
python lmdeploy/serve/proxy/proxy.py --server_name {server_name} --server_port {server_port} --strategy "min_expected_latency"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

能用下面这种方式么?

python3 -m lmdeploy.serve.proxy --server-name {server_name} --server-port 

@lvhan028 lvhan028 merged commit bfbaeef into InternLM:main Jan 12, 2024
4 of 8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants