Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Optional Configurations for Embedding models #1661

Open
zhangever opened this issue Jun 17, 2024 · 3 comments
Open

Support Optional Configurations for Embedding models #1661

zhangever opened this issue Jun 17, 2024 · 3 comments
Milestone

Comments

@zhangever
Copy link

Is your feature request related to a problem? Please describe

部署BCE embedding模型, 3副本,分布在3张3090卡上,但是没法得到比单副本3倍的提升。 实质上吞吐量跟单卡一样

Describe the solution you'd like

希望能增加可选配置,例如request limit。

Describe alternatives you've considered

N/A

Additional context

N/A

@XprobeBot XprobeBot added this to the v0.12.2 milestone Jun 17, 2024
@codingl2k1
Copy link
Contributor

你用 3 卡 3 副本时,有没有哪些进程 CPU 打满 100% 的?

@zhangever
Copy link
Author

你用 3 卡 3 副本时,有没有哪些进程 CPU 打满 100% 的?

当时只有压测请求, 没有别的请求呢
gpu的利用率不高,每张卡断断续续,高峰期也不到10%

@codingl2k1
Copy link
Contributor

我是说 CPU 使用率有没有 100% 的?

@XprobeBot XprobeBot modified the milestones: v0.12.2, v0.12.4 Jun 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants