Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

支持并发输出对话token #1380

Open
5 tasks done
beijingtl opened this issue Apr 26, 2024 · 0 comments
Open
5 tasks done

支持并发输出对话token #1380

beijingtl opened this issue Apr 26, 2024 · 0 comments
Labels
enhancement New feature or request

Comments

@beijingtl
Copy link

beijingtl commented Apr 26, 2024

例行检查

  • 我已确认目前没有类似 issue
  • 我已确认我已升级到最新版本
  • 我已完整查看过项目 README,已确定现有版本无法满足需求
  • 我理解并愿意跟进此 issue,协助测试和提供反馈
  • 我理解并认可上述内容,并理解项目维护者精力有限,不遵循规则的 issue 可能会被无视或直接关闭

功能描述

在同一时刻,当一个外部API调用one-api时,如果在one-api中,存在2个定义相同的2个渠道(指模型名称相同,但后台对应启动了2个大模型docker服务)。此时,one-api仍然是“顺序”输出token。例如,当请求1token输出完成之后,再输出token2的请求。

基于“负载均衡”的逻辑,能否在调用one-api的一个API时(名称、url、授权均相同),自动增加负载支持,同时输出2个请求的对话?

应用场景

同1个one-api高并发请求出现时,可底层配置多个大语言模型实例,来提高并发能力。但需要one-api支持相同API能“并发”输出token。

@beijingtl beijingtl added the enhancement New feature or request label Apr 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant