Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: integrated model controller panel support? #4226

Open
leiwen83 opened this issue Apr 20, 2024 · 0 comments 路 May be fixed by #4861
Open

[Feature]: integrated model controller panel support? #4226

leiwen83 opened this issue Apr 20, 2024 · 0 comments 路 May be fixed by #4861

Comments

@leiwen83
Copy link
Contributor

leiwen83 commented Apr 20, 2024

馃殌 The feature, motivation and pitch

In the production scenario, multiple model registration is a needed feature, which could be served as auto scale or model update case or centralized service dispatch accessed from fixed URL. Previously, we use fastchat w/ vllm, and it works well to serve our purpose.

But nowadays vllm get rapidly expansion in it LLM support feature like images/video, etc, and also engine args grows to support various need, fastchat鈥榮 provided openai interface seems cannot keep up the pace with the changes of vllm side.

So shall we consider to host some kind of function just like fastchat's controller feature, and model worker could be loosely-coupled with controller, and dynamic leave and register into controller's server backend, while controler could choose the best route for certain prompt request?

Alternatives

I'm not sure whether there is some other openai API server could does this controller/work loosely-coupled working mode well, and also could keep sync with vllm's quickly changing API.

Additional context

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant