[Feature]: integrated model controller panel support? #4226

leiwen83 · 2024-04-20T14:03:01Z

🚀 The feature, motivation and pitch

In the production scenario, multiple model registration is a needed feature, which could be served as auto scale or model update case or centralized service dispatch accessed from fixed URL. Previously, we use fastchat w/ vllm, and it works well to serve our purpose.

But nowadays vllm get rapidly expansion in it LLM support feature like images/video, etc, and also engine args grows to support various need, fastchat‘s provided openai interface seems cannot keep up the pace with the changes of vllm side.

So shall we consider to host some kind of function just like fastchat's controller feature, and model worker could be loosely-coupled with controller, and dynamic leave and register into controller's server backend, while controler could choose the best route for certain prompt request?

Alternatives

I'm not sure whether there is some other openai API server could does this controller/work loosely-coupled working mode well, and also could keep sync with vllm's quickly changing API.

Additional context

No response

leiwen83 added the feature request label Apr 20, 2024

leiwen83 linked a pull request May 16, 2024 that will close this issue

Add control panel allow manage multi vllm instances #4861

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: integrated model controller panel support? #4226

[Feature]: integrated model controller panel support? #4226

leiwen83 commented Apr 20, 2024 •

edited

[Feature]: integrated model controller panel support? #4226

[Feature]: integrated model controller panel support? #4226

Comments

leiwen83 commented Apr 20, 2024 • edited

🚀 The feature, motivation and pitch

Alternatives

Additional context

leiwen83 commented Apr 20, 2024 •

edited