-
-
Notifications
You must be signed in to change notification settings - Fork 44
Open
Labels
featureCategorizes issue or PR as related to a new feature.Categorizes issue or PR as related to a new feature.needs-priorityIndicates a PR lacks a label and requires one.Indicates a PR lacks a label and requires one.needs-triageIndicates an issue or PR lacks a label and requires one.Indicates an issue or PR lacks a label and requires one.
Description
What would you like to be added:
Right now, llmaz is mostly designed for large language models, however, some users may need to support traditional models as a singleton solution, let's wait for some feedbacks.
References:
- Kserve: https://kserve.github.io/website/latest/modelserving/v1beta1/serving_runtime/
- Seldon: https://docs.seldon.io/projects/seldon-core/en/latest/nav/config/servers.html
The solution is quite similar, we have to implement the server runtime just like vllm for different kinds of models, or reuse the official ones like torchserve.
Why is this needed:
Completion requirements:
This enhancement requires the following artifacts:
- Design doc
- API change
- Docs update
The artifacts should be linked in subsequent comments.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
featureCategorizes issue or PR as related to a new feature.Categorizes issue or PR as related to a new feature.needs-priorityIndicates a PR lacks a label and requires one.Indicates a PR lacks a label and requires one.needs-triageIndicates an issue or PR lacks a label and requires one.Indicates an issue or PR lacks a label and requires one.