-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor the vllm specific code to become model server agnostic #383
Comments
/assign @BenjaminBraunDev |
@ahg-g: GitHub didn't allow me to assign the following users: BenjaminBraunDev. Note that only kubernetes-sigs members with read permissions, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Commenting to allow assignment! |
/assign @BenjaminBraunDev |
Made a PR for this: #461 |
/close |
@liu-cong: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Currently the epp hardcodes the metrics to vllm. To support other model servers, we should make this logic configurable via flags to allow integrations with any model server adhering to the model server protocol as defined in https://github.com/kubernetes-sigs/gateway-api-inference-extension/blob/main/docs/proposals/003-endpoint-picker-protocol/README.md
The text was updated successfully, but these errors were encountered: