-
Notifications
You must be signed in to change notification settings - Fork 102
Fixing try it out curl #929
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Fixing try it out curl #929
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: sathvikpurushotham The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Welcome @sathvikpurushotham! |
Hi @sathvikpurushotham. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
✅ Deploy Preview for gateway-api-inference-extension ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
PORT=80 | ||
|
||
curl -i ${IP}:${PORT}/v1/completions -H 'Content-Type: application/json' -d '{ | ||
"model": "food-review", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we use model food-review-1
?
and then we just need to make sure all deployments have an adapter with that name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think it would be a good idea to use food-review-1
model. I will update this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesnt this just break CPU deployments?
More metaquestion. Do we want/need to support CPU deployments now that we have the sim server? They served a similar purpose
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kfswain we rolled back to food-review
.
(but to answer your question, CPU deployment has some inconsistencies and currently has both food-review
and food-review-1
. there is a separate open issue to handle to inconsistencies).
@sathvikpurushotham this LGTM, you just need to have the CLA authorization set up. /lgtm |
/ok-to-test |
@nirrozenbaum, Thank you. I am in the process of requesting CCLA from my organisation (IBM). Will get the CLA authorisation set up asap. |
site-src/guides/index.md
Outdated
PORT=80 | ||
|
||
curl -i ${IP}:${PORT}/v1/completions -H 'Content-Type: application/json' -d '{ | ||
"model": "food-review-1", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PTAL at the InferenceModel manifest used by the quickstart guide. The curl request should target the food-review
model. The EPP will change the model name to food-review-1
based on the food-review
InferenceModel configuration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@danehans I wasn’t looking in the InferenceModel resources, so I suggested food-review-1. you’re right 👍🏻
@sathvikpurushotham have you tested this change (with #929 (comment)) by following the quickstart guide and using the CPU-based vLLM deployment option? |
@sathvikpurushotham PTAL at #933 if you're interested in helping with a similar fix. |
New changes are detected. LGTM label has been removed. |
@danehans @nirrozenbaum I have updated the model name back to |
@danehans No, I have been struggling a bit to setup my local system to run this so I haven't been able to test it. |
/test pull-gateway-api-inference-extension-test-e2e-main |
Issue link - #907
As mentioned in the issue, the quickstart guide currently has separate tabs to test GPU and CPU-based deployment options, but both deployment options should use the same model name.
After fix the try it out section has only 1 curl that can be used. Also the tab sections have been removed.