-
Notifications
You must be signed in to change notification settings - Fork 28
[FSTORE-1820] Documentation for REST API model deployments #504
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
|
||
| The URL follows this format: | ||
| ```text | ||
| http://<ISTIO_GATEWAY_IP>/v1/models/<DEPLOYMENT_NAME>:predict |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not 100% correct. The URL format depends on the model server. For example, /v1/models/:predict works for python deployments, but not for TensorFlow or LLMs.
I would suggest something like:
http://<ISTIO_GATEWAY_IP>/<RESOURCE_PATH>, where RESOURCE_PATH depends on the model server (e.g., vLLM, TensorFlow Serving, KServe sklearnserver).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated it based on your suggestion.
|
|
||
| ## Example Response | ||
|
|
||
| The model returns predictions in a JSON object. You can find more information [here](https://kserve.github.io/website/docs/concepts/architecture/data-plane/v1-protocol#response-format). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The model server responses also depend on the model server implementation :)
The { "predictions": [] } format applies to sklearn/xgboost deployments, but TensorFlow Serving or vLLM returns a different format that the one specified in the link.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aah yes I see your point. I removed the example I have and updated the text to point to a say that the response also depends on the model server.
I could not get a link to their Model Serving Page so I pointed Kserve docs to mention that they can refer their for more information regarding any model servers.
Co-authored-by: Javier de la Rúa Martínez <javierdlrm@outlook.com>
Co-authored-by: Javier de la Rúa Martínez <javierdlrm@outlook.com>
Co-authored-by: Javier de la Rúa Martínez <javierdlrm@outlook.com>
Co-authored-by: Javier de la Rúa Martínez <javierdlrm@outlook.com>
…ocks#504) Co-authored-by: Javier de la Rúa Martínez <javierdlrm@outlook.com>
REST API - Hopsworks Documentation.pdf