Skip to content

Conversation

@manu-sj
Copy link
Contributor

@manu-sj manu-sj commented Aug 28, 2025

@manu-sj manu-sj requested review from SirOibaf and javierdlrm August 28, 2025 12:24

The URL follows this format:
```text
http://<ISTIO_GATEWAY_IP>/v1/models/<DEPLOYMENT_NAME>:predict
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not 100% correct. The URL format depends on the model server. For example, /v1/models/:predict works for python deployments, but not for TensorFlow or LLMs.

I would suggest something like:
http://<ISTIO_GATEWAY_IP>/<RESOURCE_PATH>, where RESOURCE_PATH depends on the model server (e.g., vLLM, TensorFlow Serving, KServe sklearnserver).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated it based on your suggestion.


## Example Response

The model returns predictions in a JSON object. You can find more information [here](https://kserve.github.io/website/docs/concepts/architecture/data-plane/v1-protocol#response-format).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The model server responses also depend on the model server implementation :)
The { "predictions": [] } format applies to sklearn/xgboost deployments, but TensorFlow Serving or vLLM returns a different format that the one specified in the link.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aah yes I see your point. I removed the example I have and updated the text to point to a say that the response also depends on the model server.

I could not get a link to their Model Serving Page so I pointed Kserve docs to mention that they can refer their for more information regarding any model servers.

manu-sj and others added 4 commits August 28, 2025 15:47
Co-authored-by: Javier de la Rúa Martínez <javierdlrm@outlook.com>
Co-authored-by: Javier de la Rúa Martínez <javierdlrm@outlook.com>
Co-authored-by: Javier de la Rúa Martínez <javierdlrm@outlook.com>
manu-sj and others added 2 commits August 28, 2025 17:20
Co-authored-by: Javier de la Rúa Martínez <javierdlrm@outlook.com>
@SirOibaf SirOibaf merged commit 7c0c041 into logicalclocks:main Sep 2, 2025
1 check passed
manu-sj added a commit to manu-sj/logicalclocks.github.io that referenced this pull request Sep 2, 2025
…ocks#504)

Co-authored-by: Javier de la Rúa Martínez <javierdlrm@outlook.com>
SirOibaf pushed a commit that referenced this pull request Sep 2, 2025
Co-authored-by: Javier de la Rúa Martínez <javierdlrm@outlook.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants