Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Native Support for KServe Open Inference REST/gRPC Protocol #2373

Closed
yuzisun opened this issue May 31, 2023 · 4 comments · Fixed by #2609
Closed

Native Support for KServe Open Inference REST/gRPC Protocol #2373

yuzisun opened this issue May 31, 2023 · 4 comments · Fixed by #2609
Labels
kubernetes triaged Issue has been reviewed and triaged

Comments

@yuzisun
Copy link

yuzisun commented May 31, 2023

🚀 The feature

KServe now has rebranded the v2 inference protocol to open inference protocol specification, can we implement OIP natively in torchserve like other model servers such as Triton, MLServer, OpenVino, AMD Inference Serve ?

Motivation, pitch

Currently torchserve utilizes the KServe python server in front of the TorchServe Netty server to adapt to the KServe v1, v2 REST protocol. However, I think this extra layer provides minimal value and cause numerous issues for maintenance and performance. The KServe python SDK is primarily designed for native python inference runtimes when user wants to implement arbitrary inference code with pre/post processing. Similarly, TorchServe provides comparable custom handlers. Therefore, there is no good reason why we need both and route all the kserve inference requests through kserve python server -> Netty -> torchserve python worker.

Alternatives

No response

Additional context

  • Can we remove the kserve python wrapper all together?
  • Seems like we are able to send the kserve inference request directly to Netty server as v1/v2 REST requests are handled here
  • for gRPC I guess we need to implement the OIP grpc specification natively
    here
@yuzisun yuzisun changed the title Native Support for KServe Open Inference Protocol for REST/gRPC Native Support for KServe Open Inference REST/gRPC Protocol May 31, 2023
@msaroufim msaroufim added kubernetes triaged Issue has been reviewed and triaged labels May 31, 2023
@yuzisun
Copy link
Author

yuzisun commented Jun 14, 2023

@msaroufim Can we setup a call with you to discuss this ?

@msaroufim
Copy link
Member

msaroufim commented Jun 14, 2023

Hi @yuzisun sure! I'd be happy to, just forwarded this to the core team lemme get back to you with a few times that work

Might be easiest to email me its marksaroufim@meta.com

@gavrissh
Copy link

Quick suggestion - During my experiments I noticed that in torchserve with KServe v1, v2 REST protocol, we cannot use dynamic batching done by the Netty server. This is causing a performance difference as well compared to using raw inputs.
Batching support would be ideal. Thanks!

@yuzisun
Copy link
Author

yuzisun commented Jun 14, 2023

Make sense, the goal of this issue is to totally remove the kserve wrapper and implement OIP natively with TorchServe. @gavrishp I think it is possible to just disable the kserve wrapper and send requests directly to Netty server using KServe v1, v2 REST protocol, can you try that ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kubernetes triaged Issue has been reviewed and triaged
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants