Join GitHub today
GitHub is home to over 36 million developers working together to host and review code, manage projects, and build software together.Sign up
[feature request] Support different serving servers #14
Now there are some inference servers such as TensorRT Inference Server, GraphPipe, TensorFlow Serving, and so on. Different may want to use different servers. Thus I think we should support different servers.
BTW, some servers support serving multiples models and multiple frameworks. For example, Graphpipe supports TensorFlow, PyTorch and Caffe. Maybe we also should investigate how to support co-serving multiple models in one serving CRD.
Thanks for the feature! Given our data model (high level "Tensorflow" spec), do you think it would make most sense to swap different Tensorflow technologies with annotations?
I think this is an anti-pattern in KFServing. A KFServing Service is a "unit" of model serving. The idea of hosting multiple models in a single model server is extremely interesting and something I've been working on in my share time. I've been thinking we should control this via implementation, not interface, and allow users to specify an annotation like "enable-multitenancy".
I'll make a separate issue for this since I think it's a huge topic.