New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Follow up work on ServingRuntime support - PR1 #1924 #1926
Conversation
df33be2
to
288373c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, thanks for this! Did a quick passthrough and it seems the main thing is to get the e2e tests passing. Found some issues which may help with that.
Yes @pvaneck . I am also thinking the same. |
bb2e69a
to
966c4f1
Compare
pkg/controller/v1beta1/inferenceservice/components/predictor.go
Outdated
Show resolved
Hide resolved
// Update image tag if GPU is enabled or runtime version is provided | ||
func UpdateImageTag(container *v1.Container, runtimeVersion string) { | ||
image := container.Image | ||
if utils.IsGPUEnabled(container.Resources) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I'm wondering if we should have this gpu block. It might have the unintended side effect of appending -gpu
to image tags that might not need it since, in this case, any inferenceservice with a resources.limits
object containing nvidia.com/gpu
will automatically have the -gpu
appended.
I'm thinking maybe just keep it simple here, and just rely on the user passing in the correct runtimeVersion or using a runtime that already specifies a gpu-enabled image. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, for example triton does not differentiate cpu or gpu image while tfserving, torchserve does. So I think the logic should be following:
- if
runtimeVersion
is specified, user should pass in the right gpu image version liketfserving:1.14-gpu
- if
runtimeVersion
is not specified, we can add an optional field for gpuImage onServingRuntime
?
config/runtimes/kserve-mlserver.yaml
Outdated
spec: | ||
supportedModelTypes: | ||
- name: sklearn | ||
version: "2" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sklearn version should be "0"
xgboost version should be "1"
@adriangonz what are the other frameworks MLServer support?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could add MLflow and LightGBM to that list. Although I'm assuming that'd require changes in the code as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adriangonz the purpose of introducing servingruntime is that we no longer need code change for adding new frameworks.. we only need to create new serving runtime CR and install along with kserve
// Update image tag if GPU is enabled or runtime version is provided | ||
func UpdateImageTag(container *v1.Container, runtimeVersion string) { | ||
image := container.Image | ||
if utils.IsGPUEnabled(container.Resources) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, for example triton does not differentiate cpu or gpu image while tfserving, torchserve does. So I think the logic should be following:
- if
runtimeVersion
is specified, user should pass in the right gpu image version liketfserving:1.14-gpu
- if
runtimeVersion
is not specified, we can add an optional field for gpuImage onServingRuntime
?
966c4f1
to
057f56f
Compare
@pvaneck @yuzisun I have few questions.
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some version comments for the ClusterServingRuntimes.
caff958
to
b25d0e0
Compare
Thanks @Suresh-Nakkeran. I think this is pretty much ready to merge. A new Developer Certificate of Origin (DCO) check was recently added. Can you follow the steps here to get that check passing? |
Signed-off-by: Suresh Nakkeran <suresh.n@ideas2it.com>
Signed-off-by: Suresh Nakkeran <suresh.n@ideas2it.com>
Signed-off-by: Suresh Nakkeran <suresh.n@ideas2it.com>
Signed-off-by: Suresh Nakkeran <suresh.n@ideas2it.com>
Signed-off-by: Suresh Nakkeran <suresh.n@ideas2it.com>
Signed-off-by: Suresh Nakkeran <suresh.n@ideas2it.com>
Signed-off-by: Suresh Nakkeran <suresh.n@ideas2it.com>
Signed-off-by: Suresh Nakkeran <suresh.n@ideas2it.com>
Signed-off-by: Suresh Nakkeran <suresh.n@ideas2it.com>
Signed-off-by: Suresh Nakkeran <suresh.n@ideas2it.com>
- update image tag if gpu enabled or runtime version provided - update mlserver, tensorflow image versions Signed-off-by: Suresh Nakkeran <suresh.n@ideas2it.com>
Signed-off-by: Suresh Nakkeran <suresh.n@ideas2it.com>
Signed-off-by: Suresh Nakkeran <suresh.n@ideas2it.com>
b25d0e0
to
0345163
Compare
/lgtm |
Signed-off-by: Dan Sun <dsun20@bloomberg.net>
@Suresh-Nakkeran Awesome work! I added a commit to fix the protocol version. /lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: Suresh-Nakkeran, yuzisun The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
fixes #1924
Below are features added as part of this PR,