Follow up work on ServingRuntime support - PR1 #1924 #1926

Suresh-Nakkeran · 2021-11-26T06:39:36Z

fixes #1924

Below are features added as part of this PR,

Template variable support for enabling passing in isvc name to runtime container.
More default ClusterServingRuntimes
Logic in InferenceService mutator to convert old schema into the new schema.

config/runtimes/kserve-sklearnserver.yaml

config/runtimes/kserve-lgbserver.yaml

pkg/apis/serving/v1beta1/inference_service_defaults.go

config/runtimes/kserve-mlserver.yaml

config/runtimes/kserve-torchserveserver-gpu.yaml

pkg/constants/constants.go

config/runtimes/kserve-torchserveserver-gpu.yaml

config/runtimes/kserve-torchserveserver.yaml

config/runtimes/kserve-tensorflow-serving-gpu.yaml

pvaneck

Hey, thanks for this! Did a quick passthrough and it seems the main thing is to get the e2e tests passing. Found some issues which may help with that.

pkg/apis/serving/v1beta1/inference_service_defaults.go

config/runtimes/kserve-lgbserver.yaml

pkg/apis/serving/v1beta1/inference_service_defaults.go

config/runtimes/kustomization.yaml

pkg/apis/serving/v1beta1/inference_service_defaults.go

Suresh-Nakkeran · 2021-12-01T03:35:48Z

Yes @pvaneck . I am also thinking the same.

config/runtimes/kserve-tensorflow-serving-gpu.yaml

config/runtimes/kustomization.yaml

pkg/controller/v1beta1/inferenceservice/utils/utils.go

pkg/controller/v1beta1/inferenceservice/components/predictor.go

pvaneck · 2021-12-03T23:14:17Z

pkg/controller/v1beta1/inferenceservice/utils/utils.go

+// Update image tag if GPU is enabled or runtime version is provided
+func UpdateImageTag(container *v1.Container, runtimeVersion string) {
+	image := container.Image
+	if utils.IsGPUEnabled(container.Resources) &&


Hmm, I'm wondering if we should have this gpu block. It might have the unintended side effect of appending -gpu to image tags that might not need it since, in this case, any inferenceservice with a resources.limits object containing nvidia.com/gpu will automatically have the -gpu appended.

I'm thinking maybe just keep it simple here, and just rely on the user passing in the correct runtimeVersion or using a runtime that already specifies a gpu-enabled image. WDYT?

Agreed, for example triton does not differentiate cpu or gpu image while tfserving, torchserve does. So I think the logic should be following:

if runtimeVersion is specified, user should pass in the right gpu image version like tfserving:1.14-gpu

if runtimeVersion is not specified, we can add an optional field for gpuImage on ServingRuntime?

pkg/constants/constants.go

pkg/apis/serving/v1beta1/inference_service_defaults.go

yuzisun · 2021-12-04T14:07:47Z

config/runtimes/kserve-mlserver.yaml

+spec:
+  supportedModelTypes:
+    - name: sklearn
+      version: "2"


sklearn version should be "0"
xgboost version should be "1"
@adriangonz what are the other frameworks MLServer support?

We could add MLflow and LightGBM to that list. Although I'm assuming that'd require changes in the code as well?

@adriangonz the purpose of introducing servingruntime is that we no longer need code change for adding new frameworks.. we only need to create new serving runtime CR and install along with kserve

config/runtimes/kserve-torchserve.yaml

config/runtimes/kserve-tritonserver.yaml

yuzisun · 2021-12-04T14:32:41Z

pkg/controller/v1beta1/inferenceservice/utils/utils.go

+// Update image tag if GPU is enabled or runtime version is provided
+func UpdateImageTag(container *v1.Container, runtimeVersion string) {
+	image := container.Image
+	if utils.IsGPUEnabled(container.Resources) &&


Agreed, for example triton does not differentiate cpu or gpu image while tfserving, torchserve does. So I think the logic should be following:

if runtimeVersion is specified, user should pass in the right gpu image version like tfserving:1.14-gpu

if runtimeVersion is not specified, we can add an optional field for gpuImage on ServingRuntime?

Suresh-Nakkeran · 2021-12-07T15:08:40Z

@pvaneck @yuzisun I have few questions.

In pkg/controller/v1beta1/inferenceservice/components/predictor.go:line 105, For the case no explicit runtime provided, we get the list of supported serving runtime specs which are not in same order every time since we are not sorting it, which causing issue when we deploy ISVC with new schema directly just framework name, version.
Do we need to address this issue?
Are we not allowed to filter servingruntime based on protocol version (V1 , V2)?

pvaneck · 2021-12-07T21:00:16Z

In pkg/controller/v1beta1/inferenceservice/components/predictor.go:line 105, For the case no explicit runtime provided, we get the list of supported serving runtime specs which are not in same order every time since we are not sorting it, which causing issue when we deploy ISVC with new schema directly just framework name, version.
Do we need to address this issue?

Are we not allowed to filter servingruntime based on protocol version (V1 , V2)?

Hmm, didn't realize the ordering was non-deterministic. I'm assuming you are getting a different serving runtime each time you deploy a specific model? If we want consistency in what SR is selected, then sorting would do the trick. Although, maybe a random selection could be considered a feature not a bug? @njhill How do you think this case should be handled?
I don't believe we have any mechanism for filtering v1 vs v2. The ServingRuntimes are agnostic to that detail. And I believe v1 will eventually be phased out.

pvaneck

Just some version comments for the ClusterServingRuntimes.

config/runtimes/kserve-mlserver.yaml

config/runtimes/kserve-lgbserver.yaml

config/runtimes/kserve-paddleserver.yaml

config/runtimes/kserve-pmmlserver.yaml

config/runtimes/kserve-xgbserver.yaml

config/runtimes/kserve-tritonserver.yaml

Suresh-Nakkeran · 2021-12-08T18:08:41Z

@pvaneck @yuzisun I updated the model supported versions. Can you please review and let me know if i missed any?

pvaneck · 2021-12-09T07:11:26Z

Thanks @Suresh-Nakkeran. I think this is pretty much ready to merge. A new Developer Certificate of Origin (DCO) check was recently added. Can you follow the steps here to get that check passing?

Signed-off-by: Suresh Nakkeran <suresh.n@ideas2it.com>

- update image tag if gpu enabled or runtime version provided - update mlserver, tensorflow image versions Signed-off-by: Suresh Nakkeran <suresh.n@ideas2it.com>

Signed-off-by: Suresh Nakkeran <suresh.n@ideas2it.com>

pvaneck · 2021-12-10T02:25:48Z

/lgtm

config/runtimes/kserve-mlserver.yaml

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

yuzisun · 2021-12-11T17:03:04Z

@Suresh-Nakkeran Awesome work!

I added a commit to fix the protocol version.

/lgtm
/approve

kserve-oss-bot · 2021-12-11T17:03:08Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Suresh-Nakkeran, yuzisun

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [yuzisun]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

kserve-oss-bot requested review from njhill and theofpa November 26, 2021 06:39

yuzisun reviewed Nov 27, 2021

View reviewed changes

yuzisun mentioned this pull request Nov 27, 2021

deprecate onnx runtime server #1792

Closed

Suresh-Nakkeran force-pushed the csrt-extn branch from df33be2 to 288373c Compare November 29, 2021 13:08

pvaneck reviewed Nov 30, 2021

View reviewed changes

pvaneck reviewed Dec 1, 2021

View reviewed changes

config/runtimes/kustomization.yaml Outdated Show resolved Hide resolved

pkg/apis/serving/v1beta1/inference_service_defaults.go Outdated Show resolved Hide resolved

pvaneck reviewed Dec 3, 2021

View reviewed changes

config/runtimes/kserve-tensorflow-serving-gpu.yaml Outdated Show resolved Hide resolved

config/runtimes/kserve-tensorflow-serving-gpu.yaml Outdated Show resolved Hide resolved

config/runtimes/kustomization.yaml Outdated Show resolved Hide resolved

Suresh-Nakkeran force-pushed the csrt-extn branch from bb2e69a to 966c4f1 Compare December 3, 2021 16:03

pvaneck reviewed Dec 3, 2021

View reviewed changes

yuzisun reviewed Dec 4, 2021

View reviewed changes

Suresh-Nakkeran force-pushed the csrt-extn branch from 966c4f1 to 057f56f Compare December 4, 2021 16:56

pvaneck reviewed Dec 7, 2021

View reviewed changes

Suresh-Nakkeran force-pushed the csrt-extn branch from caff958 to b25d0e0 Compare December 8, 2021 18:07

Suresh-Nakkeran added 10 commits December 9, 2021 16:07

added built-in cluster servingruntime for existing framework

9ba2ebb

Signed-off-by: Suresh Nakkeran <suresh.n@ideas2it.com>

replace placeholders in runtime container

fddab30

Signed-off-by: Suresh Nakkeran <suresh.n@ideas2it.com>

convert isvc old schema to new schema

599c872

Signed-off-by: Suresh Nakkeran <suresh.n@ideas2it.com>

added triton as supported framework for backward compatibility

33d56e4

Signed-off-by: Suresh Nakkeran <suresh.n@ideas2it.com>

assigning predictor spec issue fix

479c8fb

Signed-off-by: Suresh Nakkeran <suresh.n@ideas2it.com>

incorporated review comments on serving runtimes

9b0b5b0

Signed-off-by: Suresh Nakkeran <suresh.n@ideas2it.com>

predictor model name issue fix

d632159

Signed-off-by: Suresh Nakkeran <suresh.n@ideas2it.com>

update image tag and protocol version

0fdb65a

Signed-off-by: Suresh Nakkeran <suresh.n@ideas2it.com>

isvc update issue fix

407ac08

Signed-off-by: Suresh Nakkeran <suresh.n@ideas2it.com>

fix e2e test failure issues

b553906

Signed-off-by: Suresh Nakkeran <suresh.n@ideas2it.com>

Suresh-Nakkeran added 3 commits December 9, 2021 16:07

- remove gpu clusterservingruntimes

9d4cab2

- update image tag if gpu enabled or runtime version provided - update mlserver, tensorflow image versions Signed-off-by: Suresh Nakkeran <suresh.n@ideas2it.com>

added ut for new utils funcs

0cf121a

Signed-off-by: Suresh Nakkeran <suresh.n@ideas2it.com>

updated model supported versions

0345163

Signed-off-by: Suresh Nakkeran <suresh.n@ideas2it.com>

Suresh-Nakkeran force-pushed the csrt-extn branch from b25d0e0 to 0345163 Compare December 9, 2021 10:37

kserve-oss-bot assigned pvaneck Dec 10, 2021

kserve-oss-bot added the lgtm label Dec 10, 2021

yuzisun reviewed Dec 11, 2021

View reviewed changes

config/runtimes/kserve-mlserver.yaml Show resolved Hide resolved

kserve-oss-bot removed the lgtm label Dec 11, 2021

Fix protocol version

8d37595

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

yuzisun force-pushed the csrt-extn branch from d3f184e to 8d37595 Compare December 11, 2021 17:00

kserve-oss-bot assigned yuzisun Dec 11, 2021

kserve-oss-bot added the lgtm label Dec 11, 2021

kserve-oss-bot added the approved label Dec 11, 2021

yuzisun mentioned this pull request Dec 11, 2021

#1593: Remove hard coded isvc default resource requests/limits. #1678

Closed

kserve-oss-bot merged commit db4d528 into kserve:master Dec 11, 2021

This was referenced Dec 22, 2021

Add ServingRuntime docs kserve/website#62

Closed

KServe 0.8 release tracking #1906

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Follow up work on ServingRuntime support - PR1 #1924 #1926

Follow up work on ServingRuntime support - PR1 #1924 #1926

Suresh-Nakkeran commented Nov 26, 2021 •

edited by yuzisun

pvaneck left a comment

Suresh-Nakkeran commented Dec 1, 2021

pvaneck Dec 3, 2021

yuzisun Dec 4, 2021 •

edited

yuzisun Dec 4, 2021

adriangonz Dec 6, 2021

yuzisun Dec 8, 2021

yuzisun Dec 4, 2021 •

edited

Suresh-Nakkeran commented Dec 7, 2021

pvaneck commented Dec 7, 2021

pvaneck left a comment

Suresh-Nakkeran commented Dec 8, 2021

pvaneck commented Dec 9, 2021

pvaneck commented Dec 10, 2021

yuzisun commented Dec 11, 2021

kserve-oss-bot commented Dec 11, 2021

Follow up work on ServingRuntime support - PR1 #1924 #1926

Follow up work on ServingRuntime support - PR1 #1924 #1926

Conversation

Suresh-Nakkeran commented Nov 26, 2021 • edited by yuzisun

pvaneck left a comment

Choose a reason for hiding this comment

Suresh-Nakkeran commented Dec 1, 2021

pvaneck Dec 3, 2021

Choose a reason for hiding this comment

yuzisun Dec 4, 2021 • edited

Choose a reason for hiding this comment

yuzisun Dec 4, 2021

Choose a reason for hiding this comment

adriangonz Dec 6, 2021

Choose a reason for hiding this comment

yuzisun Dec 8, 2021

Choose a reason for hiding this comment

yuzisun Dec 4, 2021 • edited

Choose a reason for hiding this comment

Suresh-Nakkeran commented Dec 7, 2021

pvaneck commented Dec 7, 2021

pvaneck left a comment

Choose a reason for hiding this comment

Suresh-Nakkeran commented Dec 8, 2021

pvaneck commented Dec 9, 2021

pvaneck commented Dec 10, 2021

yuzisun commented Dec 11, 2021

kserve-oss-bot commented Dec 11, 2021

Suresh-Nakkeran commented Nov 26, 2021 •

edited by yuzisun

yuzisun Dec 4, 2021 •

edited

yuzisun Dec 4, 2021 •

edited