Add support for MLServer in the SKLearn predictor #1155

adriangonz · 2020-10-23T13:42:14Z

What this PR does / why we need it:

Adds support for MLServer (and the V2 dataplane) in the v1beta1 version of the SKLearn predictor. Note that this PR will still default all v1beta1 SKLearn InferenceServices to use the V1 dataplane. To enable the V2 protocol, there is a new protocolVersion field in the predictor spec which can be set to v1.

This PR also adds an example worth checking under ./docs/samples/v1beta1/sklearn showcasing how the integration works.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):

Works towards #1111 (it doesn't fix it yet, as it still needs support for XGBoost).

Special notes for your reviewer:

The aim of this PR is to kickstart the discussion on how do we want to shape the integration between KFServing and MLServer. This has already surfaced some questions, e.g.:

~~Do we want to maintain backwards compatibility in the data plane with previous versions?~~ Yes, for now.
Moving to MMS, do we want to restrict a single framework in each MLServer pod? In other words, do we need a mlserver predictor which supports multiple frameworks?

As such, it would be great to get people's thoughts on this proposal and which aspects would they change.

That's also why it only focuses on SKLearn for now. Once we are happy with how it looks, it should pretty straightforward to extend to XGBoost.

Please confirm that if this PR changes any image versions, then that's the sole change this PR makes.

Release note:

Added support for the V2 dataplane in the SKLearn predictor when using the `v1beta1` version of the `InferenceService` CRDs.

kubeflow-bot · 2020-10-23T13:42:21Z

This change is

k8s-ci-robot · 2020-10-23T13:42:26Z

Hi @adriangonz. Thanks for your PR.

I'm waiting for a kubeflow member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

yuzisun · 2020-10-25T14:33:01Z

pkg/apis/serving/v1beta1/predictor_sklearn.go

-	if extensions.ContainerConcurrency != nil {
-		arguments = append(arguments, fmt.Sprintf("%s=%s", constants.ArgumentWorkers, strconv.FormatInt(*extensions.ContainerConcurrency, 10)))
-	}
+	k.Container.Env = append(


Does MLServer support command arguments ? If we can use the same set of the arguments that helps maintain the backwards compatibility, otherwise we can do a if/else check based on the image name.

yuzisun · 2020-10-25T14:37:23Z

config/configmap/inferenceservice.yaml

-            "image": "gcr.io/kfserving/sklearnserver",
-            "defaultImageVersion": "v0.4.1",
+            "image": "docker.io/seldonio/mlserver",
+            "defaultImageVersion": "0.1.2",


Let's still keep the default to old v1 version for a few releases and then we can start defaulting to mlserver once they migrate away from v1. With v1beta1 user can specify the image and version on the inference service spec itself if they want to use mlserver.

yuzisun · 2020-10-26T01:50:01Z

thanks @adriangonz, This is a great start!

I think for the initial v1beta1 release we still default to kfserver, but we can implement in a way that both kfserver and mlserver work with the sklearn spec, e.g if user specify mlserver image and version on inference service spec like following then it creates mlserver to load the sklearn model.

sklearn:
    image: seldon.io/mlserver
    runtimeVersion: v0.1.2
    storageUri: gs://kfserving-examples/sklearn/iris

Does mlserver model repo works the same way as the current kfserver ?

for MMS I think we can punt on supporting multi frameworks when use case comes up.

adriangonz · 2020-10-26T08:55:57Z

Thanks for your comments @yuzisun! To make it more explicit for the end user, do you think it would make sense to have a protocol flag in PredictorExtensionSpec? We could potentially leverage the same flag in Triton to enable / disable the V2 protocol (although, I'm not sure if Triton allows that in recent images).

yuzisun · 2020-10-26T13:32:48Z

Thanks for your comments @yuzisun! To make it more explicit for the end user, do you think it would make sense to have a protocol flag in PredictorExtensionSpec? We could potentially leverage the same flag in Triton to enable / disable the V2 protocol (although, I'm not sure if Triton allows that in recent images).

I think that's good idea, we make it more explicit on the spec and user knows which protocol model server supports.

tensorflow defaults to v1, no v2 support
sklearn/xgboost defaults to v1 for now but user can specify protocol: v2 to use mlserver
triton defaults to v2 @deadeyegoodwin triton no longer supports v1 correct?

deadeyegoodwin · 2020-10-26T16:03:22Z

Triton only supports KFServing V2 protocol. V1 support was dropped a couple of months ago. From: Dan Sun <notifications@github.com> Sent: Monday, October 26, 2020 6:33 AM To: kubeflow/kfserving <kfserving@noreply.github.com> Cc: David Goodwin <DAVIDG@nvidia.com>; Mention <mention@noreply.github.com> Subject: Re: [kubeflow/kfserving] Add support for MLServer in the SKLearn predictor (#1155) Thanks for your comments @yuzisun<https://github.com/yuzisun>! To make it more explicit for the end user, do you think it would make sense to have a protocol flag in PredictorExtensionSpec? We could potentially leverage the same flag in Triton to enable / disable the V2 protocol (although, I'm not sure if Triton allows that in recent images). I think that's good idea, we make it more explicit on the spec and user knows which protocol model server supports. * tensorflow defaults to v1, no v2 support * sklearn/xgboost defaults to v1 for now but user can specify protocol: v2 to use mlserver * triton defaults to v2 @deadeyegoodwin<https://github.com/deadeyegoodwin> triton no longer supports v1 correct? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub<#1155 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ABG6GZEUZPDF3MTBOHSC2DTSMV3A7ANCNFSM4S4SPQUQ>.

yuzisun · 2020-11-02T23:01:34Z

/ok-to-test

yuzisun · 2020-11-04T02:06:02Z

/retest

yuzisun · 2020-11-04T03:08:04Z

/retest

config/configmap/inferenceservice.yaml

adriangonz · 2020-11-04T17:25:48Z

Hey @yuzisun , I just amended the overlay to add the new image. I've also added a new integration test for the SKLearn predictor with the V2 protocol.

yuzisun · 2020-11-04T21:15:33Z

/retest

yuzisun · 2020-11-05T23:32:30Z

This is really awsome work! thanks @adriangonz !
/lgtm

@cliveseldon If this looks good to you, can you approve?

ukclivecox · 2020-11-06T08:04:02Z

Indeed. Great addition. /approve

ukclivecox · 2020-11-06T08:38:41Z

/approve

k8s-ci-robot · 2020-11-06T08:39:03Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: adriangonz, cliveseldon

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [cliveseldon]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot added needs-ok-to-test size/XXL labels Oct 23, 2020

k8s-ci-robot requested review from jinchihe and yuzliu October 23, 2020 13:42

yuzisun reviewed Oct 25, 2020

View reviewed changes

aws-kf-ci-bot added ok-to-test and removed needs-ok-to-test labels Nov 2, 2020

yuzisun reviewed Nov 4, 2020

View reviewed changes

config/configmap/inferenceservice.yaml Show resolved Hide resolved

Adrian Gonzalez-Martin added 11 commits November 5, 2020 09:31

Add ginkgo suite support to v1beta1 package

5dcaeda

Start adding MLServer predictor spec

dc11d21

Add MLServer config to PredictorsConfig

aac26e2

Add default config for MLServer

a7493a1

Add defaulter

236b128

Add GetContainer method

d5e426d

Change comment

cd717fb

Add MLServer to IS spec

ec6078d

Remove comment

4a0a653

Move MLServer to SKLearn predictor

7593769

Remove other references to MLServer predictor

71d3ed2

Adrian Gonzalez-Martin added 19 commits November 5, 2020 09:31

Bump version to 0.1.2

5f072a7

Add registry prefix

f89a820

Fix docs

d59c361

Fix a couple references on the README

d0d06c2

Add protocol field to predictorspec

0c40955

Add SKLearnV2 predictor

d413407

Default predictor to V1 protocol

8b252f1

Build v1 or v2 container depending on flag

fd37be2

Add tests for V1

dfa409c

Fix image logic

8879e10

Add protocolVersion flag to example

15dc024

re-generate CRD

bc627d3

Change configmap structure for sklearn predictor

9d1e8fb

Add new structure to test overlay

7075529

Add test case for SKLearn v2

f4a4ae6

Add missing license block

71c06fc

Re-generate Python SDK to add ProtocolVersions object

9c84ce1

Fix test case for V2

170f207

Re-generate SDK to fix wrong test

c760259

k8s-ci-robot assigned yuzisun Nov 5, 2020

k8s-ci-robot added the lgtm label Nov 5, 2020

k8s-ci-robot added the approved label Nov 6, 2020

k8s-ci-robot merged commit 82ec9ce into kserve:master Nov 6, 2020

adriangonz deleted the add-mlserver-support branch November 6, 2020 09:55

adriangonz mentioned this pull request Nov 9, 2020

Add V2 support to XGBoost predictor #1196

Merged

adriangonz mentioned this pull request Dec 1, 2020

Add adriangonz to Kubeflow org kubeflow/internal-acls#414

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for MLServer in the SKLearn predictor #1155

Add support for MLServer in the SKLearn predictor #1155

adriangonz commented Oct 23, 2020 •

edited

kubeflow-bot commented Oct 23, 2020

k8s-ci-robot commented Oct 23, 2020

yuzisun Oct 25, 2020 •

edited

yuzisun Oct 25, 2020

yuzisun commented Oct 26, 2020 •

edited

adriangonz commented Oct 26, 2020

yuzisun commented Oct 26, 2020

deadeyegoodwin commented Oct 26, 2020 via email

yuzisun commented Nov 2, 2020

yuzisun commented Nov 4, 2020

yuzisun commented Nov 4, 2020

adriangonz commented Nov 4, 2020

yuzisun commented Nov 4, 2020

yuzisun commented Nov 5, 2020

ukclivecox commented Nov 6, 2020

ukclivecox commented Nov 6, 2020

k8s-ci-robot commented Nov 6, 2020

Add support for MLServer in the SKLearn predictor #1155

Add support for MLServer in the SKLearn predictor #1155

Conversation

adriangonz commented Oct 23, 2020 • edited

kubeflow-bot commented Oct 23, 2020

k8s-ci-robot commented Oct 23, 2020

yuzisun Oct 25, 2020 • edited

Choose a reason for hiding this comment

yuzisun Oct 25, 2020

Choose a reason for hiding this comment

yuzisun commented Oct 26, 2020 • edited

adriangonz commented Oct 26, 2020

yuzisun commented Oct 26, 2020

deadeyegoodwin commented Oct 26, 2020 via email

yuzisun commented Nov 2, 2020

yuzisun commented Nov 4, 2020

yuzisun commented Nov 4, 2020

adriangonz commented Nov 4, 2020

yuzisun commented Nov 4, 2020

yuzisun commented Nov 5, 2020

ukclivecox commented Nov 6, 2020

ukclivecox commented Nov 6, 2020

k8s-ci-robot commented Nov 6, 2020

adriangonz commented Oct 23, 2020 •

edited

yuzisun Oct 25, 2020 •

edited

yuzisun commented Oct 26, 2020 •

edited