Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for MLServer in the SKLearn predictor #1155

Merged
merged 37 commits into from Nov 6, 2020
Merged

Add support for MLServer in the SKLearn predictor #1155

merged 37 commits into from Nov 6, 2020

Conversation

adriangonz
Copy link
Contributor

@adriangonz adriangonz commented Oct 23, 2020

What this PR does / why we need it:

Adds support for MLServer (and the V2 dataplane) in the v1beta1 version of the SKLearn predictor. Note that this PR will still default all v1beta1 SKLearn InferenceServices to use the V1 dataplane. To enable the V2 protocol, there is a new protocolVersion field in the predictor spec which can be set to v1.

This PR also adds an example worth checking under ./docs/samples/v1beta1/sklearn showcasing how the integration works.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):

Works towards #1111 (it doesn't fix it yet, as it still needs support for XGBoost).

Special notes for your reviewer:

The aim of this PR is to kickstart the discussion on how do we want to shape the integration between KFServing and MLServer. This has already surfaced some questions, e.g.:

  • Do we want to maintain backwards compatibility in the data plane with previous versions? Yes, for now.
  • Moving to MMS, do we want to restrict a single framework in each MLServer pod? In other words, do we need a mlserver predictor which supports multiple frameworks?

As such, it would be great to get people's thoughts on this proposal and which aspects would they change.

That's also why it only focuses on SKLearn for now. Once we are happy with how it looks, it should pretty straightforward to extend to XGBoost.

  1. Please confirm that if this PR changes any image versions, then that's the sole change this PR makes.

Release note:

Added support for the V2 dataplane in the SKLearn predictor when using the `v1beta1` version of the `InferenceService` CRDs.

@kubeflow-bot
Copy link

This change is Reviewable

@k8s-ci-robot
Copy link
Contributor

Hi @adriangonz. Thanks for your PR.

I'm waiting for a kubeflow member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

if extensions.ContainerConcurrency != nil {
arguments = append(arguments, fmt.Sprintf("%s=%s", constants.ArgumentWorkers, strconv.FormatInt(*extensions.ContainerConcurrency, 10)))
}
k.Container.Env = append(
Copy link
Member

@yuzisun yuzisun Oct 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does MLServer support command arguments ? If we can use the same set of the arguments that helps maintain the backwards compatibility, otherwise we can do a if/else check based on the image name.

"image": "gcr.io/kfserving/sklearnserver",
"defaultImageVersion": "v0.4.1",
"image": "docker.io/seldonio/mlserver",
"defaultImageVersion": "0.1.2",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's still keep the default to old v1 version for a few releases and then we can start defaulting to mlserver once they migrate away from v1. With v1beta1 user can specify the image and version on the inference service spec itself if they want to use mlserver.

@yuzisun
Copy link
Member

yuzisun commented Oct 26, 2020

thanks @adriangonz, This is a great start!

  • I think for the initial v1beta1 release we still default to kfserver, but we can implement in a way that both kfserver and mlserver work with the sklearn spec, e.g if user specify mlserver image and version on inference service spec like following then it creates mlserver to load the sklearn model.
sklearn:
    image: seldon.io/mlserver
    runtimeVersion: v0.1.2
    storageUri: gs://kfserving-examples/sklearn/iris

Does mlserver model repo works the same way as the current kfserver ?

  • for MMS I think we can punt on supporting multi frameworks when use case comes up.

@adriangonz
Copy link
Contributor Author

Thanks for your comments @yuzisun! To make it more explicit for the end user, do you think it would make sense to have a protocol flag in PredictorExtensionSpec? We could potentially leverage the same flag in Triton to enable / disable the V2 protocol (although, I'm not sure if Triton allows that in recent images).

@yuzisun
Copy link
Member

yuzisun commented Oct 26, 2020

Thanks for your comments @yuzisun! To make it more explicit for the end user, do you think it would make sense to have a protocol flag in PredictorExtensionSpec? We could potentially leverage the same flag in Triton to enable / disable the V2 protocol (although, I'm not sure if Triton allows that in recent images).

I think that's good idea, we make it more explicit on the spec and user knows which protocol model server supports.

  • tensorflow defaults to v1, no v2 support
  • sklearn/xgboost defaults to v1 for now but user can specify protocol: v2 to use mlserver
  • triton defaults to v2 @deadeyegoodwin triton no longer supports v1 correct?

@deadeyegoodwin
Copy link
Contributor

deadeyegoodwin commented Oct 26, 2020 via email

@yuzisun
Copy link
Member

yuzisun commented Nov 2, 2020

/ok-to-test

@yuzisun
Copy link
Member

yuzisun commented Nov 4, 2020

/retest

1 similar comment
@yuzisun
Copy link
Member

yuzisun commented Nov 4, 2020

/retest

@adriangonz
Copy link
Contributor Author

Hey @yuzisun , I just amended the overlay to add the new image. I've also added a new integration test for the SKLearn predictor with the V2 protocol.

@yuzisun
Copy link
Member

yuzisun commented Nov 4, 2020

/retest

@yuzisun
Copy link
Member

yuzisun commented Nov 5, 2020

This is really awsome work! thanks @adriangonz !
/lgtm

@cliveseldon If this looks good to you, can you approve?

@ukclivecox
Copy link
Contributor

Indeed. Great addition. /approve

@ukclivecox
Copy link
Contributor

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: adriangonz, cliveseldon

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants