Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Follow up work on ServingRuntime support - PR1 #1924 #1926

Merged
merged 14 commits into from Dec 11, 2021

Conversation

Suresh-Nakkeran
Copy link
Contributor

@Suresh-Nakkeran Suresh-Nakkeran commented Nov 26, 2021

fixes #1924

Below are features added as part of this PR,

  • Template variable support for enabling passing in isvc name to runtime container.
  • More default ClusterServingRuntimes
  • Logic in InferenceService mutator to convert old schema into the new schema.

config/runtimes/kserve-sklearnserver.yaml Show resolved Hide resolved
config/runtimes/kserve-lgbserver.yaml Show resolved Hide resolved
pkg/apis/serving/v1beta1/inference_service_defaults.go Outdated Show resolved Hide resolved
config/runtimes/kserve-mlserver.yaml Outdated Show resolved Hide resolved
config/runtimes/kserve-torchserveserver-gpu.yaml Outdated Show resolved Hide resolved
pkg/constants/constants.go Outdated Show resolved Hide resolved
config/runtimes/kserve-torchserveserver-gpu.yaml Outdated Show resolved Hide resolved
config/runtimes/kserve-torchserveserver.yaml Outdated Show resolved Hide resolved
config/runtimes/kserve-tensorflow-serving-gpu.yaml Outdated Show resolved Hide resolved
config/runtimes/kserve-tensorflow-serving-gpu.yaml Outdated Show resolved Hide resolved
Copy link
Member

@pvaneck pvaneck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, thanks for this! Did a quick passthrough and it seems the main thing is to get the e2e tests passing. Found some issues which may help with that.

pkg/apis/serving/v1beta1/inference_service_defaults.go Outdated Show resolved Hide resolved
pkg/apis/serving/v1beta1/inference_service_defaults.go Outdated Show resolved Hide resolved
config/runtimes/kserve-lgbserver.yaml Show resolved Hide resolved
pkg/apis/serving/v1beta1/inference_service_defaults.go Outdated Show resolved Hide resolved
@Suresh-Nakkeran
Copy link
Contributor Author

Yes @pvaneck . I am also thinking the same.

config/runtimes/kserve-tensorflow-serving-gpu.yaml Outdated Show resolved Hide resolved
config/runtimes/kserve-tensorflow-serving-gpu.yaml Outdated Show resolved Hide resolved
config/runtimes/kustomization.yaml Outdated Show resolved Hide resolved
// Update image tag if GPU is enabled or runtime version is provided
func UpdateImageTag(container *v1.Container, runtimeVersion string) {
image := container.Image
if utils.IsGPUEnabled(container.Resources) &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I'm wondering if we should have this gpu block. It might have the unintended side effect of appending -gpu to image tags that might not need it since, in this case, any inferenceservice with a resources.limits object containing nvidia.com/gpu will automatically have the -gpu appended.

I'm thinking maybe just keep it simple here, and just rely on the user passing in the correct runtimeVersion or using a runtime that already specifies a gpu-enabled image. WDYT?

Copy link
Member

@yuzisun yuzisun Dec 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, for example triton does not differentiate cpu or gpu image while tfserving, torchserve does. So I think the logic should be following:

  1. if runtimeVersion is specified, user should pass in the right gpu image version like tfserving:1.14-gpu
  2. if runtimeVersion is not specified, we can add an optional field for gpuImage on ServingRuntime?

pkg/constants/constants.go Outdated Show resolved Hide resolved
pkg/apis/serving/v1beta1/inference_service_defaults.go Outdated Show resolved Hide resolved
spec:
supportedModelTypes:
- name: sklearn
version: "2"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sklearn version should be "0"
xgboost version should be "1"
@adriangonz what are the other frameworks MLServer support?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could add MLflow and LightGBM to that list. Although I'm assuming that'd require changes in the code as well?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adriangonz the purpose of introducing servingruntime is that we no longer need code change for adding new frameworks.. we only need to create new serving runtime CR and install along with kserve

config/runtimes/kserve-torchserve.yaml Outdated Show resolved Hide resolved
config/runtimes/kserve-tritonserver.yaml Outdated Show resolved Hide resolved
config/runtimes/kserve-tritonserver.yaml Outdated Show resolved Hide resolved
// Update image tag if GPU is enabled or runtime version is provided
func UpdateImageTag(container *v1.Container, runtimeVersion string) {
image := container.Image
if utils.IsGPUEnabled(container.Resources) &&
Copy link
Member

@yuzisun yuzisun Dec 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, for example triton does not differentiate cpu or gpu image while tfserving, torchserve does. So I think the logic should be following:

  1. if runtimeVersion is specified, user should pass in the right gpu image version like tfserving:1.14-gpu
  2. if runtimeVersion is not specified, we can add an optional field for gpuImage on ServingRuntime?

@Suresh-Nakkeran
Copy link
Contributor Author

@pvaneck @yuzisun I have few questions.

  1. In pkg/controller/v1beta1/inferenceservice/components/predictor.go:line 105, For the case no explicit runtime provided, we get the list of supported serving runtime specs which are not in same order every time since we are not sorting it, which causing issue when we deploy ISVC with new schema directly just framework name, version.
    Do we need to address this issue?

  2. Are we not allowed to filter servingruntime based on protocol version (V1 , V2)?

@pvaneck
Copy link
Member

pvaneck commented Dec 7, 2021

  1. In pkg/controller/v1beta1/inferenceservice/components/predictor.go:line 105, For the case no explicit runtime provided, we get the list of supported serving runtime specs which are not in same order every time since we are not sorting it, which causing issue when we deploy ISVC with new schema directly just framework name, version.
    Do we need to address this issue?
  2. Are we not allowed to filter servingruntime based on protocol version (V1 , V2)?
  1. Hmm, didn't realize the ordering was non-deterministic. I'm assuming you are getting a different serving runtime each time you deploy a specific model? If we want consistency in what SR is selected, then sorting would do the trick. Although, maybe a random selection could be considered a feature not a bug? @njhill How do you think this case should be handled?

  2. I don't believe we have any mechanism for filtering v1 vs v2. The ServingRuntimes are agnostic to that detail. And I believe v1 will eventually be phased out.

Copy link
Member

@pvaneck pvaneck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some version comments for the ClusterServingRuntimes.

config/runtimes/kserve-mlserver.yaml Show resolved Hide resolved
config/runtimes/kserve-lgbserver.yaml Outdated Show resolved Hide resolved
config/runtimes/kserve-paddleserver.yaml Outdated Show resolved Hide resolved
config/runtimes/kserve-pmmlserver.yaml Outdated Show resolved Hide resolved
config/runtimes/kserve-xgbserver.yaml Outdated Show resolved Hide resolved
config/runtimes/kserve-tritonserver.yaml Outdated Show resolved Hide resolved
@Suresh-Nakkeran
Copy link
Contributor Author

@pvaneck @yuzisun I updated the model supported versions. Can you please review and let me know if i missed any?

@pvaneck
Copy link
Member

pvaneck commented Dec 9, 2021

Thanks @Suresh-Nakkeran. I think this is pretty much ready to merge. A new Developer Certificate of Origin (DCO) check was recently added. Can you follow the steps here to get that check passing?

Signed-off-by: Suresh Nakkeran <suresh.n@ideas2it.com>
Signed-off-by: Suresh Nakkeran <suresh.n@ideas2it.com>
Signed-off-by: Suresh Nakkeran <suresh.n@ideas2it.com>
Signed-off-by: Suresh Nakkeran <suresh.n@ideas2it.com>
Signed-off-by: Suresh Nakkeran <suresh.n@ideas2it.com>
Signed-off-by: Suresh Nakkeran <suresh.n@ideas2it.com>
Signed-off-by: Suresh Nakkeran <suresh.n@ideas2it.com>
Signed-off-by: Suresh Nakkeran <suresh.n@ideas2it.com>
Signed-off-by: Suresh Nakkeran <suresh.n@ideas2it.com>
Signed-off-by: Suresh Nakkeran <suresh.n@ideas2it.com>
 - update image tag if gpu enabled or runtime version provided
 - update mlserver, tensorflow image versions

Signed-off-by: Suresh Nakkeran <suresh.n@ideas2it.com>
Signed-off-by: Suresh Nakkeran <suresh.n@ideas2it.com>
Signed-off-by: Suresh Nakkeran <suresh.n@ideas2it.com>
@pvaneck
Copy link
Member

pvaneck commented Dec 10, 2021

/lgtm

Signed-off-by: Dan Sun <dsun20@bloomberg.net>
@yuzisun
Copy link
Member

yuzisun commented Dec 11, 2021

@Suresh-Nakkeran Awesome work!

I added a commit to fix the protocol version.

/lgtm
/approve

@kserve-oss-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Suresh-Nakkeran, yuzisun

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Follow up work for ServingRuntime support
5 participants