Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix validation for custom storageUri #3134

Merged

Conversation

greenmoon55
Copy link
Contributor

@greenmoon55 greenmoon55 commented Sep 20, 2023

What this PR does / why we need it:
We introduced storage container CRD in #3060 which allows users to use any custom storage uri and customize the container spec.
For more information, please take a look at the design doc and the default StorageContainer CR.

However, we reject storage uris which are supported in a storage container CR because we have another validation in the validatingwebhook, which rejects uris not supported by the default storage initializer image.

This PR fixes the validation to allow custom uris that are supported by a storage container CR, and moves the validation logic from validatingwebhook to the controller.

Alternatives considered

  1. Move the validation to storage initailizer injector (pod mutator) [WIP] Move storage uri check from validator to pod mutator #3131. It was abandoned because the error only appears in the logs which is confusing to users.
  2. Fetch storage container CRs in validating webhook. This seems promising but there are some issues.
    • Creating an isvc might take longer (not sure)
    • It could be a bigger change because we need to pass the kubebuilder client to many places (search Validate()) including tests.
    • It is a good idea to validate the isvc based on another CR (storage container) in the validating webhook?

Breaking change

Some invalid Azure uris will pass the validation if they have the default StorageContainer CR installed because the last line allows all http(s) uris.

Examples:

https://blob.core.windows.net/triton/simple_string/
https://foo.blob.core.windows.net/"

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #3121

Type of changes
Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

Feature/Issue validation/testing:

Please describe the tests that you ran to verify your changes and relevant result summary. Provide instructions so it can be reproduced.
Please also list any relevant details for your test configuration.

Test 1 Custom storage uri + custom StorageContainer CR

apiVersion: "serving.kserve.io/v1alpha1"
kind: ClusterStorageContainer
metadata:
  name: abc
spec:
  container:
    name: storage-initializer
    image: kserve/storage-initializer:latest
    resources:
      requests:
        memory: 100Mi
        cpu: 100m
      limits:
        memory: 1Gi
        cpu: "1"
  supportedUriFormats:
    - prefix: abc://
aiVersion: "serving.kserve.io/v1beta1"
kind: InferenceService
metadata: 
    name: sklearn-iris
spec:
  predictor:
    model:
      modelFormat:
        name: sklearn
      storageUri: "abc://kfserving-examples/models/sklearn/1.0/model"

Inference service created and no error in the controller

Test 2 Custom storage uri + default StorageContainer CR + custom StorageContainer CR

Same result as above.

Test 3 Custom storage uri + no StorageContainer CR

➜  ~ k get isvc
NAME           URL   READY   PREV   LATEST   PREVROLLEDOUTREVISION   LATESTREADYREVISION   AGE
sklearn-iris                                                                               25s
➜  ~ k describe isvc sklearn-iris
Name:         sklearn-iris
Namespace:    default
Labels:       <none>
Annotations:  <none>
API Version:  serving.kserve.io/v1beta1
Kind:         InferenceService
Metadata:
  Creation Timestamp:  2023-09-22T00:49:23Z
  Finalizers:
    inferenceservice.finalizers
  Generation:        1
  Resource Version:  167831
  UID:               fc9100e6-4816-45d1-a41a-9bf9da26bbf6
Spec:
  Predictor:
    Model:
      Model Format:
        Name:  sklearn
      Name:    
      Resources:
      Storage Uri:  abc://kfserving-examples/models/sklearn/1.0/model
Events:
  Type     Reason         Age                 From                Message
  ----     ------         ----                ----                -------
  Warning  InternalError  10s (x13 over 31s)  v1beta1Controllers  StorageURI not supported: storageUri, must be one of: [gs://, s3://, pvc://, file://, https://, http://, hdfs://, webhdfs://] or match https://{}.blob.core.windows.net/{}/{} or be an absolute or relative local path. StorageUri [abc://kfserving-examples/models/sklearn/1.0/model] is not supported.

Test 4 Supported uri + no StorageContainer CR

apiVersion: "serving.kserve.io/v1beta1"
kind: InferenceService
metadata: 
    name: sklearn-iris
spec:
  predictor:
    model:
      modelFormat:
        name: sklearn
      storageUri: "gs://kfserving-examples/models/sklearn/1.0/model"
➜  ~ k get isvc
NAME           URL                                       READY   PREV   LATEST   PREVROLLEDOUTREVISION   LATESTREADYREVISION            AGE
sklearn-iris   http://sklearn-iris.default.example.com   True           100                              sklearn-iris-predictor-00001   59s

Test 5 Supported uri + default StorageContainer CR

Same result as above

  • Logs

Special notes for your reviewer:

  1. Please confirm that if this PR changes any image versions, then that's the sole change this PR makes.

Checklist:

  • Have you added unit/e2e tests that prove your fix is effective or that this feature works?
  • Has code been commented, particularly in hard-to-understand areas?
  • Have you made corresponding changes to the documentation?

Release note:

Fix validation for custom storageUri. Now users to use any custom storageUri defined in storage container CRs.

Signed-off-by: Jin Dong <jdong183@bloomberg.net>
Signed-off-by: Jin Dong <jdong183@bloomberg.net>
Signed-off-by: Jin Dong <jdong183@bloomberg.net>
func TestAzureBlobNoAccountFails(t *testing.T) {
g := gomega.NewGomegaWithT(t)
isvc := makeTestInferenceService()
isvc.Spec.Predictor.Tensorflow.StorageURI = proto.String("https://blob.core.windows.net/triton/simple_string/")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This uri is actually supported in the new CR because "https" is a valid prefix. Just want to note that this is a behavior change.

func TestAzureBlobNoContainerFails(t *testing.T) {
g := gomega.NewGomegaWithT(t)
isvc := makeTestInferenceService()
isvc.Spec.Predictor.Tensorflow.StorageURI = proto.String("https://foo.blob.core.windows.net/")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Comment on lines +341 to +343
if err != nil {
return err
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happens if the storage container CR does not exist?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added comments. If no CR exists, it will be exactly the same as the current behavior.

Signed-off-by: Jin Dong <jdong183@bloomberg.net>
@greenmoon55 greenmoon55 changed the title [WIP] Fix storage initializer validation Fix validation for custom storageUri Sep 22, 2023
@greenmoon55 greenmoon55 marked this pull request as ready for review September 22, 2023 02:14
@yuzisun
Copy link
Member

yuzisun commented Sep 22, 2023

Thanks @greenmoon55 !

/lgtm
/approve

@kserve-oss-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: greenmoon55, yuzisun

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kserve-oss-bot kserve-oss-bot merged commit a6af401 into kserve:master Sep 22, 2023
58 checks passed
Iamlovingit pushed a commit to Iamlovingit/kserve that referenced this pull request Oct 1, 2023
* Move storage uri check from validator to pod mutator

Signed-off-by: Jin Dong <jdong183@bloomberg.net>

* Move storage uri validation to controller

Signed-off-by: Jin Dong <jdong183@bloomberg.net>

* Fix test

Signed-off-by: Jin Dong <jdong183@bloomberg.net>

* Fix tests, add comments, and remove model transition status

Signed-off-by: Jin Dong <jdong183@bloomberg.net>

---------

Signed-off-by: Jin Dong <jdong183@bloomberg.net>
Signed-off-by: iamlovingit <freecode666@gmail.com>
ddelange added a commit to ddelange/kserve that referenced this pull request Oct 2, 2023
…mp-ray

* 'bump-ray' of https://github.com/ddelange/kserve:
  Bumping version for 0.11.1 (kserve#3141)
  Fix validation for custom storageUri (kserve#3134)
  Fix: error response handling for splitter and switch nodes (kserve#3116)
ddelange added a commit to ddelange/kserve that referenced this pull request Oct 2, 2023
* 'master' of https://github.com/kserve/kserve:
  Unpack archive files for hdfs (kserve#3093)
  Upgrade istio Api and migrate to v1beta1 Api version (kserve#3150)
  Increase pytest workers for kourier e2e test (kserve#3151)
  chore: Add design doc template links to feature request template (kserve#3155)
  Make storage initializer image configurable (kserve#3145)
  Bumping version for 0.11.1 (kserve#3141)
  Fix validation for custom storageUri (kserve#3134)
  Fix: error response handling for splitter and switch nodes (kserve#3116)
ddelange added a commit to ddelange/kserve that referenced this pull request Oct 2, 2023
* 'master' of https://github.com/kserve/kserve:
  Unpack archive files for hdfs (kserve#3093)
  Upgrade istio Api and migrate to v1beta1 Api version (kserve#3150)
  Increase pytest workers for kourier e2e test (kserve#3151)
  chore: Add design doc template links to feature request template (kserve#3155)
  Make storage initializer image configurable (kserve#3145)
  Bumping version for 0.11.1 (kserve#3141)
  Fix validation for custom storageUri (kserve#3134)
  Fix: error response handling for splitter and switch nodes (kserve#3116)
@greenmoon55 greenmoon55 deleted the fix-storage-initializer-validation branch June 7, 2024 21:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Ability to add custom storage initializer prefixes
4 participants