Improve kserve protocol version handling #2957

sgaist · 2024-02-22T17:05:35Z

Description

This PR ensures that a default value for the KServe protocol is passed forward to the ModelServer class and that the class itself uses a default value if instantiated directly.

With the current implementation, if the PROTOCOL_VERSION environment variable is not set, the protocol property of the ModelServer instance will be as well and this will trigger an exception when on data return after processing as the base class does a check on its value.

Fixes #2956

Type of change

Please delete options that are not relevant.

Bug fix (non-breaking change which fixes an issue)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
New feature (non-breaking change which adds functionality)
This change requires a documentation update

Feature/Issue validation/testing

A new test has been added to ensure that:

If the PROTOCOL_VERSION environment variable is not set, a default is used
If the PROTOCOL_VERSION environment variable is set to a valid value, it's indeed that one that is used
If the PROTOCOL_VERSION environment variable is set to an invalid value, the wrapper fails to start

Checklist:

Did you have fun?
Have you added tests that prove your fix is effective or that this feature works?
Has code been commented, particularly in hard-to-understand areas?
Have you made corresponding changes to the documentation?

sgaist · 2024-02-26T09:16:33Z

I currently don't know why the changed file linting failed as I installed the pre-commit hooks and just to be sure I manually ran them on the files I changed and they where all deemed clean.

agunapal · 2024-02-27T02:02:44Z

@sgaist Can you please mention in the PR what the current issue and what this PR is addressing

sgaist · 2024-02-27T13:12:51Z

@agunapal Description fixed !

Sorry for the wrong content, I had it opened in parallel to the ticket and thought I had updated the description before saving.

sgaist · 2024-02-27T20:33:44Z

@agunapal I just realized that in order for the test to run properly, kserve must be installed and based on my experience the kserve[storage] variant. While adding it to developer.txt, I got hit by an incompatible requirement: serve is using 4.25.1 while kserve<4.0.0.

What would you recommend to go forward ? Isolate the test and create a dedicate flow for it ?

agunapal · 2024-02-27T21:01:07Z

@agunapal I just realized that in order for the test to run properly, kserve must be installed and based on my experience the kserve[storage] variant. While adding it to developer.txt, I got hit by an incompatible requirement: serve is using 4.25.1 while kserve<4.0.0.

What would you recommend to go forward ? Isolate the test and create a dedicate flow for it ?

Yes, I'm not sure if we want to do this.

Can you please check if you can add a test here https://github.com/pytorch/serve/blob/master/kubernetes/kserve/tests/scripts/test_mnist.sh

sgaist · 2024-02-28T13:58:40Z

I think the overhead is not worth it.

After having fiddled a bit more with test_mnist.sh, I can suggest three solutions:

Set the kserve version to v0.11.0 (which is the last one before they implemented systematic definition of the PROTOCOL_VERSION environment variable) in kserve_cpu_tests.yml.
Modify test_mnist.sh to deploy kserve v0.11.0 after the current set of tests and re-do a run of mnist_v1_cpu
Make a matrix of the KServe versions to test against.

Number one has the downside of using an outdated version of KServe.
Number two makes the tests last a bit longer but does not preclude from updating the KServe version used for all other tests.
Number three will multiply the test duration by the amount of KServe version tested but allows to confirm that everything works properly on them. This also means that test_mnist.sh does not need to be modified.

The options are listed in ascending order in terms of resource/time impact.

Which one would you like to have ?

agunapal · 2024-02-29T01:00:29Z

I think the overhead is not worth it.

After having fiddled a bit more with test_mnist.sh, I can suggest three solutions:

Set the kserve version to v0.11.0 (which is the last one before they implemented systematic definition of the PROTOCOL_VERSION environment variable) in kserve_cpu_tests.yml.

Modify test_mnist.sh to deploy kserve v0.11.0 after the current set of tests and re-do a run of mnist_v1_cpu

Make a matrix of the KServe versions to test against.

Number one has the downside of using an outdated version of KServe. Number two makes the tests last a bit longer but does not preclude from updating the KServe version used for all other tests. Number three will multiply the test duration by the amount of KServe version tested but allows to confirm that everything works properly on them. This also means that test_mnist.sh does not need to be modified.

The options are listed in ascending order in terms of resource/time impact.

Which one would you like to have ?

Thanks for looking into this.
If this is already implemented in the latest version, I think you can manually run a test with 0.11.0 version of kserve and attach the logs.

FYI, we are making a release soon. So, we will review this after the release.

Also curious, if this is resolved in the latest version of kserve, can't you use that?

The current implementation retrieves the protocol to use from the "PROTOCOL_VERSION" environment variable however there's no default value which will trigger an error when served through Kserve as the base class does a protocol check in the predict method that will fail with None. The default protocol uses the same value as the base class.

This will allow to make the wrapper easier to test beside making it possible to to change where the file should be looked for.

In case None is passed, keep the default value set in the base __init__. This makes TorchserveModel behave in the same fashion as the base class.

It now validates that the protocol version is passed properly as well as failure if an invalid protocol is given.

The overhead required to run the tests outweighs it benefits. The thing to keep in mind is that the fix allows to run models through kserve deployments prior to 0.11.1.

sgaist · 2024-02-29T09:42:11Z

KServe 0.11.0 run for protocol v2 prior to this PR

Check status of the pod NAME READY STATUS RESTARTS AGE torchserve-mnist-v2-predictor-00001-deployment-9569d6f8b-8w6pb 1/2 Running 0 2m Name: torchserve-mnist-v2-predictor-00001-deployment-9569d6f8b-8w6pb Namespace: default Priority: 0 Service Account: default Node: minikube/192.168.49.2 Start Time: Thu, 29 Feb 2024 10:26:12 +0100 Labels: app=torchserve-mnist-v2-predictor-00001 component=predictor pod-template-hash=9569d6f8b service.istio.io/canonical-name=torchserve-mnist-v2-predictor service.istio.io/canonical-revision=torchserve-mnist-v2-predictor-00001 serviceEnvelope=kservev2 serving.knative.dev/configuration=torchserve-mnist-v2-predictor serving.knative.dev/configurationGeneration=1 serving.knative.dev/configurationUID=a40d926b-2a09-4d00-9aa5-34e09fed7608 serving.knative.dev/revision=torchserve-mnist-v2-predictor-00001 serving.knative.dev/revisionUID=cdff5033-616f-454c-a27c-03984823cd74 serving.knative.dev/service=torchserve-mnist-v2-predictor serving.knative.dev/serviceUID=f04ce05a-78a7-44ed-8473-3cf46fb117b6 serving.kserve.io/inferenceservice=torchserve-mnist-v2 Annotations: autoscaling.knative.dev/class: kpa.autoscaling.knative.dev autoscaling.knative.dev/min-scale: 1 internal.serving.kserve.io/storage-initializer-sourceuri: gs://kfserving-examples/models/torchserve/image_classifier/v2 prometheus.kserve.io/path: /metrics prometheus.kserve.io/port: 8082 serving.knative.dev/creator: system:serviceaccount:kserve:kserve-controller-manager serving.kserve.io/enable-metric-aggregation: false serving.kserve.io/enable-prometheus-scraping: false Status: Running IP: 10.244.0.17 IPs: IP: 10.244.0.17 Controlled By: ReplicaSet/torchserve-mnist-v2-predictor-00001-deployment-9569d6f8b Init Containers: storage-initializer: Container ID: docker://6f8e446c5344730eea59403bf3d9b3307fe96c3d8df8c6cdcd63773a852aec84 Image: kserve/storage-initializer:v0.11.0 Image ID: docker-pullable://kserve/storage-initializer@sha256:962682077dc30c21113822f50b63400d759ce6c49518d0c9caa638b6f77c7fed Port: Host Port: Args: gs://kfserving-examples/models/torchserve/image_classifier/v2 /mnt/models State: Terminated Reason: Completed Exit Code: 0 Started: Thu, 29 Feb 2024 10:26:39 +0100 Finished: Thu, 29 Feb 2024 10:26:56 +0100 Ready: True Restart Count: 0 Limits: cpu: 1 memory: 1Gi Requests: cpu: 100m memory: 100Mi Environment: Mounts: /mnt/models from kserve-provision-location (rw) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-8wbm2 (ro) Containers: kserve-container: Container ID: docker://cc99e7eecaf2c842fcea9cc30fb301e6427a4cfa259f7b42cb0806e6a92ea1e0 Image: index.docker.io/pytorch/torchserve-kfs-nightly@sha256:5ec33cf3085b56b5eedb6b11c67faa5b7e669019573a26459eb49a2394531be2 Image ID: docker-pullable://pytorch/torchserve-kfs-nightly@sha256:5ec33cf3085b56b5eedb6b11c67faa5b7e669019573a26459eb49a2394531be2 Port: 8080/TCP Host Port: 0/TCP Args: torchserve --start --model-store=/mnt/models/model-store --ts-config=/mnt/models/config/config.properties State: Running Started: Thu, 29 Feb 2024 10:28:02 +0100 Ready: True Restart Count: 0 Limits: cpu: 1 memory: 1Gi Requests: cpu: 100m memory: 256Mi Environment: TS_SERVICE_ENVELOPE: kservev2 PORT: 8080 K_REVISION: torchserve-mnist-v2-predictor-00001 K_CONFIGURATION: torchserve-mnist-v2-predictor K_SERVICE: torchserve-mnist-v2-predictor Mounts: /mnt/models from kserve-provision-location (ro) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-8wbm2 (ro) queue-proxy: Container ID: docker://ad160b0a42f80dc14996c9acf54185125788bca57ec79eb36ca9193823a37521 Image: gcr.io/knative-releases/knative.dev/serving/cmd/queue@sha256:65c427aaab3be9cea1afea32cdef26d5855c69403077d2dc3439f75c26a1e83f Image ID: docker-pullable://gcr.io/knative-releases/knative.dev/serving/cmd/queue@sha256:65c427aaab3be9cea1afea32cdef26d5855c69403077d2dc3439f75c26a1e83f Ports: 8022/TCP, 9090/TCP, 9091/TCP, 8012/TCP, 8112/TCP Host Ports: 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP State: Running Started: Thu, 29 Feb 2024 10:28:07 +0100 Ready: False Restart Count: 0 Requests: cpu: 25m Readiness: http-get http://:8012/ delay=0s timeout=1s period=10s #success=1 #failure=3 Environment: SERVING_NAMESPACE: default SERVING_SERVICE: torchserve-mnist-v2-predictor SERVING_CONFIGURATION: torchserve-mnist-v2-predictor SERVING_REVISION: torchserve-mnist-v2-predictor-00001 QUEUE_SERVING_PORT: 8012 QUEUE_SERVING_TLS_PORT: 8112 CONTAINER_CONCURRENCY: 0 REVISION_TIMEOUT_SECONDS: 300 REVISION_RESPONSE_START_TIMEOUT_SECONDS: 0 REVISION_IDLE_TIMEOUT_SECONDS: 0 SERVING_POD: torchserve-mnist-v2-predictor-00001-deployment-9569d6f8b-8w6pb (v1:metadata.name) SERVING_POD_IP: (v1:status.podIP) SERVING_LOGGING_CONFIG: SERVING_LOGGING_LEVEL: SERVING_REQUEST_LOG_TEMPLATE: {"httpRequest": {"requestMethod": "{{.Request.Method}}", "requestUrl": "{{js .Request.RequestURI}}", "requestSize": "{{.Request.ContentLength}}", "status": {{.Response.Code}}, "responseSize": "{{.Response.Size}}", "userAgent": "{{js .Request.UserAgent}}", "remoteIp": "{{js .Request.RemoteAddr}}", "serverIp": "{{.Revision.PodIP}}", "referer": "{{js .Request.Referer}}", "latency": "{{.Response.Latency}}s", "protocol": "{{.Request.Proto}}"}, "traceId": "{{index .Request.Header "X-B3-Traceid"}}"} SERVING_ENABLE_REQUEST_LOG: false SERVING_REQUEST_METRICS_BACKEND: prometheus TRACING_CONFIG_BACKEND: none TRACING_CONFIG_ZIPKIN_ENDPOINT: TRACING_CONFIG_DEBUG: false TRACING_CONFIG_SAMPLE_RATE: 0.1 USER_PORT: 8080 SYSTEM_NAMESPACE: knative-serving METRICS_DOMAIN: knative.dev/internal/serving SERVING_READINESS_PROBE: {"tcpSocket":{"port":8080,"host":"127.0.0.1"},"successThreshold":1} ENABLE_PROFILING: false SERVING_ENABLE_PROBE_REQUEST_LOG: false METRICS_COLLECTOR_ADDRESS: HOST_IP: (v1:status.hostIP) ENABLE_HTTP2_AUTO_DETECTION: false ROOT_CA: Mounts: /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-8wbm2 (ro) Conditions: Type Status Initialized True Ready False ContainersReady False PodScheduled True Volumes: kube-api-access-8wbm2: Type: Projected (a volume that contains injected data from multiple sources) TokenExpirationSeconds: 3607 ConfigMapName: kube-root-ca.crt ConfigMapOptional: DownwardAPI: true kserve-provision-location: Type: EmptyDir (a temporary directory that shares a pod's lifetime) Medium: SizeLimit: QoS Class: Burstable Node-Selectors: Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 2m default-scheduler Successfully assigned default/torchserve-mnist-v2-predictor-00001-deployment-9569d6f8b-8w6pb to minikube Normal Pulling 117s kubelet Pulling image "kserve/storage-initializer:v0.11.0" Normal Pulled 100s kubelet Successfully pulled image "kserve/storage-initializer:v0.11.0" in 17.382s (17.382s including waiting) Normal Created 93s kubelet Created container storage-initializer Normal Started 93s kubelet Started container storage-initializer Normal Pulling 76s kubelet Pulling image "index.docker.io/pytorch/torchserve-kfs-nightly@sha256:5ec33cf3085b56b5eedb6b11c67faa5b7e669019573a26459eb49a2394531be2" Normal Pulled 13s kubelet Successfully pulled image "index.docker.io/pytorch/torchserve-kfs-nightly@sha256:5ec33cf3085b56b5eedb6b11c67faa5b7e669019573a26459eb49a2394531be2" in 1m2.971s (1m2.971s including waiting) Normal Created 11s kubelet Created container kserve-container Normal Started 10s kubelet Started container kserve-container Normal Pulling 10s kubelet Pulling image "gcr.io/knative-releases/knative.dev/serving/cmd/queue@sha256:65c427aaab3be9cea1afea32cdef26d5855c69403077d2dc3439f75c26a1e83f" Normal Pulled 6s kubelet Successfully pulled image "gcr.io/knative-releases/knative.dev/serving/cmd/queue@sha256:65c427aaab3be9cea1afea32cdef26d5855c69403077d2dc3439f75c26a1e83f" in 4.128s (4.128s including waiting) Normal Created 5s kubelet Created container queue-proxy Normal Started 5s kubelet Started container queue-proxy Warning Unhealthy 2s (x2 over 4s) kubelet Readiness probe failed: Get "http://10.244.0.17:8012/": context deadline exceeded (Client.Timeout exceeded while awaiting headers)

Wait for inference service to be ready
Wait for ports to be in forwarding
Forwarding from 127.0.0.1:8080 -> 8080
Forwarding from [::1]:8080 -> 8080
Make inference request
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0Handling connection for 8080
100 6754 100 62 100 6692 98 10654 --:--:-- --:--:-- --:--:-- 10737
✘ Test failed: Prediction: {"error":"ValueError:NoneisnotavalidPredictorProtocol"}, expected {"model_name":"mnist","model_version":"1.0","id":"d3b15cad-50a2-4eaf-80ce-8b0a428bd298","parameters":null,"outputs":[{"name":"input-0","shape":[1],"datatype":"INT64","parameters":null,"data":[1]}]}.
Delete cluster

KServe 0.11.0 run for default protocol prior to this PR

MNIST KServe V1 test begin Deploying the cluster inferenceservice.serving.kserve.io/torchserve created Waiting for pod to come up... Check status of the pod NAME READY STATUS RESTARTS AGE torchserve-predictor-00001-deployment-c8b8c4595-mxtpf 1/2 Running 0 115s Name: torchserve-predictor-00001-deployment-c8b8c4595-mxtpf Namespace: default Priority: 0 Service Account: default Node: minikube/192.168.49.2 Start Time: Thu, 29 Feb 2024 09:35:17 +0100 Labels: app=torchserve-predictor-00001 component=predictor pod-template-hash=c8b8c4595 service.istio.io/canonical-name=torchserve-predictor service.istio.io/canonical-revision=torchserve-predictor-00001 serviceEnvelope=kserve serving.knative.dev/configuration=torchserve-predictor serving.knative.dev/configurationGeneration=1 serving.knative.dev/configurationUID=36ca148d-50d5-402e-9c78-3becaa325179 serving.knative.dev/revision=torchserve-predictor-00001 serving.knative.dev/revisionUID=a1de2f03-1a4f-4620-a18c-e96e568709b8 serving.knative.dev/service=torchserve-predictor serving.knative.dev/serviceUID=aeb06d26-b27d-42fc-8167-46c1521410ea serving.kserve.io/inferenceservice=torchserve Annotations: autoscaling.knative.dev/class: kpa.autoscaling.knative.dev autoscaling.knative.dev/min-scale: 1 internal.serving.kserve.io/storage-initializer-sourceuri: gs://kfserving-examples/models/torchserve/image_classifier/v1 prometheus.kserve.io/path: /metrics prometheus.kserve.io/port: 8082 serving.knative.dev/creator: system:serviceaccount:kserve:kserve-controller-manager serving.kserve.io/enable-metric-aggregation: false serving.kserve.io/enable-prometheus-scraping: false Status: Running IP: 10.244.0.17 IPs: IP: 10.244.0.17 Controlled By: ReplicaSet/torchserve-predictor-00001-deployment-c8b8c4595 Init Containers: storage-initializer: Container ID: docker://98f5c07ad65f6095f52916559fa118f45c64ffb342637c2caac5b55aa8e718a6 Image: kserve/storage-initializer:v0.11.0 Image ID: docker-pullable://kserve/storage-initializer@sha256:962682077dc30c21113822f50b63400d759ce6c49518d0c9caa638b6f77c7fed Port: Host Port: Args: gs://kfserving-examples/models/torchserve/image_classifier/v1 /mnt/models State: Terminated Reason: Completed Exit Code: 0 Started: Thu, 29 Feb 2024 09:35:43 +0100 Finished: Thu, 29 Feb 2024 09:35:59 +0100 Ready: True Restart Count: 0 Limits: cpu: 1 memory: 1Gi Requests: cpu: 100m memory: 100Mi Environment: Mounts: /mnt/models from kserve-provision-location (rw) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-pfjj8 (ro) Containers: kserve-container: Container ID: docker://9f5a3a87a2cb3d393e24806e178f7fbd66cdc74bf99875334336ca8897d6cac7 Image: index.docker.io/pytorch/torchserve-kfs-nightly@sha256:5ec33cf3085b56b5eedb6b11c67faa5b7e669019573a26459eb49a2394531be2 Image ID: docker-pullable://pytorch/torchserve-kfs-nightly@sha256:5ec33cf3085b56b5eedb6b11c67faa5b7e669019573a26459eb49a2394531be2 Port: 8080/TCP Host Port: 0/TCP Args: torchserve --start --model-store=/mnt/models/model-store --ts-config=/mnt/models/config/config.properties State: Running Started: Thu, 29 Feb 2024 09:37:05 +0100 Ready: True Restart Count: 0 Limits: cpu: 1 memory: 1Gi Requests: cpu: 100m memory: 256Mi Environment: TS_SERVICE_ENVELOPE: kserve PORT: 8080 K_REVISION: torchserve-predictor-00001 K_CONFIGURATION: torchserve-predictor K_SERVICE: torchserve-predictor Mounts: /mnt/models from kserve-provision-location (ro) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-pfjj8 (ro) queue-proxy: Container ID: docker://eb7c37a231243d0e19973d392aa8a9425429eb10aea1c8a415353768cae214ad Image: gcr.io/knative-releases/knative.dev/serving/cmd/queue@sha256:65c427aaab3be9cea1afea32cdef26d5855c69403077d2dc3439f75c26a1e83f Image ID: docker-pullable://gcr.io/knative-releases/knative.dev/serving/cmd/queue@sha256:65c427aaab3be9cea1afea32cdef26d5855c69403077d2dc3439f75c26a1e83f Ports: 8022/TCP, 9090/TCP, 9091/TCP, 8012/TCP, 8112/TCP Host Ports: 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP State: Running Started: Thu, 29 Feb 2024 09:37:11 +0100 Ready: False Restart Count: 0 Requests: cpu: 25m Readiness: http-get http://:8012/ delay=0s timeout=1s period=10s #success=1 #failure=3 Environment: SERVING_NAMESPACE: default SERVING_SERVICE: torchserve-predictor SERVING_CONFIGURATION: torchserve-predictor SERVING_REVISION: torchserve-predictor-00001 QUEUE_SERVING_PORT: 8012 QUEUE_SERVING_TLS_PORT: 8112 CONTAINER_CONCURRENCY: 0 REVISION_TIMEOUT_SECONDS: 300 REVISION_RESPONSE_START_TIMEOUT_SECONDS: 0 REVISION_IDLE_TIMEOUT_SECONDS: 0 SERVING_POD: torchserve-predictor-00001-deployment-c8b8c4595-mxtpf (v1:metadata.name) SERVING_POD_IP: (v1:status.podIP) SERVING_LOGGING_CONFIG: SERVING_LOGGING_LEVEL: SERVING_REQUEST_LOG_TEMPLATE: {"httpRequest": {"requestMethod": "{{.Request.Method}}", "requestUrl": "{{js .Request.RequestURI}}", "requestSize": "{{.Request.ContentLength}}", "status": {{.Response.Code}}, "responseSize": "{{.Response.Size}}", "userAgent": "{{js .Request.UserAgent}}", "remoteIp": "{{js .Request.RemoteAddr}}", "serverIp": "{{.Revision.PodIP}}", "referer": "{{js .Request.Referer}}", "latency": "{{.Response.Latency}}s", "protocol": "{{.Request.Proto}}"}, "traceId": "{{index .Request.Header "X-B3-Traceid"}}"} SERVING_ENABLE_REQUEST_LOG: false SERVING_REQUEST_METRICS_BACKEND: prometheus TRACING_CONFIG_BACKEND: none TRACING_CONFIG_ZIPKIN_ENDPOINT: TRACING_CONFIG_DEBUG: false TRACING_CONFIG_SAMPLE_RATE: 0.1 USER_PORT: 8080 SYSTEM_NAMESPACE: knative-serving METRICS_DOMAIN: knative.dev/internal/serving SERVING_READINESS_PROBE: {"tcpSocket":{"port":8080,"host":"127.0.0.1"},"successThreshold":1} ENABLE_PROFILING: false SERVING_ENABLE_PROBE_REQUEST_LOG: false METRICS_COLLECTOR_ADDRESS: HOST_IP: (v1:status.hostIP) ENABLE_HTTP2_AUTO_DETECTION: false ROOT_CA: Mounts: /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-pfjj8 (ro) Conditions: Type Status Initialized True Ready False ContainersReady False PodScheduled True Volumes: kube-api-access-pfjj8: Type: Projected (a volume that contains injected data from multiple sources) TokenExpirationSeconds: 3607 ConfigMapName: kube-root-ca.crt ConfigMapOptional: DownwardAPI: true kserve-provision-location: Type: EmptyDir (a temporary directory that shares a pod's lifetime) Medium: SizeLimit: QoS Class: Burstable Node-Selectors: Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 115s default-scheduler Successfully assigned default/torchserve-predictor-00001-deployment-c8b8c4595-mxtpf to minikube Normal Pulling 113s kubelet Pulling image "kserve/storage-initializer:v0.11.0" Normal Pulled 95s kubelet Successfully pulled image "kserve/storage-initializer:v0.11.0" in 17.862s (17.862s including waiting) Normal Created 89s kubelet Created container storage-initializer Normal Started 89s kubelet Started container storage-initializer Normal Pulling 72s kubelet Pulling image "index.docker.io/pytorch/torchserve-kfs-nightly@sha256:5ec33cf3085b56b5eedb6b11c67faa5b7e669019573a26459eb49a2394531be2" Normal Pulled 9s kubelet Successfully pulled image "index.docker.io/pytorch/torchserve-kfs-nightly@sha256:5ec33cf3085b56b5eedb6b11c67faa5b7e669019573a26459eb49a2394531be2" in 1m2.861s (1m2.861s including waiting) Normal Created 7s kubelet Created container kserve-container Normal Started 7s kubelet Started container kserve-container Normal Pulling 7s kubelet Pulling image "gcr.io/knative-releases/knative.dev/serving/cmd/queue@sha256:65c427aaab3be9cea1afea32cdef26d5855c69403077d2dc3439f75c26a1e83f" Normal Pulled 2s kubelet Successfully pulled image "gcr.io/knative-releases/knative.dev/serving/cmd/queue@sha256:65c427aaab3be9cea1afea32cdef26d5855c69403077d2dc3439f75c26a1e83f" in 4.483s (4.483s including waiting) Normal Created 1s kubelet Created container queue-proxy Normal Started 1s kubelet Started container queue-proxy

Wait for inference service to be ready
Wait for ports to be in forwarding
Forwarding from 127.0.0.1:8080 -> 8080
Forwarding from [::1]:8080 -> 8080
Make inference request
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0Handling connection for 8080
100 463 100 62 100 401 405 2621 --:--:-- --:--:-- --:--:-- 3026
✘ Test failed: Prediction: {"error":"ValueError:NoneisnotavalidPredictorProtocol"}, expected {"predictions":[2]}.

I did two runs to show that having the protocolVersion entry in the InferenceService definition does not change the failure.

While I can update the version of KServe, I think many people creating models to be used by torchserve do not handle the underlying infrastructure it will run on, so fixing the wrapper will have a broader impact.

agunapal

LGTM

* fix(kserve): ensure there's a default protocol configured The current implementation retrieves the protocol to use from the "PROTOCOL_VERSION" environment variable however there's no default value which will trigger an error when served through Kserve as the base class does a protocol check in the predict method that will fail with None. The default protocol uses the same value as the base class. * fix(kserve): ensure the protocol version configured is a valid value * feat(kserve): make configuration file path configurable This will allow to make the wrapper easier to test beside making it possible to to change where the file should be looked for. * test: add KServe wrapper test * test(kserve_wrapper): add protobuf code generation * fix(kserse): add None handling to the wrapper In case None is passed, keep the default value set in the base __init__. This makes TorchserveModel behave in the same fashion as the base class. * refactor(test): rewrite wrapper test to be more complete It now validates that the protocol version is passed properly as well as failure if an invalid protocol is given. * test: remove kserve pytest tests The overhead required to run the tests outweighs it benefits. The thing to keep in mind is that the fix allows to run models through kserve deployments prior to 0.11.1.

sgaist force-pushed the improve_kserve_protocol_version_handling branch from d10ac9f to 93ffaa8 Compare February 25, 2024 21:41

sgaist force-pushed the improve_kserve_protocol_version_handling branch from 93ffaa8 to 3992bc4 Compare February 26, 2024 13:31

sgaist force-pushed the improve_kserve_protocol_version_handling branch from 5675c03 to effb601 Compare February 27, 2024 13:09

sgaist added 8 commits February 29, 2024 10:17

fix(kserve): ensure the protocol version configured is a valid value

e25f6cd

feat(kserve): make configuration file path configurable

c709369

This will allow to make the wrapper easier to test beside making it possible to to change where the file should be looked for.

test: add KServe wrapper test

675f0ea

test(kserve_wrapper): add protobuf code generation

0642144

fix(kserse): add None handling to the wrapper

2cf85e3

In case None is passed, keep the default value set in the base __init__. This makes TorchserveModel behave in the same fashion as the base class.

refactor(test): rewrite wrapper test to be more complete

54f49eb

It now validates that the protocol version is passed properly as well as failure if an invalid protocol is given.

test: remove kserve pytest tests

15336a2

The overhead required to run the tests outweighs it benefits. The thing to keep in mind is that the fix allows to run models through kserve deployments prior to 0.11.1.

sgaist force-pushed the improve_kserve_protocol_version_handling branch from 378e1c3 to 15336a2 Compare February 29, 2024 09:17

agunapal approved these changes Feb 29, 2024

View reviewed changes

agunapal enabled auto-merge February 29, 2024 18:31

agunapal added this pull request to the merge queue Feb 29, 2024

Merged via the queue into pytorch:master with commit c207cd2 Feb 29, 2024
15 checks passed

sgaist deleted the improve_kserve_protocol_version_handling branch March 1, 2024 14:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve kserve protocol version handling #2957

Improve kserve protocol version handling #2957

sgaist commented Feb 22, 2024 •

edited

Loading

sgaist commented Feb 26, 2024

agunapal commented Feb 27, 2024

sgaist commented Feb 27, 2024

sgaist commented Feb 27, 2024

agunapal commented Feb 27, 2024

sgaist commented Feb 28, 2024

agunapal commented Feb 29, 2024

sgaist commented Feb 29, 2024 •

edited

Loading

agunapal left a comment

Improve kserve protocol version handling #2957

Improve kserve protocol version handling #2957

Conversation

sgaist commented Feb 22, 2024 • edited Loading

Description

Type of change

Feature/Issue validation/testing

Checklist:

sgaist commented Feb 26, 2024

agunapal commented Feb 27, 2024

sgaist commented Feb 27, 2024

sgaist commented Feb 27, 2024

agunapal commented Feb 27, 2024

sgaist commented Feb 28, 2024

agunapal commented Feb 29, 2024

sgaist commented Feb 29, 2024 • edited Loading

agunapal left a comment

Choose a reason for hiding this comment

sgaist commented Feb 22, 2024 •

edited

Loading

sgaist commented Feb 29, 2024 •

edited

Loading