Open Inference Protocol with nightly build not working #2951

harshita-meena · 2024-02-21T00:54:48Z

🐛 Describe the bug

While trying to run load tests with latest merged changes on v2 Open inference protocol, I noticed that the example for mnist does not work in preprocessing step. https://github.com/pytorch/serve/pull/2609/files

Error logs

The server side showed error

Installation instructions

ARG VERSION=latest-cpu
ARG IMAGE_NAME=pytorch/torchserve-nightly

from $IMAGE_NAME:$VERSION

USER root

RUN apt-get -y update
RUN apt-get install -y curl vim

# Installation steps to download model from GCP
# Downloading gcloud package
RUN curl https://dl.google.com/dl/cloudsdk/release/google-cloud-sdk.tar.gz > /tmp/google-cloud-sdk.tar.gz

# Installing the package
RUN mkdir -p /usr/local/gcloud \
  && tar -C /usr/local/gcloud -xvf /tmp/google-cloud-sdk.tar.gz \
  && /usr/local/gcloud/google-cloud-sdk/install.sh

# Adding the package path to local
ENV PATH $PATH:/usr/local/gcloud/google-cloud-sdk/bin

ENV TS_OPEN_INFERENCE_PROTOCOL oip

RUN pip install protobuf googleapis-common-protos grpcio loguru

COPY config.properties /home/model-server/config.properties
COPY mnist.mar /home/model-server/model-store/

copied model from gs://kfserving-examples/models/torchserve/image_classifier/v2/model-store/mnist.mar
Built the docker file using docker build -f Dockerfile -t metadata . and brought it up locally
Ran ghz load test tool with

ghz  --proto serve/frontend/server/src/main/resources/proto/open_inference_grpc.proto   --call org.pytorch.serve.grpc.openinference.GRPCInferenceService/ModelInfer --duration 300s --rps 1 --insecure localhost:79 -D ./serve/kubernetes/kserve/kf_request_json/v2/mnist/mnist_v2_tensor_grpc.json

Model Packaing

Used an existing packaged model mnist.mar at gs://kfserving-examples/models/torchserve/image_classifier/v2

config.properties

inference_address=http://0.0.0.0:8080
management_address=http://0.0.0.0:8081
metrics_address=http://0.0.0.0:8082
enable_metrics_api=true
model_metrics_auto_detect=true
metrics_mode=prometheus
number_of_netty_threads=32
job_queue_size=1000
enable_envvars_config=true
model_store=/home/model-server/model-store
load_models=mnist.mar
workflow_store=/home/model-server/wf-store

Versions

Environment headers

Torchserve branch:

**Warning: torchserve not installed ..
torch-model-archiver==0.9.0

Python version: 3.7 (64-bit runtime)
Python executable: /Users/hmeena/development/ml-platform-control-planes/venv/bin/python

Versions of relevant python libraries:
numpy==1.21.6
requests==2.31.0
requests-oauthlib==1.3.1
torch-model-archiver==0.9.0
wheel==0.41.0
**Warning: torch not present ..
**Warning: torchtext not present ..
**Warning: torchvision not present ..
**Warning: torchaudio not present ..

Java Version:

OS: Mac OSX 11.7.8 (x86_64)
GCC version: N/A
Clang version: 12.0.0 (clang-1200.0.32.29)
CMake version: version 3.23.2

Versions of npm installed packages:
**Warning: newman, newman-reporter-html markdown-link-check not installed...

Repro instructions

same as installation instruction

Possible Solution

I am unsure of how well the OIP is working with Torchserve at the moment. I tried a small ranker example and it fails in the post processing step where the worker crashes completely, it is not able to send response as ModelInferResponse.

The text was updated successfully, but these errors were encountered:

agunapal · 2024-02-21T01:54:39Z

Hi @harshita-meena
Thanks for reporting the issue.

Do you mind trying this script
https://github.com/pytorch/serve/blob/master/kubernetes/kserve/tests/scripts/test_mnist.sh

We are running this nightly.
https://github.com/pytorch/serve/actions/workflows/kserve_cpu_tests.yml

harshita-meena · 2024-02-21T06:14:04Z

Currently I am trying to setup the tests but they will probably not fail because the images used in oip kserve yamls are custom ones (http and grpc) and both do not refer to nightly ones though are part of the test_mnist.sh at lines 189 and 215.

harshita-meena · 2024-02-21T18:05:06Z

Still struggling to get the tests running, if you can approve the workflow for this PR. The primary reason I am trying to get this working is because I wanted to use Open Inference Protocol for a non-kserve deployment. Everything works till the worker dies after the post processing step. I was heavily relying on this because OIP provides a great generic metadata/inference API. If this doesn't work I will use the inference.proto instead.

agunapal · 2024-02-21T20:15:28Z

Hi @harshita-meena Thanks for the details. Checking with kserve regarding this. Will update

harshita-meena · 2024-02-21T22:59:48Z

You can reproduce the worker died issue, if you build the dockerfile part of this issue with config properties and create a new mnist.mar with a slightly modified handler for OIP specific requests. (attached a zip)

torch-model-archiver --model-name mnist --version 1.0 --serialized-file mnist_cnn.pt --model-file mnist.py --handler mnist_handler.py -r requirements.txt

mnist.zip

ghz --proto serve/frontend/server/src/main/resources/proto/open_inference_grpc.proto --call org.pytorch.serve.grpc.openinference.GRPCInferenceService/ModelInfer --duration 300s --rps 1 --insecure localhost:79 -D ./serve/kubernetes/kserve/kf_request_json/v2/mnist/mnist_v2_tensor_grpc.json

agunapal · 2024-02-29T20:22:36Z

Hi @harshita-meena Thanks! This is a new feature and there might be bugs. Will update when I repro it

harshita-meena · 2024-03-06T17:41:33Z

Hi @agunapal I was wondering if you identified the reason for the issue or got a chance to discuss with kserve.

harshita-meena · 2024-03-07T01:01:33Z

I figured out the solution and will reply back with it soon. Thankyou!

agunapal · 2024-03-07T03:07:25Z

Hi @harshita-meena I was able to repro the issue with the steps you shared. Please feel free to send a PR if you have identified the problem

harshita-meena · 2024-03-07T18:23:58Z

It is how the OIP expects the response, I was sending only dict or only list but if I send list of dicts containing parameters specific to OIP response, it will give successful prediction.

    def postprocess(self, data):
        """The post process of MNIST converts the predicted output response to a label.

        Args:
            data (list): The predicted output from the Inference with probabilities is passed
            to the post-process function
        Returns:
            list : A list of dictionary with predictons and explanations are returned.
        """

        return [{
            "model_name":"mnist",
            "model_version":"N/A",
            "id":"N/A",
            "outputs":[{"name":"output-0","datatype":"FLOAT64","shape":["1"],"data":[data.argmax(1).flatten().tolist()]}]
        }]

mnist_handler_oip.py.zip

harshita-meena · 2024-03-07T18:32:36Z

Just saw your message @agunapal, if it helps I can submit a PR with only handler specific to OIP.

agunapal · 2024-03-07T18:40:56Z

Hi @harshita-meena The error you posted and the one I see is in pre-processing? So, how is it related to post-processing.

Also, I'm wondering how do we address this backward compatibility breaking change for post-processing

harshita-meena · 2024-03-07T18:47:06Z

Apologies if my stream of errors confused you about the actual issue.

My main goal is to get an inference working with gRPC using Open Inference Protocol in a basic deployment not using kserve.

The pre-processing error is because I was using the old handler when i first started this issue that didn't extract the request following Open Inference Protocol but using the old inference.proto. The second error I posted is after I finally resolved pre-processing but I was not able to figure out the post processing step. Finally yesterday I was able to figure out how the post processing step should look.

Overall if your question is how can we prevent the worker from crashing, it will be in the OIP server logic in GRPCJob.java.

Posting my findings from yesterday
The issue was in postprocessing before a response is sent, assuming the prediction is for a batch of 1 and has model predicted value 0.

If i send a dictionary with output set as 0, handler layer does not allow me to send a response that is not list
if i send a list with output [0], server layer needs dict in the response and json serialization fails

But if i send a response as a list of dictionary, the parsing logic goes through so OIP response can process it. [{"model_name":....."outputs":[{"name":"output-0","datatype":"INT64","shape":["1"],"data":[0]}]
We will have to send each and every parameter part of OIP response (even the ones that are optional in the protocol) otherwise Pytorch Worker will crash in parsing.

agunapal · 2024-03-07T19:11:10Z

Thanks for the detailed findings. cc @lxning

harshita-meena mentioned this issue Feb 21, 2024

Open Inference Protocol Implementation. #2609

Merged

10 tasks

agunapal added the triaged Issue has been reviewed and triaged label Feb 21, 2024

agunapal self-assigned this Feb 21, 2024

harshita-meena closed this as completed Mar 7, 2024

harshita-meena reopened this Mar 7, 2024

harshita-meena mentioned this issue Mar 26, 2024

gRPC Model Metadata using Open Inference Protocol #3045

Open

agunapal added the OIP label Apr 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Open Inference Protocol with nightly build not working #2951

Open Inference Protocol with nightly build not working #2951

harshita-meena commented Feb 21, 2024 •

edited

Loading

agunapal commented Feb 21, 2024

harshita-meena commented Feb 21, 2024 •

edited

Loading

harshita-meena commented Feb 21, 2024 •

edited

Loading

agunapal commented Feb 21, 2024

harshita-meena commented Feb 21, 2024 •

edited

Loading

agunapal commented Feb 29, 2024

harshita-meena commented Mar 6, 2024

harshita-meena commented Mar 7, 2024

agunapal commented Mar 7, 2024

harshita-meena commented Mar 7, 2024

harshita-meena commented Mar 7, 2024 •

edited

Loading

agunapal commented Mar 7, 2024

harshita-meena commented Mar 7, 2024 •

edited

Loading

agunapal commented Mar 7, 2024

Open Inference Protocol with nightly build not working #2951

Open Inference Protocol with nightly build not working #2951

Comments

harshita-meena commented Feb 21, 2024 • edited Loading

🐛 Describe the bug

Error logs

Installation instructions

Model Packaing

config.properties

Versions

Environment headers

Repro instructions

Possible Solution

agunapal commented Feb 21, 2024

harshita-meena commented Feb 21, 2024 • edited Loading

harshita-meena commented Feb 21, 2024 • edited Loading

agunapal commented Feb 21, 2024

harshita-meena commented Feb 21, 2024 • edited Loading

agunapal commented Feb 29, 2024

harshita-meena commented Mar 6, 2024

harshita-meena commented Mar 7, 2024

agunapal commented Mar 7, 2024

harshita-meena commented Mar 7, 2024

harshita-meena commented Mar 7, 2024 • edited Loading

agunapal commented Mar 7, 2024

harshita-meena commented Mar 7, 2024 • edited Loading

agunapal commented Mar 7, 2024

harshita-meena commented Feb 21, 2024 •

edited

Loading

harshita-meena commented Feb 21, 2024 •

edited

Loading

harshita-meena commented Feb 21, 2024 •

edited

Loading

harshita-meena commented Feb 21, 2024 •

edited

Loading

harshita-meena commented Mar 7, 2024 •

edited

Loading

harshita-meena commented Mar 7, 2024 •

edited

Loading