Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

KServe wrapper default configuration is faulty #2956

Closed
sgaist opened this issue Feb 22, 2024 · 5 comments 路 Fixed by #2957
Closed

KServe wrapper default configuration is faulty #2956

sgaist opened this issue Feb 22, 2024 · 5 comments 路 Fixed by #2957
Labels
triaged Issue has been reviewed and triaged

Comments

@sgaist
Copy link
Contributor

sgaist commented Feb 22, 2024

馃悰 Describe the bug

The TorchserveModel class is not initialized correctly when called from __main__.py.

The class initializes the protocol property with whatever is passed as parameters.

The issue lies in the handling of the PROTOCOL_VERSION environment variable. The default value will be None if not set and thus the protocol property will be as well. In the base class implementation, if no configuration is passed, the default value is v1.

This become an issue when the answer is generated as the protocol version is checked.

Error logs

File "/path/to/miniconda3/envs/pytorch-serve-dev/lib/python3.9/site-packages/kserve/model.py", line 121, in __call__
    response = (await self.predict(payload, headers)) if inspect.iscoroutinefunction(self.predict) \
  File "/path/to/miniconda3/envs/pytorch-serve-dev/lib/python3.9/site-packages/kserve/model.py", line 297, in predict
    return InferResponse.from_rest(self.name, res) if is_v2(PredictorProtocol(self.protocol)) else res
  File "/path/to/miniconda3/envs/pytorch-serve-dev/lib/python3.9/enum.py", line 384, in __call__
    return cls.__new__(cls, value)
  File "/path/to/miniconda3/envs/pytorch-serve-dev/lib/python3.9/enum.py", line 702, in __new__
    raise ve_exc
ValueError: None is not a valid PredictorProtocol

Installation instructions

The installation procedure does not matter.

The issue manifests itself in both the Docker container and with a source or package installation

Model Packaing

from abc import ABC
from ts.torch_handler.base_handler import BaseHandler

class TestTBHandler(BaseHandler, ABC):

    def __init__(self):
        super().__init__()

    def initialize(self, context):
        self.initialized = True


    def preprocess(self, requests):
        print("preprocess", requests)

        return []

    def inference(self, samples, *args, **kwargs):
        return [{"test": 42}]

    def postprocess(self, data):
        print("postprocess", data)
        return [data]
torch-model-archiver --model-name test --version 1.0 --export-path model-store --handler handler.py -f

config.properties

inference_address=http://127.0.0.1:8080
management_address=http://127.0.0.1:8081
metrics_address=http://127.0.0.1:8082
install_py_dep_per_model=false
number_of_netty_threads=1
job_queue_size=1
service_envelope=kservev2
model_store=./model-store
load_models=test=test.mar
models={\
  "test": {\
    "1.0": {\
        "defaultVersion": true,\
        "marName": "test.mar",\
        "minWorkers": 1,\
        "maxWorkers": 5,\
        "batchSize": 1,\
        "maxBatchDelay": 100,\
        "responseTimeout": 120\
    }\
  }\
}

Versions


Environment headers

Torchserve branch: HEAD

**Warning: torchserve not installed ..
**Warning: torch-model-archiver not installed ..

Python version: 3.9 (64-bit runtime)
Python executable: ~/miniconda3/envs/pytorchserve-dev/bin/python

Versions of relevant python libraries:
numpy==1.26.4
psutil==5.9.8
requests==2.31.0
requests-oauthlib==1.3.1
wheel==0.41.2
**Warning: torch not present ..
**Warning: torchtext not present ..
**Warning: torchvision not present ..
**Warning: torchaudio not present ..

Java Version:

OS: Mac OSX 13.6.4 (arm64)
GCC version: N/A
Clang version: 15.0.0 (clang-1500.1.0.2.5)
CMake version: version 3.28.3

Versions of npm installed packages:
**Warning: newman, newman-reporter-html markdown-link-check not installed...

Repro instructions

create environment with serve installed
create handler.py from sample code
create config.properties from example

torch-model-archiver --model-name test --version 1.0 --export-path model-store --handler handler.py -f
torchserve --start --model-store model-store  --ts-config config.properties --foreground

create environment with kserve installed

pip install 'kserve[storage]'

clone this repo
create /mnt/models/config/config.properties with:

inference_address=http://127.0.0.1:9080
management_address=http://127.0.0.1:9081
metrics_address=http://127.0.0.1:9082
grpc_inference_port=7070
grpc_management_port=7071
model_store=file:///
model_snapshot={"name": "startup.cfg","modelCount": 1,"models": {"test": {"1.0": {"defaultVersion": true, "marName": "test.mar", "minWorkers": 1, "maxWorkers": 5, "batchSize": 5, "maxBatchDelay": 200, "responseTimeout": 60}}}}

start kserve wrapper:

python3 kserve_wrapper/__main__.py

create payload data.json:

{
    "inputs": [
        {
            "name": "uuid", 
            "shape": -1,
            "datatype": "BYTES",
            "data": ["fakedata"]
        }
    ]
}

send request

curl -X POST http://127.0.0.1:9080/v1/models/test:predict -T data.json

Possible Solution

There are several:

  • Set a default value for the retrieval of the PROTOCOL_VERSION environment variable
  • Check if the protocol is None in the __init__ function and set it to v1
  • Check if the protocol is None in the __init__ function and do not set it so it keeps the default value of the base class

In the idea of failing early, adding a check for the validity of the value would likely also be worthwhile as PROTOCOL_VERSION might contain an invalid value.

@agunapal
Copy link
Collaborator

Hi @sgaist Thanks for the details!
We do have this nightly test https://github.com/pytorch/serve/actions/runs/7999859721

If you can add a test to this, then it would easy to understand and expedite your PR

@sgaist
Copy link
Contributor Author

sgaist commented Feb 25, 2024

Hi @agunapal,

Thanks for the pointer !

I ran the tests and while checking how they were going on, I saw that the PROTOCOL_VERSION environment variable was set properly by the KServe controller (that was the next thing I wanted to fix). Looking at the KServe sources, the fix dates back to September last year.

Seeing that, I verified what I have on my current machine and surprisingly, it's an older version, v0.9.0 to be exact, of the KServe controller that has been deployed. I don't know why that version as I have followed their documentation to deploy the whole system.

This means that the current version of KServe does the right thing when creating the deployment from InferenceService hence my patch would not be strictly required in that context. However I still think that it serves its purpose of making the wrapper's behaviour in line with the base class implementation.

While writing the test, I saw some things that could be improved such as the configuration default handling. I am wondering whether such improvements should be part of a follow up patch or integrated in this one ?

@agunapal
Copy link
Collaborator

Hi @sgaist The nightly tests that we run are on nightly images. Hence, you don't see the issue I guess. We will be be making a new release soon.

@agunapal agunapal added the triaged Issue has been reviewed and triaged label Feb 26, 2024
@agunapal
Copy link
Collaborator

Hi @sgaist We welcome any improvements to the code. You can send a PR for one of the issues you want to fix, include a test and we can take it from there

@sgaist
Copy link
Contributor Author

sgaist commented Feb 26, 2024

Good, then I'll open a ticket about the improvements I have in mind for the wrapper on top of what I have done in this one with a follow up patch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triaged Issue has been reviewed and triaged
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants