Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Whisper model deployment with OVMS #2066

Open
Aditya-Scalers opened this issue Sep 26, 2023 · 2 comments
Open

Whisper model deployment with OVMS #2066

Aditya-Scalers opened this issue Sep 26, 2023 · 2 comments
Labels

Comments

@Aditya-Scalers
Copy link

Aditya-Scalers commented Sep 26, 2023

With the reference of whisper implementation with openvino for subtitle generation, I was able to create the whisper_encoder and whisper_decoder xml and bin files. Using whisper_encoder and whisper_decoder as seperate models with ovms i was able to start the docker container.

New status: ( "state": "AVAILABLE", "error_code": "OK" )
[2023-09-26 13:14:16.673][1][serving][info][model.cpp:88] Updating default version for model: whisper, from: 0
[2023-09-26 13:14:16.673][1][serving][info][model.cpp:98] Updated default version for model: whisper, to: 1
[2023-09-26 13:14:16.673][66][modelmanager][info][modelmanager.cpp:1069] Started model manager thread
[2023-09-26 13:14:16.673][1][serving][info][servablemanagermodule.cpp:45] ServableManagerModule started
[2023-09-26 13:14:16.673][67][modelmanager][info][modelmanager.cpp:1088] Started cleaner thread

I am not able to perform inference on these models. Any help would be appreciated.

client code:

from ovmsclient import make_grpc_client

client = make_grpc_client("localhost:9000")
binary_data = None
with open("output.bin", "rb") as binary_file:
    binary_data = binary_file.read()
data_dict = {
    "binary_data": binary_data
}
results = client.predict(inputs=data_dict, model_name="whisper")

When i request from the client using binary input of audio file i am getting this error.

Traceback (most recent call last):
  File "/home/ubuntu/.local/lib/python3.10/site-packages/ovmsclient/tfs_compat/grpc/serving_client.py", line 47, in predict
    raw_response = self.prediction_service_stub.Predict(request.raw_request, timeout)
  File "/home/ubuntu/.local/lib/python3.10/site-packages/grpc/_channel.py", line 1161, in __call__
    return _end_unary_response_blocking(state, call, False, None)
  File "/home/ubuntu/.local/lib/python3.10/site-packages/grpc/_channel.py", line 1004, in _end_unary_response_blocking
    raise _InactiveRpcError(state)  # pytype: disable=not-instantiable
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
        status = StatusCode.INVALID_ARGUMENT
        details = "Invalid number of inputs - Expected: 26; Actual: 1"
        debug_error_string = "UNKNOWN:Error received from peer ipv6:%5B::1%5D:9000 {grpc_message:"Invalid number of inputs - Expected: 26; Actual: 1", grpc_status:3, created_time:"2023-09-27T07:31:16.658787286+00:00"}"
>

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/ubuntu/cloudvedge/client.py", line 12, in <module>
    results = client.predict(inputs=data_dict, model_name="whisper")
  File "/home/ubuntu/.local/lib/python3.10/site-packages/ovmsclient/tfs_compat/grpc/serving_client.py", line 49, in predict
    raise_from_grpc(grpc_error)
  File "/home/ubuntu/.local/lib/python3.10/site-packages/ovmsclient/tfs_compat/base/errors.py", line 66, in raise_from_grpc
    raise (error_class(details))
ovmsclient.tfs_compat.base.errors.InvalidInputError: Error occurred during handling the request: Invalid number of inputs - Expected: 26; Actual: 1
@atobiszei
Copy link
Collaborator

Hi,
you are getting error that OVMS expects more inputs than you provide. I assume that you use some kind of wrapper for OV model that encapsulates the fact that OV uses much more inputs than one to perform inference.

We have plans to support adding python code execution support inside OVMS so that could ease the integration in cases when you have existing python wrapping.

One thing I noticed as well is that you tried to use binary audi file - right now OVMS only supports images with binary inputs.

@nhha1602
Copy link

nhha1602 commented Nov 17, 2023

Hi,

Reference to this link. I also exported whisper encode & decode model to IR format. Then I tested it successfully with some class in
this link

Next steps, How can I use these models in IR format for OPVM loading it and then I can use client to inference it ?

Please advise this.

@atobiszei atobiszei added the stale label Dec 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants