Can I use triton server tensorrtllm backend to host other tensorrt built models? If not what do you suggest if our models stack is mixed of LLM and non-LLM models

Hello, 

so our current models stack consists of a set of models built in TensorRT  and  the whisper ASR model. 

I'd like to use triton server to host all these models.   Since whisper can be converted using TensoRT LLM. I tried to host all models with the triton server with tensorrtLLM backends and I see the error.

And I'm seeing this error 
```
E1220 20:46:48.038714 1 model_lifecycle.cc:621] failed to load 'fer_2' version 1: Invalid argument: unable to find 'libtriton_tensorrt.so' or 'tensorrt/model.py' for model 'fer_2', in /opt/tritonserver/backends/tensorrt
```

I think this suggests that the triton server tensorrtllm backend do not support tensorrt models. Is it the case?

If so, what should I do? Or what do you recommand if we have a large models stack mixed of LLM and non-LLM models.


Thank you.




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Can I use triton server tensorrtllm backend to host other tensorrt built models? If not what do you suggest if our models stack is mixed of LLM and non-LLM models #242

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Can I use triton server tensorrtllm backend to host other tensorrt built models? If not what do you suggest if our models stack is mixed of LLM and non-LLM models #242

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions