Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] Taking a long time to initiate multiple ONNX sessions using the same ONNX model, with TensorrtExecutionProvider #929

Open
dinushazoomi opened this issue Jul 29, 2023 · 0 comments

Comments

@dinushazoomi
Copy link

dinushazoomi commented Jul 29, 2023

Description

It is taking a long time (~ 30 minuets) to initiate multiple ONNX sessions using the same ONNX model, but it does not happen when using different models. any idea why this happen?

Reason I am using this is it is very fast when using the ONNX inference with TensorrtExecutionProvider

Steps to reproduce

Here is my code

import onnxruntime as ort

provider = providers=['TensorrtExecutionProvider', 'CUDAExecutionProvider', 'CPUExecutionProvider'] # 

ort_sess_1 = ort.InferenceSession("./model_data/mobilenetv2_x1_0-fp16.onnx",providers=provider)
ort_sess_2 = ort.InferenceSession("./model_data/mobilenetv2_x1_0-fp16.onnx",providers=provider)
ort_sess_3 = ort.InferenceSession("./model_data/mobilenetv2_x1_0-fp16.onnx",providers=provider)

Environment

I am running this on Jetson Orin

TensorRT Version: 8.5.2.2
**ONNX Runtime Version **: 1.12.1

Another side question is ONNX use trtexec to serialize a tensorRT engine file and then uses it in the ONNX runtime session.

Any help on this matter would be highly appreciated.
Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant