-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
how to create TensorRT Run time for two inputs model and one output #3929
Comments
Thank you so much. I have one more question. I have built the engine file with half precision and I am getting about 8.5 FPS on Jetson Orin Nano. How can I optimize it? I am thinking to built INT8 file but I am not sure how to calibrate it and built it successfully, do you have any relevant document/examples for it? I have two inputs for image fusion model. |
@lix19937 could i deploy with deepstream or python runtime, I am confused. I don't know which will improve the efficiency. |
You can use fp16, then use PTQ(int8) to improve infer speed. More optimize methods see https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#optimize-performance |
@lix19937 thank you again, any guidelines on using NVIDIA deep stream based video analytics system? |
@lix19937 thank you so much :) |
Description
Environment
TensorRT Version: 8.5
CUDA Version: 11.4
CUDNN Version: 8.6
Operating System:
Python Version (if applicable): 3.8.10
PyTorch Version (if applicable): 2.1.0a0+41361538.nv23.6
My implementation
The text was updated successfully, but these errors were encountered: