You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am currently testing this project as I am interested to use the exported TensorRT model in a Jetson device using C++.
First, I want to thank you for this work, which is the only one that I found that does not use the OpenCV DNN module.
The problem I have found is that, when I export the model using the --half flag, the outputs of the model are not correctly retrieved and, therefore, its processing fails.
Such flag, converts the TensorRT engine to FP16, reducing the inference time by almost half.
However, I think there is a problem regarding how the data is copied back from the GPU.
Specifically, when the detection results are retrieved in
For the FP32 model, the default one without --half flag, the data obtain represents the expected values like in the following screenshot:
However, when the FP16 model is used, the retrieved values are considered NAN, like in the following capture:
I have tested both models using Python and both yield correct results, so it is not an error exporting the model when FP16 used.
I believe that it has to be something related with the data copy.
However, when working with the detection engine from this repo there is no distinction when using both kind of models.
I will keep working on this to see if I found the solution, but any help is welcomed.
Thanks beforehand.
The text was updated successfully, but these errors were encountered:
I have explored today a little bit further about the topic, but still I am not able to make to work the model with any export using the flag --half in the original yolov5 repo.
However, testing with trtexec there is no problem with the exported engine, being able to perform the inference in 3ms, in contrast with the 5ms of the default model.
I have used both, the original repo and the transformation script you made.
Therefore, I do not understand what is the problem, as there is no way to indicate to the engine loader that the expected data is in float16.
Hello,
I am currently testing this project as I am interested to use the exported TensorRT model in a Jetson device using C++.
First, I want to thank you for this work, which is the only one that I found that does not use the OpenCV DNN module.
The problem I have found is that, when I export the model using the
--half
flag, the outputs of the model are not correctly retrieved and, therefore, its processing fails.Such flag, converts the TensorRT engine to FP16, reducing the inference time by almost half.
However, I think there is a problem regarding how the data is copied back from the GPU.
Specifically, when the detection results are retrieved in
Yolov5-instance-seg-tensorrt/main2_trt_infer.cpp
Line 201 in bda2aca
For the FP32 model, the default one without
--half
flag, the data obtain represents the expected values like in the following screenshot:However, when the FP16 model is used, the retrieved values are considered NAN, like in the following capture:
I have tested both models using Python and both yield correct results, so it is not an error exporting the model when FP16 used.
I believe that it has to be something related with the data copy.
However, when working with the detection engine from this repo there is no distinction when using both kind of models.
I will keep working on this to see if I found the solution, but any help is welcomed.
Thanks beforehand.
The text was updated successfully, but these errors were encountered: