Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inference with --half exported model not working #2

Open
adrigrillo opened this issue Nov 28, 2022 · 1 comment
Open

Inference with --half exported model not working #2

adrigrillo opened this issue Nov 28, 2022 · 1 comment

Comments

@adrigrillo
Copy link

Hello,

I am currently testing this project as I am interested to use the exported TensorRT model in a Jetson device using C++.
First, I want to thank you for this work, which is the only one that I found that does not use the OpenCV DNN module.

The problem I have found is that, when I export the model using the --half flag, the outputs of the model are not correctly retrieved and, therefore, its processing fails.
Such flag, converts the TensorRT engine to FP16, reducing the inference time by almost half.
However, I think there is a problem regarding how the data is copied back from the GPU.

Specifically, when the detection results are retrieved in

For the FP32 model, the default one without --half flag, the data obtain represents the expected values like in the following screenshot:

Screenshot_20221128_142122
Screenshot_20221128_142502

However, when the FP16 model is used, the retrieved values are considered NAN, like in the following capture:

Screenshot_20221128_142329
Screenshot_20221128_142404

I have tested both models using Python and both yield correct results, so it is not an error exporting the model when FP16 used.
I believe that it has to be something related with the data copy.
However, when working with the detection engine from this repo there is no distinction when using both kind of models.

I will keep working on this to see if I found the solution, but any help is welcomed.
Thanks beforehand.

@adrigrillo
Copy link
Author

I have explored today a little bit further about the topic, but still I am not able to make to work the model with any export using the flag --half in the original yolov5 repo.

However, testing with trtexec there is no problem with the exported engine, being able to perform the inference in 3ms, in contrast with the 5ms of the default model.
I have used both, the original repo and the transformation script you made.

Therefore, I do not understand what is the problem, as there is no way to indicate to the engine loader that the expected data is in float16.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant