New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
2 questions about calibrations #38
Comments
|
Hi, I tried the the model on your image with both FP32/FP16/INT8, and the results are reasonable. I uploaded them here. The command and the model I am using (FP16 as example): Can you try with the model I am using and see if it is due to the model file or TensorRT? And please pull the newest commit from our repo because the newest code will automatically convert YOLACT weights. |
I will try it now. |
We do not need to train with FP16. Simply train a full precision model and you can convert the trained model to FP16/INT8 using TensorRT. |
Can you try removing all TensorRT cache with |
This is what I did: [01/25 11:22:15 yolact.eval]: Loading model... /home/ws/images/imgs_in/cars.jpg -> /home/ws/imgs_out/cars.png` but the results are the same... |
The line over the first lines it because the '~' in linux |
bdw, this is the config:
|
So you basically trained your own model on this dataset with an image size of 400x400? Would you mind sharing the trained model with email to me so that I can test it. We haven't seen a model that has such huge performance difference between FP16 and FP32. |
Yes, I can share it with you. Thanks |
liuhaotian.cn at gmail |
OK, I am not at work right now (it's 21:00 in our time). |
@haotian-liu I sent it now. Thanks |
My collaborator and I will take a look later this week, and will let you know with the updates, thanks. |
Thanks! |
Hi @haotian-liu |
@sdimantsd Hi we found that it is due to TensorRT conversion of prediction module/FPN. As when we disable these two conversion, and only use the backbone/protonet conversion, everything works fine. Could you try this on your model / dataset? We also found that native PyTorch FP16 conversion works fine. We decide to contact the upstream TensorRT and torch2trt maintainers for more information and help. |
Thanks! |
Setting these two option to False in the config allows you to disable TensorRT for FPN (similar for other modules) {
'torch2trt_fpn': False,
'torch2trt_fpn_int8': False,
} |
Hi I am currently closing this issue, and merge the discussion related to TensorRT conversion issue after training on a custom dataset to this issue #47 as it is quite hard for me to track so many open issues. Hope you understand, thanks. |
I somehow figured out that the cause and applied the fix, details of the solution are explained in #47. Please take a look to see if the issue can be resolved. |
OK. |
@sdimantsd Hello, I would like to ask, do you make inferences on jeston nano? The backbone is resnet101 and the image size is 400, right? How much memory did you use during conversion to tensorrt and inference? Because I couldn’t make inferences on jetson nano 2gb, it was killed. My backbone is mobilenetv2 and image size I tried 320, 160, 80. I'm thinking about whether to change to jetson nano 4gb. In addition, will the pred_scales affect the result? I see that you have made changes. Thank you. |
@chingi071 You can try to set the cfg.torch2trt_max_calibration_images to lower (e.g. 5), if it still OOM, then you might set it to use TensorRT FP16 with |
@haotian-liu Hello, I tried setting cfg.torch2trt_max_calibration_images to a smaller value (1, 5), and used --use_fp16_tensorrt, but it was still killed... My current resolution is 320, do I need to set a smaller one? |
What if you only use the PyTorch version? We haven't been testing our method on Jetson Nano, thus I cannot provide much of the advice. |
I use the pytorch version. Then I try to use Jetson Nano 4GB to see if it can be inference,thank you. |
@chingi071 Not sure if I am not being clear, I mean use |
Oh, I misunderstand. I haven't tried use --disable_tensorrt. I will try it,thank you very much! |
Hi @chingi071 |
@sdimantsd Thank you very much! This is very useful information for me. |
It is possible that the fp16 don't do the calibration?
It look like it from the code (https://github.com/haotian-liu/yolact_edge/blob/662d760f8b2d8b4409d385aaf172e155aaa3a3d8/utils/tensorrt.py#L38)
Thanks
The text was updated successfully, but these errors were encountered: