Zero mAP and no detections on custom dataset #61

ghost · 2021-02-23T14:07:19Z

Dear Haotian Liu,
currently I'm trying the yolact_edge trained on custom coco-like dataset for one class ('person'). Contrary to coco the image resolution is 512.

When I run evaluation with TensorRT conversion I get the following error during the protonet conversion:

[02/23 15:26:23 yolact.eval]: Converting protonet to TensorRT...
Traceback (most recent call last):
  File "eval.py", line 1241, in <module>
    convert_to_tensorrt(net, cfg, args, transform=BaseTransform())
  File "/home/oidpsv/yolact_edge/utils/tensorrt.py", line 156, in convert_to_tensorrt
    net.to_tensorrt_protonet(cfg.torch2trt_protonet_int8, calibration_dataset=calibration_protonet_dataset, batch_size=args.trt_batch_size)
  File "/home/oidpsv/yolact_edge/yolact.py", line 1565, in to_tensorrt_protonet
    self.trt_load_if("proto_net", trt_fn, [x], int8_mode, batch_size=batch_size)
  File "/home/oidpsv/yolact_edge/yolact.py", line 1534, in trt_load_if
    module = trt_fn(module, trt_fn_params)
  File "/opt/conda/lib/python3.6/site-packages/torch2trt-0.1.0-py3.6-linux-x86_64.egg/torch2trt/torch2trt.py", line 555, in torch2trto
    engine = builder.build_cuda_engine(network)
  File "/opt/conda/lib/python3.6/site-packages/torch2trt-0.1.0-py3.6-linux-x86_64.egg/torch2trt/calibration.py", line 51, in get_batch
    buffer[i].copy_(tensor)
RuntimeError: The size of tensor a (69) must match the size of tensor b (64) at non-singleton dimension 2

which can be fixed by changing line 1563 in yolact.py from x = torch.ones((1, 256, 69, 69)).cuda()
to x = torch.ones((1, 256, 64, 64)).cuda().

The problem is that in this case further evaluation with TensorRT conversion gives zero mAP and processing of images provides empty result (no masks or boxes). Could you please help me? Many thanks.

The text was updated successfully, but these errors were encountered:

haotian-liu · 2021-02-23T17:57:14Z

What is the mAP if without TensorRT? Also did you set the classes to ('person') or ('person',), the first one is incorrect.

ghost · 2021-02-24T08:49:45Z

It is in the range of 70-80 without TensorRT. Also, the classes are set as ('person', ) label map is = { 1: 1,}

haotian-liu · 2021-02-24T14:37:34Z

If you try evaluate with FP16 TensorRT, will the AP improve?

ghost · 2021-02-24T16:07:13Z

Unfortunately it is same with FP16 - zero mAP

haotian-liu · 2021-02-24T20:12:24Z

Please try the latest code base in haotian-dev branch.

XYunaaa · 2021-03-01T10:21:35Z

Hi @oidpsv ，I also encountered this problem. Have you found a solution now？

haotian-liu · 2021-03-02T18:50:11Z

@XYunaaa Please see the newest code base and use the safe mode as explained here.

Jason-Lee0 · 2021-03-16T07:45:41Z

@haotian-liu sorry to bother you.
I have encountered the similar issue.
But in my situation,I replaced the backbone with rexnetv1 (https://github.com/clovaai/rexnet/blob/master/rexnetv1.py).

It can evaluate normally when I disable tensorrt.
But when I want to convert the model with tensorrt command, it showed the error.

First of all:
I add the following code in trt modules.

elif cfg.backbone.name == "MobileNetV2":
            x = [
                torch.randn(1, 32, 69, 69).cuda(),
                torch.randn(1, 64, 35, 35).cuda(),
                torch.randn(1, 160, 18, 18).cuda(),
                ]
        elif cfg.backbone.name == "MobileNetV3":
            x = [
                torch.randn(1, 24, 69, 69).cuda(),
                torch.randn(1, 40, 35, 35).cuda(),
                torch.randn(1, 40, 18, 18).cuda(),
                ]                
        elif cfg.backbone.name == "Rexnet":
            x = [
                torch.randn(1, 16, 69,69).cuda(),
                torch.randn(1, 27, 35, 35).cuda(),
                torch.randn(1, 50, 18, 18).cuda(),
                ]

    if cfg.backbone.name == "ResNet50" or cfg.backbone.name == "ResNet101":
            x = [
                torch.randn(1, 256, 69, 69).cuda(),
                torch.randn(1, 256, 35, 35).cuda(),
                torch.randn(1, 256, 18, 18).cuda(),
                ]
        elif cfg.backbone.name == "MobileNetV2":
            x = [
                torch.randn(1, 256, 69, 69).cuda(),
                torch.randn(1, 256, 35, 35).cuda(),
                torch.randn(1, 256, 18, 18).cuda(),
                ]
        elif cfg.backbone.name == "MobileNetV3":
            x = [
                torch.randn(1, 256, 69, 69).cuda(),
                torch.randn(1, 256, 35, 35).cuda(),
                torch.randn(1, 256, 18, 18).cuda(),
                ]
        elif cfg.backbone.name == "Rexnet":
            x = [
                torch.randn(1, 256, 69,69).cuda(),
                torch.randn(1, 256, 35, 35).cuda(),
                torch.randn(1, 256, 18, 18).cuda(),
                ]


        if cfg.backbone.name == "ResNet50" or cfg.backbone.name == "ResNet101":
            x = torch.randn(1, 512, 69, 69).cuda()
        elif cfg.backbone.name == "MobileNetV2":
            x = torch.randn(1, 32, 69, 69).cuda()
        elif cfg.backbone.name == "MobileNetV3":
            x = torch.randn(1, 24, 69, 69).cuda() 
        elif cfg.backbone.name == "Rexnet":
            x = torch.randn(1, 16,69,69).cuda()

The setting format about Mobilenetv3 and Rexnet refer the code.

When I use model with rexnet, the error showed.

And I changed the line 1512:

x=torch.ones((1,256,69,69)).cuda() -> x=torch.ones((1,256,112,112)).cuda()

The error will not showed , but it still got zero mAP.
So I think the line is not a problem.

I think maybe the problem is caused by my setting about torch.randn

Could you tell me how to set the torch.randn in the correct way ?
How to define the parameter in torch module ?

Thanks for your reading~

Have a nice day.

BTW, I have already used the command: --use_tensorrt_safe_mode
It still get the error.

haotian-liu · 2021-03-16T15:48:15Z

@ntut108318099 You should put in torch.randn(n,c,h,w) where it is the expected shape of the tensor for all the images when passing through the module.

Jason-Lee0 · 2021-03-16T17:51:39Z

@haotian-liu Thanks for your reply.

Sorry,
Could you tell me where can I expect shape of the tensor for all the images when passing through the module ?
Which code can I used to saw the result ?

Thanks for your help.

haotian-liu · 2021-03-16T18:12:59Z

@ntut108318099

In each module, e.g. FPN_phase 1, it will have some inputs def forward(self, x1, x2, x3), we will put x = [ torch.randn(*x1.shape), torch.randn(*x2.shape), torch.randn(*x3.shape) ] there. Replace them with actual numbers.

Jason-Lee0 · 2021-03-17T08:05:51Z

@haotian-liu

Depending on your advice, I convert the trt module successfully.
But it showed the another error , it's about shape '[1 ,25575, 81]' is invalid for input of size 1559088.

Have you ever met this error before ?

Jason-Lee0 · 2021-03-17T18:42:49Z

@haotian-liu Hi,Sorry to bother you.

I have fixed the problem .

The problem was caused by "PredictionModuleTRTWrapper" . I didn't notice that have some parameter also need to set .

The result looks great !!
Thanks for your supply.

Have a nice day .

haotian-liu · 2021-03-17T19:50:10Z

@ntut108318099 Glad you have the problem solved!

haotian-liu closed this as completed Mar 5, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Zero mAP and no detections on custom dataset #61

Zero mAP and no detections on custom dataset #61

ghost commented Feb 23, 2021 •

edited by ghost

haotian-liu commented Feb 23, 2021

ghost commented Feb 24, 2021

haotian-liu commented Feb 24, 2021

ghost commented Feb 24, 2021

haotian-liu commented Feb 24, 2021

XYunaaa commented Mar 1, 2021

haotian-liu commented Mar 2, 2021

Jason-Lee0 commented Mar 16, 2021 •

edited

haotian-liu commented Mar 16, 2021

Jason-Lee0 commented Mar 16, 2021

haotian-liu commented Mar 16, 2021

Jason-Lee0 commented Mar 17, 2021

Jason-Lee0 commented Mar 17, 2021

haotian-liu commented Mar 17, 2021

Zero mAP and no detections on custom dataset #61

Zero mAP and no detections on custom dataset #61

Comments

ghost commented Feb 23, 2021 • edited by ghost

haotian-liu commented Feb 23, 2021

ghost commented Feb 24, 2021

haotian-liu commented Feb 24, 2021

ghost commented Feb 24, 2021

haotian-liu commented Feb 24, 2021

XYunaaa commented Mar 1, 2021

haotian-liu commented Mar 2, 2021

Jason-Lee0 commented Mar 16, 2021 • edited

haotian-liu commented Mar 16, 2021

Jason-Lee0 commented Mar 16, 2021

haotian-liu commented Mar 16, 2021

Jason-Lee0 commented Mar 17, 2021

Jason-Lee0 commented Mar 17, 2021

haotian-liu commented Mar 17, 2021

ghost commented Feb 23, 2021 •

edited by ghost

Jason-Lee0 commented Mar 16, 2021 •

edited