Zero recall value while evaluating on LMO dataset #93

supriya-gdptl · 2022-12-04T06:45:39Z

I tried to evaluate the GDR-Net model on LMO dataset using the pretrained models you shared on OneDrive.
I used following command to run the valuation:

python core/gdrn_modeling/main_gdrn.py --config-file configs/gdrn/lmo/a6_cPnP_AugAAETrunc_BG0.5_lmo_real_pbr0.1_40e.py \
 --num-gpus 1 \
--eval-only  \
--opts MODEL.WEIGHTS=output/gdrn/lmo/a6_cPnP_AugAAETrunc_BG0.5_lmo_real_pbr0.1_40e/gdrn_lmo_real_pbr.pth

However, it is showing zero recall values. Please see the screenshot below.
Could you please help?

Thank you,
Supriya

The text was updated successfully, but these errors were encountered:

wangg12 · 2022-12-04T13:28:10Z

Maybe you should check your full running log to see where the problem is.

supriya-gdptl · 2022-12-04T20:59:20Z

The features from the backbone is a tensor of zeros. (On line 121 in GDRN.py). Because of this, all further steps output zero tensor.

features = 
tensor([[[[0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.],
          ...,
          [0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.]]]], device='cuda:0')

The log says all weights (backbone, pnp_net and rot_head) from the checkpoint are loaded correctly. Still the output of backbone is zero tensor.
Did you encounter such error before? Do you know what might be causing it?
See below the log

20221204_124725|fvcore.common.checkpoint@152: [Checkpointer] Loading from D:/research/data/gdrnet_data/gdrn/lmo/a6_cPnP_AugAAETrunc_BG0.5_lmo_real_blender_160e/model_final_wo_optim.pth ...
20221204_124728|d2.checkpoint.c2_model_loading@324: Following weights matched with model:
| Names in Model                        | Names in Checkpoint                                                                       | Shapes                         |
|:--------------------------------------|:------------------------------------------------------------------------------------------|:-------------------------------|
| backbone.bn1.*                        | backbone.bn1.{bias,num_batches_tracked,running_mean,running_var,weight}                   | (64,) () (64,) (64,) (64,)     |
| backbone.conv1.weight                 | backbone.conv1.weight                                                                     | (64, 3, 7, 7)                  |
| backbone.layer1.0.bn1.*               | backbone.layer1.0.bn1.{bias,num_batches_tracked,running_mean,running_var,weight}          | (64,) () (64,) (64,) (64,)     |
| backbone.layer1.0.bn2.*               | backbone.layer1.0.bn2.{bias,num_batches_tracked,running_mean,running_var,weight}          | (64,) () (64,) (64,) (64,)     |
| backbone.layer1.0.conv1.weight        | backbone.layer1.0.conv1.weight                                                            | (64, 64, 3, 3)                 |
| backbone.layer1.0.conv2.weight        | backbone.layer1.0.conv2.weight                                                            | (64, 64, 3, 3)                 |
| backbone.layer1.1.bn1.*               | backbone.layer1.1.bn1.{bias,num_batches_tracked,running_mean,running_var,weight}          | (64,) () (64,) (64,) (64,)     |
| backbone.layer1.1.bn2.*               | backbone.layer1.1.bn2.{bias,num_batches_tracked,running_mean,running_var,weight}          | (64,) () (64,) (64,) (64,)     |
| backbone.layer1.1.conv1.weight        | backbone.layer1.1.conv1.weight                                                            | (64, 64, 3, 3)                 |
....
| pnp_net.fc1.*                         | pnp_net.fc1.{bias,weight}                                                                 | (1024,) (1024,8192)            |
| pnp_net.fc2.*                         | pnp_net.fc2.{bias,weight}                                                                 | (256,) (256,1024)              |
| pnp_net.fc_r.*                        | pnp_net.fc_r.{bias,weight}                                                                | (6,) (6,256)                   |
| pnp_net.fc_t.*                        | pnp_net.fc_t.{bias,weight}                                                                | (3,) (3,256)                   |
.....
| rot_head_net.features.0.weight        | rot_head_net.features.0.weight                                                            | (512, 256, 3, 3)               |
| rot_head_net.features.1.*             | rot_head_net.features.1.{bias,num_batches_tracked,running_mean,running_var,weight}        | (256,) () (256,) (256,) (256,) |
| rot_head_net.features.10.weight       | rot_head_net.features.10.weight                                                           | (256, 256, 3, 3)               |

Thank you,
Supriya

supriya-gdptl · 2022-12-05T02:01:59Z

Hi @wangg12,

Could you please tell which version of detectron2 you have used?
The detectron2 website link that you shared in README.md (link) is for detectron2 version 0.6.

Circled in red in the image below

wangg12 · 2022-12-05T03:33:13Z

Yes. But I installed from source. It seems you were running on windows, could you run the code on Ubuntu?

supriya-gdptl · 2022-12-05T05:33:06Z

Thank you for the suggestion @wangg12.

I figured out the issue.
The features from backbone were zero because the weights of backbone were zero.
The checkpoint was getting loaded correctly but for some unknown reason, Line 550 in gdrn_evaluator.py was resetting the weights to zero.

I resolved this issue by loading the checkpoint again after line 550.
I got the following result.

Could you please tell me what does each metric in the first column stand for, i.e. what does ad_2, rete_2, re_2, te_2, proj_2, re, te stand for?

Thank you,
Supriya

wangg12 · 2022-12-05T06:20:24Z

Here https://github.com/THU-DA-6D-Pose-Group/GDR-Net/blob/main/core/gdrn_modeling/gdrn_custom_evaluator.py#L772 you can find what those metrics mean.

wangg12 closed this as completed Dec 5, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Zero recall value while evaluating on LMO dataset #93

Zero recall value while evaluating on LMO dataset #93

supriya-gdptl commented Dec 4, 2022

wangg12 commented Dec 4, 2022

supriya-gdptl commented Dec 4, 2022 •

edited

supriya-gdptl commented Dec 5, 2022

wangg12 commented Dec 5, 2022

supriya-gdptl commented Dec 5, 2022

wangg12 commented Dec 5, 2022

Zero recall value while evaluating on LMO dataset #93

Zero recall value while evaluating on LMO dataset #93

Comments

supriya-gdptl commented Dec 4, 2022

wangg12 commented Dec 4, 2022

supriya-gdptl commented Dec 4, 2022 • edited

supriya-gdptl commented Dec 5, 2022

wangg12 commented Dec 5, 2022

supriya-gdptl commented Dec 5, 2022

wangg12 commented Dec 5, 2022

supriya-gdptl commented Dec 4, 2022 •

edited