Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About test.py #3

Open
BioDPJ opened this issue Mar 17, 2022 · 2 comments
Open

About test.py #3

BioDPJ opened this issue Mar 17, 2022 · 2 comments

Comments

@BioDPJ
Copy link

BioDPJ commented Mar 17, 2022

Dr.Jiang,
Sorry to bother you.
I run the command "CUDA_VISIBLE_DEVICES=0 python -u cad_recognition/test.py --data_dir data/FloorPlansGraph5_iter --pretrained_model log/run182_2_best.pth" with codes about "opt.arch" and "opt.graph" being commented out.
BUT before and then, I still got the errors:
"size mismatch for cls_net.fusion_block.0.weight: copying a param with shape torch.Size([1024, 128]) from checkpoint, the shape in current model is torch.Size([1024, 448]).
size mismatch for cls_net.fusion_block_super.0.weight: copying a param with shape torch.Size([1024, 128]) from checkpoint, the shape in current model is torch.Size([1024, 448]).
size mismatch for prediction_cls.0.0.weight: copying a param with shape torch.Size([512, 2304]) from checkpoint, the shape in current model is torch.Size([512, 2944])."
It really confusing since the model was saved based on "def save_checkpoint()" while it did not match during loading the model.
Would you like to resolve this issue?
Thanks a lot and looking forward to your response soon.

Best regards,
VivianBB.

@xinyangj
Copy link
Contributor

Hi, this probably has something to do with the training architecture does not match with network arch in test.py (i.e. the --arch args). Please make sure they are the same.

@BioDPJ
Copy link
Author

BioDPJ commented Mar 22, 2022

Thx for your time, Dr.Jiang. I need to correct my comments just now.
I did try assign args like "--arch centernet3cc_rpn_gp_iter2 --graph bezier_cc_bb_iter". The problem is that the testing command could work out unless I add the args "--n_blocks_out 2", BUT the odd thing is the testing loss and loss_cls are over 684 (OMG, totally weird) with top acc over 90. It seems go for predicting the non-objects instead.
I have seen the loss curve in tensorboard, it seems well-functioned during training.
So, it actually has two issues with this command:
a. why do we have to add "--n_blocks_out" arg in test training, and b. why I tried different parameters, the testing losses are still so high?

PS: The command about n_blocks in testing code :"CUDA_VISIBLE_DEVICES=0 python -u cad_recognition/test.py --data_dir data/FloorPlansGraph5_iter --phase test --n_blocks_out 2 --arch centernet3cc_rpn_gp_iter2 --graph bezier_cc_bb_iter --pretrained_model log/sem_seg_sparse-res-attr_edge-n2-C64-k16-drop0.0-lr0.00025_B4_20220314-202805_17e0fdb1-ddaf-4cfa-b79f-39a4dcb26998/checkpoint/run182_2_best.pth"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants