Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

the mismatch error happened when using the pretarined model you provide. #15

Closed
HXYNODE opened this issue Apr 11, 2021 · 5 comments
Closed

Comments

@HXYNODE
Copy link

HXYNODE commented Apr 11, 2021

awesome work!
when i reproduce the results you report in this repository (i.e. cider metric score is 97.8 on msvd dataset), errors indicating size mismatch for the whole Capmodel occurred as running evaluate.py with your pretrained file results/msvd_model/msvd_best_cider.pth.
e. g.
Runtime error: Error(s) in loading state_dictionary for CapModel:
size mismatch for encoder.bi_lstm1.weight_it_l0: copying a parameters with shape torch.Size([2048,1000]) from checkpoint, the shape in current model is torch.Size([5200,1000]).
size mismatch ……
size mismatch ……
it seems like you have modified the model while don't update the msvd_best_cider.pth.
if you do so please let me know
and i would appreciate it if you provide the new version PTH file so that i can reproduce the results you report in this repository.
by the way why the final high results was not published in the paper?
thanks!

@tgc1997
Copy link
Owner

tgc1997 commented Apr 12, 2021

Sorry, I can not reproduce your errors, you can check the tensors' sizes step by step (5200 is a strange number). The training of the model is not very stable, so the final result of MSVD in the paper is an average of three models' results.

@HXYNODE
Copy link
Author

HXYNODE commented Apr 13, 2021

thanks for your tips
i try to reproduce the project again using another machine but the same error is reported
the error i met before
2021-04-13 13-37-40 的屏幕截图
i attempt to debug the evaluate.py according to your suggestions
the strange number 5200 occurs as follows:
2021-04-13 13-33-43 的屏幕截图
the errors may be caused by the parameters' size inside the bi_lstm
so could you show the screenshot pictures just like the same with the second pictures i uploaded when you run the evaluate.py in debug mode.
it confused me a lot and i do want to find out the reason.
and i argue that the core problem still is that the net structure is incompitable with the msvd checkpoint files.
thank you for your kindly and generous help again!

@tgc1997
Copy link
Owner

tgc1997 commented Apr 13, 2021

make sure you didn't change the size parameter in utils.opt.py, and run with the following command:
python evaluate.py --dataset=msvd --model=RMN --result_dir=results/msvd_model --use_loc --use_rel --use_func --hidden_size=512 --att_size=1024 --test_batch_size=2 --beam_size=2 --eval_metric=CIDEr

And I tried to output the size as you did, but I didn't find anything wrong:
weight_size

@tgc1997
Copy link
Owner

tgc1997 commented Apr 13, 2021

Note that the hidden size for msvd is 512 as mentioned in the paper.

@HXYNODE
Copy link
Author

HXYNODE commented Apr 13, 2021

Ooops... i got it wrong and you found it!
the hidden size parameter set in my run command, 1000 for msr-vtt, did not replaced by 512 for msvd.
u're so cool & nice.
thanks a lot for your timely help and reply! : )

@HXYNODE HXYNODE closed this as completed Apr 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants