-
Notifications
You must be signed in to change notification settings - Fork 207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Flickr30k Finetune results does not match the provided checkpoint #13
Comments
The fine-tuning results can be unstable due to augmentations. Also, we have only trained the IR/TR fine-tuning models for a single time. |
I tried longer epochs but that end up overfitting with increasing val loss. Would you mind providing the checkpoint for 100k steps also? |
@JACKHAHA363 |
thanks @dandelin! |
Are you able to solve this issue? @JACKHAHA363 I have similar issues on both flicker and coco retrieval. |
Hi, bro. |
Hi.Thanks for your reply. But I find shuffle in IR/TR image loader (code) is exactly False. In fact, shuffle is not exexplicitly set to False. However, the default value in torch.utils.data.DataLoader is False. Besides, shuffle in dist_sampler is also False.
I have no idea about the unsteable result.
T_T ...
…------------------ 原始邮件 ------------------
发件人: "Wonjae ***@***.***>;
发送时间: 2021年12月30日(星期四) 晚上10:36
收件人: ***@***.***>;
抄送: "by ***@***.***>; ***@***.***>;
主题: Re: [dandelin/ViLT] Flickr30k Finetune results does not match the provided checkpoint (#13)
Hi @byougert
Interpolating the position embedding dynamically for every image batch can cause subtle fluctuation while doing a batch evaluation. During computing IR/TR, we pass batch of images to vit.visual_embed (code), and inside of vit.visual_embed method, we calculate the maximum width and height given the batch of images dynamically and interpolate the pos_embed upto those dimensions (code).
I think setting shuffle=False argument in IR/TR image loader (code) would make the evaluation not fluctuate by fixing the order of image batches.
It seems I somehow deleted the argument while polishing the code for public release. Please set the argument and let me know it fixes the issue.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
Triage notifications on the go with GitHub Mobile for iOS or Android.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Hi @byougert Oops, you got the mail. I deleted the comment right after posted it as I noticed I put shuffle=False in Though after quick investigation, I found the true reason. I guess the score from |
Hi, bro. |
Hi, @dandelin |
Hi authors,
I take the provided pretrained 200k checkpoint and did the finetuning of flickr30k. The IR and TR scores after are 64.5 and 81.7. The TR score lower than the one in the paper. My finetuning command is
I also test the given
vilt_irtr_f30k.ckpt
and the results is good, with IR=65.3, TR=83.5. Can I ask what is the process of gettingvilt_irtr_f30k.ckpt
?The text was updated successfully, but these errors were encountered: