New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
question about ITM loss #53
Comments
Thanks for your question! First, the negative samples are sampled from the mini-batch. |
Thanks so much for your reply! And what is the magnitude of ITM loss after pretraining? |
It is around 0.11-0.13 |
Thanks so much! |
Hi, @LiJunnan1992 I'm trying to collect the negative samples from the whole batch, i.e., not limited to the same mini-batch. Could you please tell me whether such a sampling strategy will incur disaster to ITM loss under the current label order? Thanks in advance!
|
Yes you can do it. Please check out our BLIP's code on hard negative mining from all GPUs: |
Hi, @LiJunnan1992 Thanks for your reply! We will follow the BLIP work again. We want to follow the promising ALBEF and BLIP, however the dataset becomes a obstacle. The SBU Captions is inaccessible, could you please offer one copy to us? Thanks! We will do our best to move forward and appreciate your enthusiastic help! |
Hi,
Thanks for the great work.
After reading the code for calculating ITM loss, I have a question below:
The itm labels for positive and negative samples are in a "fixed" order instead of being shuffled. I'm wondering whether the order be an issue for the ITM loss to work correctly? In some other VLP models such as ViLT, the ITM loss is calculated based on an shuffled pos-neg batches, which is detailed at https://github.com/dandelin/ViLT/blob/762fd3975c180db6fc88f577cf39549983fa373a/vilt/modules/objectives.py#L207
Thanks in advance for your kind reply.
The text was updated successfully, but these errors were encountered: