Questions about Visual Grounding checkpoint and visualization #50

zzzzzigzag · 2022-02-10T17:34:49Z

Thank you a lot for your outstanding work! I'm having problems with Visual Grounding task:

How did you get refcoco.pth?
During Fine-tuning according to this provided procedure, a 3.3G checkpoint_best.pth file, as large as the pretrained model, is generated; However the checkpoint: refcoco.pth you've given is only 800M, Could you explain how you managed to shrink the model size?
I tried distill :True and distill :False in config file, making no difference to the final size.
python -m torch.distributed.launch --nproc_per_node=8 --use_env Grounding.py \ --config ./configs/Grounding.yaml \ --output_dir output/RefCOCO \ --gradcam_mode itm \ --block_num 8 \ --checkpoint [Pretrained checkpoint, size 3.3G]
How to evaluate refcoco.pth?
Setting distill: False in config file does not work for me;
python -m torch.distributed.launch --nproc_per_node=8 --use_env Grounding.py \ --config ./configs/Grounding.yaml \ --output_dir output/RefCOCO_albefpth \ --gradcam_mode itm \ --block_num 8 \ --evaluate \ --checkpoint refcoco.pth
Drops following KeyError problem:
Traceback (most recent call last): File "Grounding.py", line 295, in <module> main(args, config) File "Grounding.py", line 187, in main state_dict = checkpoint['model'] KeyError: 'model'
How to visualize the 3.3G checkpoint_best.pth file generated by fine-tuning?
During Fine-tuning, the [val, test_A, test_B] metrics data printed out seems fine. However, the visualization.ipynb only works for refcoco.pth, but not works for the 3.3G checkpoint_best.pth generated by fine-tuning, the heat map is totally mess, not as expected. There seems a gap between checkpoint_best.pth and refcoco.pth.

The text was updated successfully, but these errors were encountered:

LiJunnan1992 · 2022-02-11T01:18:05Z

Hi, thanks for your question!

refcoco.pth contains the model's state_dict from checkpoint['model'], with the momentum model's state_dict removed to reduce file size. Hence, in order to load refcoco.pth, you can directly use state_dict = checkpoint, and set distill=False.

zzzzzigzag · 2022-02-11T04:27:25Z

Hi, thanks for your question!

refcoco.pth contains the model's state_dict from checkpoint['model'], with the momentum model's state_dict removed to reduce file size. Hence, in order to load refcoco.pth, you can directly use state_dict = checkpoint, and set distill=False.

Thank you for you kind reply! I am now clear about two of the above questions:

For the 2nd question, I changed state_dict = checkpoint['model'] to state_dict = checkpoint at line 187, and deleted https://github.com/salesforce/ALBEF/blob/main/Grounding.py#L188-L191 , then I got [val, test_A, test_B] metrics same with the paper.

For the 3rd question, in 5. Load model and tokenizer part of visualization.ipynb, I made following change and the visualization results goes well:

"""for 3.3G not distilled models: """
msg = model.load_state_dict(checkpoint['model'],strict=False)
"""for distilled models: """
#msg = model.load_state_dict(checkpoint,strict=False)

However I am still confused about how you shrink the model size. Where should I modify in Grounding.py to get a 800M checkpoint file? If just go with the original code, the checkpoint size would be 3.3G.

LiJunnan1992 · 2022-02-11T05:59:42Z

I used another script to delete the momentum model's parameters from checkpoint['model'] to shrink the size.

zzzzzigzag · 2022-02-11T08:51:24Z

I used another script to delete the momentum model's parameters from checkpoint['model'] to shrink the size.

Thank you, I can export 800M pth checkpoint file now

yuese1234 · 2024-03-29T02:17:23Z

Hi, thanks for your question!
refcoco.pth contains the model's state_dict from checkpoint['model'], with the momentum model's state_dict removed to reduce file size. Hence, in order to load refcoco.pth, you can directly use state_dict = checkpoint, and set distill=False.

Thank you for you kind reply! I am now clear about two of the above questions:

For the 2nd question, I changed state_dict = checkpoint['model'] to state_dict = checkpoint at line 187, and deleted https://github.com/salesforce/ALBEF/blob/main/Grounding.py#L188-L191 , then I got [val, test_A, test_B] metrics same with the paper.

For the 3rd question, in 5. Load model and tokenizer part of visualization.ipynb, I made following change and the visualization results goes well:
"""for 3.3G not distilled models: """
msg = model.load_state_dict(checkpoint['model'],strict=False)
"""for distilled models: """
#msg = model.load_state_dict(checkpoint,strict=False)
However I am still confused about how you shrink the model size. Where should I modify in Grounding.py to get a 800M checkpoint file? If just go with the original code, the checkpoint size would be 3.3G.

hi I also meet the problem that the heat map is totally mess when I use the checkpoint_best.pth. And I followed the steps you gave, （msg = model.load_state_dict(checkpoint['model'],strict=False)，but the problem is still not solved .Can you share if there is any additional processing. Also, can you share how you exported the 800M file? Thank you very much!

zzzzzigzag closed this as completed Feb 11, 2022

idealwhite mentioned this issue Jul 11, 2022

Questions about visual grounding #84

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Questions about Visual Grounding checkpoint and visualization #50

Questions about Visual Grounding checkpoint and visualization #50

zzzzzigzag commented Feb 10, 2022

LiJunnan1992 commented Feb 11, 2022

zzzzzigzag commented Feb 11, 2022

LiJunnan1992 commented Feb 11, 2022

zzzzzigzag commented Feb 11, 2022

yuese1234 commented Mar 29, 2024

Questions about Visual Grounding checkpoint and visualization #50

Questions about Visual Grounding checkpoint and visualization #50

Comments

zzzzzigzag commented Feb 10, 2022

LiJunnan1992 commented Feb 11, 2022

zzzzzigzag commented Feb 11, 2022

LiJunnan1992 commented Feb 11, 2022

zzzzzigzag commented Feb 11, 2022

yuese1234 commented Mar 29, 2024