Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some questions for the paper #11

Closed
wanng-ide opened this issue Jan 19, 2022 · 17 comments
Closed

Some questions for the paper #11

wanng-ide opened this issue Jan 19, 2022 · 17 comments

Comments

@wanng-ide
Copy link

What is the difference between the score in Table 5 and Table 8?
77.19 in Table 5 results on test-dev set of VQAv2, and,
77.68 in Table 8 results on test-dev set of VQAv2.

@zdou0830
Copy link
Owner

See #6. Thanks!

@wanng-ide
Copy link
Author

@zdou0830 Thanks!

I have another question about the codes.

I tried your pretrained model to finetune VQA v2 with default setting.
However, the score of val is only around 72.55.

It should be more than 80..

Could you share your experiment settings of finetuning VQA v2 and pretraining tasks?

@zdou0830
Copy link
Owner

The command should be

python run.py with data_root=$DATA_DIR num_gpus=8 num_nodes=1 task_finetune_vqa_clip_bert per_gpu_batchsize=4 clip16 text_roberta image_size=576 clip_randaug load_path=meter_clip16_288_roberta_pretrain.ckpt

Hope this thread would help #7.

@wanng-ide
Copy link
Author

@zdou0830

This setting is the same as mine ...

@wanng-ide
Copy link
Author

If I use your finetuned model to do the test only task of VQA v2, the result is 77.66.
This model works.

@wanng-ide
Copy link
Author

image

@zdou0830
Copy link
Owner

You can try testing the last checkpoint and submitting the resulting json file to evalai.

@wanng-ide
Copy link
Author

The result from evalai is 71.53.
Should I finetune the model with more step?

@zdou0830
Copy link
Owner

The VQA dataset can be downloaded here: https://drive.google.com/file/d/1qT7YWHpLg-fAL43daKlOsYx2EbbQk--d/view?usp=sharing.

The training command is

python run.py with data_root=$DATA_DIR num_gpus=8 num_nodes=1 task_finetune_vqa_clip_bert per_gpu_batchsize=4 clip16 text_roberta image_size=576 clip_randaug load_path=meter_clip16_288_roberta_pretrain.ckpt

The testing command is

python run.py with data_root=$DATA_DIR num_gpus=8 num_nodes=1 test_only=True task_finetune_vqa_clip_bert per_gpu_batchsize=4 clip16 text_roberta image_size=576 load_path=last.ckpt

The provided VQA-finetuned checkpoint is trained in this way, so if you follow these steps correctly, you should be able to get a score of ~77.6 on test-dev. I didn't look at the dev scores and the number of training epochs was set to 10 as in config.py. The fine-tuning took about 2 days on 8 V100s for reference.

@wanng-ide
Copy link
Author

Ok, I will have a try! Thank you for your patient.

@wanng-ide
Copy link
Author

I found a problem.
If I use only one node to finetune the pretrained model, the result will be better than two nodes (around 5% in VQA v2 Val).

That might be the reason.

May you share your pretraining log?
I will pretrain the model in two nodes.
I want to know the different between one node and two nodes.

@zdou0830
Copy link
Owner

I didn't save the logs, but I did pre-train the models with 1/2/4 nodes and there were no significant differences, so I'd suggest you to debug your multi-node training settings.

@jiyt17
Copy link

jiyt17 commented Jan 22, 2022

I also met the same problem, which probably results from multi-node training settings. I use slurm to multi-node train. May you share your train bash file, if you also use slurm.

@zdou0830
Copy link
Owner

I didn't use slurm, but I uploaded the running file for distributed training on Microsoft machines (https://github.com/zdou0830/METER/blob/main/azure_distributed_run.py). Not sure if this is helpful.

@jiyt17
Copy link

jiyt17 commented Jan 23, 2022

ok, thank u~

@mactavish91
Copy link

I also met the same problem, which probably results from multi-node training settings. I use slurm to multi-node train. May you share your train bash file, if you also use slurm.

@jiyt17 Hello, have you solved the problems you encountered before?

@mactavish91
Copy link

@zdou0830 Thanks!

I have another question about the codes.

I tried your pretrained model to finetune VQA v2 with default setting. However, the score of val is only around 72.55.

It should be more than 80..

Could you share your experiment settings of finetuning VQA v2 and pretraining tasks?

@wanng-ide Hello, have you solved the problems you encountered before?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants