Skip to content

Latest commit

 

History

History
77 lines (54 loc) · 4.07 KB

gqa.md

File metadata and controls

77 lines (54 loc) · 4.07 KB

GQA

Backbone Test-dev Test-std url size
Resnet-101 62.48 61.99 model 3GB
EfficientNet-B5 62.95 62.45 model 2.7GB

Data preparation

The config for this dataset can be found in configs/gqa.json and is also shown below:

{
  "combine_datasets": ["gqa"],
  "combine_datasets_val": ["gqa"],
  "vg_img_path": "",
  "gqa_ann_path": "mdetr_annotations/",
  "gqa_split_type": "balanced"
}
  • Download the gqa images at GQA images and update vg_img_path to point to the folder containing the images.
  • Download our pre-processed annotations that are converted to coco format (all datasets present in the same zip folder for MDETR annotations): Pre-processed annotations and update the gqa_ann_path to this folder with pre-processed annotations.

Script to reproduce results

Model weights (can also be loaded directly from url):

  1. gqa_resnet101_checkpoint.pth
  2. gqa_EB5_checkpoint.pth
  3. pretrained_resnet101_checkpoint.pth

GQA has two types of splits "all" and "balanced". Choose the one you are interested in, by changing in configs/gqa.json gqa_split_type

To run evaluation on testdev balanced:

python run_with_submitit.py --dataset_config configs/gqa.json --ngpus 1 --nodes 2  --ema --eval --do_qa --split_qa_heads --no_contrastive_align_loss --resume https://zenodo.org/record/4721981/files/gqa_resnet101_checkpoint.pth

To run on a single node with 2 gpus

python -m torch.distributed.launch --nproc_per_node=2 --use_env main.py --dataset_config configs/gqa.json --ema --eval --do_qa --split_qa_heads --no_contrastive_align_loss  --resume https://zenodo.org/record/4721981/files/gqa_resnet101_checkpoint.pth

To run finetuning on the "all" split (this was run on 8 nodes of 4 gpus each, effective batch size 128):

  1. Change the configs/gqa.json to have gqa_split_type as "all"
python run_with_submitit.py --dataset_config configs/gqa.json --ngpus 8 --ema --epochs 125 --epoch_chunks 25 --do_qa --split_qa_heads --lr_drop 150 --load https://zenodo.org/record/4721981/files/pretrained_resnet101_checkpoint.pth --nodes 4 --batch_size 4 --no_aux_loss --qa_loss_coef 25 --lr 1.4e-4 --lr_backbone 1.4e-5 --text_encoder_lr 7e-5

To run on a single node with 8 gpus

python -m torch.distributed.launch --nproc_per_node=8 --use_env main.py --dataset_config configs/gqa.json --ema -epochs 125 --epoch_chunks 25 --lr_drop 150 --do_qa --split_qa_heads --load https://zenodo.org/record/4721981/files/pretrained_resnet101_checkpoint.pth --no_aux_loss --qa_loss_coef 25

To dump predictions that can be submitted to the EvalAI server, use this instead:

  1. Change the configs/gqa.json to have gqa_split_type as "all"
  2. --split can be testdev or submission to generate the prediction file that is uploaded to the GQA EvalAI server.
python run_with_submitit_gqa_eval.py  --do_qa --eval  --resume https://zenodo.org/record/4721981/files/gqa_resnet101_checkpoint.pth?download=1  --split_qa_heads --ngpus 1 --nodes 4  --ema --split testdev --dataset_config configs/gqa.json
  1. The resulting predictions will be saved in the experiments output dir as testdev_predictions.json or submission_predictions.json accordingly.

You can also run this on just one node with 4 gpus

python -m torch.distributed.launch --nproc_per_node=4 --use_env scripts/eval_gqa.py --do_qa --eval --resume https://zenodo.org/record/4721981/files/gqa_resnet101_checkpoint.pth?download=1 --split_qa_heads --ema --split testdev --dataset_config configs/gqa.json