Description

This is an official repo for ICCV2023 Toward Unsupervised Realistic Visual Question Answering, which encourage the model to answer the answerable questions and reject the un-answerable ones (See figure below). See Paper for details.

Clone this repo

git clone https://github.com/chihhuiho/RGQA.git

Create environment

Might need to install the different pytorch and torchvision version based on your device.

conda env create -f environment.yml

Data Preparation

Download butd related data, gqa (link) and the proposed RGQA dataset (link)

cd data
sh download_rgqa.sh

Follow the data preparation step from Lxmert repo and download the rest of the features. The data folder should looks like

|-- data
    |-- butd  
    |-- download_rgqa.sh  
    |-- gqa  
    |-- lxmert  
    |-- mscoco_imgfeat  
    |-- nlvr2  
    |-- nlvr2_imgfeat  
    |-- vg_gqa_imgfeat  
    |-- vqa

Compute Metric

To encourage future research on RVQA, we provide a script to evaluate the proposed dataset using the proposed metric (e.g. AUAF, FF95 and FACC). As long as the model prediction is organized as the provided example, the script can be used to compute the model performance. Below are the steps.

Enter the compute_accfpr folder

cd compute_accfpr

The RGQA dataset format is identical to the example.json file and the model prediction should be identical to example_predict.json format.
Compute the metric

python compute_accfpr.py

Training and evaluating different backbones

Below are the script for 3 backbones, including lxmert, butd and uniter. Simply change the ``BACKBONE" to the desired backbone in the following command.

Training

Finetune vanilla GQA with BACKBONE

sh scripts/BACKBONE/train/vanilla.sh 0

Finetune BACKBONE with random pairing Pseudo UQ (RP) on GQA

sh scripts/BACKBONE/train/rp.sh 0

Finetune BACKBONE with hard Pseudo UQ (RP) on GQA

sh scripts/BACKBONE/train/rp_with_hard_uq.sh 0

Finetune BACKBONE with mixup RoI on GQA

sh scripts/BACKBONE/train/mix.sh 0

Download pretrained weight

The pretrained weight for different RVQA approahces can be download using the following code (about 8GB).

cd snap/gqa
sh download_rgqa_ckpt.sh

Testing with different RVQA approaches

BACKBONE with FRCNN

sh scripts/BACKBONE/test/frcnn.sh $GPUID
sh scripts/BACKBONE/test/frcnn.sh 0

The result is located in snap/gqa/BACKBONE/test/frcnn" and it should be identical to snap/gqa/pretrain/BACKBONE/frcnn"

BACKBONE with MSP

sh scripts/BACKBONE/test/msp.sh $GPUID
sh scripts/BACKBONE/test/msp.sh 0

The result is located in snap/gqa/BACKBONE/test/msp" and it should be identical to snap/gqa/pretrain/BACKBONE/msp"

BACKBONE with ODIN

sh scripts/BACKBONE/test/odin.sh $GPUID
sh scripts/BACKBONE/test/odin.sh 0

The result is located in snap/gqa/BACKBONE/test/odin" and it should be identical to snap/gqa/pretrain/BACKBONE/odin"

BACKBONE with Maha

sh scripts/BACKBONE/test/maha.sh $GPUID
sh scripts/BACKBONE/test/maha.sh 0

The result is located in snap/gqa/BACKBONE/test/maha" and it should be identical to snap/gqa/pretrain/BACKBONE/maha"

BACKBONE with Energy

sh scripts/BACKBONE/test/energy.sh $GPUID
sh scripts/BACKBONE/test/energy.sh 0

The result is located in snap/gqa/BACKBONE/test/energy" and it should be identical to snap/gqa/pretrain/BACKBONE/energy"

BACKBONE with Q-C

sh scripts/BACKBONE/test/qc.sh $GPUID
sh scripts/BACKBONE/test/qc.sh 0

The result is located in snap/gqa/BACKBONE/test/qc" and it should be identical to snap/gqa/pretrain/BACKBONE/qc"

BACKBONE with resample

sh scripts/BACKBONE/test/resample.sh $GPUID
sh scripts/BACKBONE/test/resample.sh 0

The result is located in snap/gqa/BACKBONE/test/resampling" and it should be identical to snap/gqa/pretrain/BACKBONE/resampling"

BACKBONE with RP with only hardUQ

sh scripts/BACKBONE/test/rp_with_harduq.sh $GPUID
sh scripts/BACKBONE/test/rp_with_harduq.sh 0

The result is located in snap/gqa/BACKBONE/test/RP_with_hard_uq" and it should be identical to snap/gqa/pretrain/BACKBONE/RP_with_hard_uq"

BACKBONE with RP

sh scripts/BACKBONE/test/rp.sh $GPUID
sh scripts/BACKBONE/test/rp.sh 0

The result is located in snap/gqa/BACKBONE/test/RP" and it should be identical to snap/gqa/pretrain/BACKBONE/RP"

BACKBONE with Mixup

sh scripts/BACKBONE/test/mixup.sh $GPUID
sh scripts/BACKBONE/test/mixup.sh 0

The result is located in snap/gqa/BACKBONE/test/mixup" and it should be identical to snap/gqa/pretrain/BACKBONE/mixup"

BACKBONE with Ensemble

sh scripts/BACKBONE/test/ensemble.sh $GPUID
sh scripts/BACKBONE/test/ensemble.sh 0

The result is located in snap/gqa/BACKBONE/test/ensemble" and it should be identical to snap/gqa/pretrain/BACKBONE/ensemble"

Test all RVQA approaches with BACKBONE

sh scripts/BACKBONE/test/test_all.sh $GPUID
sh scripts/BACKBONE/test/test_all.sh 0

Acknowledgement

The repo uses the code and checkpoint from

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
accfpr		accfpr
compute_accfpr		compute_accfpr
data		data
run		run
scripts		scripts
snap		snap
src		src
README.md		README.md
environment.yml		environment.yml
figure.PNG		figure.PNG

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

accfpr

accfpr

compute_accfpr

compute_accfpr

data

data

run

run

scripts

scripts

snap

snap

src

src

README.md

README.md

environment.yml

environment.yml

figure.PNG

figure.PNG

Repository files navigation

Description

Clone this repo

Create environment

Data Preparation

Compute Metric

Training and evaluating different backbones

Training

Download pretrained weight

Testing with different RVQA approaches

Acknowledgement

About

Releases

Packages

Languages

chihhuiho/RGQA

Folders and files

Latest commit

History

Repository files navigation

Description

Clone this repo

Create environment

Data Preparation

Compute Metric

Training and evaluating different backbones

Training

Download pretrained weight

Testing with different RVQA approaches

Acknowledgement

About

Resources

Stars

Watchers

Forks

Languages