Adversarial Training of LXMERT

This repository contains the PyTorch code of VILLA (NeurIPS 2020 Spotlight) that supports adversarial training (finetuning) of LXMERT on VQA, GQA, and NLVR2. Most of the code in this repo are copied/modified from LXMERT.

For details on UNITER adversarial pre-training and finetuning, please refer to the main VILLA repo.

Results

This repo can be used to reproduce the following results.

Method	VQA (test-dev)	VQA (test-std)	GQA (test-dev)	GQA (test-std)	NLVR2 (dev)	NLVR2 (test-P)
LXMERT	72.50%	72.52%	59.92%	60.28%	74.72%	74.75%
LXMERT-AdvTrain	73.02%	73.18%	60.98%	61.12%	75.98%	75.73%

Requirements

We provide Docker image for easier reproduction. Please install the following:

Our scripts require the user to have the docker group membership so that docker commands can be run without sudo. We only support Linux with NVIDIA GPUs. We test on Ubuntu 18.04 and V100 cards. We use mixed-precision training hence GPUs with Tensor Cores are recommended.

VQA

NOTE: Please follow the official LXMERT repo to download the pre-trained checkpoint and get all the VQA features and processed data ready.

Organize the downloaded VQA data with the following folder structure:

├── finetune 
├── img_db
│   ├── train2014_obj36.tsv
│   └── val2014_obj36.tsv
├── pretrained
│   └── model_LXRT.pth
└── txt_db
    ├── minival.json
    ├── nominival.json
    └── train.json

Launch the Docker container for running the experiments.
```
# docker image should be automatically pulled
source launch_container.sh $PATH_TO_STORAGE/txt_db $PATH_TO_STORAGE/img_db \
    $PATH_TO_STORAGE/finetune $PATH_TO_STORAGE/pretrained
```
The launch script respects $CUDA_VISIBLE_DEVICES environment variable. Note that the source code is mounted into the container under /src instead of built into the image so that user modification will be reflected without re-building the image. (Data folders are mounted into the container separately for flexibility on folder structures.)

Run finetuning for the VQA task.

# verify on a small training set
bash run/vqa_finetune.bash 0 vqa_lxr955_tiny --tiny
bash run/vqa_adv_finetune.bash 0 vqa_lxr955_adv_tiny --tiny

# standard finetuning
bash run/vqa_finetune.bash 0 vqa_lxr955

# adversarial finetuning
bash run/vqa_adv_finetune.bash 0 vqa_lxr955_adv

Run inference for the VQA task and then evaluate.

# local validation
bash run/vqa_test.bash 0 vqa_lxr955_results --test minival --load /storage/vqa_lxr955/BEST
bash run/vqa_test.bash 0 vqa_lxr955_adv_results --test minival --load /storage/vqa_lxr955_adv/BEST

# submission to VQA test server
# you need download the test data first
bash run/vqa_test.bash 0 vqa_lxr955_results --test test --load /storage/vqa_lxr955/BEST
bash run/vqa_test.bash 0 vqa_lxr955_adv_results --test test --load /storage/vqa_lxr955_adv/BEST

GQA

NOTE: Please follow the official LXMERT repo to download the GQA processed data and features.

Organize the downloaded GQA data with the following folder structure:

├── finetune 
├── img_db
│   ├── gqa_testdev_obj36.tsv
│   └── vg_gqa_obj36.tsv
├── pretrained
│   └── model_LXRT.pth
└── txt_db
    ├── testdev.json
    ├── train.json
    └── valid.json

Launch the Docker container for running the experiments.

# docker image should be automatically pulled
source launch_container.sh $PATH_TO_STORAGE/txt_db $PATH_TO_STORAGE/img_db \
    $PATH_TO_STORAGE/finetune $PATH_TO_STORAGE/pretrained

Run finetuning for the GQA task.

# verify on a small training set
bash run/gqa_finetune.bash 0 gqa_lxr955_tiny --tiny
bash run/gqa_adv_finetune.bash 0 gqa_lxr955_adv_tiny --tiny

# standard finetuning
bash run/gqa_finetune.bash 0 gqa_lxr955

# adversarial finetuning
bash run/gqa_adv_finetune.bash 0 gqa_lxr955_adv

Run inference for the GQA task and then evaluate.

# local validation
bash run/gqa_test.bash 0 gqa_lxr955_results --load /storage/gqa_lxr955/BEST --test testdev --batchSize 1024
bash run/gqa_test.bash 0 gqa_lxr955_adv_results --load /storage/gqa_lxr955_adv/BEST --test testdev --batchSize 1024

# submission to GQA test server
# you need download the test data first
bash run/gqa_test.bash 0 gqa_lxr955_results --load /storage/gqa_lxr955/BEST --test submit --batchSize 1024
bash run/gqa_test.bash 0 gqa_lxr955_adv_results --load /storage/gqa_lxr955_adv/BEST --test submit --batchSize 1024

NLVR2

NOTE: Please follow the official LXMERT repo to download the NLVR2 processed data and features.

Organize the downloaded NLVR2 data with the following folder structure:

├── finetune 
├── img_db
│   ├── train_obj36.tsv
│   └── valid_obj36.tsv
├── pretrained
│   └── model_LXRT.pth
└── txt_db
    ├── test.json
    ├── train.json
    └── valid.json

Launch the Docker container for running the experiments.

# docker image should be automatically pulled
source launch_container.sh $PATH_TO_STORAGE/txt_db $PATH_TO_STORAGE/img_db \
    $PATH_TO_STORAGE/finetune $PATH_TO_STORAGE/pretrained

Run finetuning for the NLVR2 task.

# verify on a small training set
bash run/nlvr2_finetune.bash 0 nlvr2_lxr955_tiny --tiny
bash run/nlvr2_adv_finetune.bash 0 nlvr2_lxr955_adv_tiny --tiny

# standard finetuning
bash run/nlvr2_finetune.bash 0 nlvr2_lxr955

# adversarial finetuning
bash run/nlvr2_adv_finetune.bash 0 nlvr2_lxr955_adv

Run inference for the VQA task and then evaluate.

# inference on public test split
bash run/nlvr2_test.bash 0 nlvr2_lxr955_results --load /storage/nlvr2_lxr955/BEST --test test --batchSize 1024
bash run/nlvr2_test.bash 0 nlvr2_lxr955_adv_results --load /storage/nlvr2_lxr955_adv/BEST --test test --batchSize 1024

Citation

If you find this code useful for your research, please consider citing:

@inproceedings{gan2020large,
  title={Large-Scale Adversarial Training for Vision-and-Language Representation Learning},
  author={Gan, Zhe and Chen, Yen-Chun and Li, Linjie and Zhu, Chen and Cheng, Yu and Liu, Jingjing},
  booktitle={NeurIPS},
  year={2020}
}

@inproceedings{tan2019lxmert,
  title={LXMERT: Learning Cross-Modality Encoder Representations from Transformers},
  author={Tan, Hao and Bansal, Mohit},
  booktitle={EMNLP},
  year={2019}
}

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
data		data
run		run
src		src
.gitignore		.gitignore
.gitmodules		.gitmodules
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
launch_container.sh		launch_container.sh
requirements.txt		requirements.txt
villa_framework.png		villa_framework.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

run

run

src

src

.gitignore

.gitignore

.gitmodules

.gitmodules

Dockerfile

Dockerfile

LICENSE

LICENSE

README.md

README.md

launch_container.sh

launch_container.sh

requirements.txt

requirements.txt

villa_framework.png

villa_framework.png

Repository files navigation

Adversarial Training of LXMERT

Results

Requirements

VQA

GQA

NLVR2

Citation

License

About

Releases

Packages

Languages

License

zhegan27/LXMERT-AdvTrain

Folders and files

Latest commit

History

Repository files navigation

Adversarial Training of LXMERT

Results

Requirements

VQA

GQA

NLVR2

Citation

License

About

Topics

Resources

License

Stars

Watchers

Forks

Languages