Skip to content
master
Switch branches/tags
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Quantifying and Alleviating the Language Prior Problem in Visual Question Answering.

This the official implementation for our SIGIR 19 paper. We tested our idea on three different works. Since the code is easy to implement and expand, we only provide the score regularization based on the Strong baseline paper. This repository is built upon the code provided by @Yan Zhang. Thanks for him generously sharing the code.

Get repo:

git clone https://github.com/guoyang9/vqa-prior.git --recursive

Prerequisites

* python==3.6.8
* numpy==1.16.2
* pytorch==1.0.1
* torchvision==0.2.2
* nltk==3.4
* bcolz==1.2.1
* tqdm==4.31.1

Dataset

First of all, make all the data in the right position according to the config.py.

  • The VQA dataset can be downloaded at the official website. This repository only implemented the model on the VQA 1.0 dataset.
  • The pre-trained Glove features can be found on glove website.

Preprocessing

  1. Preprocess grid-based image features: preprocess the image feature, including extracting pre-trained image faetures.
    python preprocess/preprocess-images.py
    
  2. Preprocess the vocabulary: filtering top 3000 answers.
    python preprocess/preprocess-vocab.py
    
  3. Preprocess question type: counting answers under each question type.
    python preprocess/preprocess-qt.py
    

Model Training

python main.py --name=vqa-prior --gpu=0

Model Test only

python main.py --test --name=vqa-prior --gpu=0

Citation

If you plan to use this code as part of your published research, we'd appreciate it if you could cite our paper:

@Inproceedings{prior,
  author    = {Yangyang Guo, Zhiyong Cheng, Liqiang Nie, Yibing Liu, Yinglong Wang and Mohan Kankanhalli},
  title     = {Quantifying and Alleviating the Language Prior Problem in Visual Question Answering},
  booktitle = {SIGIR},
  year      = {2019},
}

About

Implementation for our SIGIR 2019 paper --- Quantifying and Alleviating the Language Prior Problem in Visual Question Answering.

Topics

Resources

License

Releases

No releases published

Packages

No packages published

Languages