simple-vqa

Learns an MLP for VQA

This code implements the VQA MLP basline from Revisiting Visual Question Answering Baselines.

Some numbers on VQA

Features/Methods	VQA Val Accuracy	VQA Test-dev Accuracy
MCBP	-	66.4
Baseline - MLP	-	64.9
Imagenet - MLP	63.62	65.9

Readme is a work in progress......

Installation

The MLP is implemented in Torch, and depends on the following packages: torch/nn, torch/nngraph, torch/cutorch, torch/cunn, torch/image, torch/tds, lua-cjson, nninit, torch-word-emb, torch-hdf5, torchx

After installing torch, you can install / update these dependencies by running the following:

luarocks install nn
luarocks install nngraph
luarocks install image
luarocks install tds

luarocks install cutorch
luarocks install cunn

luarocks install lua-cjson
luarocks install nninit
luarocks install torch-word-emb
luarocks install torchx

Install torch-hdf5 by following instructions here

Running trained models

Download this repo

git clone --recursive https://github.com/arunmallya/simple-vqa.git

Data Dependencies

Create a data/ folder and symlink or place the following datasets: vqa -> VQA dataset root, coco -> COCO dataset root (coco is needed only if you plan to extract and use your own features, not required if using cached features below).
Download the Word2Vec model file from here. This is needed to encode sentences into vectors. Place the .bin file in the data/models folder.
Download cached resnet-152 imagenet features for the VQA dataset splits and place them in data/feats: features
Download VQA lite annotations and place then in data/vqa/Annotations/. These are required because the original VQA annotations do not fit in the 2GB limit of luajit.
Download MLP models trained on the VQA train set and place them in checkpoint/: models
At this point, your data folder should have models/, feats/, coco/ and vqa/ folders.

Run Eval

For example, to run the model trained on the VQA train set with Imagenet features, on the VQA val set:

th eval.lua -eval_split val \
-eval_checkpoint_path checkpoint/MLP-imagenet-train.t7

In general, the command is:

th eval.lua -eval_split (train/val/test-dev/test-final) \
-eval_checkpoint_path <model-path>

This will dump the results in checkpoint/ as a .json file as well as a results.zip file in case of test-dev and test-final. This results.zip can be uploaded to CodaLab for evaluation.

Training MLP from scratch

th train.lua -im_feat_types imagenet -im_feat_dims 2048

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
eval		eval
external		external
modules		modules
prepare_dataset		prepare_dataset
.gitmodules		.gitmodules
README.md		README.md
download.sh		download.sh
eval.lua		eval.lua
opts.lua		opts.lua
train.lua		train.lua

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

simple-vqa

Some numbers on VQA

Installation

Running trained models

Download this repo

Data Dependencies

Run Eval

Training MLP from scratch

About

Releases

Packages

Languages

arunmallya/simple-vqa

Folders and files

Latest commit

History

Repository files navigation

simple-vqa

Some numbers on VQA

Installation

Running trained models

Download this repo

Data Dependencies

Run Eval

Training MLP from scratch

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages