This is the re-implementation of Learning to Count Objects in Natural Images for Visual Question Answering.
Codes are re-implemented base on Counting component for VQA.
To train the two models, run both of the following two commands:
python vqa-counter.py
python vqa-baseline.py
Both models were trained with both easy and hard task.
Logs of model weights and testing accuracy will store in resultacc.txt
, resultdata.txt
and resultacc-baseline.txt
respectively.
Run the following commands to plot the result.
python plot.py
Alternatively, pretrained model weights and evaluation accuracy are stored in .txt
files, you can just run plot.py
directly without training.
To check out the difference of weights dimension, run:
python plot_allacc.py
Remind if the file does not exist please run vqa-counter.py
with missing file parameter.
This code was confirmed to run with the following environment:
-
Python 3.6.3
- torch 1.0.1
- torchvision 0.2.1
- torchbearer 0.3.0
- numpy 1.14.5
-
Cuda 10.0