Ask, Attend and Answer: Exploring Question-Guided Spatial Attention for Visual Question Answering
Instructions for training the "SMem-VQA Two-Hop" model:
-
Download the provided caffe folder and install caffe following the instructions in http://caffe.berkeleyvision.org/installation.html .
-
Run ./example/data/get_image.sh to download MSCOCO image data.
-
Run ./train/train_mm.sh to train the model.
@article{xu2015ask, title={Ask, Attend and Answer: Exploring Question-Guided Spatial Attention for Visual Question Answering}, author={Xu, Huijuan and Saenko, Kate}, journal={arXiv preprint arXiv:1511.05234}, year={2015} }