Skip to content
Code for CVPR'18 "Grounding Referring Expressions in Images by Variational Context"
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
data Initial commit May 22, 2018
submodule Initial commit May 22, 2018
util Initial commit May 22, 2018
.gitignore
.gitmodules Initial commit May 22, 2018
README.md Modify unsupervised training Jul 4, 2018
config.py add unsupervised setting Jun 30, 2018
prepare_data.py Initial commit May 22, 2018
prepro.py Initial commit May 22, 2018
requirements.txt Initial commit May 22, 2018
test.py Initial commit May 22, 2018
train.py add unsupervised setting Jun 30, 2018
vc_model.py Modify unsupervised training Jul 4, 2018

README.md

Grounding Referring Expressions in Images by Variational Context

This repository contains the code for the following paper:

  • Hanwang Zhang, Yulei Niu, Shih-Fu Chang, Grounding Referring Expressions in Images by Variational Context. In CVPR, 2018. (PDF)
@article{zhang2018grounding,
  title={Grounding Referring Expressions in Images by Variational Context},
  author={Zhang, Hanwang and Niu, Yulei and Chang, Shih-Fu},
  journal={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  year={2018}
}

Note: part of this repository is built upon cmn, speaker_listener_reinforcer and refer.

Requirements and Dependencies

# Make sure to clone with --recursive
git clone --recursive https://github.com/yuleiniu/vc.git

The recursive will help also clone the refer API and cmn API repo.

  • Install other dependencies by simply run:
  pip install -r requirements.txt

Preprocessing

  • Download the model weights of Faster-RCNN VGG-16 network converted from Caffe model:
  ./data/models/download_vgg_params.sh
  • Download the GloVe matrix for word embedding:
  ./data/word_embedding/download_embed_matrix.sh
  • Re-build the NMS lib and the ROIPooling operation following cmn. Simply run:
  ./submodule/cmn.sh
  • Preprocess data for the use of referring expression following speaker_listener_reinforcer and refer (implemented by Python 2) , and save the results into data/raw. Simply run:
  ./submodule/refer.sh

Extract features

  • Extract region features for RefCOCO/RefCOCO+/RefCOCOg, run:
  python prepare_data.py --dataset refcoco  #(for RefCOCO)
  python prepare_data.py --dataset refcoco+ #(for RefCOCO+)
  python prepare_data.py --dataset refcocog #(for RefCOCOg)

Train

  • To train the model under supervised setting, run:
  python train.py --dataset refcoco  #(for RefCOCO)
  python train.py --dataset refcoco+ #(for RefCOCO+)
  python train.py --dataset refcocog #(for RefCOCOg)
  • To train the model under unsupervised setting, run:
  python train.py --dataset refcoco  --supervised False --max_iter 80000 --lr_decay_step 20000 --snapshot_start 20000 #(for RefCOCO)
  python train.py --dataset refcoco+ --supervised False --max_iter 80000 --lr_decay_step 20000 --snapshot_start 20000 #(for RefCOCO+)
  python train.py --dataset refcocog --supervised False --max_iter 80000 --lr_decay_step 20000 --snapshot_start 20000 #(for RefCOCOg)

Evaluation

  • To test the model, run:
  python test.py --dataset refcoco  (for RefCOCO)  --checkpoint /path/to/checkpoint
  python test.py --dataset refcoco+ (for RefCOCO+) --checkpoint /path/to/checkpoint
  python test.py --dataset refcocog (for RefCOCOg) --checkpoint /path/to/checkpoint
You can’t perform that action at this time.