Attribute Phrases

This is the dataset and the training code with Tensorflow used in the paper:

Jong-Chyi Su*, Chenyun Wu*, Huaizu Jiang, Subhransu Maji, "Reasoning about Fine-grained Attribute Phrases using Reference Games", International Conference on Computer Vision (ICCV), 2017

@inproceedings{su2017reasoning,
    Author = {Jong-Chyi Su and Chenyun Wu and Huaizu Jiang and Subhransu Maji},
    Title = {Reasoning about Fine-grained Attribute Phrases using Reference Games},
    Booktitle = {International Conference on Computer Vision (ICCV)},
    Year = {2017}
}

[Project page] [Paper]

Dataset

Each pair has 1 pair of images and 5 pairs of corresponding attribute phrases

Image 1		Image 2

commercial plane	vs	private plane
large plane	vs	small plane
white and grey	vs	white with blue and red stripes
twin engines	vs	single engine
more windows on body	vs	less windows on body

Stats about the dataset

Training set: 4700 pairs
Val set: 2350 pairs
Test set: 2350 pairs

Requirements

Python 2.7
Tensorflow v1.0+

Download Dataset

User descriptions are included in dataset/visdiff\_SET.json, where SET={train, val, test, trainval}
Download images from OID dataset (http://www.robots.ox.ac.uk/~vgg/data/oid)
Move images from oid-aircraft-beta-1/data/images/aeroplane/\*.jpg to the folder dataset/images/\*.jpg

Download ImageNet Pre-trained Model

Add pretrained model (e.g. vgg_16.ckpt) in models/checkpoints/

Extract image feature to numpy file to accelerate training

Go to utils/ and run: python get_feature.py --dataset train the numpy file will be saved in img_feat/vgg_16/train.npy

Train Listener Model

Step 1 fix image feature Step 2 finetune image feature

SL (Simple Listener)

python train_listener.py --mode train --log_dir result/SL --pairwise 0 --train_img_model 0 --max_steps 2000 --batch_size 128
python train_listener.py --mode train --log_dir result/SL --pairwise 0 --train_img_model 1 --max_steps 7500 --load_model_path model-fixed-2000 --learn_rate 0.00001

SLr (Simple Listener trained w/o contrastive data)

python train_listener.py --mode train --log_dir result/SLr --pairwise 0 --ran_neg_sample 1 --train_img_model 0 --max_steps 5000 --batch_size 128
python train_listener.py --mode train --log_dir result/SLr --pairwise 0 --ran_neg_sample 1 --train_img_model 1 --max_steps 10000 --load_model_path model-fixed-5000 --learn_rate 0.00001

DL (Discerning Listener)

python train_listener.py --mode train --log_dir result/DL --pairwise 1 --train_img_model 0 --max_steps 2000 --max_sent_length 17 --batch_size 128
python train_listener.py --mode train --log_dir result/DL --pairwise 1 --train_img_model 1 --max_steps 7000 --load_model_path model-fixed-2000 --max_sent_length 17 --learn_rate 0.00001

Evaluate Listener Model

SL

python train_listener.py --mode eval --log_dir result/SL --pairwise 0 --train_img_model 0 --load_model_path model-fixed-2000 --dataset val
python train_listener.py --mode eval --log_dir result/SL --pairwise 0 --train_img_model 1 --load_model_path model-finetune-7500 --dataset val

SLr

python train_listener.py --mode eval --log_dir result/SLr --pairwise 0 --train_img_model 0 --load_model_path model-fixed-5000 --dataset val
python train_listener.py --mode eval --log_dir result/SLr --pairwise 0 --train_img_model 1 --load_model_path model-finetune-10000 --dataset val

DL

python train_listener.py --mode eval --log_dir result/DL --pairwise 1 --train_img_model 0 --load_model_path model-fixed-2000 --dataset val
python train_listener.py --mode eval --log_dir result/DL --pairwise 1 --train_img_model 1 --load_model_path model-finetune-7000 --dataset val

Train Speaker Model

Example: python train_speaker.py --speaker_mode=S --img_model=vgg_16 --train_img_model=1 --experiment_path=result/speaker/temp
Options:
- --speaker_mode: S or DS
- --img_model: alexnet, inception_v3, or vgg_16
- --train_img_model: Fine-tune image model or not (0 as False, 1 as True)
- --experiment_path: where to output and save the trained model
- --load_model_dir: path to the pre-trained model. If not set, train from scratch
- --load_model_name: model name (model-%steps) in load_model_dir
- See more options in train_speaker.py

Use Speaker to Generate Attribute Phrases

Example: python inference_pairwise.py --input_path=result/speaker/temp --model_step=model-5000 --dataset_name=val
Options:
- --input_path: path to the trained speaker model that you want to use
- --model_step: model name (model-%steps) in input_path
- --dataset_name: which sub-dataset to use (train / val / test)
- See more options in inference_pairwise.py

Discerning Speaker Model

Here we use the listener model to re-rank attribute phrases generated by speaker model. To run this step, you need to have a listenter model, and generated phrases from a speaker model.

Example: pyhton rerank.py --listener_path=result/SL --listener_model=model-fixed-2000 --speaker_result_path=result/speaker/temp/infer_annotations_val_model-5000_case0_beam10_sent10.json --infer_dataset=val
Options:
- --listener_path: path to the listener model used for reranking
- --listener_model: model name (model-%steps) in listener_model
- --speaker_result_path: the file that saves the phrases generated by a speaker model
- --infer_dataset: which dataset to work on (train / val / test)
- See more options in rerank.py

Generate Set-wise Attribute Phrases

In "inference_setwise.py", set "speaker_path" as the path to the trained speaker model you want to use
run python inference_setwise.py

Authors

Please contact jcsu@cs.umass.edu if you have any question.

Jong-Chyi Su (Umass-Amherst)
Chenyun Wu (Umass-Amherst)
Huaizu Jiang (Umass-Amherst)

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
dataset		dataset
img_feat/img_feat_dict		img_feat/img_feat_dict
models		models
network		network
pyVisDifftools		pyVisDifftools
utils		utils
vocabulary		vocabulary
README.md		README.md
data_loader.py		data_loader.py
inference_pairwise.py		inference_pairwise.py
inference_setwise.py		inference_setwise.py
pyVisDiffDemo.ipynb		pyVisDiffDemo.ipynb
rerank.py		rerank.py
train_listener.py		train_listener.py
train_speaker.py		train_speaker.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Attribute Phrases

Dataset

Stats about the dataset

Requirements

Download Dataset

Download ImageNet Pre-trained Model

Extract image feature to numpy file to accelerate training

Train Listener Model

SL (Simple Listener)

SLr (Simple Listener trained w/o contrastive data)

DL (Discerning Listener)

Evaluate Listener Model

SL

SLr

DL

Train Speaker Model

Use Speaker to Generate Attribute Phrases

Discerning Speaker Model

Generate Set-wise Attribute Phrases

Authors

About

Releases

Packages

Languages

jongchyisu/attribute_phrases

Folders and files

Latest commit

History

Repository files navigation

Attribute Phrases

Dataset

Stats about the dataset

Requirements

Download Dataset

Download ImageNet Pre-trained Model

Extract image feature to numpy file to accelerate training

Train Listener Model

SL (Simple Listener)

SLr (Simple Listener trained w/o contrastive data)

DL (Discerning Listener)

Evaluate Listener Model

SL

SLr

DL

Train Speaker Model

Use Speaker to Generate Attribute Phrases

Discerning Speaker Model

Generate Set-wise Attribute Phrases

Authors

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages