Cross-Modal Retrieval Experiments

Training on MSCOCO

sh scripts/train_coco.sh

Training on Flickr

sh scripts/train_flickr.sh

Checkout the command line options for details of the experiment configuration. Supported loss functions: triplet, order-violation and cosine similarity dnn_library.py is the interface to use any other base feature extractor.

Evaluation on MSCOCO

sh scripts/eval_coco.sh <path_to_checkpoint>

<path_to_checkpoint> - Specify the path to trained model ckpt. Checkout the default command line options for evaluation and modify accordingly.

Evaluation on Flickr

sh scripts/eval_flickr.sh <path_to_checkpoint>

Generate Data

In the data folder, you can find scripts for generating TF-records for flowers dataset. Checkout command line arguments in the scripts for setting paths

To generate train and text files for flowers

python process_flowers_6k.py

To generate TF-records for flowers

python flowers_data_loader.py

coco_data_loader.py is base class to read COCO data. Data-readers and writers are included along with padded batching, pre-processing and data augmentation.

To generate TF-records for MSCOCO

python coco_data_loader.py --num 10000

Generate CNN features for FLICKR

python extract_image_features.py --dataset flickr --data_path /shared/kgcoe-research/mil/peri/flickr_data/ --root_path /shared/kgcoe-research/mil/Flickr30k/flickr30k_images/flickr30k_imagebackup2/ --save_path /shared/kgcoe-research/mil/peri/flickr_data/

To generate TF records with precomputed resnet_v1_152 features for flickr captions

python extract_image_features.py --record_path /shared/kgcoe-research/mil/Flickr30k/flickr_new_train_feat.tfrecord

To generate TF records with precomputed resnet_v1_152 features

python extract_image_features.py --dataset mscoco --root_path /shared/kgcoe-research/mil/video_project/mscoco_skipthoughts/images/ --data_path /shared/kgcoe-research/mil/peri/mscoco_data/train.ids --save_path /shared/kgcoe-research/mil/peri/mscoco_data/ --caps_path /shared/kgcoe-research/mil/peri/mscoco_data/train_caps.txt

Args:

--num : Number of examples to put in TF record. If it is not specified, entire dataset would be taken. Do not specify unless you are trying to overfit on a smaller dataset.
--phase : By default, training phase is set and it picks training + some validation images of MSCOCO More command line options related to setting the paths to the data can be found in the script coco_data_loader.py. A sample in TF record is of the form (image, caption)

Notes

model.py - Base model class for LSTM encoder, feature extractor, embedding layers and loss function
dnn_library.py - Dictionary of base feature extractor networks
Checkpoints and summaries can be found at

/shared/kgcoe-research/mil/peri/mscoco_data/
/shared/kgcoe-research/mil/Flickr30k

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
data		data
nets		nets
preprocessing		preprocessing
scripts		scripts
README.md		README.md
attention.py		attention.py
dnn_library.py		dnn_library.py
eval_align.py		eval_align.py
eval_bi.py		eval_bi.py
eval_gpu.py		eval_gpu.py
extract_image_features.py		extract_image_features.py
text_encoder.py		text_encoder.py
train_crossmodal.py		train_crossmodal.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cross-Modal Retrieval Experiments

Training on MSCOCO

Training on Flickr

Evaluation on MSCOCO

Evaluation on Flickr

Generate Data

Notes

About

Releases

Packages

Languages

peri044/CMR

Folders and files

Latest commit

History

Repository files navigation

Cross-Modal Retrieval Experiments

Training on MSCOCO

Training on Flickr

Evaluation on MSCOCO

Evaluation on Flickr

Generate Data

Notes

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages