This repository contains the code for our paper Dual-Path Convolutional Image-Text Embedding. Thank you for your kindly attention.
What's New: We updated the paper to the second version, adding more illustration about the mechanism of the proposed instance loss.
I have included my Matconvnet in this repo, so you do not need to download it again.You just need to uncomment and modify some lines in gpu_compile.m and run it in Matlab. Try it~ (The code does not support cudnn 6.0. You may just turn off the Enablecudnn or try cudnn5.1)
If you fail in compilation, you may refer to http://www.vlfeat.org/matconvnet/install/
-
Extract wrod2vec weights. Follow the instruction in
./word2vector_matlab
; -
Prepocess the dataset. Follow the instruction in
./dataset
. You can choose one dataset to run. Three datasets need different prepocessing. I write the instruction for Flickr30k, MSCOCO and CUHK-PEDES. -
Download the model pre-trained on ImageNet. And put the model into './data'.
(bash) wget http://www.vlfeat.org/matconvnet/models/imagenet-resnet-50-dag.mat
Alternatively, you may try VGG16 or VGG19.
You may have a different split with me. (Sorry, this is my fault. I used a random split.) Just for a backup, this is the dictionary archive used in the paper.
You may download the three trained models from GoogleDrive.
- For Flickr30k, run
train_flickr_word2_1_pool.m
for Stage I training.
Run train_flickr_word_Rankloss_shift_hard
for Stage II training.
- For MSCOCO, run
train_coco_word2_1_pool.m
for Stage I training.
Run train_coco_Rankloss_shift_hard.m
for Stage II training.
- For CUHK-PEDES, run
train_cuhk_word2_1_pool.m
for Stage I training.
Run train_cuhk_word_Rankloss_shift
for Stage II training.
Select one model and have fun!
-
For Flickr30k, run
test/extract_pic_feature_word2_plus_52.m
and to extract the feature from image and text. Note that you need to change the model path in the code. -
For MSCOCO, run
test_coco/extract_pic_feature_word2_plus.m
and to extract the feature from image and text. Note that you need to change the model path in the code. -
For CUHK-PEDES, run
test_cuhk/extract_pic_feature_word2_plus_52.m
and to extract the feature from image and text. Note that you need to change the model path in the code.
-
Get word2vec weight
-
Data Preparation (Flickr30k)
-
Train on Flickr30k
-
Test on Flickr30k
-
Data Preparation (MSCOCO)
-
Train on MSCOCO
-
Test on MSCOCO
-
Data Preparation (CUHK-PEDES)
-
Train on CUHK-PEDES
-
Test on CUHK-PEDES
-
Run the code on another machine