Captioned image retrieval
We explore the task of retrieving similar captioned images from a dataset, given a previously unseen captioned image.
Note: the source code in
SpatialPyramid has some bugs fixed. It is not exactly the same as the original source code from UIUC.
To run baseline
- Unzip the Flickr 8k dataset to a
close all; clear all; baseline;
To run crawling
- Add the required search tags into
searchTagsarray in line 13 of
Note: The crawled Imgur dataset (DataM) consists of 32K images and ~110K captions and is sized at ~16 gb. It can be provided on request.
To run LDA
See readme in lda/
To run CNN
See readme in caffenet/
The dataset and associated captions can be found at: http://pages.cs.wisc.edu/~ms/CS766-ComputerVision/captioned-image-retrieval/