Skip to content

danieljf24/cmrf

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

cmrf

The cmrf package provides a python implementation of our winning solution [1] for the MSR-Bing Image Retrieval Challenge in conjunction with ACM Multimedia 2015.

  • Six individual methods (i.e. Image2text, Text2image, PSI, DeViSE, ConSE and Parzen window),
  • Learning optimized weights for relevance fusion,
  • Cross-platform support (linux, mac, windows).

Dependency

Description

In order to generate cross-media relevance, image and query have to be represented in a common space as they are of two distinct modalities. In our package, we implement six individual methods for cross-media relevance computation and a late-fusion method for cross-media relevance fusion.

#####Individual training methods

  • PSI: utilize stochastic gradient descent with mini-batches to minimize the margin ranking loss of PSI model.
  • DeViSE: utilize stochastic gradient descent with mini-batches to minimize the margin ranking loss of DeViSE model.
  • Other methods have no training process.

#####Individual test methods

  • Image2text: project image into Bag-of-Words space.
  • Text2image: project query into visual feature space.
  • PSI: project image and query of Bag-of-Words into a learned latent space.
  • DeViSE: project image and query of word2vec feature into a learned latent space.
  • ConSE: project image and query into a learned word2vec space.
  • Parzen window: an extreme case of text2image.

#####Relevance fusion

Get Started

Please run doit_all.sh to see if everything is in place. If it runs successfully, the cross-media relevance of all the query-imge pairs will be written in result/final.result.txt folder, and other intermediate results will also appear in result folder.

Note

  • If you have not installed the Theano, you could run doit_4.sh (only Image2text, Text2image, ConSE and Parzen window)
  • As a show case, we only run 20 queries. If you want to run all the 1000 queries from Dev set, please rename 'qid.text.all.txt' in /rootpath/msr2013dev/Annotations/ to 'qid.text.txt'. It will take a while.
  • If you would like to use your own dataset, we recommand you to organize dataset in a fixed structure like our data, which can minimize your coding effort.
  • The package does not include any visual feature extractors. Features of data need to be pre-computed, and converted to required binary format using txt2bin.py.

Reference

[1] Jianfeng Dong, Xirong Li, Shuai Liao, Jieping Xu, Duanqing Xu, Xiaoyong Du. Image Retrieval by Cross-Media Relevance Fusion. ACM Multimedia 2015 (Multimedia Grand Challenge Session)

About

Cross-Media Reference Fusion

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published