ImageCaptionRetrieval

Description

Pytorch code for training an image-caption retrieval model on the MS-COCO dataset. Based on the paper ORDER-EMBEDDINGS OF IMAGES AND LANGUAGE. The model can be trained with 2 approaches:

Symmetric Distance(Cosine loss)
Ordered Distance

Furthermore, the code is also capable of fine-tuning the model using the VIST dataset.

Details

Preprocessing(done by preproc.py): The first step is to extract out normalized 10-CROP VGG19 features from all the images. These features will be used for all further processing.
Training(done by train.py): Loads the image-encoder and the sentence-encoder and trains using triplet loss alongwith Cosine Loss or orderEmbedding loss. No negative mining is being done in this code. The hyperparameters have been set according to the Research Paper.
Testing(done by eval.py): Finds the mean retrieval rank against the test dataset.

Check utils.py for utility function definitions and model.py for encoder architecture.

How to Run:

bar@foo$:~/ImageCaptionRetrieval/Symmetric embedding python3 preproc.py
bar@foo$:~/ImageCaptionRetrieval/Symmetric embedding python3 train.py
bar@foo$:~/ImageCaptionRetrieval/Symmetric embedding python3 eval.py

Similarly, the code can be run in Order embedding directory and FineTune directory.

Performance

A mean rank of 80 out of 5000 images for a given caption can be easily achieved by training this model.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
FineTune		FineTune
Order embedding		Order embedding
Symmetric embedding		Symmetric embedding
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ImageCaptionRetrieval

Description

Details

How to Run:

Performance

Datasets

Built Using

Author

About

Releases

Packages

Languages

VAIBHAV-2303/ImageCaptionRetrieval

Folders and files

Latest commit

History

Repository files navigation

ImageCaptionRetrieval

Description

Details

How to Run:

Performance

Datasets

Built Using

Author

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages