INTRA-MODAL CONSTERAINT LOSS FOR IMAGE-TEXT RETRIEVAL IEEE
This is code for IMC work. PDF
- Python3
- PyTorch
- NumPy
- TensorBoard
- pycocotools
- torchvision
- torchtext
- matplotlib
- nltk
Download the dataset files (MS-COCO and Flickr30K) in /data.
python train.py --data_path "$DATA_PATH" --data_name coco --logger_name runs/coco_imc --max_violation
--num_epochs 30 --rnn_type LSTM --wordemb glove --use_bidirectional --cnn_type resnet152 --use_restval --il_measure l1
You can download our pre-trained models coco_imc and f30k_imc in RUN_PATH
.
python -c "\
from vocab import Vocabulary
import evaluation
evaluation.evalrank('$RUN_PATH/f30k_imc/model_best.pth.tar', data_path='$DATA_PATH', split='test')"
To do cross-validation on MSCOCO, pass fold5=True
with a model trained using
$RUN_PATH/coco_imc/model_best.pth.tar
.
Our code is besed on VSE++. We thank to the authors for releasing codes.
@INPROCEEDINGS{9897195,
author={Chen, Jianan and Zhang, Lu and Wang, Qiong and Bai, Cong and Kpalma, Kidiyo},
booktitle={2022 IEEE International Conference on Image Processing (ICIP)},
title={Intra-Modal Constraint Loss for Image-Text Retrieval},
year={2022},
volume={},
number={},
pages={4023-4027},
doi={10.1109/ICIP46576.2022.9897195}}