Skip to content
/ CGMN Public

The code of the paper "Cross-Modal Graph Matching Network for Image-Text Retrieval" in ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) .

License

Notifications You must be signed in to change notification settings

cyh-sj/CGMN

Repository files navigation

Cross-Modal Graph Matching Network for Image-Text Retrieval (CGMN)

PyTorch code for CGMN described in the paper "Cross-Modal Graph Matching Network for Image-text Retrieval". The paper is accepted by Transactions on Multimedia Computing Communications and Applications. It is built on top of the VSE++.

Partial data can be obtained here, and the pretrained models can be obtained in Flickr30K and MS-COCO.

The IOU.npy can be obtained by using getiou.py with _bbx.npy .

Requirements

We recommended the following dependencies.

import nltk
nltk.download()
> d punkt

Evaluate pre-trained models

Modify the model_path and data_path in the evaluation_models.py file. Then Run evaluation_models.py:

python evaluation_models.py

Training new models

Run train.py:

For MSCOCO:

python train.py --data_path $DATA_PATH --data_name coco_precomp --logger_name runs/coco_VSRN --max_violation

For Flickr30K:

python train.py --data_path $DATA_PATH --data_name f30k_precomp --logger_name runs/flickr_CGMN --max_violation --lr_update 10  --max_len 60

Reference

If you found this code useful, please cite the following paper:

@article{Cheng2022CGMN,
author = {Cheng, Yuhao and Zhu, Xiaoguang and Qian, Jiuchao and Wen, Fei and Liu, Peilin},
title = {Cross-Modal Graph Matching Network for Image-Text Retrieval},
year = {2022},
issue_date = {November 2022},
volume = {18},
number = {4},
issn = {1551-6857},
url = {https://doi.org/10.1145/3499027},
doi = {10.1145/3499027},
journal = {ACM Trans. Multimedia Comput. Commun. Appl.},
month = {mar},
articleno = {95},
numpages = {23},
keywords = {Image-text retrieval, relation reasoning, cross-modal matching, graph matching}
}

License

Apache License 2.0

About

The code of the paper "Cross-Modal Graph Matching Network for Image-Text Retrieval" in ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) .

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages