SEMScene: Semantic-Consistency Enhanced Multi-Level Scene Graph Matching for Image-Text Retrieval

PyTorch implementation for SEMScene model. SEMScene is a scene-graph based image-text retrieval method. The paper of this reasearch has been accepted by ACM Transactions on Multimedia Computing, Communications, and Applications (ACM TOMM) entitled "SEMScene: Semantic-Consistency Enhanced Multi-Level Scene Graph Matching for Image-Text Retrieval". The paper has been accepted and is currently awaiting typesetting for publication.

Requirements

For all used packages in the model, please refer to the requirements.txt. The Python version is 3.11.4.

Data

Except for the uploaded basic data in this repository, the model still need basic data including adjacency matrix based on the connections of predicates and triplets of sentence extracted through leveraging the SceneGraphParser, which can be obtained here: flickr30k and mscoco. Please download and place them in the data_flickr30k/data and data_mscoco/data folders, respectively. Or you can extract them by editing the paths of original files in extract_pred_adj.py and sng_parser_process.ipynb, then run them. The original files can be download from here. After extracting the triplets of sentence, please implement the stemming for them.

The visual features of objects and predicates are also needed, we follow LGSGM to use EfficientNet-b5 to extract these features, you can find them here: flickr30k_visual and mscoco_visual, the files storing all extracted visual features of Flickr30k and MSCOCO are provided by LGSGM, many thanks. Please download and place them in the data_flickr30k and data_mscoco, respectively.

Training new models from scratch

Please modify the hyper-parameters in SEMScene/Configuration.py according to their corresponding comments, and run:

python SEMScene/SEMScene.py

Pre-trained model and Evaluation

For limited google drive space, we temporarily upload the pretrained models of Flickr30K, they can be downloaded from flickr30k_pretrained_model. Please modify the path in the 24th row info_dict['checkpoint'] = None of SEMScene/Configuration.py and delete the statement in the 935th row trainer.train() of SEMScene/SEMScene.py, then run the SEMScene/SEMScene.py for evaluation.

Contact

For any issue or comment, you can directly email the authors at lyk208d80@gmail.com or xiangyuan@stu.pku.edu.cn.

Reference

If you find our work helpful to your research, please cite our work as:

@article{10.1145/3664816,
author = {Liu, Yuankun and Yuan, Xiang and Li, Haochen and Tan, Zhijie and Huang, Jinsong and Xiao, Jingjie and Li, Weiping and Mo, Tong},
title = {SEMScene: Semantic-Consistency Enhanced Multi-Level Scene Graph Matching for Image-Text Retrieval},
year = {2024},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
issn = {1551-6857},
url = {https://doi.org/10.1145/3664816},
doi = {10.1145/3664816},
note = {Just Accepted},
journal = {ACM Trans. Multimedia Comput. Commun. Appl.},
month = {May}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SEMScene: Semantic-Consistency Enhanced Multi-Level Scene Graph Matching for Image-Text Retrieval

Requirements

Data

Training new models from scratch

Pre-trained model and Evaluation

Contact

Reference

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
SEMScene		SEMScene
data_flickr30k/data		data_flickr30k/data
data_mscoco/data		data_mscoco/data
README.md		README.md
extract_pred_adj.py		extract_pred_adj.py
requirements.txt		requirements.txt
sng_parser_process.ipynb		sng_parser_process.ipynb

MartinYuanNJU/SEMScene

Folders and files

Latest commit

History

Repository files navigation

SEMScene: Semantic-Consistency Enhanced Multi-Level Scene Graph Matching for Image-Text Retrieval

Requirements

Data

Training new models from scratch

Pre-trained model and Evaluation

Contact

Reference

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages