Song Jiayu, Yuxuan Hu, Lei Zhu, Chengyuan Zhang, Jian Zhang, and Shichao Zhang. 2024. Soft Contrastive Cross-Modal Retrieval Applied Sciences 14, no. 5: 1944. https://doi.org/10.3390/app14051944
Cross-modal retrieval plays a key role in neural language processing area, which aims to retrieve one modality to another efficiently. Although most existing cross-modal retrieval methods have achieved remarkable performance, the embedding boundary becomes sharp and tortuous with the upgrade of model complexity. Besides, most of exist methods realize the outstanding results on the basis of datasets without any error or noise, but that is extremely ideal and leads to trained model lack of robustness. To solve these problems, in this paper we propsed a novel approach, Soft Contrastive Cross-Modal Retrieval(SCCMR), which integrate deep cross-modal model with soft contrastive learning and smooth label cross-entropy learning to boost common subspace embedding and improve the generalizability and robustness of model. To confirm the performance and effectiveness of SCCMR, we conduct extenstive experiments comparing with 12 state-of-the-arts on two multimodal datasets by using the image-text retrieval as a showcase. The experimental results show that our proposed method outperforms the baselines.
This code is based on DSCMR-master. Many thanks for their open source.
python main.py
The whole experimental processing and results have been released in sccmr.ipynb .
If you think it is useful for your research, please consider citing this:
@Article{app14051944,
AUTHOR = {Song, Jiayu and Hu, Yuxuan and Zhu, Lei and Zhang, Chengyuan and Zhang, Jian and Zhang, Shichao},
TITLE = {Soft Contrastive Cross-Modal Retrieval},
JOURNAL = {Applied Sciences},
VOLUME = {14},
YEAR = {2024},
NUMBER = {5},
ARTICLE-NUMBER = {1944},
URL = {https://www.mdpi.com/2076-3417/14/5/1944},
ISSN = {2076-3417},
DOI = {10.3390/app14051944}
}