Implementation of our AAAI2022 paper, Show Your Faith: Cross-Modal Confidence-Aware Network for Image-Text Matching.
vocab is available here: vocab.
ims_bbx.npy, ims_size.npy, precaps_stan.txt for Flickr30K: f30k_precomp.
ims_bbx.npy, ims_size.npy, precaps_stan.txt for MSCOCO: coco_precomp.
ims_dir_selfadj{4, 8, 12}.npy can be created by running the adj.py under directory ./context_extractor/.
The model trained on the Flickr30K dataset is available here: checkpoint_f30k_c499.3.pth.
Any questions please contact huatianzhang@mail.ustc.edu.cn for immediate reply. Thanks.