Skip to content

QinLab-WFU/DPSaH

Repository files navigation

Deep Potential Semantic-aware Hashing for Cross-modal Retrieval

This paper is accepted for publication with EAAI.

Training

Processing dataset

Before training, you need to download the oringal data from coco(include 2017 train,val and annotations), nuswide Google drive, mirflickr25k Baidu, 提取码:u9e1 or Google drive (include mirflickr25k and mirflickr25k_annotations_v080), then use the "data/make_XXX.py" to generate .mat file

After all mat file generated, the dir of dataset will like this:

dataset
├── base.py
├── __init__.py
├── dataloader.py
├── coco
│   ├── caption.mat 
│   ├── index.mat
│   └── label.mat 
├── flickr25k
│   ├── caption.mat
│   ├── index.mat
│   └── label.mat
└── nuswide
    ├── caption.txt  # Notice! It is a txt file!
    ├── index.mat 
    └── label.mat

Download CLIP pretrained model

Pretrained model will be found in the 30 lines of CLIP/clip/clip.py. This code is based on the "ViT-B/32".

You should copy ViT-B-32.pt to this dir.

Start

After the dataset has been prepared, we could run the follow command to train.

python main.py

Citation

@article{wu2026deep, title={Deep Potential Semantic-aware Hashing for Cross-modal Retrieval}, author={Wu, Lei and Qin, Qibing and Dai, Jiangyan and Huang, Lei and Zhang, Wenfeng}, journal={Engineering Applications of Artificial Intelligence}, volume={169}, pages={114155}, year={2026}, publisher={Elsevier} }

About

Source code for EAAI paper “Deep Potential Semantic-aware Hashing for cross-modal retrieval”

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages