WSDHQ: Weakly Supervised Deep Hyperspherical Quantization for Image Retrieval

[toc]

1. Introduction

This repository provides the code for our paper at AAAI 2021:

Weakly Supervised Deep Hyperspherical Quantization for Image Retrieval. Jinpeng Wang, Bin Chen, Qiang Zhang, Zaiqiao Meng, Shangsong Liang, Shu-Tao Xia. [link].

We proposed WSDHQ, a weakly supervised deep quantization approach for image retrieval. Instead of requiring ground-truth labels, WSDHQ leverages the informal tags provided by amateur users to guide quantization learning, which can alleviate the reliance on manual annotations and facilitate the feasibility of industrial deployment. In WSDHQ, we propose a tag processing mechanism based on correlation to enhance the weak semantics of such noisy tags. Besides, we learn quantized representations on the hypersphere manifold, on which we design a novel adaptive cosine margin loss for embedding learning and a supervised cosine quantization loss for quantization. Experiments on Flickr-25K and NUS-WIDE datasets demonstrate the superiority of WSDHQ.

In the following, we will guide you how to use this repository step by step. 🤗

2. Preparation

git clone https://github.com/gimpong/AAAI21-WSDHQ.git
cd AAAI21-WSDHQ/
tar -xvzf data.tar.gz
rm -f data.tar.gz

2.1 Requirements

python 3.7.8
numpy 1.19.1
scikit-learn 0.23.1
h5py 2.10.0
python-opencv 3.4.2
tqdm 4.51.0
tensorflow 1.15.0

2.2 Download image datasets and pre-trained models. Organize them properly

Before running the code, we need to make sure that everything needed is ready. First, the working directory is expected to be organized as below:

AAAI21-WSDHQ/

data/

flickr25k/

tags

FinalTagEmbs.txt
TagIdMergeMap.pkl

common_tags.txt
database_img.txt
database_label.txt
train_img.txt
train_tag.txt
test_img.txt
test_label.txt

nus-wide/

tags

FinalTagEmbs.txt
TagIdMergeMap.pkl

TagList1k.txt
database_img.txt
database_label.txt
train_img.txt
train_tag.txt
test_img.txt
test_label.txt

datasets/

GoogleNews-vectors-negative300.bin.gz
flickr25k/

mirflickr/

im1.jpg
im2.jpg
...

nus-wide/

Flickr/

actor/

0001_2124494179.jpg
0002_174174086.jpg
...

administrative_assistant/

...

...

scripts/

run0001.sh
run0002.sh
...
tag_processing.sh

train.py
validation.py
net.py
net_val.py
util.py
dataset.py
alexnet.npy

Notes

The data/ folder is the collection of data splits for Flickr25K and NUS-WIDE datasets. The raw images of Flickr25K and NUS-WIDE datasets should be downloaded additionally and arranged in datasets/flickr25k/ and datasets/nus-wide/ respectively. Here we provide copies of these image datasets, you can download them via Google Drive or Baidu Wangpan (Web Drive, password: ocmv).
The pre-trained files of AlexNet (alexnet.npy) and Word2Vec (GoogleNews-vectors-negative300.bin.gz) can be downloaded from Baidu Wangpan (Web Drive, password: ocmv).

3. Enhance the weak semantic information of tags via preprocessing (Optional)

We have provided enhanced tag embeddings in this repository. See data/flickr25k/tags/ and data/nus-wide/tags/. If you want to reproduce these files, you can remove them and execute

cd scripts/
# '0' is the id of GPU
bash tag_processing.sh 0

4. Train and then evaluate

To facilitate reproducibility, we provide the scripts with configurations for each experiment. The scripts can be found under the scripts/ folder. For example, if you want to train and evaluate an 8-bit WSDHQ model on Flickr25K dataset, you can do

cd scripts/
# '0' is the id of GPU
bash run0001.sh 0

The script run0001.sh includes the running commands:

#!/bin/bash

cd ..

##8 bits
#                     dataset  lr      iter  lambda    subspace_num  loss   notes  gpu
python train.py       flickr   0.0003  800   0.0001    1             WSDQH  0001   $1
#                     dataset  model_weight                                                                 gpu
python validation.py  flickr   ./checkpoints/flickr_WSDQH_nbits=8_adaMargin_gamma=1_lambda=0.0001_0001.npy  $1

cd -

After running a script, a series of files will be saved under logs/ and checkpoints/. Take run0001.sh as an example:

AAAI21-WSDHQ/

logs/

flickr_WSDQH_nbits=8_adaMargin_gamma=1_lambda=0.0001_0001.log

checkpoints/

flickr_WSDQH_nbits=8_adaMargin_gamma=1_lambda=0.0001_0001.npy
flickr_WSDQH_nbits=8_adaMargin_gamma=1_lambda=0.0001_0001_retrieval.h5

...

Here we report the results of running the scripts on a GTX 1080 Ti. Results are shown in the following table. We have also uploaded the logs and checkpoint information for reference, which can be downloaded from Baidu Wangpan (Web Drive, password: ocmv).

Note that some values can slightly deviate from the reported results in our original paper. The phenomenon is caused by the randomness of Tensorflow and the software and hardware discrepancies.

Script	Dataset	Code Length / bits	MAP	Log
run0001.sh	Flickr25K	8	0.766	flickr_WSDQH_nbits=8_adaMargin_gamma=1_lambda=0.0001_0001.log
run0002.sh		16	0.755	flickr_WSDQH_nbits=16_adaMargin_gamma=1_lambda=0.0001_0002.log
run0003.sh		24	0.765	flickr_WSDQH_nbits=24_adaMargin_gamma=1_lambda=0.0001_0003.log
run0004.sh		32	0.767	flickr_WSDQH_nbits=32_adaMargin_gamma=1_lambda=0.0001_0004.log
run0005.sh	NUS-WIDE	8	0.717	nuswide_WSDQH_nbits=8_adaMargin_gamma=1_lambda=0.0001_0005.log
run0006.sh		16	0.727	nuswide_WSDQH_nbits=16_adaMargin_gamma=1_lambda=0.0001_0006.log
run0007.sh		24	0.730	nuswide_WSDQH_nbits=24_adaMargin_gamma=1_lambda=0.0001_0007.log
run0008.sh		32	0.729	nuswide_WSDQH_nbits=32_adaMargin_gamma=1_lambda=0.0001_0008.log

5. References

If you find this code useful or use the toolkit in your work, please consider citing:

@inproceedings{wang2021wsdhq,
  title={Weakly Supervised Deep Hyperspherical Quantization for Image Retrieval},
  author={Wang, Jinpeng and Chen, Bin and Zhang, Qiang and Meng, Zaiqiao and Liang, Shangsong and Xia, Shutao},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  volume={35},
  number={4},
  pages={2755--2763},
  year={2021}
}

6. Acknowledgements

We use DeepHash as the code base in our implementation.

7. Contact

If you have any question, you can raise an issue or email Jinpeng Wang (wjp20@mails.tsinghua.edu.cn). We will reply you soon.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
figs		figs
reference_logs		reference_logs
scripts		scripts
README.md		README.md
data.tar.gz		data.tar.gz
dataset.py		dataset.py
net.py		net.py
net_val.py		net_val.py
train.py		train.py
util.py		util.py
validation.py		validation.py

gimpong/AAAI21-WSDHQ

Folders and files

Latest commit

History

Repository files navigation

WSDHQ: Weakly Supervised Deep Hyperspherical Quantization for Image Retrieval

1. Introduction

2. Preparation

2.1 Requirements

2.2 Download image datasets and pre-trained models. Organize them properly

Notes

3. Enhance the weak semantic information of tags via preprocessing (Optional)

4. Train and then evaluate

5. References

6. Acknowledgements

7. Contact

About

Topics

Resources

Stars

Watchers

Forks

Languages