PAU

Introduction

The implementation for NeurIPS 2023 paper of "Prototype-based Aleatoric Uncertainty Quantification for Cross-modal Retrieval.". It is built on top of the CLIP4Clip and the X-CLIP.

Update

2024.3.9: The code of Image-text Retrieval has been updated. You can find it in directory Image-Text-Retrieval/.

Requirement

We recommend the following dependencies.

Python 3.8
PyTorch 1.7.1
NumPy (>1.19.5)
TensorBoard

pip install requirments.txt

Dataset Preparation

We follow the same split provided by CLIP4Clip. You can follow the guide of its Data preparing.

MSRVTT

The official data and video can be found here.

You can download the splits and captions by:

wget https://github.com/ArrowLuo/CLIP4Clip/releases/download/v0.0/msrvtt_data.zip

MSVD

The raw videos can be found here.

You can download the splits and captions by:

wget https://github.com/ArrowLuo/CLIP4Clip/releases/download/v0.0/msvd_data.zip

DiDeMo

The raw videos can be found here. The splits can be found here

Checkpoint

We provide the trained model files for evaluation. You can download the model trained on MSRVTT here, model trained on MSVD here, and model trained on DiDeMo here.

Training

Please edit the ${DATA_PATH} to the path of your dataset, and the ${SAVE_PATH} to the path of the checkpoints where to save.

Tips: ${do_rerank_learn} indicates whether to automatically learn the beta parameter of the model after each model training, which will take a longer time. You can remove it if you wish to speed up the validate process.

MSR-VTT

sh scripts/run_msrvtt.sh

MSVD

sh scripts/run_msvd.sh

DiDeMo

sh scripts/run_didemo.sh

Rerank Beta Learning

If you want to get the best beta parameters of re-ranking (may take more time). Please edit the ${DATA_PATH} to the path of your dataset, and the ${SAVE_PATH} to the path of the checkpoints where to save.

You can freely construct a beta learning set, but it is preferable that the data within it have not been used in previous model training process. Here, we default to using the validation set as the beta learning set.

MSR-VTT

sh scripts/run_msrvtt_learn.sh

MSVD

sh scripts/run_msvd_learn.sh

DiDeMo

sh scripts/run_didemo_learn.sh

Evaluation

Please edit the ${DATA_PATH} to the path of your dataset, the ${SAVE_PATH} to the path of the checkpoints where to save, and the ${MODEL_PATH} to the path of the checkpoints to be loaded. ${rerank_coe_v} and ${rerank_coe_t} are the rerank parameters ($\beta_1$, $\beta_2$) obtained in the Beta Learning Process.

MSR-VTT

sh scripts/run_msrvtt_eval.sh

MSVD

sh scripts/run_msvd_eval.sh

DiDeMo

sh scripts/run_didemo_eval.sh

Reference

If you found this code useful, please cite the following paper:

@inproceedings{PAU,
  author    = {Hao Li and
               Jingkuan Song and
               Lianli Gao and
               Xiaosu Zhu and
               Heng Tao Shen},
  title     = {Prototype-based Aleatoric Uncertainty Quantification for Cross-modal Retrieval},
  booktitle = {NeurIPS},
  year      = {2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
Image-Text-Retrieval		Image-Text-Retrieval
__pycache__		__pycache__
assets		assets
dataloaders		dataloaders
log		log
modules		modules
preprocess		preprocess
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main_pau.py		main_pau.py
metrics.py		metrics.py
requirements.txt		requirements.txt
util.py		util.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PAU

Introduction

Update

Requirement

Dataset Preparation

MSRVTT

MSVD

DiDeMo

Checkpoint

Training

Rerank Beta Learning

Evaluation

Reference

About

Releases

Packages

Languages

License

leolee99/PAU

Folders and files

Latest commit

History

Repository files navigation

PAU

Introduction

Update

Requirement

Dataset Preparation

MSRVTT

MSVD

DiDeMo

Checkpoint

Training

Rerank Beta Learning

Evaluation

Reference

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages