GitHub - WentaoTan/MLLM4Text-ReID: Code for Harnessing the Power of MLLMs for Transferable Text-to-Image Person ReID (CVPR 2024)

Harnessing the Power of MLLMs for Transferable Text-to-Image Person ReID (CVPR 2024)

Requirements

pytorch 1.9.0
torchvision 0.10.0
prettytable
easydict

1、Construct LUPerson-MLLM

Download the LUPerson images from here.
Use MLLMs to annotate LUPerson images. Take Qwen as an example. The code for image captioning is provided in the captions folder. Inside, you will find 46 templates along with static and dynamic instructions. You can download all the descriptions for the final LUPerson-MLLM from here.
Place the generated descriptions in the captions folder.

2、Prepare Downstream Datasets

Download the CUHK-PEDES dataset from here, ICFG-PEDES dataset from here and RSTPReid dataset form here.

3、Pretrain Model (direct transfer setting)

To pretrain your model, you can simply run sh run.sh. After the model training is completed, it will provide the performance of direct transfer setting.

4、Fine-tune the Pretrained Model on Downstream Datasets (fine-tune setting)

To fine-tune your model, you can simply run sh finetune.sh --finetune checkpoint.pth. After the model training is completed, it will provide the performance of fine-tune setting.

Acknowledgments

This repo borrows partially from IRRA.

Citation

@article{tan2024harnessing,
  title={Harnessing the Power of MLLMs for Transferable Text-to-Image Person ReID},
  author={Wentao Tan, Changxing Ding, Jiayu Jiang, Fei Wang, Yibing Zhan, Dapeng Tao},
  journal={CVPR},
  year={2024},
}

Contact

Email: ftwentaotan@mail.scut.edu.cn or 731584671@qq.com

如果可以当然还是希望用中文contact我啦！

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
captions		captions
data		data
datasets		datasets
figures		figures
model		model
processor		processor
solver		solver
utils		utils
CVPR2024 Harnessing the Power of MLLMs for Transferable Text-to-Image Person ReID.pdf		CVPR2024 Harnessing the Power of MLLMs for Transferable Text-to-Image Person ReID.pdf
MLLM4Text-reid-sup.pdf		MLLM4Text-reid-sup.pdf
README.md		README.md
finetune.py		finetune.py
finetune.sh		finetune.sh
run.sh		run.sh
test.py		test.py
train.py		train.py

WentaoTan/MLLM4Text-ReID

Folders and files

Latest commit

History

Repository files navigation

Harnessing the Power of MLLMs for Transferable Text-to-Image Person ReID (CVPR 2024)

Requirements

1、Construct LUPerson-MLLM

2、Prepare Downstream Datasets

3、Pretrain Model (direct transfer setting)

4、Fine-tune the Pretrained Model on Downstream Datasets (fine-tune setting)

Acknowledgments

Citation

Contact

About

Resources

Stars

Watchers

Forks

Languages