This paper is accepted by the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2023 (CVPR2023) paper
This is the source code of PyTorch implementation of the FashionSAP.
We will introduce more about our project ...
- requirements.txt
-
- download the raw file and extract it in path
data_root. - change the
data_rootandsplitinprepare_dataset.pyand run it get the assitance file.
- download the raw file and extract it in path
-
- download the raw file and extract it in path
data_root. - the directory
captionsandimagesin raw fileare put indata_root. Besides the file, we also merge all kinds of train file intocap.train.jsonfile incaptions, so as toval.
- download the raw file and extract it in path
-
we define 3 kinds downstream names as
downstream_nameretrieval: includes 2 downstream tasks: text-to-image retrieval downstream and image-to-text retrieval.catereg: fashion domain category recognition and subcategory recognition.tgir: text guided image retrieval or text modified image retrieval.
-
command
bash run_pretrain.shto run pretrain stage. -
command
bash run_{downstream_name}.shto train and evaluate different downstream tasks.
- Our pre-trained model can be downloaded from Google Driver
If you find this code useful for your research, please cite:
@inproceedings{FashionSAP,
title={FashionSAP: Symbols and Attributes Prompt for Fine-grained Fashion Vision-Language Pre-training},
author={Han, Yunpeng and Zhang, Lisai and Chen, Qingcai and Chen, Zhijian and Li, Zhonghua and Yang, Jianxin and Cao, Zhao},
year={2023},
booktitle={CVPR}
}
Some utils codes are referenced from project ALBEF