- Update 2023.03.25
- Align before Fuse: Vision and Language Representation Learning with Momentum Distillation : [paper] [code], NeurIPS 2021
- FAME-ViL: Multi-Tasking Vision-Language Model for Heterogeneous Fashion Tasks : [paper] [code], CVPR 2023
- Masked Vision-language Transformer in Fashion : [paper] [code], Machine Intelligence Research 2022
- FaD-VLP: Fashion Vision-and-Language Pre-training towards Unified Retrieval and Captioning : [paper], EMNLP 2022
- FashionViL: Fashion-Focused Vision-and-Language Representation Learning : [paper] [code], ECCV 2022
- Kaleido-BERT: Vision-Language Pre-training on Fashion Domain : [paper] [code], CVPR 2021
- FashionBERT: Text and Image Matching with Adaptive Loss for Cross-modal Retrieval : [paper] SIGIR'2020
- Contrastive language and vision learning of general fashion concepts : [paper]