GitHub - chullhwan-song/Reading-Paper-VL

Vision and Language Pre-training

Align before Fuse: Vision and Language Representation Learning with Momentum Distillation : [paper] [code], NeurIPS 2021

FAME-ViL: Multi-Tasking Vision-Language Model for Heterogeneous Fashion Tasks : [paper] [code], CVPR 2023
Masked Vision-language Transformer in Fashion : [paper] [code], Machine Intelligence Research 2022
FaD-VLP: Fashion Vision-and-Language Pre-training towards Unified Retrieval and Captioning : [paper], EMNLP 2022
FashionViL: Fashion-Focused Vision-and-Language Representation Learning : [paper] [code], ECCV 2022
Kaleido-BERT: Vision-Language Pre-training on Fashion Domain : [paper] [code], CVPR 2021
FashionBERT: Text and Image Matching with Adaptive Loss for Cross-modal Retrieval : [paper] SIGIR'2020

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
README.md		README.md