DreamLIP: Language-Image Pre-training with Long Captions

DreamLIP: Language-Image Pre-training with Long Captions
Kecheng Zheng, Yifei Zhang, Wei Wu, Fan Lu, Shuailei Ma, Xin Jin, Wei Chen, Yujun Shen
Project Page | Paper | Data

📰 News

[2024/03/27] Long captions (LLAVA1.5, InstructBLIP and shareGPT4V) of CC3M are released here~

💡 Highlights

🔥 Exploring how language-image pre-training could benefit from long captions.
🔥 Strong improvement on semantic segmentation, image-text retrieval, semantic segmentation, and image understanding in MLLM.

🔥 DreamLIP trained with 30M image-text pairs achieves on par or even better performance than CLIP trained with 400M pairs.

🎨 In-Progress

We have released long captions of CC3M.
Release long captions of CC12M, YFCC15M, Laion20M, and COYO4M.
Upload the pretrained weight of VIT-B/16 and VIT-B/32 pretrained in CC3M, CC12M, YFCC15M, and merged-30M.
Release evaluation code
Release training code

🏝️ Overview of supported long captions:

Long Captions of Supported Datasets (5)

Long Captions of MLLMs (3)

Generated Long Captions

Dataset	Raw	InstructBLIP	LLAVA1.5	ShareGPT4V	ALL
CC3M	TODO	TODO	TODO	TODO	Link
CC12M	TODO	TODO	TODO	TODO	TODO
YFCC15M	TODO	TODO	TODO	TODO	TODO

Pretrained checkpoints

TODO

📖 Citation

@article{DreamLIP,
  title={DreamLIP: Language-Image Pre-training with Long Captions},
  author={Zheng, Kecheng and Zhang, Yifei and Wu, Wei and Lu, Fan and Ma, Shuailei and Jin, Xin and Chen, Wei and Shen, Yujun},
  journal={arXiv:2403.17007},
  year={2024}
}

Acknowledgements

We thank InstructBLIP, ShareGPT4V and LLAVA for the pretrained models and codes.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
figures		figures
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

figures

figures

README.md

README.md

Repository files navigation

DreamLIP: Language-Image Pre-training with Long Captions

📰 News

💡 Highlights

🎨 In-Progress

🏝️ Overview of supported long captions:

Generated Long Captions

Pretrained checkpoints

📖 Citation

Acknowledgements

About

Releases

Packages

Contributors 2

zyf0619sjtu/DreamLIP

Folders and files

Latest commit

History

figures

figures

README.md

README.md

Repository files navigation

DreamLIP: Language-Image Pre-training with Long Captions

📰 News

💡 Highlights

🎨 In-Progress

🏝️ Overview of supported long captions:

Generated Long Captions

Pretrained checkpoints

📖 Citation

Acknowledgements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Packages