Skip to content

Latest commit

 

History

History
58 lines (40 loc) · 4.48 KB

Datasets.md

File metadata and controls

58 lines (40 loc) · 4.48 KB
  • An overview of multi-modal datasets proposed for large-scale pre-training.

  • Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Pre-training Dataset and Benchmarks, [Paper] [Github]

  • LAION-5B: An open large-scale dataset for training next generation image-text models, [Paper] [Project]

  • COYO-700M: Image-Text Pair Dataset [Code]

NO. Dataset Year Scale Modality Language Available URL
01 SBU Captions 2011 1M image-text English Yes [Link]
02 Flickr30k 2014 145K image-text English Yes [Link]
03 COCO 2014 567K image-text English Yes [Link]
04 Visual Genome 2017 5.4M image-text English Yes [Link]
05 VQA v2.0 2017 1.1M image-text English Yes [Link]
06 FashionGen 2018 300k image-text English Yes [Link]
07 CC3M 2018 3M image-text English Yes [Link]
08 GQA 2019 1M image-text English Yes [Link]
09 LAIT 2020 10M image-text English No -
10 CC12M 2021 12M image-text English Yes [Link]
11 AltText 2021 1.8B image-text English No -
12 TVQA 2018 21,793 video-text English Yes [Link]
13 HT100M 2019 136M video-text English Yes [Link]
14 WebVid2M 2021 2.5M video-text English Yes [Link]
15 YFCC-100M 2015 100M image-text English Yes [Link]
16 LAION-400M 2021 400M image-text English Yes [Link]
17 RedCaps 2021 12M image-text English Yes [Link]
18 Wukong 2022 100M image-text Chinese Yes [Link]
19 CxC 2021 24K image-text English Yes [Link]
20 Product1M 2021 1M image-text Chinese Yes [Link]
21 WIT 2021 37.5M image-text Multi-lingual Yes [Link]
22 JFT-300M 2017 30M image-text English No -
23 JFT-3B 2021 3000M image-text English No -
24 IG-3.5B-17k 2018 350M image-text English No -
25 M6-Corpus 2021 60M image, image-text Chinese No -
26 M5Product 2021 6M image, text, table, video, audio English Yes [Link]
27 Localized Narratives 2020 849k image, audio, text, mouse trace English Yes [Link]
28 RUC-CAS-WenLan 2021 30M image-text Chinese No -
29 WuDaoMM 2022 600M image-text Chinese Yes [Link]
30 MEP-3M 2021 3M image-text Chinese Yes [Link]
31 WSCD 2021 650M image-text Chinese No -