This repository serves as a mirror of the GenImage dataset, originally introduced in:
Mingjian Zhu, Hanting Chen, Qiangyu Yan, Xudong Huang, Guanyu Lin, Wei Li, Zhijun Tu, Hailin Hu, Jie Hu, and Yunhe Wang. GenImage: A Million-Scale Benchmark for Detecting AI-Generated Image. In the 36th Conference on Neural Information Processing Systems (NeurIPS), 2023.
It is intended to provide a redundant and accessible copy for research and non-commercial use.
β οΈ Note: This is not the original source. Proper citation to the original dataset is provided above.
Due to the large size of the GenImage dataset, we redistribute it in separate parts via Kaggle.
- Each part corresponds to a specific generative model.
- This structure allows easier access and download of the dataset.
The dataset is divided into multiple subsets, with each subset hosted separately on Kaggle:
- ADM: GenImage-ADM
- BigGAN: GenImage-BigGAN
- glide: GenImage-glide
- Midjourney:
π Note: The Midjourney subset is split into three parts due to its large size. Please download all parts and merge them before extraction:
cat part_aa part_ab part_ac > midjourney.zip - stable_diffusion_v_1_4: GenImage-stable_diffusion_v_1_4
- stable_diffusion_v_1_5: GenImage-stable_diffusion_v_1_5
- VQDM: GenImage-VQDM
- wukong: GenImage-wukong
In addition, we also provide a separate validation subset of the GenImage dataset for researchers who are only interested in model evaluation. This subset is particularly useful for cross-domain testing scenarios, training-free methods, and research focused solely on evaluation without requiring access to the full dataset.
- GenImage - Validation Set: GenImage-Validation
This repository mirrors the GenImage dataset.
The dataset is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).
- Non-commercial use only
- Attribution required
- Share-alike applies to derivatives
All rights and credits belong to the original authors of the GenImage dataset.
This repository does not claim ownership and is intended solely for redistribution and accessibility purposes.