This repository is an official implementation of the ICPR 2024 paper "FIDAVL: Fake Image Detection and Attribution using Vision-Language Model".
☀️ If you find this work useful for your research, please kindly star our repo and cite our paper! ☀️
We are working hard on following items.
✅ Release arXiv paper
- Release training scripts
- Release inference scripts
- Release checkpoints
In this paper, we study the problem of synthetic image (e.g., GANs and diffusion models generated images) detection and attribution.
we introduce FIDAVL, a novel and efficient multitask method designed to detect and attribute fake images to their respective source models. Leveraging a vision-language approach, FIDAVL exploits the synergies between vision and language along with soft prompt-tuning strategy to accurately detect and assign generated images to their originating source generators.
To train FIDAVL, we adpot images generated by LDM, Stable Diffusion v1.4, GLIDE, ProGAN, StyleGAN, and Diff-ProjectedGAN, consisting of 1-class (bedroom) following Towards the Detection of Diffusion Model Deepfakes. The original download link can be found in here.
To evaluate FIDAVL, we consider the synthetic images from both GANs and diffusion models (DMs).
-
GANs dataset
For the GANs dataset, we utilize the 5 types of GANs for testing, including ProGAN, StyleGAN, Diff-ProjectedGAN, Diff-StyleGAN2, and ProjectedGAN.
-
DMs dataset
For the DMs dataset, we utilize 7 types of SOTA DMs, including LDM, ADM, DDPM, IDDPM, PNDM, and Stable Diffusion v1.4 and GLIDE.
This project is built on the open source repository AntifakePrompt. Thanks them for their well-organized code!
@article{keita2024fidavl,
title={FIDAVL: Fake Image Detection and Attribution using Vision-Language Model},
author={Keita, Mamadou and Hamidouche, Wassim and Eutamene, Hessen Bougueffa and Taleb-Ahmed, Abdelmalik and Hadid, Abdenour},
journal={arXiv preprint arXiv:2409.03109},
year={2024}
}