# Neural Networks Final Project
### Reimplementation of the study: <br> ***"DE-FAKE: Detection and Attribution of Fake Images Generated by Text-to-Image GenerationModels"* <br> from Zeyang Sha, Zheng Li, Ning Yu, Yang Zhang**

**Name**: *Laura Papi*

**Matricola**: *1760732*

# Project Description

The above cited study focuses on the growing concerns about the possible misuse of AI generated images, and assesses the necessity for a tool to detect, and attribute, these fake images.<br>
In particular, it points out the lack of research on the particular case of images generated by a text prompt.
<br>

<br>
Therefore, this research proposes methods to answer the following 3 research questions [RQ]:

- **RQ1**. Detection of images generated by text-to-image generation models

- **RQ2**. Attribution of the fake images to their source model

- **RQ3**. Analysis of the likelihood that different text prompts have to generate authentic images

<br><br>
The following sections contain examples for my implementation of the described methods.<br><br>
The complete implementation of the models can be found in the source directory of the GitHub repository __[Source Code](http://url)__


## RQ1. Detection of images generated by text-to-image generation models

The study proposes two detector models:

1. **Image-only detector**<br>binary classifier that decides whether an input image is fake or real.

2. **Hybrid detector**<br>binary classifier that is able to tell if an image is fake or real, based on the input image and its corresponding text prompt.


### 1. Image-only detector

#### 1.1 Dataset
All the datasets are constitueted by a set of N real images (labeled 1), and a set of N corresponding fake generated images (labeled 0).

##### 1.1.1 Data Collection
The data used for the training is collected and generated as described in the following steps **(i)** and **(ii)**

- **(i)** Real images are fetched from the MSCOCO dataset, together with their captions.

In [None]:
import sys
import os
#import the path to the scripts needed for this section
sys.path.insert(10, '/home/parwal/Documents/GitHub/De-Fake_nn_final_project/src/imageonly_detector')
#TODO capire a chi serve questo import e metterlo nel posto giusto

from src.imageonly_detector.MSCOCO_data_collection import fetchImagesFromMSCOCO

#SD+MSCOCO
fetchImagesFromMSCOCO("data/MSCOCO_for_SD/images", "data/MSCOCO_for_SD", 100)

#LD+MSCOCO --------------------------------------------------------------------------
fetchImagesFromMSCOCO("data/MSCOCO_for_LD/images", "data/MSCOCO_for_LD", 50)

#GLIDE+MSCOCO -----------------------------------------------------------------------
fetchImagesFromMSCOCO("data/MSCOCO_for_GLIDE/images", "data/MSCOCO_for_GLIDE", 50)

#DALL-E2+MSCOCO ---------------------------------------------------------------------
#TODO

- **(ii)** The captions from the MSCOCO images are used as input to the Stable Diffusion (SD) text-to-image generator.

In [None]:
#SD+MSCOCO --------------------------------------------------------------------------
#use stable-diffusion API to generate 100 fake images from the 100 captions collected before
#prima di eseguire il file ho cambiato le directory
%run src/imageonly_detector/SD_MSCOCO_data_generation.py

#LD+MSCOCO --------------------------------------------------------------------------
#resetto la directory corrente a quella del progetto de-fake, altrimenti il file da eseguire non viene trovato
#questo è necessario perché LD_MSCOCO_data_generation.py cambia la directory a quella di latent-diffusion
os.chdir("/home/parwal/Documents/GitHub/De-Fake_nn_final_project")
%run src/imageonly_detector/LD_MSCOCO_data_generation.py

#GLIDE+MSCOCO -----------------------------------------------------------------------
#NON HO MAI PROVATO A RUNNARLO, altrimenti rigenera il modello (3gb)
#provare a runnarlo proprio alla fine di tutto per sicurezza
%run src/imageonly_detector/GLIDE_MSCOCO_data_generation.ipynb

#DALL-E2+MSCOCO ---------------------------------------------------------------------
#TODO (è a pagamento soltanto con le API, valutare)

##### 1.1.2 Dataset Construction
The collected and generated data is then shaped in the following structure, in order to be used for the training and evaluation step:<br><br>
train/<br>
    ├── class_0/<br>
    │   ├── ...<br>
    │   └── all the fake images<br>
    ├── class_1/<br>
    │   ├── ...<br>
    │   └── all the real images<br>
val/<br>
    ├── class_0/<br>
    │   ├── ...<br>
    │   └── all the fake images<br>
    ├── class_1/<br>
    │   ├── ...<br>
    │   └── all the real images<br>

In [None]:
#transform the collected data in the previously described structure

from src.imageonly_detector.format_dataset import formatIntoDataset, formatIntoTrainTest

#SD+MSCOCO
#this function generates a pair of datasets (train and val), starting from data from the Stable Diffusion generation
#the data generated from SD contains 100 images, this original dataset is split in half (50 for train, 50 for test)
formatIntoTrainTest("data/MSCOCO_for_SD/images", "data/SD+MSCOCO/images", "data/imageonly_detector_data")
print("ok SD")

#LD+MSCOCO --------------------------------------------------------------------------
formatIntoDataset("data/MSCOCO_for_LD/images", "../latent-diffusion/outputs/txt2img-samples", "data/imageonly_detector_data/val_LD")
print("ok LD")

#GLIDE+MSCOCO -----------------------------------------------------------------------
formatIntoDataset("data/MSCOCO_for_GLIDE/images", "data/GLIDE+MSCOCO/images", "data/imageonly_detector_data/val_GLIDE")
print("ok GLIDE")

#DALL-E2+MSCOCO ---------------------------------------------------------------------
#TODO (è a pagamento soltanto con le API, valutare)

#### 1.2 Detector

The model is defined and trained in the file executed in the followind code block.

In [13]:
#this function trains the model and tests it at every epoch
#both the test and train datasets are generated using SD
%run src/imageonly_detector/train.py

train dataset size: 100
test dataset size: 98
Epoch 0/49
----------
train Loss: 0.6691 Acc: 0.6200
val Loss: 0.5134 Acc: 0.6837

Epoch 1/49
----------
train Loss: 0.4941 Acc: 0.7400
val Loss: 0.6277 Acc: 0.6939

Epoch 2/49
----------
train Loss: 0.3981 Acc: 0.8000
val Loss: 0.2936 Acc: 0.8571

Epoch 3/49
----------
train Loss: 0.3812 Acc: 0.8300
val Loss: 0.3056 Acc: 0.8776

Epoch 4/49
----------
train Loss: 0.6900 Acc: 0.7600
val Loss: 0.4617 Acc: 0.7755

Epoch 5/49
----------
train Loss: 0.4494 Acc: 0.8100
val Loss: 0.4298 Acc: 0.8163

Epoch 6/49
----------
train Loss: 0.4004 Acc: 0.8500
val Loss: 0.7959 Acc: 0.7449

Epoch 7/49
----------
train Loss: 0.4696 Acc: 0.8400
val Loss: 0.3766 Acc: 0.8571

Epoch 8/49
----------
train Loss: 0.5092 Acc: 0.8000
val Loss: 0.2665 Acc: 0.8878

Epoch 9/49
----------
train Loss: 0.2969 Acc: 0.8600
val Loss: 0.2440 Acc: 0.9286

Epoch 10/49
----------
train Loss: 0.2016 Acc: 0.9200
val Loss: 0.2715 Acc: 0.9184

Epoch 11/49
----------
train Loss: 0.286

### 2. Hybrid detector

#### 2.1 Dataset

The dataset is built in the exact same way as the dataset for the image-only detector.
The following are the instructions to run in order to build:
- one training dataset (using images generated from SD)
- three evaluation dataset (using images generated from SD, LD and GLIDE respectively)

In [14]:
import sys
import os

# N.B.
# before running this block you need to erase all the content of the following directories:
# data/MSCOCO_for_SD
# data/MSCOCO_for_LD
# data/MSCOCO_for_GLIDE
# data/SD+MSCOCO
# data/GLIDE+MSCOCO
# latent-diffusion/outputs/txt2img-samples

# ------------------- COLLECT REAL IMAGES FROM MSCOCO -------------------- #
#import the path to the scripts needed for this section
sys.path.insert(10, '/home/parwal/Documents/GitHub/De-Fake_nn_final_project/src/imageonly_detector')
#TODO capire a chi serve questo import e metterlo nel posto giusto

from src.imageonly_detector.MSCOCO_data_collection import fetchImagesFromMSCOCO

#SD+MSCOCO
fetchImagesFromMSCOCO("data/MSCOCO_for_SD/images", "data/MSCOCO_for_SD", 100)

#LD+MSCOCO --------------------------------------------------------------------------
fetchImagesFromMSCOCO("data/MSCOCO_for_LD/images", "data/MSCOCO_for_LD", 50)

#GLIDE+MSCOCO -----------------------------------------------------------------------
fetchImagesFromMSCOCO("data/MSCOCO_for_GLIDE/images", "data/MSCOCO_for_GLIDE", 50)

#DALL-E2+MSCOCO ---------------------------------------------------------------------
#TODO

loading annotations into memory...
Done (t=13.79s)
creating index...
index created!
loading annotations into memory...
Done (t=0.73s)
creating index...
index created!
loading annotations into memory...
Done (t=12.11s)
creating index...
index created!
loading annotations into memory...
Done (t=0.73s)
creating index...
index created!
loading annotations into memory...
Done (t=12.86s)
creating index...
index created!
loading annotations into memory...
Done (t=0.78s)
creating index...
index created!
loading annotations into memory...
Done (t=13.52s)
creating index...
index created!
loading annotations into memory...
Done (t=0.72s)
creating index...
index created!
Your request activated the API's safety filters and could not be processed.Please modify the prompt and try again.
Current prompt (detected invalid)s: A kid playing with a bat and ball on a beach.
genero l'immagine 0/50
Loading model from models/ldm/text2img-large/model.ckpt
LatentDiffusion: Running in eps-prediction mode
Diffusio

In [None]:
import os
# ------------------- GENERATE FAKE IMAGES USING SD, LD, GLIDE -------------------- #
#SD+MSCOCO --------------------------------------------------------------------------
#use stable-diffusion API to generate 100 fake images from the 100 captions collected before
%run src/imageonly_detector/SD_MSCOCO_data_generation.py

#LD+MSCOCO --------------------------------------------------------------------------
# N.B.
# prima di lanciare questo comando, aggiungere il file src/imageonly_detector/txt2img_batch.py alla directory latent-diffusion/scripts/
#resetto la directory corrente a quella del progetto de-fake, altrimenti il file da eseguire non viene trovato
#questo è necessario perché LD_MSCOCO_data_generation.py cambia la directory a quella di latent-diffusion
os.chdir("/home/parwal/Documents/GitHub/De-Fake_nn_final_project")
%run src/imageonly_detector/LD_MSCOCO_data_generation_batch.py

#GLIDE+MSCOCO -----------------------------------------------------------------------
#NON HO MAI PROVATO A RUNNARLO, altrimenti rigenera il modello (3gb)
#provare a runnarlo proprio alla fine di tutto per sicurezza
%run src/imageonly_detector/GLIDE_MSCOCO_data_generation.ipynb #TODO

#DALL-E2+MSCOCO ---------------------------------------------------------------------
#TODO (è a pagamento soltanto con le API, valutare)

In [2]:
# ------------------- FORMAT THE DATA INTO THE STRUCTURE NEEDED FOR TRAINING/TESTING -------------------- #
os.chdir("/home/parwal/Documents/GitHub/De-Fake_nn_final_project")

#transform the collected data in the previously described structure
from src.imageonly_detector.format_dataset import formatIntoDataset, formatIntoTrainTest


#SD+MSCOCO --------------------------------------------------------------------------
#this function generates a pair of datasets (train and val), starting from data from the Stable Diffusion generation
#the data generated from SD contains 100 images, this original dataset is split in half (50 for train, 50 for test)
formatIntoTrainTest("data/MSCOCO_for_SD/images", "data/SD+MSCOCO/images", "data/hybrid_detector_data")
print("ok SD")

#LD+MSCOCO --------------------------------------------------------------------------
formatIntoDataset("data/MSCOCO_for_LD/images", "../latent-diffusion/outputs/txt2img-samples", "data/hybrid_detector_data/val_LD")
print("ok LD")

#GLIDE+MSCOCO -----------------------------------------------------------------------
formatIntoDataset("data/MSCOCO_for_GLIDE/images", "data/GLIDE+MSCOCO/images", "data/hybrid_detector_data/val_GLIDE") #TODO
print("ok GLIDE")

#DALL-E2+MSCOCO ---------------------------------------------------------------------
#TODO (è a pagamento soltanto con le API, valutare)

[]


#### 2.2 Detector

## RQ2. Attribution of the fake images to their source model

The study proposes two attributor models:

1. **Image-only attributor**<br>multi-class classifier that assigns each input image to its source generation model.

2. **Hybrid attributor**<br>multi-class classifier that assigns each input image to its source generation model, based on the input image and its corresponding text prompt.


### 1. Image-only attributor

### 2. Hybrid attributor

## RQ3. Analysis of the likelihood that different text prompts have to generate authentic images

### 1. Semantic Analysis

### 2. Structure Analysis

## Conclusions