# Neural Networks Final Project
### Reimplementation of the study: <br> ***"DE-FAKE: Detection and Attribution of Fake Images Generated by Text-to-Image GenerationModels"* <br> from Zeyang Sha, Zheng Li, Ning Yu, Yang Zhang**

**Name**: *Laura Papi*

**Matricola**: *1760732*

# Project Description

The above cited study focuses on the growing concerns about the possible misuse of AI generated images, and assesses the necessity for a tool to detect, and attribute, these fake images.<br>
In particular, it points out the lack of research on the particular case of images generated by a text prompt.
<br>

<br>
Therefore, this research proposes methods to answer the following 3 research questions [RQ]:

- **RQ1**. Detection of images generated by text-to-image generation models

- **RQ2**. Attribution of the fake images to their source model

- **RQ3**. Analysis of the likelihood that different text prompts have to generate authentic images

<br><br>
The following sections contain examples for my implementation of the described methods.<br><br>
The complete implementation of the models can be found in the source directory of the GitHub repository __[Source Code](http://url)__


## RQ1. Detection of images generated by text-to-image generation models

The study proposes two detector models:

1. **Image-only detector**<br>binary classifier that decides whether an input image is fake or real.

2. **Hybrid detector**<br>binary classifier that is able to tell if an image is fake or real, based on the input image and its corresponding text prompt.


### 1. Image-only detector

#### 1.1 Training Dataset
The training dataset is constitueted by a set of N real images (labeled 1), and a set of N corresponding fake generated images (labeled 0).

##### 1.1.1 Data Collection
The data used for the training is collected and generated as described in the following steps **(i)** and **(ii)**

- **(i)** Real images are fetched from the MSCOCO dataset, together with their captions.

In [None]:
import sys
#import the path to the scripts needed for this section
sys.path.insert(10, '/home/parwal/Documents/GitHub/De-Fake_nn_final_project/src/imageonly_detector')
#TODO capire a chi serve questo import e metterlo nel posto giusto


from src.imageonly_detector.MSCOCO_data_collection import fetchImagesFromMSCOCO

#fetch 200 real images from MSCOCO, each with its corresponding caption
fetchImagesFromMSCOCO("data/MSCOCO/images", "data/MSCOCO", 50)
#TODO testare se funziona anche chiamato cosi

- **(ii)** The captions from the MSCOCO images are used as input to the Stable Diffusion (SD) text-to-image generator.

In [None]:
#use stable-diffusion API to generate 200 fake images from the 200 captions collected before
%run src/imageonly_detector/SD_MSCOCO_data_generation.py
#TODO valutare se trasformarlo in funzione o tenerlo come run dello script

##### 1.1.2 Data Construction
The collected and generated data is then shaped in the following structure, in order to be used for the training step:<br><br>
train/<br>
    ├── class_0/<br>
    │   ├── ...<br>
    │   └── all the fake images<br>
    ├── class_1/<br>
    │   ├── ...<br>
    │   └── all the real images<br>

In [None]:
from src.imageonly_detector.format_dataset import formatIntoDataset

#function that transforms the collected data in the previously described structure
formatIntoDataset("data/MSCOCO/images", "data/SD+MSCOCO/images", "data/imageonly_detector_data/train")
#TODO testare se funzione chiamata cosi

#### 1.2 Detector Construction

#### 1.3 Testing Dataset
The testing dataset follows the same structure of the training dataset, but in order to evaluate the generalizability of the model for the testing dataset we use different text-to-image generation models.<br>
In particular we will test the image-only classifier on data generated by Latent Diffusion (LD), GLIDE and DALL-E2.

##### 1.3.1 Data Collection
Similarly as seen in **1.1.1**, the data is collected in the following two steps **(i)** and **(ii)**

- **(i)** Real images are fetched from the MSCOCO dataset, together with their captions.

In [None]:
import sys
import os
print(sys.path)

from src.imageonly_detector.MSCOCO_data_collection import fetchImagesFromMSCOCO

#LD+MSCOCO --------------------------------------------------------------------------
fetchImagesFromMSCOCO("data/MSCOCO_for_LD/images", "data/MSCOCO_for_LD", 50)

#GLIDE+MSCOCO -----------------------------------------------------------------------
fetchImagesFromMSCOCO("data/MSCOCO_for_GLIDE/images", "data/MSCOCO_for_GLIDE", 50)

#DALL-E2+MSCOCO ---------------------------------------------------------------------
fetchImagesFromMSCOCO("data/MSCOCO_for_DALLE2/images", "data/MSCOCO_for_DALLE2", 50)


- **(ii)** The captions from the MSCOCO images are used as input to LD, GLIDE and DALL-E2 text-to-image generators.

In [None]:
#LD+MSCOCO --------------------------------------------------------------------------
#resetto la directory corrente a quella del progetto de-fake, altrimenti il file da eseguire non viene trovato
#questo è necessario perché LD_MSCOCO_data_generation.py cambia la directory a quella di latent-diffusion
os.chdir("/home/parwal/Documents/GitHub/De-Fake_nn_final_project")
%run src/imageonly_detector/LD_MSCOCO_data_generation.py

#GLIDE+MSCOCO -----------------------------------------------------------------------
#TODO

#DALL-E2+MSCOCO ---------------------------------------------------------------------
#TODO

##### 1.3.2 Dataset Construction
The data is structured as already seen and described in **1.1.2**

In [None]:
#LD+MSCOCO --------------------------------------------------------------------------
formatIntoDataset("data/MSCOCO_for_LD/images", "../latent-diffusion/outputs/txt2img-samples", "data/imageonly_detector_data/val_LD") #TODO

#GLIDE+MSCOCO -----------------------------------------------------------------------
#TODO

#DALL-E2+MSCOCO ---------------------------------------------------------------------
#TODO

## NOTE

In [2]:
#generate the training dataset
#fetch 200 real images from MSCOCO, each with its corresponding caption
%run src/imageonly_detector/MSCOCO_data_collection.py

#use stable-diffusion API to generate 200 fake images from the 200 captions collected before
%run src/imageonly_detector/SD_MSCOCO_data_generation.py

#generate the dataset needed for training
%run src/imageonly_detector/generate_train_dataset.py

#train the image-only detector on the previously generated dataset
%run src/imageonly_detector/train.py

#generate the datasets need for evaluation
#LD+MSCOCO
#first fetch again some pictures from 

#GLIDE+MSCOCO

#DALL-E2+MSCOCO

#evaluate the image-only detector on the three previously generated datasets
%run src/imageonly_detector/evaluate.py

loading annotations into memory...
Done (t=14.11s)
creating index...
index created!


### 2. Hybrid detector

#### 2.1 Training Dataset

##### 2.1.1 Data Collection

##### 2.1.2 Dataset Construction

#### 2.2 Detector Construction

#### 2.3 Testing Dataset

##### 2.3.1 Data Collection

##### 2.3.2 Dataset Construction

## RQ2. Attribution of the fake images to their source model

The study proposes two attributor models:

1. **Image-only attributor**<br>multi-class classifier that assigns each input image to its source generation model.

2. **Hybrid attributor**<br>multi-class classifier that assigns each input image to its source generation model, based on the input image and its corresponding text prompt.


### 1. Image-only attributor

### 2. Hybrid attributor

## RQ3. Analysis of the likelihood that different text prompts have to generate authentic images

### 1. Semantic Analysis

### 2. Structure Analysis

## Conclusions