# Neural Networks Final Project
### Reimplementation of the study: <br> ***"DE-FAKE: Detection and Attribution of Fake Images Generated by Text-to-Image GenerationModels"* <br> from Zeyang Sha, Zheng Li, Ning Yu, Yang Zhang**

**Name**: *Laura Papi*

**Matricola**: *1760732*


# Project Description

The above cited study focuses on the growing concerns about the possible misuse of AI generated images, and assesses the necessity for a tool to detect and attribute these fake images.<br>
In particular, it points out the lack of research on the particular case of images generated by a text prompt.
<br>

<br>
Therefore, this research proposes methods to answer the following 3 research questions [RQ]:

- **RQ1**. Detection of images generated by text-to-image generation models

- **RQ2**. Attribution of the fake images to their source model

- **RQ3**. Analysis of the likelihood that different text prompts have to generate authentic images

<br>
This notebook contains the instructions to test the models that were implemented to anser these questions.<br>
In the following sections there are instructions to download the pre-build datasets and the pre-trained weights for the models, in order to test the performance results of this work.<br><br>


For a more detailed description of the work done see the complete notebook __[here](Notebook.ipynb)__.<br>
The complete notebook can be used to reproduce the entire project from scratch, from the creation of the datasets to the design and training of the models.

For furhter informations the complete code of this project can be found in the source directory of the GitHub repository __[Source Code](https://github.com/parwal-lp/De-Fake_nn_final_project/src)__


## Download the Datasets
Download the dataset from this __[link](https://drive.google.com/drive/folders/1Z2qrihz_gKY7R6dula-f0eKjjGxBba6u?usp=sharing)__.<br>
Then extract the "data" folder and place it at the root of this git repository.

## Download the pre-traned Weights
Download the trained weights for all the models __[here]()__.<br>
Then extract the "trained_models" folder and place it at the root of this git reporitory.

## Run the Models

For each RQ, the study proposes two possible models as solution:

1. **Image-only**<br>classifies the image based solely on the input image.

2. **Hybrid**<br>classifier the image based on the image together with its corresponding text prompt.

In [4]:
# Import all the necessary libraries and functions
# External libraries
import torch
import torchvision
import os

# Custom functions of this project
from src.imageonly_detector.model import eval_imageonly_detector
from src.imageonly_attributor.model import eval_imageonly_attributor
from src.hybrid_detector.hybrid_detector import TwoLayerPerceptron, eval_hybrid_detector
from src.hybrid_attributor.model import MultiClassTwoLayerPerceptron, eval_hybrid_attributor # TODO implement eval function

from src.encoder import get_multiclass_dataset_loader, get_dataset_loader

### 1. Image-only Detector

Binary classifier implemented through a two-layer perceptron, that is able to tell if an image is real or fake;<br>where fake means that it is generated from a text-to-image generation model.
<br><br>
The model is tested on real images fetched from MSCOCO and fake images generated by Stable Diffusion (SD), Latent Diffusion (LD) and GLIDE.<br>
Since the model was trained only on images generated from SD, we expect higher accuracy for that case.

In [2]:
# First create Dataloaders
print("Building the dataset...")
data_transforms = {
    'val': torchvision.transforms.Compose([ # contains real images from MSCOCO and fake images generated by SD
        torchvision.transforms.Resize(256),
        torchvision.transforms.CenterCrop(224),
        torchvision.transforms.ToTensor(),
        torchvision.transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ]),
    'val_LD': torchvision.transforms.Compose([ # contains real images from MSCOCO and fake images generated by LD
        torchvision.transforms.Resize(256),
        torchvision.transforms.CenterCrop(224),
        torchvision.transforms.ToTensor(),
        torchvision.transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ]),
    'val_GLIDE': torchvision.transforms.Compose([ # contains real images from MSCOCO and fake images generated by GLIDE
        torchvision.transforms.Resize(256),
        torchvision.transforms.CenterCrop(224),
        torchvision.transforms.ToTensor(),
        torchvision.transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ]),
}

data_dir = 'data/imageonly_detector_data'
image_datasets = {x: torchvision.datasets.ImageFolder(os.path.join(data_dir, x), data_transforms[x]) for x in ['val', 'val_LD', 'val_GLIDE']}
dataloaders = {x: torch.utils.data.DataLoader(image_datasets[x], batch_size=4, shuffle=True, num_workers=4) for x in ['val', 'val_LD', 'val_GLIDE']}
dataset_sizes = {x: len(image_datasets[x]) for x in ['val', 'val_LD', 'val_GLIDE']}

# Then evaluate the model on the dataloaders above
print("Evaluation starts")
print("loading model with trained weights...")
imageonly_detector = torchvision.models.resnet18(weights='IMAGENET1K_V1')
imageonly_detector.load_state_dict(torch.load('trained_models/imageonly_detector.pth'))
eval_imageonly_detector(imageonly_detector, dataloaders, dataset_sizes)

Building the dataset...
Evaluation starts
loading model with trained weights...
Evaluation on SD -> Acc: 0.9388 - Loss: 0.2364
Evaluation on LD -> Acc: 0.6800 - Loss: 1.4511
Evaluation on GLIDE -> Acc: 0.7000 - Loss: 1.1783
Evaluation complete in 0m 2s


### 2. Hybrid Detector

This model is again a binary classifier implemented through a two-layer perceptron and the purpose is the same as in the image-only detector.<br>
The difference is that in this case the model will take as input not only the image but also its textual description, which is the text prompt that was used to generate it.
<br><br>
The model is again tested on real images and captions fetched from MSCOCO and fake images generated by Stable Diffusion (SD), Latent Diffusion (LD) and GLIDE.<br>
In this case the model was also trained only on images generated from SD, so we again expect higher accuracy for that case.

In [4]:
# Load the pretrained weights on the model
print("loading model with trained weights...")
test_hybrid_detector = TwoLayerPerceptron(1024, 100, 1)
test_hybrid_detector.load_state_dict(torch.load('trained_models/hybrid_detector.pth'))

eval_dirs = {'SD': {
                'captions': "data/hybrid_detector_data/mscoco_captions.csv", 
                'real': "data/hybrid_detector_data/val/class_1", 
                'fake': "data/hybrid_detector_data/val/class_0"},
             'GLIDE': {
                 'captions': "data/hybrid_detector_data/val_GLIDE/mscoco_captions.csv",
                  'real': "data/hybrid_detector_data/val_GLIDE/class_1", 
                  'fake': "data/hybrid_detector_data/val_GLIDE/class_0"},
             'LD': {
                 'captions': "data/hybrid_detector_data/val_LD/mscoco_captions.csv", 
                 'real': "data/hybrid_detector_data/val_LD/class_1", 
                 'fake': "data/hybrid_detector_data/val_LD/class_0"}}

#Build a the dataloaders and test the model on each of them
print("Evaluation starts")
for dataset_name in eval_dirs:
    eval_data_loader = get_dataset_loader(eval_dirs[dataset_name]['captions'], eval_dirs[dataset_name]['real'], eval_dirs[dataset_name]['fake'])
    loss, acc = eval_hybrid_detector(test_hybrid_detector, eval_data_loader)
    print(f'Evaluation on {dataset_name} --> Accuracy: {acc} - Loss: {loss}')

loading model with trained weights...
Evaluation starts
Evaluation on SD --> Accuracy: 0.8933334350585938 - Loss: 0.5943436622619629
Evaluation on GLIDE --> Accuracy: 0.7099999785423279 - Loss: 0.6429539918899536
Evaluation on LD --> Accuracy: 0.7099999785423279 - Loss: 0.6497436761856079


### 3. Image-only Attributor
Multiclass classifier implemented through a two-layer perceptron, that is able to assign its image to its original source (the text-to-image model that generated it).
<br><br>
The model is trained and tested on real images fetched from MSCOCO and fake images generated by Stable Diffusion (SD), Latent Diffusion (LD) and GLIDE (so we have 4 classes in total).<br>

In [2]:
# Build the Dataloaders
print("Building the dataset...")
data_transforms = {
    'test': torchvision.transforms.Compose([
        torchvision.transforms.Resize(256),
        torchvision.transforms.CenterCrop(224),
        torchvision.transforms.ToTensor(),
        torchvision.transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ])
}

data_dir = 'data/imageonly_attributor_data'
image_datasets = {x: torchvision.datasets.ImageFolder(os.path.join(data_dir, x), data_transforms[x]) for x in ['test']}
dataloaders = {x: torch.utils.data.DataLoader(image_datasets[x], batch_size=4, shuffle=True, num_workers=4) for x in ['test']}
dataset_sizes = {x: len(image_datasets[x]) for x in ['test']}


# Evaluate the model
print("Evaluation starts:")
print("loading model with trained weights...")
imageonly_attributor = torchvision.models.resnet18(weights='IMAGENET1K_V1')
imageonly_attributor.load_state_dict(torch.load('trained_models/imageonly_detector.pth'))
eval_imageonly_attributor(imageonly_attributor, dataloaders, dataset_sizes)

Building the dataset...
Evaluation starts:
loading model with trained weights...
Evaluation results -> ACC: 0.3251 - LOSS: 11.5085


### 4. Hybrid Attributor
This model is also a multiclass classifier implemented through a two-layer perceptron, whith the same goal as in the image-only attributor, but in this case the model takes as input not only the image but also its caption (the text prompt used to generate it).
<br><br>
The model is again trained and tested on real images and captions fetched from MSCOCO and fake images generated by Stable Diffusion (SD), Latent Diffusion (LD) and GLIDE (so we have 4 classes in total).<br>

In [7]:
# Build the model
print('Building the model...')
hybrid_attributor = MultiClassTwoLayerPerceptron(1024, 100, 4)
hybrid_attributor.load_state_dict(torch.load('trained_models/hybrid_attributor.pth'))

# Build the dataset (each sample in the dataset is the encoding of an image concatenated to the encoding of its caption - encodings generated using the CLIP model)
print('Building the dataset...')
captions_file = "data/hybrid_attributor_data/test/mscoco_captions.csv"
dataset_dir = "data/hybrid_attributor_data/test"
classes = {"class_real", "class_SD", "class_LD", "class_GLIDE"}

dataloader = get_multiclass_dataset_loader(captions_file, dataset_dir, classes)

# Train the model on the dataset just generated
print('Evaluation starts:')
eval_hybrid_attributor(hybrid_attributor, dataloader)

Building the model...
Building the dataset...
Evaluation starts:
Evaluation results -> ACC: 0.8649999499320984 - LOSS: 0.4977986812591553
