# Problem Statement & Motivation

In today's digital age, the internet overwhelms people worldwide with an unprecedented volume of information sourced from various online platforms. Given the human tendency to trust initial impressions, it becomes essential to inspect the reliability of the sources we depend on for information. Regrettably, misinformation and misunderstandings pervade the digital landscape, posing a considerable challenge to the integrity of information dissemination. In light of this, our primary objective is to determine the origin of the images we encounter, discerning whether they have been generated by artificial intelligence or not.

# Dataset

Dataset: [Kaggle](https://www.kaggle.com/datasets/philosopher0808/real-vs-ai-generated-faces-dataset)

Our project utilizes a dataset sourced from Kaggle, comprising approximately 120,000 facial images. Of these, around 70,000 are authentic images captured through conventional photography, while the remaining 51,000 images are AI-generated. The dataset showcases a wide range of facial images, representing various demographics, including different ages, ethnic backgrounds, and genders. The diversity of our dataset is crucial for ensuring the robustness and generalizability of our deep-learning models. Training on such a varied collection of images guarantees that the model is exposed to a broad spectrum of features and patterns. This exposure significantly enhances the model's ability to generalize its learning to new, unseen images, thereby improving its performance and reliability.


# Data Loading & Package Import

In [None]:
# delete the warning
%matplotlib inline
import warnings
warnings.filterwarnings("ignore")

In [None]:
# import package
import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, transforms
from torch.utils.data import DataLoader, random_split

In [None]:
enable_wandb = True
use_gpu = True

In [None]:
# test if the environment is GPU or not
gpu_available = torch.cuda.is_available()
gpu_available

True

In [None]:
# log into wandb account
if enable_wandb:
  !pip install wandb -qU
  import wandb
  wandb.login()

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.2/2.2 MB[0m [31m13.7 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m207.3/207.3 kB[0m [31m13.6 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m267.1/267.1 kB[0m [31m20.3 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m62.7/62.7 kB[0m [31m6.8 MB/s[0m eta [36m0:00:00[0m
[?25h

<IPython.core.display.Javascript object>

[34m[1mwandb[0m: Logging into wandb.ai. (Learn how to deploy a W&B server locally: https://wandb.me/wandb-server)
[34m[1mwandb[0m: You can find your API key in your browser here: https://wandb.ai/authorize
wandb: Paste an API key from your profile and hit enter, or press ctrl+c to quit:

 ··········


[34m[1mwandb[0m: Appending key for api.wandb.ai to your netrc file: /root/.netrc


In [None]:
# mount to google drive in order to unzip the dataset
import pandas as pd
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [None]:
!unzip "/content/drive/MyDrive/Colab Notebooks/BA865_group_project/dataset.zip"

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
  inflating: dataset/train/1/ebb4f0d76edbe1b4018327d779856c82b613141aa59542d09794cae63e724bd1.jpg  
  inflating: __MACOSX/dataset/train/1/._ebb4f0d76edbe1b4018327d779856c82b613141aa59542d09794cae63e724bd1.jpg  
  inflating: dataset/train/1/SFHQ_pt1_00003927.jpg  
  inflating: __MACOSX/dataset/train/1/._SFHQ_pt1_00003927.jpg  
  inflating: dataset/train/1/SFHQ_pt2_00003244.jpg  
  inflating: __MACOSX/dataset/train/1/._SFHQ_pt2_00003244.jpg  
  inflating: dataset/train/1/SFHQ_pt3_00001050.jpg  
  inflating: __MACOSX/dataset/train/1/._SFHQ_pt3_00001050.jpg  
  inflating: dataset/train/1/SFHQ_pt2_00081535.jpg  
  inflating: __MACOSX/dataset/train/1/._SFHQ_pt2_00081535.jpg  
  inflating: dataset/train/1/0da062db44a837714d994217ee35dae47e67e7d2.jpg  
  inflating: __MACOSX/dataset/train/1/._0da062db44a837714d994217ee35dae47e67e7d2.jpg  
  inflating: dataset/train/1/c5580f924a937a338f0b23fab3a572f2883c9ddfc35bc71bc0668688f68c2c27

# Data Augmentation

In [None]:
configs = {
    "data_augmentation":True,
    "pretrained": True,
                    # false: the parameter I use in pretrained will be random, not completely follow the pretrained model
                    # cause: the training process will be slow
                    # => it's better to use pretrained model
    "efficientnet":True,
                    # true: aim to achieve higher performance using fewer computational resources.
                    # These models adjust their size and complexity to strike a balance between accuracy and efficiency,
    "transferlearning": False
}

In [None]:
if configs['data_augmentation']:
      transform = transforms.Compose([
      transforms.RandomHorizontalFlip(0.5),
      transforms.Resize(112),       # Resize the image to 112x112 pixels while maintaining aspect ratio
      transforms.RandomCrop(112),
      transforms.RandomRotation(45),
      transforms.ColorJitter(), # add a coast of color layer to the original image
      transforms.ToTensor(),   # Convert the image to a PyTorch tensor
      transforms.Normalize(mean=[0.485, 0.456, 0.406],  # Normalize the tensor
                           std=[0.229, 0.224, 0.225])
  ])
else:
  transform = transforms.Compose([
      transforms.Resize(512),       # Resize the image to 512x512 pixels while maintaining aspect ratio
      transforms.CenterCrop(512),  # most of the image will center in the middle of the image, so you will extract most info from here
      transforms.ToTensor(),       # Convert the image to a PyTorch tensor
      transforms.Normalize(mean=[0.485, 0.456, 0.406],  # Normalize the tensor
                           std=[0.229, 0.224, 0.225])
  ])


import torchvision
# with data augmentation
train_dataset = torchvision.datasets.ImageFolder("/content/dataset/train", transform=transform)
val_dataset = torchvision.datasets.ImageFolder("/content/dataset/validate", transform=transform)
test_dataset = torchvision.datasets.ImageFolder("/content/dataset/test", transform=transform)

transform2 = transforms.Compose([
    transforms.Resize(224),  # Resize the image to 224x224 pixels while maintaining aspect ratio
    transforms.CenterCrop(224),  # Center crop to 224x224 pixels
    transforms.ToTensor(),  # Convert the image to a PyTorch tensor
    transforms.Normalize(mean=[0.485, 0.456, 0.406],
                         std=[0.229, 0.224, 0.225])  # Normalize the tensor
])
train_dataset.transform = transform2
val_dataset.transform = transform2
test_dataset.transform = transform2

# Define Classes

## Accuracy

In [None]:
def get_accuracy(dataloader, model):

    # initialize the incremented count
    acc = 0
    correct_predictions = 0
    total_predictions = 0

    with torch.no_grad(): # disables gradient calculation (update the weight & bias)
                          # time: when you are doing evaluation of your model
        model.eval()  # Set model to evaluation mode

        for images, labels in dataloader:
            if gpu_available and use_gpu:
                images = images.cuda()
                labels = labels.cuda()

            outputs = model(images)  # Get predictions

            # Get predicted class
            if configs["transferlearning"]:
              _, predicted = torch.max(outputs.logits, 1)
            else:
              _, predicted = torch.max(outputs, 1)

            correct_predictions += (predicted == labels).sum().item()
            total_predictions += labels.size(0)
                                 # labels: tensor([1, 0, 1])
                                 # labels.size(0): 3
                                 # size(0): return the labels size to me
                                 # coz it's a 1D tensor so only put 0 in size(0)
    acc = correct_predictions / total_predictions
    return acc

## Loss

In [None]:
criterion = nn.CrossEntropyLoss()

In [None]:
def get_loss(loader): ### in pytorch loss function already include softmax, so you don't need to specific add softmax in the last layer

  with torch.no_grad():

    loss = 0
    for i, (images, labels) in enumerate(loader): # The batches.
          # step1: Move data to cuda. Make sure the model is on cuda too!
          if gpu_available and use_gpu:
            images = images.cuda()
            labels = labels.cuda()

          # step2: Forward pass
          outputs = model(images)
          if configs["transferlearning"]:
            outputs = outputs.logits

          # step 3: calculate the loss.
          loss = loss + criterion(outputs, labels)
    return loss/ len(loader)

## Early Stopping

In [None]:
class EarlyStopper:
    def __init__(self, patience=1):
        self.patience = patience
        self.counter = 0
        self.min_validation_loss = float('inf')

    def early_stop(self, validation_loss):

        # If the new loss is lower than the old loss, reset the counter
        if validation_loss < self.min_validation_loss:
            self.min_validation_loss = validation_loss
            self.counter = 0

            # Keep track of the best model by saving it on the hard drive.
            torch.save(model.state_dict(), "./best_model.pt")

        # otherwise, increment the counter.
        elif validation_loss > self.min_validation_loss:
            self.counter += 1

            if self.counter >= self.patience: # terminate
                return True
        return False