# AI vs. Human-Generated Images Detection


**Course**: INFO-6147  
**Student**: Harrison Kim, 1340629  
**Date**: August 15, 2025

## 0. Environment Setup

In [52]:
!pip install -q torch torchvision kagglehub pandas tqdm matplotlib scikit-learn numpy

In [53]:
import torch

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f'Using device: {device}')

Using device: cuda


## 1. Dataset Selection

### 1.1 Choose a suitable image dataset for your project. You can consider any of the well-known datasets here Datasets — [Torchvision 0.17 documentation](https://docs.pytorch.org/vision/stable/datasets.html) for simplicity.

- decided to use [*`CIFAKE: Real and AI-Generated Synthetic Images`*](https://www.kaggle.com/datasets/birdy654/cifake-real-and-ai-generated-synthetic-images)

In [54]:
import kagglehub
import os
import pandas as pd

# Check if dataset already exists
expected_dir = os.path.expanduser("~/.cache/kagglehub/datasets/birdy654/cifake-real-and-ai-generated-synthetic-images/versions/3")
if not os.path.exists(expected_dir):
    path = kagglehub.dataset_download("birdy654/cifake-real-and-ai-generated-synthetic-images")
    print("Downloaded dataset to:", path)
else:
    path = expected_dir
    print("Dataset already exists at:", path)

# Define image folders (usually 'train' and 'test' or 'real' and 'fake')
image_root = os.path.join(path)
folders = ["train", "test"]
labels = ["REAL", "FAKE"]

rows = []
for split in folders:
    for label in labels:
        folder_path = os.path.join(image_root, split, label)
        if os.path.exists(folder_path):
            # for fname in os.listdir(folder_path):
            #     if fname.lower().endswith(('.jpg')):
            #         rows.append({
            #             "image_path": os.path.join(folder_path, fname),
            #             "label": label,
            #             "split": split
            #         })
            file_list = [fname for fname in os.listdir(folder_path) if fname.lower().endswith('.jpg')]
            half_len = len(file_list) // 2
            for fname in file_list[:half_len]:
                rows.append({
                    "image_path": os.path.join(folder_path, fname),
                    "label": label,
                    "split": split
                })

# Check rows length and DataFrame columns
print('Number of rows:', len(rows))

# Create DataFrame
df = pd.DataFrame(rows)
print('DataFrame columns:', df.columns)

Downloaded dataset to: /kaggle/input/cifake-real-and-ai-generated-synthetic-images
Number of rows: 60000
DataFrame columns: Index(['image_path', 'label', 'split'], dtype='object')


### 1.2 Ensure that the dataset contains a reasonable number of classes and a sufficient number of images per class.

In [55]:
# Print the number of unique classes in the 'label' column
num_classes = df['label'].nunique()
print('Number of classes:', num_classes)
print('Label names:', df['label'].unique())


Number of classes: 2
Label names: ['REAL' 'FAKE']


### 1.3 If the dataset has very large number of images, you can use a subset (e.g., 1000 images per class if number of classes are 10 or less)

In [56]:
# Print the number of images per class
image_counts = df['label'].value_counts()
print('Number of images per class:')
print(image_counts)

Number of images per class:
label
REAL    30000
FAKE    30000
Name: count, dtype: int64


### 1.4 If the dataset has more than 20 classes, you can use a subset of the classes (e.g., only use 10 classes)

In [57]:
if num_classes > 20:
    print("Using a subset of classes due to high number of classes.")
else:
    print(f"Using all classes as we only have {num_classes} classes. ==> {df['label'].unique()}")

Using all classes as we only have 2 classes. ==> ['REAL' 'FAKE']


## 2. Data Preprocessing

### 2.1 Perform data preprocessing steps such as resizing images, normalizing pixel values, and splitting the dataset into training, validation, and test sets.

In [58]:
import torch
from torch.utils.data import Dataset
from PIL import Image

# Define ImageDataset class
class ImageDataset(Dataset):
    def __init__(self, dataframe, transform=None):
        self.dataframe = dataframe
        self.transform = transform
        self.image_paths = dataframe['image_path'].values
        self.labels = dataframe['label'].values
        self.label_to_idx = {label: idx for idx, label in enumerate(sorted(set(self.labels)))}

    def __len__(self):
        return len(self.image_paths)

    def __getitem__(self, idx):
        img_path = self.image_paths[idx]
        label = self.labels[idx]
        label_idx = self.label_to_idx[label]
        try:
            img = Image.open(img_path).convert('RGB')
            if self.transform:
                img = self.transform(img)
        except Exception as e:
            print(f'Error loading image {img_path}:', e)
            img = torch.zeros(3, 32, 32)
        return img, label_idx

# 1. Split df into train, validation and test datasets
from sklearn.model_selection import train_test_split

temp = df[df['split'] == 'train'].reset_index(drop=True)
train_df, val_df = train_test_split(
    temp,
    test_size=0.2,
    stratify=temp['label'],
    random_state=42
)
test_df = df[df['split'] == 'test'].reset_index(drop=True)

# 2. Define transform (resize, normalize)
from torchvision import transforms
transform = transforms.Compose([
    transforms.Resize((32, 32)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5]),
])

# 3. Create ImageDataset for each split
# train_dataset = ImageDataset(train_df, transform=transform)
val_dataset = ImageDataset(val_df, transform=transform)
test_dataset = ImageDataset(test_df, transform=transform)

print('Train set size:', len(train_df))
print('Validation set size:', len(val_dataset))
print('Test set size:', len(test_dataset))

Train set size: 40000
Validation set size: 10000
Test set size: 10000


### 2.2 Apply data augmentation techniques to increase the diversity of the training data.

In [59]:
from torchvision import transforms

# Define data augmentation for training
train_transform = transforms.Compose([
    transforms.Resize((32, 32)),
    transforms.RandomHorizontalFlip(),
    transforms.RandomRotation(10),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5]),
])

# Re-create train_dataset with augmentation
train_dataset = ImageDataset(train_df, transform=train_transform)
print('Data augmentation applied to training dataset.')

Data augmentation applied to training dataset.


## 3. Model Selection and Architecture

- Select an appropriate deep learning architecture for image classification. You can start with a convolutional neural network (CNN).
- Define the architecture of your model, including the number of layers, activation functions, and any regularization techniques.

In [60]:
import torch.nn as nn

class Net(nn.Module):
    def __init__(self, num_classes):
        super(Net, self).__init__()
        self.features = nn.Sequential(
            nn.Conv2d(3, 16, kernel_size=3, padding=1),
            nn.ReLU(),
            nn.MaxPool2d(2),
            nn.Conv2d(16, 32, kernel_size=3, padding=1),
            nn.ReLU(),
            nn.MaxPool2d(2),
            nn.Conv2d(32, 64, kernel_size=3, padding=1),
            nn.ReLU(),
            nn.MaxPool2d(2)
        )
        self.classifier = nn.Sequential(
            nn.Flatten(),
            nn.Linear(64 * 4 * 4, 128),
            nn.ReLU(),
            nn.Dropout(0.5),
            nn.Linear(128, num_classes)
        )

    def forward(self, x):
        x = self.features(x)
        x = self.classifier(x)
        return x

model = Net(num_classes=num_classes)

print("CNN model with dropout(0.5) initialized.")


CNN model with dropout(0.5) initialized.


## 4. Model Training

In [61]:
# Initial hyperparameter values
BATCH_SIZE = 64
NUM_EPOCHS = 25
LR = 0.0001

### 4.1 Train your deep learning model using the training dataset.

#### 4.1.1 Load dataset into dataloaders

In [62]:
from torch.utils.data import DataLoader

train_dataset = DataLoader(train_dataset, batch_size=BATCH_SIZE, shuffle=True)
val_dataset = DataLoader(val_dataset, batch_size=BATCH_SIZE, shuffle=False)
test_dataset = DataLoader(test_dataset, batch_size=BATCH_SIZE, shuffle=False)

#### 4.1.2 Train custom CNN model

In [None]:
# Training loop for custom CNN model
import torch.optim as optim
from tqdm import tqdm
import copy

model = model.to(device)
criterion = nn.CrossEntropyLoss()
criterion = criterion.to(device)
optimizer = optim.Adam(model.parameters(), lr=LR)

# Lists to store metrics
cnn_train_loss_list = []
cnn_train_acc_list = []
cnn_val_loss_list = []
cnn_val_acc_list = []

best_val_acc = 0.0
counter = 0
for epoch in range(NUM_EPOCHS):
    model.train()
    running_loss = 0.0
    correct = 0
    total = 0
    for images, labels in tqdm(train_dataset, desc=f"Epoch {epoch+1}/{NUM_EPOCHS}"):
        images = images.to(device)
        if not isinstance(labels, torch.Tensor):
            labels = torch.tensor(labels)
        labels = labels.to(device)
        optimizer.zero_grad()
        outputs = model(images)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()
        running_loss += loss.item() * images.size(0)
        _, predicted = torch.max(outputs, 1)
        correct += (predicted == labels).sum().item()
        total += labels.size(0)
    train_loss = running_loss / total
    train_acc = correct / total

    # Store metrics
    cnn_train_loss_list.append(train_loss)
    cnn_train_acc_list.append(train_acc)

    # Calculate and record validation loss
    model.eval()
    val_loss = 0.0
    val_correct = 0
    val_total = 0
    with torch.no_grad():
        for images, labels in val_dataset:
            images = images.to(device)
            if not isinstance(labels, torch.Tensor):
                labels = torch.tensor(labels)
            labels = labels.to(device)
            outputs = model(images)
            loss = criterion(outputs, labels)
            val_loss += loss.item() * images.size(0)
            _, predicted = torch.max(outputs, 1)
            val_correct += (predicted == labels).sum().item()
            val_total += labels.size(0)
    val_loss = val_loss / val_total
    val_acc = val_correct / val_total
    cnn_val_loss_list.append(val_loss)
    cnn_val_acc_list.append(val_acc)

    if val_acc > best_val_acc:
        best_val_acc = val_acc
        counter = 0
        best_model_state = copy.deepcopy(model.state_dict())
    else:
        counter += 1
        if counter >= 5:
            print("Early stopping triggered.")
            break

    print(f"Epoch {epoch+1}/{NUM_EPOCHS} | Train Loss: {train_loss:.4f} | Train Acc: {train_acc:.4f}")

print("\nTraining complete.")
print("cnn_train_loss_list:", cnn_train_loss_list)
print("cnn_train_acc_list:", cnn_train_acc_list)

Epoch 1/25: 100%|██████████| 625/625 [00:52<00:00, 11.83it/s]


Epoch 1/25 | Train Loss: 0.5472 | Train Acc: 0.7139


Epoch 2/25: 100%|██████████| 625/625 [00:46<00:00, 13.56it/s]


Epoch 2/25 | Train Loss: 0.4396 | Train Acc: 0.7958


Epoch 3/25: 100%|██████████| 625/625 [00:46<00:00, 13.46it/s]


Epoch 3/25 | Train Loss: 0.3934 | Train Acc: 0.8250


Epoch 4/25: 100%|██████████| 625/625 [00:45<00:00, 13.87it/s]


Epoch 4/25 | Train Loss: 0.3571 | Train Acc: 0.8455


Epoch 5/25: 100%|██████████| 625/625 [00:45<00:00, 13.72it/s]


Epoch 5/25 | Train Loss: 0.3304 | Train Acc: 0.8576


Epoch 6/25: 100%|██████████| 625/625 [00:45<00:00, 13.87it/s]


Epoch 6/25 | Train Loss: 0.3103 | Train Acc: 0.8715


Epoch 7/25: 100%|██████████| 625/625 [00:46<00:00, 13.47it/s]


Epoch 7/25 | Train Loss: 0.2891 | Train Acc: 0.8811


Epoch 8/25: 100%|██████████| 625/625 [00:46<00:00, 13.50it/s]


Epoch 8/25 | Train Loss: 0.2765 | Train Acc: 0.8869


Epoch 9/25: 100%|██████████| 625/625 [00:45<00:00, 13.69it/s]


Epoch 9/25 | Train Loss: 0.2656 | Train Acc: 0.8927


Epoch 10/25:  79%|███████▉  | 496/625 [00:36<00:08, 14.44it/s]

### 4.2 Monitor training progress, including loss and accuracy, and consider using early stopping to prevent overfitting.

In [None]:
import matplotlib.pyplot as plt

plt.figure(figsize=(12,4))
plt.subplot(1,2,1)
plt.plot(cnn_train_loss_list, label='Train Loss')
plt.plot(cnn_val_loss_list, label='Val Loss')
plt.title('Loss (Train vs Validation)')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()

plt.subplot(1,2,2)
plt.plot(cnn_train_acc_list, label='Train Acc')
plt.plot(cnn_val_acc_list, label='Val Acc')
plt.title('Accuracy (Train vs Validation)')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()
plt.tight_layout()
plt.show()

## 5. Hyperparameter Tuning:

- Experiment with different hyperparameters (e.g., learning rate, batch size) to optimize the model's performance.
- Keep a record of the hyperparameters used and their impact on the model.

### 5.1 Experiment with different hyperparameters (e.g., learning rate, batch size) to optimize the model's performance.

In [None]:
# Hyperparameter search for learning rate and batch size
import copy
from tqdm import tqdm

search_learning_rates = [0.0005, 0.001, 0.005]
search_batch_sizes = [128, 192, 256]
num_epochs_hp = 5

results = []

for lr in search_learning_rates:
    for batch_size in search_batch_sizes:
        print(f"\nTraining with learning rate={lr}, batch size={batch_size}")
        # Re-create dataloaders with new batch size
        train_loader = DataLoader(train_dataset.dataset, batch_size=batch_size, shuffle=True)
        val_loader = DataLoader(val_dataset.dataset, batch_size=batch_size, shuffle=False)

        # Re-initialize model and optimizer
        model = Net(num_classes=num_classes).to(device)
        optimizer = optim.Adam(model.parameters(), lr=lr)

        best_val_acc = 0.0
        counter = 0
        for epoch in range(num_epochs_hp):
            model.train()
            running_loss = 0.0
            correct = 0
            total = 0
            for images, labels in tqdm(train_loader, desc=f"Epoch {epoch+1}/{num_epochs_hp} (train)"):
                images = images.to(device)
                if not isinstance(labels, torch.Tensor):
                    labels = torch.tensor(labels)
                labels = labels.to(device)
                optimizer.zero_grad()
                outputs = model(images)
                loss = criterion(outputs, labels)
                loss.backward()
                optimizer.step()
                running_loss += loss.item() * images.size(0)
                _, predicted = torch.max(outputs, 1)
                correct += (predicted == labels).sum().item()
                total += labels.size(0)
            train_loss = running_loss / total
            train_acc = correct / total

            # Validation
            model.eval()
            val_loss = 0.0
            val_correct = 0
            val_total = 0
            with torch.no_grad():
                for images, labels in tqdm(val_dataset, desc="Validating"):
                    images = images.to(device)
                    if not isinstance(labels, torch.Tensor):
                        labels = torch.tensor(labels)
                    labels = labels.to(device)
                    outputs = model(images)
                    loss = criterion(outputs, labels)
                    val_loss += loss.item() * images.size(0)
                    _, predicted = torch.max(outputs, 1)
                    val_correct += (predicted == labels).sum().item()
                    val_total += labels.size(0)
            val_loss = val_loss / val_total
            val_acc = val_correct / val_total

            if val_acc > best_val_acc:
                best_val_acc = val_acc
                counter = 0
                best_model_state = copy.deepcopy(model.state_dict())
            else:
                counter += 1
                if counter >= 3:
                    print("Early stopping triggered.")
                    break

        print(f"\nBest validation accuracy: {best_val_acc:.4f}")
        results.append({
            'learning_rate': lr,
            'batch_size': batch_size,
            'best_val_acc': best_val_acc
        })

### 5.2 Keep a record of the hyperparameters used and their impact on the model.

In [None]:
import pandas as pd

# Create a DataFrame to summarize hyperparameter search results
results_df = pd.DataFrame(results)
print("Hyperparameter summary table:")
print(results_df)

# Optionally, display the best configuration
best_row = results_df.loc[results_df['best_val_acc'].idxmax()]
print(f"\nBest configuration:\nLearning rate: {best_row['learning_rate']}, Batch size: {best_row['batch_size']}, Best Val Acc: {best_row['best_val_acc']:.4f}")

## 6. Evaluation:

- Evaluate your trained model using the validation dataset to assess its performance.
- Calculate relevant metrics such as accuracy, precision, recall, and F1-score.
- Visualize the model's predictions and misclassifications.

### 6.1 Evaluate your trained model using the validation dataset to assess its performance.

In [None]:
# Evaluate CNN model on validation dataset
model.eval()
val_loss = 0.0
val_correct = 0
val_total = 0
cnn_true = []
cnn_pred = []
with torch.no_grad():
    for images, labels in tqdm(val_dataset, desc="Validating"):
        images = images.to(device)
        if not isinstance(labels, torch.Tensor):
            labels = torch.tensor(labels)
        labels = labels.to(device)
        outputs = model(images)
        loss = criterion(outputs, labels)
        val_loss += loss.item() * images.size(0)
        _, predicted = torch.max(outputs, 1)
        val_correct += (predicted == labels).sum().item()
        val_total += labels.size(0)
        cnn_true.extend(labels.cpu().numpy())
        cnn_pred.extend(predicted.cpu().numpy())
val_loss = val_loss / val_total
val_acc = val_correct / val_total
print(f"\nCNN Validation Loss: {val_loss:.4f} | CNN Validation Accuracy: {val_acc:.4f}")

### 6.2 Calculate relevant metrics such as accuracy, precision, recall, and F1-score.

In [None]:
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score

# CNN metrics
cnn_acc = accuracy_score(cnn_true, cnn_pred)
cnn_prec = precision_score(cnn_true, cnn_pred, average='macro')
cnn_rec = recall_score(cnn_true, cnn_pred, average='macro')
cnn_f1 = f1_score(cnn_true, cnn_pred, average='macro')
print('CNN metrics:')
print(f'Accuracy: {cnn_acc:.4f}, Precision: {cnn_prec:.4f}, Recall: {cnn_rec:.4f}, F1-score: {cnn_f1:.4f}')

## 8. Final Model Testing:

- Test your final model on the held-out test dataset to assess its generalization to unseen data.

In [None]:
model.eval()
test_true = []
test_pred = []
with torch.no_grad():
    for images, labels in test_dataset:
        images = images.to(device)
        if not isinstance(labels, torch.Tensor):
            labels = torch.tensor(labels)
        labels = labels.to(device)
        outputs = model(images)
        _, predicted = torch.max(outputs, 1)
        test_true.extend(labels.cpu().numpy())
        test_pred.extend(predicted.cpu().numpy())

test_acc = accuracy_score(test_true, test_pred)
test_prec = precision_score(test_true, test_pred, average='macro')
test_rec = recall_score(test_true, test_pred, average='macro')
test_f1 = f1_score(test_true, test_pred, average='macro')
print(f'Test Accuracy: {test_acc:.4f}, Precision: {test_prec:.4f}, Recall: {test_rec:.4f}, F1-score: {test_f1:.4f}')

## 9. Documentation and Reporting:

- Create a project report summarizing your dataset, model architecture, training process, evaluation results, and insights gained.
- Include visualizations and explanations to make your findings clear.

## 10. Presentation:

- Prepare a brief presentation to showcase your project's key findings and outcomes.
- Share your experiences, challenges faced, and lessons learned during the project.

## 11. Conclusion:

- Conclude your capstone project by summarizing your achievements and any future work or improvements that could be made to the model.
- Remember to maintain good coding practices and seek guidance or feedback from your instructor throughout the project.
- This capstone project will demonstrate your ability to apply deep learning techniques to real-world problems and showcase your skills to potential employers or collaborators.