# Kaggle Project

Deep learning models are widely used for image classification. The goal of this project is to compare state-of-the-art models on a subset of ImageNet dataset.

The link of the project is : https://www.kaggle.com/competitions/modia-ml-2024/overview

**Notice** : You should upload a report, together with your code to Moodle, for the validation of the course. Do not copy-paste other people’s report or code.

**Due** : 27 Juin 2024.

## 1 Learn how to use Pytorch

Refer to the website https://pytorch.org/tutorials/beginner/basics/intro.html.

## 2 Train state-of-the-art CNN models

The goal is to achieve a good classification accuracy on the test dataset, using Pytorch.

 - Implement one or two models from the following list :
    - LeNet Model [1]
    - AlexNet Model [2]
    - ResNet Model [3]

 - In your report, you should give a precise definition of the model which you use, e.g. the number of layers and the type of each layer in CNN.

 - Pre-process your images in black and white

 - Pre-process your images so that they have the same input size to your model, e.g. use data augmentation.

 - Train your model using mini-batch SGD. Specify the optimization method which you use and report the total training time. To reduce the training time, you may use a GPU card.

 - Perform your parameter turning on a validation set to avoid over-fitting. Summarize your results in table/figure.

In [51]:
# Imports
import torch
import torch.nn as  nn
import torch.nn.functional as F
import torch.optim as optim
from torch.utils.data import Dataset
from torchvision import datasets
from torchvision.io import read_image
from torchvision.transforms import ToTensor
import matplotlib.pyplot as plt
import pandas as pd
import os
import numpy as np
from sklearn.model_selection import train_test_split
from torch.utils.data import DataLoader
from PIL import Image
import torchvision.transforms as transforms
from torch.optim.lr_scheduler import ReduceLROnPlateau

# Device configuration
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f'Device: {device}')

Device: cuda


In [52]:
# Custom dataset class
class CustomImageDataset(Dataset):
    def __init__(self, annotations_file, img_dir, transform=None, mode='train'):
        super().__init__()
        self.img_dir = img_dir
        file_list = os.listdir(img_dir)
        self.img_files = sorted(file_list, key=lambda x: int(os.path.splitext(x)[0]))
        self.transform = transform
        self.mode = mode
        
        if mode == 'test':
            self.img_labels = None
        else:
            annotations_files = pd.read_csv(annotations_file)
            train_files, val_files, train_labels, val_labels = train_test_split(self.img_files, annotations_files['label'], test_size=0.25, random_state=42)
            if mode == 'train':
                self.img_files = train_files
                self.img_labels = train_labels
            elif mode == 'val':
                self.img_files = val_files
                self.img_labels = val_labels

    def __len__(self):
        return len(self.img_labels)

    def __getitem__(self, idx):
        img_name = self.img_files[idx]
        img_idx = int(os.path.splitext(img_name)[0])
        img_path = os.path.join(self.img_dir, img_name)
        image = Image.open(img_path).convert('L')  # Convert image to black and white
        if self.transform is not None:
            image = self.transform(image)
        if self.mode == 'test':
            return image
        else:
            label = self.img_labels[img_idx] - 1
            label = torch.tensor(label, dtype=torch.long)
            return image, label


In [57]:
# Data
path_annotations = './train.csv'
train_dir = './train/'
test_dir = './test/'

# Image transformations
transform = transforms.Compose([
    transforms.Resize((32, 32)),  # Resize images to 32x32
    transforms.ToTensor()
])

# Data loaders
train_dataset = CustomImageDataset(path_annotations, train_dir, transform=transform, mode='train')
val_dataset = CustomImageDataset(path_annotations, train_dir, transform=transform, mode='val')
test_dataset = CustomImageDataset(None, test_dir, transform=transform, mode='test')

train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)
val_loader = DataLoader(val_dataset, batch_size=64, shuffle=False)
test_loader = DataLoader(test_dataset, batch_size=64, shuffle=False)

In [58]:
# Création du modèle LeNet
class LeNet(nn.Module):
    def __init__(self, num_channels=3, num_classes=10):
        super(LeNet, self).__init__()
        self.conv1 = nn.Conv2d(num_channels, 6, kernel_size=5, stride=1, padding=2, bias=False)
        self.conv2 = nn.Conv2d(6, 16, kernel_size=5, stride=1, padding=0, bias=False)
        self.fc1 = nn.Linear(16*5*5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, num_classes)

    def forward(self, x):
        x = F.relu(self.conv1(x))
        x = F.max_pool2d(x, 2)
        x = F.relu(self.conv2(x))
        x = F.max_pool2d(x, 2)
        x = x.view(-1, 16*5*5)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

# Model instantiation
model = LeNet(num_channels=1, num_classes=10).to(device)

In [59]:
def train_model(model, train_loader, val_loader, criterion, optimizer, scheduler, num_epochs=25):
    model.train()
    best_acc = 0.0

    for epoch in range(num_epochs):
        running_loss = 0.0
        running_corrects = 0

        for inputs, labels in train_loader:
            inputs, labels = inputs.to(device), labels.to(device)
            optimizer.zero_grad()

            outputs = model(inputs)
            _, preds = torch.max(outputs, 1)
            loss = criterion(outputs, labels)

            loss.backward()
            optimizer.step()

            running_loss += loss.item() * inputs.size(0)
            running_corrects += torch.sum(preds == labels.data)

        epoch_loss = running_loss / len(train_loader.dataset)
        epoch_acc = running_corrects.double() / len(train_loader.dataset)

        print(f'Epoch {epoch}/{num_epochs - 1}, Loss: {epoch_loss:.4f}, Acc: {epoch_acc:.4f}')

        model.eval()
        val_corrects = 0

        for inputs, labels in val_loader:
            inputs, labels = inputs.to(device), labels.to(device)

            with torch.no_grad():
                outputs = model(inputs)
                _, preds = torch.max(outputs, 1)
                val_corrects += torch.sum(preds == labels.data)

        val_acc = val_corrects.double() / len(val_loader.dataset)
        print(f'Validation Acc: {val_acc:.4f}')

        if val_acc > best_acc:
            best_acc = val_acc
            torch.save(model.state_dict(), 'best_model.pth')

        scheduler.step(val_acc)

    return model

# Loss function and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=0.001, momentum=0.9)
scheduler = ReduceLROnPlateau(optimizer, 'min')

# Training the model
model = train_model(model, train_loader, val_loader, criterion, optimizer, scheduler, num_epochs=25)


RuntimeError: shape '[-1, 400]' is invalid for input of size 36864

In [60]:
def evaluate_model(model, test_loader):
    model.load_state_dict(torch.load('best_model.pth'))
    model.eval()
    test_corrects = 0

    for inputs in test_loader:
        inputs = inputs.to(device)
        
        with torch.no_grad():
            outputs = model(inputs)
            _, preds = torch.max(outputs, 1)
            test_corrects += torch.sum(preds == inputs.data)

    test_acc = test_corrects.double() / len(test_loader.dataset)
    print(f'Test Accuracy: {test_acc:.4f}')

# Evaluating the model
evaluate_model(model, test_loader)

FileNotFoundError: [Errno 2] No such file or directory: 'best_model.pth'

## 3 Machine learning and AI safety

### 3.1 Bias introduction in ImageNet

We try to understand how an image classification model (such as the ones trained in section 2) can be biased towards a specific outcome. To make things easier we consider the binary classification problem (2 classes).

We propose to implement the following setup :

#### 3.1.1 Biased Dataset Construction

We decide to concatenate to each of our images $x$ from ImageNet a new set of variables $\varepsilon$. As we shall specify, the second variable $\varepsilon$ is a noise-like image, which is not directly computed from $x$. Yet, it introduces some bias as we choose the value of $\varepsilon$ to be strongly correlated to the label ($0$ or $1$) of the images.

More precisely, we assume $(\tilde{x}, y) = ([x, \varepsilon], y)$ is a random sample of the modified dataset.  
We shall specify the value of $\varepsilon$ using $y$. Let $p_0 \in [0, 1], p_1 \in [0, 1]$ be two probabilities.  
Given $(\tilde{x}, y)$, the bias variable $S$ is defined as
$$
S \sim Bernoulli(p_k), \quad \text{ if } y = k, k \in \{0, 1\}.
$$

Then, we choose $\varepsilon$ according to $S$ as below :
- $\varepsilon = 0$ if $S = 0$.
- $\varepsilon \sim \mathcal{N}(0, \sigma^2I)$ if $S = 1$.

In the extreme case where $p_0 = 0$ and $p_1 = 1$, one can ignore the original image $x$ and only use $\varepsilon$ to predict $y$. In that case, our predictions are never using the image $x$ and we are only relying on the noise-like $\varepsilon$ that we introduced to the dataset.

**Advice** : in Pytorch, we can build a biased dataset by simply adding the biased variables $\varepsilon^{\{i\}}$ as a dedicated channel to the original images $x^{\{i\}}$ for each image index $i$.

**Question 1** : Build a biased dataset from ImageNet using the approach described in 3.1.1

In [None]:
# Code