<a href="https://colab.research.google.com/github/ArindamRoy23/DSBA_T2-CS-Advanced_Deep_Learning/blob/master/TP4/TP4_Transfer_Learning.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Practical session on Transfer Learning**
This Pratical session proposes to study several techniques for improving challenging context, in which few data and resources are available.

# Introduction

**Context :**

Assume we are in a context where few "gold" labeled data are available for training, say 

$$\mathcal{X}_{\text{train}} = \{(x_n,y_n)\}_{n\leq N_{\text{train}}}$$

where $N_{\text{train}}$ is small. 

A large test set $\mathcal{X}_{\text{test}}$ as well as a large amount of unlabeled data, $\mathcal{X}$, is available. We also assume that we have a limited computational budget (e.g., no GPUs).

**Instructions to follow :** 

For each question, write a commented *Code* or a complete answer as a *Markdown*. When the objective of a question is to report a CNN accuracy, please use the following format to report it, at the end of the question :

| Model | Number of  epochs  | Train accuracy | Test accuracy |
|------|------|------|------|
|   XXX  | XXX | XXX | XXX |

If applicable, please add the field corresponding to the  __Accuracy on Full Data__ as well as a link to the __Reference paper__ you used to report those numbers. (You do not need to train a CNN on the full CIFAR10 dataset!)

In your final report, please *keep the logs of each training procedure* you used. We will only run this jupyter if we have some doubts on your implementation. 

The total file sizes should be reasonable (feasible with 2MB only!). You will be asked to hand in the notebook, together with any necessary files required to run it if any.

You can use https://colab.research.google.com/ to run your experiments.

## Training set creation
__Question 1 (2 points) :__ Propose a dataloader to obtain a training loader that will only use the first 100 samples of the CIFAR-10 training set.

Additional information :  

*   CIFAR10 dataset : https://en.wikipedia.org/wiki/CIFAR-10
*   You can directly use the dataloader framework from Pytorch.
*   Alternatively you can modify the file : https://github.com/pytorch/vision/blob/master/torchvision/datasets/cifar.py

In [1]:
import torch
import torchvision
import torchvision.transforms as transforms

transform = transforms.Compose(
    [transforms.ToTensor(),
     transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

trainset = torchvision.datasets.CIFAR10(root='./data', train=True,
                                        download=True, transform=transform)
                                        
trainset.data = trainset.data[:100]
trainset.targets = trainset.targets[:100]

trainloader = torch.utils.data.DataLoader(trainset, batch_size=4,
                                          shuffle=True, num_workers=2)

Downloading https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz to ./data/cifar-10-python.tar.gz


  0%|          | 0/170498071 [00:00<?, ?it/s]

Extracting ./data/cifar-10-python.tar.gz to ./data


In [3]:
testset = torchvision.datasets.CIFAR10(root='./data', train = False,
                                        download=True, transform=transform)
       
testloader = torch.utils.data.DataLoader(testset, batch_size=4,
                                          shuffle=True, num_workers=2)

Files already downloaded and verified


In [52]:
transform = transforms.Compose(
    [transforms.ToTensor(),
     transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

trainset = torchvision.datasets.CIFAR10(root='./data', train=True,
                                        download=True, transform=transform)
trainset.data = trainset.data[:100]
trainset.targets = trainset.targets[:100]
trainloader = torch.utils.data.DataLoader(trainset, batch_size=10,
                                          shuffle=True, num_workers=2)

valset = torchvision.datasets.CIFAR10(root='./data', train=True,
                                        download=True, transform=transform)

valset.data = valset.data[100:150]
valset.targets = valset.targets[100:150]

valloader = torch.utils.data.DataLoader(valset, batch_size=10,
                                          shuffle=True, num_workers=2)


testset = torchvision.datasets.CIFAR10(root='./data', train=False,
                                       download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=10,
                                         shuffle=False, num_workers=2)

classes = ('plane', 'car', 'bird', 'cat',
           'deer', 'dog', 'frog', 'horse', 'ship', 'truck')

Files already downloaded and verified
Files already downloaded and verified
Files already downloaded and verified


* This is our dataset $\mathcal{X}_{\text{train}}$, it will be used until the end of this project. 

* The remaining samples correspond to $\mathcal{X}$. 

* The testing set $\mathcal{X}_{\text{test}}$ corresponds to the whole testing set of CIFAR-10.

## Testing procedure
__Question 2 (1.5 points):__ Explain why the evaluation of the training procedure is difficult. Propose several solutions.

# The Baseline

In this section, the goal is to train a CNN on $\mathcal{X}_{\text{train}}$ and compare its performance with reported numbers from the litterature. You will have to re-use and/or design a standard classification pipeline. You should optimize your pipeline to obtain the best performances (image size, data augmentation by flip, ...).

The key ingredients for training a CNN are the batch size, as well as the learning rate scheduler (i.e. how to decrease the learning rate as a function of the number of epochs). A possible scheduler is to start the learning rate at 0.1 and decreasing it every 30 epochs by 10. In case of divergence, reduce the learning rate. A potential batch size could be 10, yet this can be cross-validated.

You can get some baselines accuracies in this paper (obviously, it is a different context for those researchers who had access to GPUs!) : http://openaccess.thecvf.com/content_cvpr_2018/papers/Keshari_Learning_Structure_and_CVPR_2018_paper.pdf. 

In [53]:
classes = ('plane', 'car', 'bird', 'cat',
           'deer', 'dog', 'frog', 'horse', 'ship', 'truck')

In [54]:
import torch.nn as nn
import torch.nn.functional as F


class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(3, 6, 5)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1 = nn.Linear(16 * 5 * 5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.view(-1, 16 * 5 * 5)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x


net = Net()

In [56]:
import torch
import torch.nn as nn
import torch.optim as optim
import torch.optim.lr_scheduler as lr_scheduler
from torch.utils.data import DataLoader
from torchvision import datasets, transforms

In [61]:
# Define your optimizer and learning rate scheduler
optimizer = optim.Adam(net.parameters(), lr=0.001)
scheduler = lr_scheduler.StepLR(optimizer, step_size=10, gamma=0.1)
criterion = nn.CrossEntropyLoss()

# Early stopping parameters
patience = 10
best_accuracy = 0.0
counter = 0

# Training loop
for epoch in range(100):
    running_loss = 0.0
    for i, data in enumerate(trainloader):
        inputs, labels = data

        optimizer.zero_grad()

        # Move your data and model to GPU if available
        device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
        inputs, labels = inputs.to(device), labels.to(device)
        net.to(device)

        outputs = net(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        running_loss += loss.item()
    
    # Update the learning rate scheduler
    scheduler.step()

    # Print statistics
    print('[Epoch %d] loss: %.3f' % (epoch + 1, running_loss / len(trainloader)))

     # Validation testing
    correct = 0
    total = 0
    with torch.no_grad():
        for data in valloader:
            images, labels = data
            images, labels = images.to(device), labels.to(device)

            outputs = net(images)
            _, predicted = torch.max(outputs.data, 1)
            total += labels.size(0)
            correct += (predicted == labels).sum().item()

    accuracy = 100 * correct / total
    print('[Epoch %d] validation accuracy: %.2f %%' % (epoch + 1, accuracy))

    # Early stopping
    if accuracy > best_accuracy:
        best_accuracy = accuracy
        counter = 0
    else:
        counter += 1
        if counter >= patience:
            print('Early stopping: validation accuracy did not improve for %d epochs.' % patience)
            break

print('Finished Training')



[Epoch 1] loss: 0.174
[Epoch 1] validation accuracy: 24.00 %
[Epoch 2] loss: 0.148
[Epoch 2] validation accuracy: 20.00 %
[Epoch 3] loss: 0.110
[Epoch 3] validation accuracy: 20.00 %
[Epoch 4] loss: 0.067
[Epoch 4] validation accuracy: 26.00 %
[Epoch 5] loss: 0.069
[Epoch 5] validation accuracy: 22.00 %
[Epoch 6] loss: 0.120
[Epoch 6] validation accuracy: 26.00 %
[Epoch 7] loss: 0.166
[Epoch 7] validation accuracy: 24.00 %
[Epoch 8] loss: 0.141
[Epoch 8] validation accuracy: 22.00 %
[Epoch 9] loss: 0.056
[Epoch 9] validation accuracy: 24.00 %
[Epoch 10] loss: 0.151
[Epoch 10] validation accuracy: 22.00 %
[Epoch 11] loss: 0.065
[Epoch 11] validation accuracy: 22.00 %
[Epoch 12] loss: 0.031
[Epoch 12] validation accuracy: 24.00 %
[Epoch 13] loss: 0.023
[Epoch 13] validation accuracy: 24.00 %
[Epoch 14] loss: 0.019
[Epoch 14] validation accuracy: 24.00 %
Early stopping: validation accuracy did not improve for 10 epochs.
Finished Training


In [48]:
correct = 0
total = 0
with torch.no_grad():
    for data in testloader:
        images, labels = data
        outputs = net(images)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

print('Accuracy of the network on the 10000 test images: %d %%' % (
    100 * correct / total))

Accuracy of the network on the 10000 test images: 20 %


In [49]:
class_correct = list(0. for i in range(10))
class_total = list(0. for i in range(10))
with torch.no_grad():
    for data in testloader:
        images, labels = data
        outputs = net(images)
        _, predicted = torch.max(outputs, 1)
        c = (predicted == labels).squeeze()
        for i in range(4):
            label = labels[i]
            class_correct[label] += c[i].item()
            class_total[label] += 1


for i in range(10):
    print('Accuracy of %5s : %2d %%' % (
        classes[i], 100 * class_correct[i] / class_total[i]))


Accuracy of plane :  5 %
Accuracy of   car : 23 %
Accuracy of  bird : 30 %
Accuracy of   cat : 30 %
Accuracy of  deer : 24 %
Accuracy of   dog :  3 %
Accuracy of  frog : 12 %
Accuracy of horse : 22 %
Accuracy of  ship : 19 %
Accuracy of truck : 29 %


## ResNet architectures

__Question 3 (4 points) :__ Write a classification pipeline for $\mathcal{X}_{\text{train}}$, train from scratch and evaluate a *ResNet-18* architecture specific to CIFAR10 (details about the ImageNet model can be found here: https://arxiv.org/abs/1512.03385). Please report the accuracy obtained on the whole dataset as well as the reference paper/GitHub link.

*Hint :* You can re-use the following code : https://github.com/kuangliu/pytorch-cifar. During a training of 10 epochs, a batch size of 10 and a learning rate of 0.01, one obtains 40% accuracy on $\mathcal{X}_{\text{train}}$ (\~2 minutes) and 20% accuracy on $\mathcal{X}_{\text{test}}$ (\~5 minutes).

In [78]:
import torch
import torch.nn as nn
import torchvision.models as models

# Load pre-trained ResNet-18 model
resnet18 = models.resnet18(pretrained=False)

# Replace the last layer to have 10 outputs (one for each class in CIFAR-10)
num_features = resnet18.fc.in_features
resnet18.fc = nn.Linear(num_features, 10)

# # Freeze all layers except the last one
# for param in resnet18.parameters():
#     param.requires_grad = False
# resnet18.fc.requires_grad = True

# Print the model architecture
print(resnet18)

# Move the model to the device (GPU or CPU)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
resnet18.to(device)

ResNet(
  (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
  (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (relu): ReLU(inplace=True)
  (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
  (layer1): Sequential(
    (0): BasicBlock(
      (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    )
    (1): BasicBlock(
      (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
  

ResNet(
  (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
  (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (relu): ReLU(inplace=True)
  (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
  (layer1): Sequential(
    (0): BasicBlock(
      (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    )
    (1): BasicBlock(
      (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
  

In [79]:
net = resnet18

In [80]:
# Define your optimizer and learning rate scheduler
optimizer = optim.Adam(net.parameters(), lr=0.001)
scheduler = lr_scheduler.StepLR(optimizer, step_size=10, gamma=0.1)
criterion = nn.CrossEntropyLoss()

# Early stopping parameters
patience = 10
best_accuracy = 0.0
counter = 0

# Training loop
for epoch in range(100):
    running_loss = 0.0
    for i, data in enumerate(trainloader):
        inputs, labels = data

        optimizer.zero_grad()

        # Move your data and model to GPU if available
        device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
        inputs, labels = inputs.to(device), labels.to(device)
        net.to(device)

        outputs = net(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        running_loss += loss.item()
    
    # Update the learning rate scheduler
    scheduler.step()

    # Print statistics
    print('[Epoch %d] loss: %.3f' % (epoch + 1, running_loss / len(trainloader)))

     # Validation testing
    correct = 0
    total = 0
    with torch.no_grad():
        for data in valloader:
            images, labels = data
            images, labels = images.to(device), labels.to(device)

            outputs = net(images)
            _, predicted = torch.max(outputs.data, 1)
            total += labels.size(0)
            correct += (predicted == labels).sum().item()

    accuracy = 100 * correct / total
    print('[Epoch %d] validation accuracy: %.2f %%' % (epoch + 1, accuracy))

    # Early stopping
    if accuracy > best_accuracy:
        best_accuracy = accuracy
        counter = 0
    else:
        counter += 1
        if counter >= patience:
            print('Early stopping: validation accuracy did not improve for %d epochs.' % patience)
            break

print('Finished Training')



[Epoch 1] loss: 2.649
[Epoch 1] validation accuracy: 20.00 %
[Epoch 2] loss: 1.917
[Epoch 2] validation accuracy: 22.00 %
[Epoch 3] loss: 1.143
[Epoch 3] validation accuracy: 22.00 %
[Epoch 4] loss: 0.909
[Epoch 4] validation accuracy: 30.00 %
[Epoch 5] loss: 0.625
[Epoch 5] validation accuracy: 26.00 %
[Epoch 6] loss: 0.349
[Epoch 6] validation accuracy: 28.00 %
[Epoch 7] loss: 0.307
[Epoch 7] validation accuracy: 20.00 %
[Epoch 8] loss: 0.239
[Epoch 8] validation accuracy: 22.00 %
[Epoch 9] loss: 0.268
[Epoch 9] validation accuracy: 26.00 %
[Epoch 10] loss: 0.520
[Epoch 10] validation accuracy: 22.00 %
[Epoch 11] loss: 0.305
[Epoch 11] validation accuracy: 18.00 %
[Epoch 12] loss: 0.190
[Epoch 12] validation accuracy: 30.00 %
[Epoch 13] loss: 0.207
[Epoch 13] validation accuracy: 30.00 %
[Epoch 14] loss: 0.058
[Epoch 14] validation accuracy: 26.00 %
Early stopping: validation accuracy did not improve for 10 epochs.
Finished Training


In [82]:
correct = 0
total = 0
with torch.no_grad():
    for data in testloader:
        images, labels = data
        images = images.to(device)
        labels = labels.to(device)
        outputs = net(images)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

print('Accuracy of the network on the 10000 test images: %d %%' % (
    100 * correct / total))


Accuracy of the network on the 10000 test images: 20 %


In [84]:
class_correct = list(0. for i in range(10))
class_total = list(0. for i in range(10))
with torch.no_grad():
    for data in testloader:
        images, labels = data
        images = images.to(device)
        labels = labels.to(device)
        outputs = net(images)
        _, predicted = torch.max(outputs, 1)
        c = (predicted == labels).squeeze()
        for i in range(4):
            label = labels[i]
            class_correct[label] += c[i].item()
            class_total[label] += 1


for i in range(10):
    print('Accuracy of %5s : %2d %%' % (
        classes[i], 100 * class_correct[i] / class_total[i]))


Accuracy of plane :  8 %
Accuracy of   car : 44 %
Accuracy of  bird : 19 %
Accuracy of   cat : 18 %
Accuracy of  deer : 33 %
Accuracy of   dog :  8 %
Accuracy of  frog : 16 %
Accuracy of horse : 20 %
Accuracy of  ship : 17 %
Accuracy of truck : 27 %


# Transfer learning

We propose to use pre-trained models on a classification and generative task, in order to improve the results of our setting.

In [85]:
import torch
import torch.nn as nn
import torchvision.models as models

# Load pre-trained ResNet-18 model
resnet18 = models.resnet18(pretrained=True)

# Replace the last layer to have 10 outputs (one for each class in CIFAR-10)
num_features = resnet18.fc.in_features
resnet18.fc = nn.Linear(num_features, 10)

# # Freeze all layers except the last one
# for param in resnet18.parameters():
#     param.requires_grad = False
# resnet18.fc.requires_grad = True

# Print the model architecture
print(resnet18)

# Move the model to the device (GPU or CPU)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
resnet18.to(device)



ResNet(
  (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
  (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (relu): ReLU(inplace=True)
  (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
  (layer1): Sequential(
    (0): BasicBlock(
      (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    )
    (1): BasicBlock(
      (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
  

ResNet(
  (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
  (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (relu): ReLU(inplace=True)
  (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
  (layer1): Sequential(
    (0): BasicBlock(
      (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    )
    (1): BasicBlock(
      (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
  

In [90]:
net = resnet18

In [87]:
# Define your optimizer and learning rate scheduler
optimizer = optim.Adam(net.parameters(), lr=0.001)
scheduler = lr_scheduler.StepLR(optimizer, step_size=10, gamma=0.1)
criterion = nn.CrossEntropyLoss()

# Early stopping parameters
patience = 10
best_accuracy = 0.0
counter = 0

# Training loop
for epoch in range(100):
    running_loss = 0.0
    for i, data in enumerate(trainloader):
        inputs, labels = data

        optimizer.zero_grad()

        # Move your data and model to GPU if available
        device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
        inputs, labels = inputs.to(device), labels.to(device)
        net.to(device)

        outputs = net(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        running_loss += loss.item()
    
    # Update the learning rate scheduler
    scheduler.step()

    # Print statistics
    print('[Epoch %d] loss: %.3f' % (epoch + 1, running_loss / len(trainloader)))

     # Validation testing
    correct = 0
    total = 0
    with torch.no_grad():
        for data in valloader:
            images, labels = data
            images, labels = images.to(device), labels.to(device)

            outputs = net(images)
            _, predicted = torch.max(outputs.data, 1)
            total += labels.size(0)
            correct += (predicted == labels).sum().item()

    accuracy = 100 * correct / total
    print('[Epoch %d] validation accuracy: %.2f %%' % (epoch + 1, accuracy))

    # Early stopping
    if accuracy > best_accuracy:
        best_accuracy = accuracy
        counter = 0
    else:
        counter += 1
        if counter >= patience:
            print('Early stopping: validation accuracy did not improve for %d epochs.' % patience)
            break

print('Finished Training')



[Epoch 1] loss: 2.849
[Epoch 1] validation accuracy: 30.00 %
[Epoch 2] loss: 1.872
[Epoch 2] validation accuracy: 32.00 %
[Epoch 3] loss: 1.531
[Epoch 3] validation accuracy: 34.00 %
[Epoch 4] loss: 1.042
[Epoch 4] validation accuracy: 30.00 %
[Epoch 5] loss: 0.778
[Epoch 5] validation accuracy: 28.00 %
[Epoch 6] loss: 0.859
[Epoch 6] validation accuracy: 34.00 %
[Epoch 7] loss: 1.257
[Epoch 7] validation accuracy: 28.00 %
[Epoch 8] loss: 1.136
[Epoch 8] validation accuracy: 40.00 %
[Epoch 9] loss: 1.139
[Epoch 9] validation accuracy: 30.00 %
[Epoch 10] loss: 0.927
[Epoch 10] validation accuracy: 32.00 %
[Epoch 11] loss: 0.491
[Epoch 11] validation accuracy: 30.00 %
[Epoch 12] loss: 0.469
[Epoch 12] validation accuracy: 24.00 %
[Epoch 13] loss: 0.304
[Epoch 13] validation accuracy: 32.00 %
[Epoch 14] loss: 0.232
[Epoch 14] validation accuracy: 32.00 %
[Epoch 15] loss: 0.172
[Epoch 15] validation accuracy: 38.00 %
[Epoch 16] loss: 0.285
[Epoch 16] validation accuracy: 34.00 %
[Epoch 17]

In [88]:
correct = 0
total = 0
with torch.no_grad():
    for data in testloader:
        images, labels = data
        images = images.to(device)
        labels = labels.to(device)
        outputs = net(images)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

print('Accuracy of the network on the 10000 test images: %d %%' % (
    100 * correct / total))


Accuracy of the network on the 10000 test images: 28 %


In [89]:
class_correct = list(0. for i in range(10))
class_total = list(0. for i in range(10))
with torch.no_grad():
    for data in testloader:
        images, labels = data
        images = images.to(device)
        labels = labels.to(device)
        outputs = net(images)
        _, predicted = torch.max(outputs, 1)
        c = (predicted == labels).squeeze()
        for i in range(4):
            label = labels[i]
            class_correct[label] += c[i].item()
            class_total[label] += 1


for i in range(10):
    print('Accuracy of %5s : %2d %%' % (
        classes[i], 100 * class_correct[i] / class_total[i]))


Accuracy of plane : 12 %
Accuracy of   car : 54 %
Accuracy of  bird : 31 %
Accuracy of   cat : 18 %
Accuracy of  deer : 36 %
Accuracy of   dog : 14 %
Accuracy of  frog : 25 %
Accuracy of horse : 36 %
Accuracy of  ship : 14 %
Accuracy of truck : 35 %


## ImageNet features

Now, we will use some pre-trained models on ImageNet and see how well they compare on CIFAR. A list is available on : https://pytorch.org/vision/stable/models.html.

__Question 4 (3 points):__ Pick a model from the list above, adapt it for CIFAR10 and retrain its final layer (or a block of layers, depending on the resources to which you have access to). Report its accuracy.

# Incorporating *a priori*
Geometrical *a priori* are appealing for image classification tasks. For now, we only consider linear transformations $\mathcal{T}$ of the inputs $x:\mathbb{S}^2\rightarrow\mathbb{R}$ where $\mathbb{S}$ is the support of an image, meaning that :

$$\forall u\in\mathbb{S}^2,\mathcal{T}(\lambda x+\mu y)(u)=\lambda \mathcal{T}(x)(u)+\mu \mathcal{T}(y)(u)\,.$$

For instance if an image had an infinite support, a translation $\mathcal{T}_a$ by $a$ would lead to :

$$\forall u, \mathcal{T}_a(x)(u)=x(u-a)\,.$$

Otherwise, one has to handle several boundary effects.

__Question 5 (1.5 points) :__ Explain the issues when dealing with translations, rotations, scaling effects, color changes on $32\times32$ images. Propose several ideas to tackle them.

## Data augmentations

__Question 6 (3 points):__ Propose a set of geometric transformation beyond translation, and incorporate them in your training pipeline. Train the model of the __Question 3__ with them and report the accuracies.

# Conclusions

__Question 7 (5 points) :__ Write a short report explaining the pros and the cons of each method that you implemented. 25% of the grade of this project will correspond to this question, thus, it should be done carefully. In particular, please add a plot that will summarize all your numerical results.

# Weak supervision

__Bonus \[open\] question (up to 3 points) :__ Pick a weakly supervised method that will potentially use $\mathcal{X}\cup\mathcal{X}_{\text{train}}$ to train a representation (a subset of $\mathcal{X}$ is also fine). Evaluate it and report the accuracies. You should be careful in the choice of your method, in order to avoid heavy computational effort.