<div class="alert alert-block alert-info">
<b>Deadline:</b> March 15, 2023 (Wednesday) 23:00
</div>

# Exercise 1. Convolutional neural networks. LeNet-5.

In this exercise, you will train a very simple convolutional neural network used for image classification tasks.

You may find it useful to look at this tutorial:
* [Neural Networks](https://pytorch.org/tutorials/beginner/blitz/neural_networks_tutorial.html#sphx-glr-beginner-blitz-neural-networks-tutorial-py)

In [146]:
skip_training = False  # Set this flag to True before validation and submission

In [147]:
# During evaluation, this cell sets skip_training to True
# skip_training = True

import tools, warnings
warnings.showwarning = tools.customwarn

In [148]:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

import torch
import torchvision
import torchvision.transforms as transforms

import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim

import tools
import tests

In [149]:
# When running on your own computer, you can specify the data directory by:
# data_dir = tools.select_data_dir('/your/local/data/directory')
data_dir = tools.select_data_dir()

The data directory is ../data


In [150]:
# Select the device for training (use GPU if you have one)
#device = torch.device('cuda:0')
device = torch.device('cpu')

In [151]:
if skip_training:
    # The models are always evaluated on CPU
    device = torch.device("cpu")

## FashionMNIST dataset

Let us use the FashionMNIST dataset. It consists of 60,000 training images of 10 classes: 'T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat', 'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot'.

In [152]:
import pandas as pd
import os
import torch

train_df = pd.DataFrame(columns=["img_name","label"])
img_name = os.listdir("Train/Astin")
lastIndex1 = len(img_name)
img_name += os.listdir("Train/Casiraghi")
lastIndex2 = len(img_name) 
img_name += os.listdir("Train/Clinton")   

train_df["img_name"] = img_name

for idx, i in enumerate(img_name[:lastIndex1]):
    train_df["img_name"][idx] = "Astin/" + train_df["img_name"][idx]
    train_df["label"][idx] = 0

for idx, i in enumerate(img_name[lastIndex1:lastIndex2], start=lastIndex1):
    train_df["img_name"][idx] = "Casiraghi/" + train_df["img_name"][idx]
    train_df["label"][idx] = 1
                            
for idx, i in enumerate(img_name[lastIndex2:], start=lastIndex2):
    train_df["img_name"][idx] = "Clinton/" + train_df["img_name"][idx]
    train_df["label"][idx] = 2

display(train_df)
train_df.to_csv (r'train_csv.csv', index = False, header=True)

You are setting values through chained assignment. Currently this works in certain cases, but when using Copy-on-Write (which will become the default behaviour in pandas 3.0) this will never work to update the original DataFrame or Series, because the intermediate object on which we are setting values will behave as a copy.
A typical example is when you are setting values in a column of a DataFrame, like:

df["col"][row_indexer] = value

Use `df.loc[row_indexer, "col"] = values` instead, to perform the assignment in a single step and ensure this keeps updating the original `df`.

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

  train_df["img_name"][idx] = "Astin/" + train_df["img_name"][idx]
You are setting values through chained assignment. Currently this works in certain cases, but when using Copy-on-Write (which will become the default behaviour in pandas 3.0) this will never work to update 

Unnamed: 0,img_name,label
0,Astin/John_Astin_0001.jpg,0
1,Astin/John_Astin_0002.jpg,0
2,Astin/John_Astin_0003.jpg,0
3,Astin/John_Astin_0004.jpg,0
4,Astin/John_Astin_0005.jpg,0
...,...,...
176,Clinton/Virginia_Clinton_Kelley_0001.jpg,2
177,Clinton/Virginia_Clinton_Kelley_0002.jpg,2
178,Clinton/Virginia_Clinton_Kelley_0003.jpg,2
179,Clinton/Virginia_Clinton_Kelley_0004.jpg,2


In [153]:
train_df = pd.DataFrame(columns=["img_name","label"])
img_name = os.listdir("Test/Sean_Astin")
lastIndex1 = len(img_name)
img_name += os.listdir("Test/Charlotte_Casiraghi")
lastIndex2 = len(img_name) 
img_name += os.listdir("Test/Chelsea_Clinton")   

train_df["img_name"] = img_name

for idx, i in enumerate(img_name[:lastIndex1]):
    train_df["img_name"][idx] = "Astin/" + train_df["img_name"][idx]
    train_df["label"][idx] = 0

for idx, i in enumerate(img_name[lastIndex1:lastIndex2], start=lastIndex1):
    train_df["img_name"][idx] = "Casiraghi/" + train_df["img_name"][idx]
    train_df["label"][idx] = 1
                            
for idx, i in enumerate(img_name[lastIndex2:], start=lastIndex2):
    train_df["img_name"][idx] = "Clinton/" + train_df["img_name"][idx]
    train_df["label"][idx] = 2

display(train_df)
train_df.to_csv (r'test_csv.csv', index = False, header=True)

You are setting values through chained assignment. Currently this works in certain cases, but when using Copy-on-Write (which will become the default behaviour in pandas 3.0) this will never work to update the original DataFrame or Series, because the intermediate object on which we are setting values will behave as a copy.
A typical example is when you are setting values in a column of a DataFrame, like:

df["col"][row_indexer] = value

Use `df.loc[row_indexer, "col"] = values` instead, to perform the assignment in a single step and ensure this keeps updating the original `df`.

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

  train_df["img_name"][idx] = "Astin/" + train_df["img_name"][idx]
You are setting values through chained assignment. Currently this works in certain cases, but when using Copy-on-Write (which will become the default behaviour in pandas 3.0) this will never work to update 

Unnamed: 0,img_name,label
0,Astin/Sean_Astin_0001.jpg,0
1,Astin/Sean_Astin_0004.jpg,0
2,Astin/Sean_Astin_0006.jpg,0
3,Astin/Sean_Astin_0007.jpg,0
4,Astin/Sean_Astin_0008.jpg,0
5,Astin/Sean_Astin_0009.jpg,0
6,Astin/Sean_Astin_0010.jpg,0
7,Astin/Sean_Astin_0011.jpg,0
8,Astin/Sean_Astin_0012.jpg,0
9,Astin/Sean_Astin_0013.jpg,0


In [173]:
from torch.utils.data import Dataset
import pandas as pd
import os
from PIL import Image
import torch
import torchvision.transforms as T

class FamiliesDataset(Dataset):
    def __init__(self, root_dir, annotation_file, transform=None):
        self.root_dir = root_dir
        self.annotations = pd.read_csv(annotation_file)
        self.transform = transform

    def __len__(self):
        return len(self.annotations)

    def __getitem__(self, index):
        img_id = self.annotations.iloc[index, 0]
        img = Image.open(os.path.join(self.root_dir, img_id)).convert("RGB")
        y_label = torch.tensor(self.annotations.iloc[index, 1])

        if self.transform is not None:
            img = self.transform(img)

        return (img, y_label)

In [174]:
from torch.utils.data import DataLoader
num_epochs = 10
learning_rate = 0.00001
train_CNN = False
batch_size = 32
shuffle = True
num_workers = 1
transform = transforms.Compose(
        [
            transforms.Resize([32,32]),
            transforms.ToTensor(),
            transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)),
           
        ]
    )
train_set = FamiliesDataset("Train","train_csv.csv",transform=transform)
test_set = FamiliesDataset("Test","test_csv.csv",transform=transform)
#train_set, validation_set = torch.utils.data.random_split(dataset,[20000,5000])

train_loader = DataLoader(dataset=train_set, shuffle=shuffle, batch_size=batch_size)
test_loader = DataLoader(dataset=test_set, shuffle=False, batch_size=batch_size)

In [175]:
#transform = transforms.Compose([
#    transforms.ToTensor(),  # Transform to tensor
#    transforms.Normalize((0.5,), (0.5,))  # Scale images to [-1, 1]
#])

# trainset = torchvision.datasets.FashionMNIST(root=data_dir, train=True, download=True, transform=transform)
# testset = torchvision.datasets.FashionMNIST(root=data_dir, train=False, download=True, transform=transform)

#classes = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat', 'Sandal',
#           'Shirt', 'Sneaker', 'Bag', 'Ankle boot']

#trainloader = torch.utils.data.DataLoader(trainset, batch_size=32, shuffle=True)
#testloader = torch.utils.data.DataLoader(testset, batch_size=5, shuffle=False)

Let us visualize the data.

In [176]:
#images, labels = next(iter(trainloader))
#tests.plot_images(images[:8], n_rows=2)

# 1. Simple convolutional network

In the first exercise, your task is to create a convolutional neural network with the architecture inspired by the classical LeNet-5 [(LeCun et al., 1998)](http://yann.lecun.com/exdb/publis/pdf/lecun-01a.pdf).

The architecture of the convolutional network that you need to create:
* 2d convolutional layer with:
    * one input channel
    * 6 output channels
    * kernel size 5 (no padding)
    * followed by ReLU
* Max-pooling layer with kernel size 2 and stride 2
* 2d convolutional layer with:
    * 16 output channels
    * kernel size 5 (no padding)
    * followed by ReLU
* Max-pooling layer with kernel size 2 and stride 2
* A fully-connected layer with:
    * 120 outputs
    * followed by ReLU
* A fully-connected layer with:
    * 84 outputs
    * followed by ReLU
* A fully-connected layer with 10 outputs and without nonlinearity.

In [177]:
class LeNet5(nn.Module):
    def __init__(self):
        super(LeNet5, self).__init__()
        # YOUR CODE HERE
        
        self.conv1 = nn.Conv2d(3, 6, 5)
        self.conv2 = nn.Conv2d(6, 16, 5)
        
        self.fc1 = nn.Linear(16 * 5 * 5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 3)

    def forward(self, x):
        """
        Args:
          x of shape (batch_size, 1, 28, 28): Input images.
        
        Returns:
          y of shape (batch_size, 10): Outputs of the network.
        """
        # YOUR CODE HERE
        
        x = F.max_pool2d(F.relu(self.conv1(x)), 2)
      
        x = F.max_pool2d(F.relu(self.conv2(x)), 2)
        
        
        x = torch.flatten(x, 1)
       
        
        x = F.relu(self.fc1(x))
        print("d")
        x = F.relu(self.fc2(x))
        print("e")
        x = self.fc3(x)
        
        return x
net = LeNet5()
print(net)

LeNet5(
  (conv1): Conv2d(3, 6, kernel_size=(5, 5), stride=(1, 1))
  (conv2): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1))
  (fc1): Linear(in_features=400, out_features=120, bias=True)
  (fc2): Linear(in_features=120, out_features=84, bias=True)
  (fc3): Linear(in_features=84, out_features=3, bias=True)
)


In [178]:
def test_LeNet5_shapes():
    net = LeNet5()

    # Feed a batch of images from the training data to test the network
    with torch.no_grad():
        images, labels = next(iter(train_loader))
        print('Shape of the input tensor:', images.shape)

        y = net(images)
        assert y.shape == torch.Size([train_loader.batch_size, 3]), "Bad shape of y: y.shape={}".format(y.shape)

    print('Success')

test_LeNet5_shapes()


Shape of the input tensor: torch.Size([32, 3, 32, 32])
d
e
Success


In [179]:
def test_LeNet5():
    net = LeNet5()
    
    # get gradients for parameters in forward path
    net.zero_grad()
    x = torch.randn(1, 1, 28, 28)
    outputs = net(x)
    outputs[0,0].backward()
    
    parameter_shapes = sorted(tuple(p.shape) for p in net.parameters() if p.grad is not None)
    print(parameter_shapes)
    expected = [(6,), (6, 1, 5, 5), (10,), (10, 84), (16,), (16, 6, 5, 5), (84,), (84, 120), (120,), (120, 256)]
    assert parameter_shapes == expected, "Wrong number of training parameters."
    
    print('Success')

test_LeNet5()

RuntimeError: Given groups=1, weight of size [6, 3, 5, 5], expected input[1, 1, 28, 28] to have 3 channels, but got 1 channels instead

# Train the network

In [180]:
# This function computes the accuracy on the test dataset
def compute_accuracy(net, testloader):
    net.eval()
    correct = 0
    total = 0
    with torch.no_grad():
        for images, labels in testloader:
            images, labels = images.to(device), labels.to(device)
            outputs = net(images)
            _, predicted = torch.max(outputs.data, 1)
            total += labels.size(0)
            correct += (predicted == labels).sum().item()
    return correct / total

### Training loop

Your task is to implement the training loop. The recommended hyperparameters:
* Stochastic Gradient Descent (SGD) optimizer with learning rate 0.001 and momentum 0.9.
* Cross-entropy loss. Note that we did not use softmax nonlinearity in the final layer of our network. Therefore, we need to use a loss function with log_softmax implemented, such as [nn.CrossEntropyLoss](https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html#torch.nn.CrossEntropyLoss).
* Number of epochs: 10. Please use mini-batches produces by `trainloader` defined above.

We recommend you to use function `compute_accuracy()` defined above to track the accuracy during training. The test accuracy should be above 0.87.

In [181]:
# Create network
net = LeNet5()

In [182]:
# Implement the training loop in this cell
if not skip_training:
    # YOUR CODE HERE

    net = net.to(device)
    
    optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)
    criterion = nn.CrossEntropyLoss()
    
    for i in range(10):
        
        for images, labels in train_loader:
            images, labels = images.to(device), labels.to(device)
            
            optimizer.zero_grad()

            output = net(images)
            
            loss = criterion(output, labels)
            loss.backward()
            
            optimizer.step()
        
        accuracy = compute_accuracy(net, train_loader)
        
        print(f'Epoch = {i+1}. Currenct accuracy = {accuracy*100:.2f}%')
        

d
e
d
e
d
e
d
e
d
e
d
e
d
e
d
e
d
e
d
e
d
e
d
e
Epoch = 1. Currenct accuracy = 38.67%
d
e
d
e
d
e
d
e
d
e
d
e
d
e
d
e
d
e
d
e
d
e
d
e
Epoch = 2. Currenct accuracy = 38.67%
d
e
d
e
d
e
d
e
d
e
d
e
d
e
d
e
d
e
d
e
d
e
d
e
Epoch = 3. Currenct accuracy = 38.67%
d
e
d
e
d
e
d
e
d
e
d
e
d
e
d
e
d
e
d
e
d
e
d
e
Epoch = 4. Currenct accuracy = 38.67%
d
e
d
e
d
e
d
e
d
e
d
e
d
e
d
e
d
e
d
e
d
e
d
e
Epoch = 5. Currenct accuracy = 38.67%
d
e
d
e
d
e
d
e
d
e
d
e
d
e
d
e
d
e
d
e
d
e
d
e
Epoch = 6. Currenct accuracy = 38.67%
d
e
d
e
d
e
d
e
d
e
d
e
d
e
d
e
d
e
d
e
d
e
d
e
Epoch = 7. Currenct accuracy = 38.67%
d
e
d
e
d
e
d
e
d
e
d
e
d
e
d
e
d
e
d
e
d
e
d
e
Epoch = 8. Currenct accuracy = 38.67%
d
e
d
e
d
e
d
e
d
e
d
e
d
e
d
e
d
e
d
e
d
e
d
e
Epoch = 9. Currenct accuracy = 38.67%
d
e
d
e
d
e
d
e
d
e
d
e
d
e
d
e
d
e
d
e
d
e
d
e
Epoch = 10. Currenct accuracy = 38.67%


In [170]:
# Save the model to disk (the pth-files will be submitted automatically together with your notebook)
# Set confirm=False if you do not want to be asked for confirmation before saving.
if not skip_training:
    tools.save_model(net, '1_lenet5.pth', confirm=True)

Model not saved.


In [171]:
if skip_training:
    net = LeNet5()
    tools.load_model(net, '1_lenet5.pth', device)

In [172]:
# Display random images from the test set, the ground truth labels and the network's predictions
net.eval()
with torch.no_grad():
    images, labels = next(iter(testloader))
    tests.plot_images(images[:5], n_rows=1)
    
    # Compute predictions
    images = images.to(device)
    y = net(images)

print('Ground truth labels: ', ' '.join('%10s' % classes[labels[j]] for j in range(5)))
print('Predictions:         ', ' '.join('%10s' % classes[j] for j in y.argmax(dim=1)))

NameError: name 'testloader' is not defined

In [32]:
# Compute the accuracy on the test set
accuracy = compute_accuracy(net, testloader)
print('Accuracy of the network on the test images: %.3f' % accuracy)
assert accuracy > 0.85, "Poor accuracy {:.3f}".format(accuracy)
print('Success')

Accuracy of the network on the test images: 0.868
Success
