## Deep Learning: Introduction (2)

The goal of this second exercice is to see a Convolutional Network for classification in PyTorch (and to discover the joy of tensor dimensioning).

In [None]:
import numpy as np

# PyTorch:
import torch

# For visualization:
import matplotlib.pyplot as plt
%matplotlib inline

The functions below will be used to create a simple dataset of synthetic images. It is not important to understand them. What matters is to notice the size of the generated images: 72x72

In [None]:
def generate_a_drawing(U, V, noise=0.0):
    figsize = 1.0    
    fig = plt.figure(figsize=(figsize,figsize))
    ax = plt.subplot(111)
    plt.axis('Off')
    ax.set_xlim(0,figsize)
    ax.set_ylim(0,figsize)
    ax.fill(U, V, "k")
    fig.canvas.draw()
    imdata = np.frombuffer(fig.canvas.tostring_rgb(), dtype=np.uint8)[::3].astype(np.float32)
    imdata = imdata + noise * np.random.random(imdata.size)
    plt.close(fig)
    return imdata.reshape(72,72)

def generate_a_rectangle(noise=0.0):
    U = np.zeros(4)
    V = np.zeros(4)
    corners = np.random.random(4)
    top = max(corners[0], corners[1])
    bottom = min(corners[0], corners[1])
    left = min(corners[2], corners[3])
    right = max(corners[2], corners[3])
    U[0] = U[1] = top
    U[2] = U[3] = bottom
    V[0] = V[3] = left
    V[1] = V[2] = right
    return generate_a_drawing(U, V, noise)

def generate_a_disk(noise=0.0):
    center = np.random.random(2)
    radius = (0.3 + 0.7 * np.random.random()) / 2
    N = 50
    U = np.zeros(N)
    V = np.zeros(N)
    i = 0
    for t in np.linspace(0, 2*np.pi, N):
        U[i] = center[0] + np.cos(t) * radius
        V[i] = center[1] + np.sin(t) * radius
        i = i + 1
    return generate_a_drawing(U, V, noise)

def generate_a_triangle(noise=0.0):
    figsize = 1.0
    U = np.random.random(3)
    V = np.random.random(3)
    return generate_a_drawing(U, V, noise)

Checking the functions above:

In [None]:
im = generate_a_rectangle(10)
plt.imshow(im, cmap='gray')

In [None]:
im = generate_a_disk(10)
plt.imshow(im, cmap='gray')

In [None]:
im = generate_a_triangle(50)
plt.imshow(im, cmap='gray')

In PyTorch, we can create a Dataset class to store and access the samples of a dataset:

In [None]:
from torch.utils.data import Dataset, DataLoader
from torchvision import transforms, utils

class ShapesDataset(Dataset):
    def __init__(self, nb_samples, noise=0.0, transform=None):
        # Getting im_size:
        im_size = generate_a_rectangle().shape[0]
        self.X = np.zeros([nb_samples,72,72])
        self.Y = np.zeros(nb_samples, dtype=np.int)
        print('Creating data:')
        for i in range(nb_samples):
            if i % 10 == 0:
                print(i)
            category = np.random.randint(3)
            if category == 0:
                self.X[i] = generate_a_rectangle(noise)
            elif category == 1: 
                self.X[i] = generate_a_disk(noise)
            else:
                self.X[i] = generate_a_triangle(noise)
            self.Y[i] = category
        # Normalizing the intensities to be between 0 and 1:
        self.X = (self.X + noise) / (255 + 2 * noise)
        # Transformation to apply to the samples 
        # (see below, we will use it to  transform the images to PyTorch tensors):
        self.transform = transform

    def __len__(self):
        return len(self.X)

    def __getitem__(self, idx):
        # idx can be a set of indices stored into a PyTorch tensor:
        if torch.is_tensor(idx):
            idx = idx.tolist()
        im = self.X[idx]
        if self.transform:
          im = self.transform(im)
        return im, self.Y[idx]

Let's test the ShapesDataset class:

In [None]:
set = ShapesDataset(10)

In [None]:
im, category = set[0]
plt.imshow(im, cmap='gray')
print(category)
print(im.shape)

We now create a class for our Convolutional Network. It has a single convolutional layer, a pooling layer, and a fully connected layer:

In [None]:
import torch.nn as nn
import torch.nn.functional as func

class ConvNet(torch.nn.Module): 
    def __init__(self):
        super(ConvNet, self).__init__()
        # 6 5×5 filters applied to the input image: 
        self.conv1 = nn.Conv2d(1, 6, 5)
        # (72-4) / 2 = 34
        self.fc1 = nn.Linear(34*34*6, 3)

    def forward(self, x):
        x = func.relu(self.conv1(x))
        x = func.max_pool2d(x, 2, 2)
        x = x.view(-1, 34*34*6)
        x = self.fc1(x)
        return func.log_softmax(x, dim=1)

- Make sure you understand the code.
- Understand where the values `34*34*6` and `3` come from in the last row of the `__init__` function.
- What does the `view` function do?

We can use our `ShapesDataset` class to create a training set:

In [None]:
training_set = ShapesDataset(100, transform=transforms.ToTensor())

We can instanciate a 'data loader', which will be used in the optimization to sample batches from the training set:

In [None]:
trainloader = torch.utils.data.DataLoader(training_set, batch_size=3, shuffle=True)

We can use the function to train out network over 1 epoch:

In [None]:
def train_one_epoch(model, trainloader, optimizer, epoch): 
    for batch_id, (images, labels) in enumerate(trainloader):
        optimizer.zero_grad()
        # model(..) calls the forward function, which expects float values:
        predictions = model(images.float())
        # nll stands for negative log likelihood:
        loss = func.nll_loss(predictions, labels)
        # the backward function computes the network parameters' gradients 
        # that will be used by the optimizer. It is inherited from `torch.nn.Module`.
        loss.backward()
        # 1 optimization step: 
        optimizer.step()
        if batch_id % 100 == 0:
            print('loss: ', loss.item())

Let's instantiate a `ConvNet` and an Adam optimizer:

In [None]:
model = ConvNet()
optimizer = torch.optim.Adam(model.parameters())

Let's run a few optimization steps:

In [None]:
for epoch in range(0, 5):
    train_one_epoch(model, trainloader, optimizer, epoch)

loss:  0.40867510437965393
loss:  0.24257434904575348
loss:  0.0975199043750763
loss:  0.098531074821949
loss:  0.06292534619569778


We can apply our model to new data as detailed below.

In [None]:
# Creating a new sample:
set = ShapesDataset(1)
im, category = set[0]
plt.imshow(im, cmap='gray')
print(category)

Calling `model(im)` will apply our network to image `im` we just created by callling the `forward()` function in our `ConvNet` class. BUT the `conv2d` function at the beginning of `forward()` expects a tensor of 4 dimensions:

`nbsamples x channels x height x width`

while `im` is currently a numpy array of dimensions:

`height x width`

We can first transform `im` into a PyTorch tensor:

In [None]:
im = torch.Tensor(im)
print(im)
print(im.size())

We can transform it into a 3-tensor  of dimensions

`1 x height x width`

using the `unsqueeze` function:

In [None]:
# 0 means we want to add the dimension at the beginning:
im = im.unsqueeze(0)
print(im)
print(im.size())

We need one more dimension at the beginning of `im`, so we call `unsqueeze` one more time:

In [None]:
im = im.unsqueeze(0)
print(im)
print(im.size())

`im` has now the correct dimensions to apply our network to it:

In [None]:
model(im)

Can you interpret the output?

## Changing the architecture

- Add a second convolutional layer and a second pooling layer. Decide what should be the size and number of the filters, and the size of the pooling regions. Note this has an impact of the dimensions for the fully connected layer. Test your code by optimizing the new network.
- Add a second fully connected layer. Make sure the dimensions of the different layers are consistent together!  Test your code by optimizing the new network.