# CNN - DO NOT TRY TO RUN THIS NOTEBOOK
This notebook contains a test CNN in pytorch, to get familiar with this developping environment. It also acts as a template for later use.

In [22]:
%matplotlib inline
import numpy as np
import torch
from torch.autograd import Variable
import torch.optim as optim
import torch.nn.functional as F
import os,sys
import matplotlib.image as mpimg
import matplotlib.pyplot as plt

#seed for reproducible results
seed = 42
np.random.seed(seed)
torch.manual_seed(seed)

<torch._C.Generator at 0x28e384f91d0>

In [23]:
# Helper functions

def load_image(infilename):
    data = mpimg.imread(infilename)
    return data

def img_float_to_uint8(img):
    rimg = img - np.min(img)
    rimg = (rimg / np.max(rimg) * 255).round().astype(np.uint8)
    return rimg

# Concatenate an image and its groundtruth
def concatenate_images(img, gt_img):
    nChannels = len(gt_img.shape)
    w = gt_img.shape[0]
    h = gt_img.shape[1]
    if nChannels == 3:
        cimg = np.concatenate((img, gt_img), axis=1)
    else:
        gt_img_3c = np.zeros((w, h, 3), dtype=np.uint8)
        gt_img8 = img_float_to_uint8(gt_img)          
        gt_img_3c[:,:,0] = gt_img8
        gt_img_3c[:,:,1] = gt_img8
        gt_img_3c[:,:,2] = gt_img8
        img8 = img_float_to_uint8(img)
        cimg = np.concatenate((img8, gt_img_3c), axis=1)
    return cimg

def img_crop(im, w, h):
    list_patches = []
    imgwidth = im.shape[0]
    imgheight = im.shape[1]
    is_2d = len(im.shape) < 3
    for i in range(0,imgheight,h):
        for j in range(0,imgwidth,w):
            if is_2d:
                im_patch = im[j:j+w, i:i+h]
            else:
                im_patch = im[j:j+w, i:i+h, :]
            list_patches.append(im_patch)
    return list_patches

## Pytorch module

This module contains the CNN named SimpleCNN.

In "init", the layers have to be initialized (for instance conv layers, maxpool layers and fully connected layers. ReLu do not take hyperparameters, so it doesn't need an initialization.

In "forward", the structure of the CNN is laid out by taking an input tensor x and applying the layers in the correct order to this tensor. This function is the forward pass of the CNN and returns the computed x.

The backward pass can be computed by Pytorch's autograd function. For this to be achieved, our input tensor and our weights must be of type "Variable" (as imported above), this type stores changes to the tensor automatically which makes it possible to compute the gradient very easily. When declaring a Variable tensor, the option "requires_grad" must be set to True, otherwise the gradient will not be computed.

### Module structure

For this example module, we will take the following structure:
- Input image: 3 channels 400x400

Two convolutions layers so that spatial stuff happen (every parameter is pretty arbitrary right now)
- Convolution with kernel size 5, reduce to 1 channel (output  1 channel 396x396)
- ReLu
- MaxPool with kernel size 2 (output 198x198)
- Convolution with kernel size 5 (output 194x194)
- ReLu
- MaxPool with kernel size 2 (output 97x97)

A linear layer to get back a vector of size 400x400=1x160'000, where each value is the prediction for the corresponding pixel
- View -> resize tensor into vector
- Linear with 160'000 output features
- Sigmoid

Note that this structure is very unoptimized as it takes a whole image as input without caring for mini-batches

In [29]:
class SimpleCNN(torch.nn.Module):
    
    #Our batch shape for input x is (3, 32, 32)
    
    def __init__(self):
        super(SimpleCNN, self).__init__()
        
        self.conv1 = torch.nn.Conv2d(3, 1, kernel_size=5)
        self.pool1 = torch.nn.MaxPool2d(kernel_size=2)
        
        self.conv2 = torch.nn.Conv2d(1, 1, kernel_size=5)
        self.pool2 = torch.nn.MaxPool2d(kernel_size=2)
        
        self.fc2 = torch.nn.Linear(9409, 160000)
        
    def forward(self, x):
        x = F.relu(self.conv1(x))
        x = self.pool1(x)
        
        x = F.relu(self.conv2(x))
        x = self.pool2(x)
        
        x = x.view(-1, 9409)
        
        x = F.relu(self.fc1(x))
        
        x = torch.nn.Sigmoid(self.fc2(x))
        return(x)

## Training the model

In [30]:
# Loading a set of 15 training images
root_dir = "training/"

image_dir = root_dir + "images/"
files = os.listdir(image_dir)
n = min(15, len(files)) # Load maximum 100 images
print("Loading " + str(n) + " images")
imgs = [load_image(image_dir + files[i]) for i in range(n)]
rimgs = img_float_to_uint8(imgs)
print(np.shape(rimgs))

Loading 15 images
(15, 400, 400, 3)


As an example, we will use 10 images for training, and 5 images for testing

In [31]:
train_set = rimgs[0:10]
test_set = rimgs[10:15]

# THIS IS WHERE IT GETS FUNKY
__The fully connected layer is waaaaaaaaaaay to big, your computer will break if you try to create an instance of this CNN. This really needs to be optimized to use mini-batches__

In [32]:
# We will optimize the cross-entropy loss using adam algorithm
net = SimpleCNN()
loss = torch.nn.CrossEntropyLoss()
optimizer = optim.Adam(net.parameters(), lr=0.01)

KeyboardInterrupt: 