# Convolutional Neural Network - part A

In this notebook we will implement Conv2D layer.

Goal of this lab is to:

* Implement and understand basic aspects of Convolutions

References:
* Largely based on http://cs231n.github.io/convolutional-networks/

# Setup

In [1]:
# Boilerplate code to get started

%load_ext autoreload
%autoreload 
%matplotlib inline

import json
import matplotlib as mpl
from src import fmnist_utils
from src.fmnist_utils import *

def plot(H):
    plt.title(max(H['test_acc']))
    plt.plot(H['acc'], label="acc")
    plt.plot(H['test_acc'], label="test_acc")
    plt.legend()

mpl.rcParams['lines.linewidth'] = 2
mpl.rcParams['figure.figsize'] = (7, 7)
mpl.rcParams['axes.titlesize'] = 12
mpl.rcParams['axes.labelsize'] = 12

(x_train, y_train), (x_test, y_test) = fmnist_utils.get_data()

Using Theano backend.


In [2]:
# https://github.com/MorvanZhou/PyTorch-Tutorial/blob/master/tutorial-contents-notebooks/401_CNN.ipynb

# Convolution layer


<img width=300 src="http://cs231n.github.io/assets/nn1/neural_net2.jpeg">
<img width=300 src="http://cs231n.github.io/assets/cnn/cnn.jpeg">

See animation at http://cs231n.github.io/convolutional-networks/, section "Convolution Demo".

Summary. To summarize, the Conv Layer:

* Accepts a volume of size W1×H1×D1
* Requires four hyperparameters:
    - Number of filters K,
    - their spatial extent F,
    - the stride S,
    - the amount of zero padding P.
* Produces a volume of size W2×H2×D2 where:
    - W2=(W1−F+2P)/S+1
    - H2=(H1−F+2P)/S+1 (i.e. width and height are computed equally by symmetry)
    - D2=K
    
With parameter sharing, it introduces F⋅F⋅D1 weights per filter, for a total of (F⋅F⋅D1)⋅K weights and K biases.
In the output volume, the d-th depth slice (of size W2×H2) is the result of performing a valid convolution of the d-th filter over the input volume with a stride of S, and then offset by d-th bias.
A common setting of the hyperparameters is F=3,S=1,P=1. However, there are common conventions and rules of thumb that motivate these hyperparameters. See the ConvNet architectures section below.

# Whiteboard exercises

(Plus anything from the previous labs)

* (0.5) Explain equations for W2, H2 and D2.
* (1.0) Compared to a dense layer with the same amount of neurons, should we initialize neurons using larger variance or small? Explain intuition behind the answer. Hint: consider equation for popular initialization in DNN, e.g. Glorot.
* (0.5) How does output of a convolutional filter react to a small (e.g. 2px) shift of the whole image?
* (1.0)  Are convolutional filters invariant to rotations of the input? If not, can you devise a simple strategy to encourage invariance rotation? Hint: think of a different approach than changing the architecture.

# Exercise: Implement forward pass of convolution layer

You cannot use convolutional primitives.

Hint: use im2col like approach (see http://cs231n.github.io/convolutional-networks/), and then use dense layer. Alternatively, just code everything as a nested loop. Both approaches are fine.

In [4]:
def conv2d_forward(input, kernel, bias, padding, stride):
    """
    Params
    ------
    input: torch.FloatTensor, shape (n_examples, n_channels, width, height)
    kernel: torch.FloatTensor, shape (n_filters, n_channels, kernel_size, kernel_size)
    bias: torch.FloatTensor, shape (n_filters)
    padding: int
        Padding to add
    """
    # Dummy implementation sampling output with a correct shape
    N = input.shape[0]
    D = kernel.shape[0]
    W, H = input.shape[2], input.shape[3]
    F = kernel.shape[-1]
    S = stride
    P = padding
    out = np.random.uniform(size=(N, D, (W-F+2*P)/S+1, (H-F+2*P)/S+1))
    return out

In [5]:
def pytorch_conv2d_foward(input, kernel, bias, padding, stride):
    # Ugly code to forward input through PyTorch convolution
    assert kernel.shape[-2] == kernel.shape[-1]
    kernel_size = kernel.shape[-1]
    n_filters = kernel.shape[0]
    n_channels = kernel.shape[1]
    m = nn.Conv2d(n_channels, n_filters, kernel_size=kernel_size, padding=padding, stride=stride)
    m.weight.data.copy_(kernel)
    m.bias.data.copy_(bias)
    output = m.forward(Variable(input))
    output = output.data.numpy()
    return output

# Tests

In [None]:
def test_conv2d(ex, w, b, P, S):
    out_student = conv2d_forward(ex, w, b, P, S)
    out = pytorch_conv2d_foward(ex, w, b, P, S)
    result = np.allclose(out, out_student, atol=1e-2)
    return result

In [16]:
results = {}

In [18]:
## Test 1
np.random.seed(777)
ex = x_train[0:40].view(40, 4, 14, 14)
w = torch.FloatTensor(np.random.uniform(size=(16, 4, 5, 5)))
b = torch.FloatTensor(np.random.uniform(size=(16,)))
results['1'] = test_conv2d(ex, w, b, 0, 1)

In [21]:
## Test 2
np.random.seed(778)
ex = x_train[0:40].view(40, 4, 14, 14)
w = torch.FloatTensor(np.random.uniform(size=(16, 4, 5, 5)))
b = torch.FloatTensor(np.random.uniform(size=(16,)))
results['2'] = test_conv2d(ex, w, b, 0, 4)

In [23]:
## Test 3
np.random.seed(779)
ex = x_train[0:40].view(40, 1, 28, 28)
w = torch.FloatTensor(np.random.uniform(size=(16, 1, 2, 2)))
b = torch.FloatTensor(np.random.uniform(size=(16,)))
results['3'] = test_conv2d(ex, w, b, 0, 4)

In [25]:
## Test 4
np.random.seed(780)
ex = x_train[0:40].view(40, 1, 28, 28)
w = torch.FloatTensor(np.random.uniform(size=(16, 1, 2, 2)))
b = torch.FloatTensor(np.random.uniform(size=(16,)))
results['4'] = test_conv2d(ex, w, b, 4, 4)

In [95]:
json.dump(results, open("9a_conv.json", "w"))