# Learning Objectives

By the end of this lab, you will have

- Implemented Convolution, ReLU, and Max-Pooling layers
- Created a reusable convolutional block layer
- Verfied the correctness of your implementations with gradient checking

Let's get started!

# Layer Interface

Recall when implementing a layer to make it conform to the following interface.

In [1]:
class Layer:
    def forward(self, inputs):
        raise NotImplementedError('Forward pass not implemented!')
        
    def backward(self, dout):
        raise NotImplementedError('Backward pass not implemented!')

# Max-Pooling Layer

Consider a 1D Max-Pooling layer described by the computational graph

![Simple Max Pooling Layer](images/Simple%20Max%20Pooling%20Layer.png)
where

$$
z = \max(\mathbf{h}).
$$

### Questions

- How many dimensions in $\nabla_h$ will be non-zero assuming there is a unique $h_i$ such that $h_i = \max(\mathbf{h})$?
- What if there are two values $h_i$ and $h_j$ such that $h_i = h_j = \max(\mathbf{h})$?

### Tasks

- Implement a 1D Max-Pooling layer

In [2]:
class MaxPool(Layer):
    def forward(self, h):
        z = np.max(h)
        self.cache = locals()
        return z
    
    def backward(self, dz):
        h, z = self.cache['h'], self.cache['z']
        dh = np.zeros_like(h)
        dh[h==z] = 1
        dh = dh * dz
        return dh

# ReLU Layer

Consider the ReLU layer described by the computational graph

![Simple ReLU Layer](images/Simple%20ReLU%20Layer.png)
where

$$
\mathbf{h}_{i} = 
\begin{cases} 
0 & \text{if } \mathbf{a}_{i} \leq 0 \\
\mathbf{a}_{i} & \text{otherwise}
\end{cases}.
$$

### Questions

- What will $\nabla_a$ be if $a_i < 0$ for all $i$?
- What if $a_i > 0$ for all $i$?

### Tasks

- Implement a 1D ReLU layer

In [3]:
import copy

class ReLU(Layer):
    def forward(self, a):
        self.cache = locals()
        h = copy.deepcopy(a)
        h[a < 0] = 0        
        return h
    
    def backward(self, dh):
        a = self.cache['a']
        da = np.ones_like(dh)
        da[a < 0] = 0
        dh = da * dh
        return dh

# Convolutional Layer

Consider a 1D convolution layer with a single filter $w$ described by the computational graph

![Simple Conv1D Layer](images/Simple%20Conv1D%20Layer.png)
where $a_i = w * x_i$. Note since $w \in \mathbb{R}$, we are performing a 1x1 convolution. Further assume that we are only dealing with a stride of 1.

### Questions

- How many elements in $\mathbf{a}$ does $x_i$ influence?
- How many elements in $\mathbf{a}$ does $w$ influence?

### Tasks

- Implement a 1D convolutional layer

In [4]:
class Conv1D(Layer):
    def forward(self, x, w):
        a = x * w
        self.cache = locals()
        return a
    
    def backward(self, da):
        x, w = self.cache['x'], self.cache['w']
        dx, dw = w*da, np.sum(x*da)
        return dx, dw

### Convolutional Block

- Consider a convolutional block layer described by the computational graph

$$
\underset{w \in \mathbb{R}}{\overset{\mathbf{x} \in \mathbb{R}^N}{\longrightarrow}}
\text{Conv}
\longrightarrow
\text{ReLU}
\longrightarrow
\text{Max Pool}
\overset{h \in \mathbb{R}}{\longrightarrow}
$$

### Tasks

- Implement a convolutional block layer in terms of Convolutional, ReLU, and Max Pool layers

In [5]:
class ConvBlock(Layer):
    def __init__(self):
        self.conv, self.relu, self.max_pool = Conv1D(), ReLU(), MaxPool()
        
    def forward(self, x, w):
        a = self.conv.forward(x, w)
        h = self.relu.forward(a)
        z = self.max_pool.forward(h)
        return z
    
    def backward(self, dz):
        dh = self.max_pool.backward(dz)
        da = self.relu.backward(dh)
        dx, dw = self.conv.backward(da)
        return dx, dw

## Check Your Implementation

An indispensible tool to check your backpropagation code is *gradient checking*. Gradient checking works by

1. Running your backward pass to compute the gradients
2. Approximating the gradients with finite differences
3. Compares these two values and returns success if they are close and fails otherwise

### Tasks

- Run the following code cell to gradient check your convolutional block

### Explanation

- The code will create a vector $x$ of five random numbers and a random filter $w$. It approxiates $\nabla{x}$ and $\nabla{w}$ and compares those values against the values of $\nabla{w}$ and $\nabla{x}$ your `ConvBlock.backward()` method returns.

In [6]:
import numpy as np
from lib.checking import gradient_check

x, w = np.random.randn(5), np.random.randn()
params = [x, w]

conv_block = ConvBlock()
gradient_check(conv_block.forward, conv_block.backward, *params)

Gradient check on param #0 PASSED with frobenius norm difference 2.6554869414496807e-12!
Gradient check on param #1 PASSED with frobenius norm difference 7.949196856316121e-14!


## Bonus Activities

- Implement a 1D convolution layer with support for multiple scalar filters
- Implement a 1D convolution layer with one filter which is a vector
- Implement a 1D convolution layer with a set of filters which are vectors
- Implement a 2D convolution layer
- Implement a max-pooling layer which supports local maxes
- Implement a 2D max-pooling layer
- Generalize your code to support minibatches
- Implement a simple trainer class with an SGD optimizer and optimize a CNN on MNIST