# Convolutional neural network

## Edge detection

- ex. 6x6 image * (convolve) 3x3 filter = 4x4 image
- vertical edge detection: use filters such as

$\left(\begin{array}{ccc}
  {1} & 0 & {1}\\
  0  & 1 & 0 \\
  {1} & 0 & {1}
\end{array}\right)$

$\left(\begin{array}{ccc}
  {1} & 0 & {-1}\\
  2  & 0 & -2 \\
  {1} & 0 & {-1}
\end{array}\right)$ (Sobel filter)

$\left(\begin{array}{ccc}
  {3} & 0 & {-3}\\
  10  & 0 & -10 \\
  {3} & 0 & {-3}
\end{array}\right)$ (Schors filter)

- vertical edge detection: use filters such as

$\left(\begin{array}{ccc}
  {1} & 1 & {1}\\
  0  & 0 & 0 \\
  {-1} & -1 & {-1}
\end{array}\right)$

- $n$ x $n * f$ x $f = (n-f+1)$ x $(n-f+1)$

## Padding

- avoids shrinking output and throwing away information from edges
- $n$ x $n * f$ x $f = (n+2p-f+1)$ x $(n+2p-f+1)$
- "valid": no padding
- "same": pad so that output size is the same as the input size
    - $p = \dfrac{f-1}{2}$ ($f$ is usually odd)

## Strided convolutions

- $n$ x $n * f$ x $f = (\dfrac{n+2p-f}{2}+1)$ x $(\dfrac{n+2p-f}{2}+1)$

## Convolution over volumn

- 6x6x3 volumn (height x width x number of channels) * 3x3x3 volume (height x width x number of channels) = 4x4 (27 numbers are multiplied and summed up 16 times to produce 4x4)
- number of channel must match between input and filter
- multuple filters
    - 6x6x3 * 3x3x3 (two of these: one for vertical edge and the other for horizontal edge) = 4x4x2
    - $n$ x $n$ x $n_{c} * f$ x $f$ x $n_{c} = (n-f+1)$ x $(n-f+1)$ x $n_{c}^{'}$ where $n_{c}^{'}$ = number of filters
    
## One layer of a convolutional network

- if layer $l$ is a convolutional layer
    - $f^{l}$ = filter size
    - $p^{l}$ = padding
    - $s^{l}$ = stride
    - $n_{c}^{l}$ = number of filters
    - input: $n_{H}^{[l-1]}$ x $n_{W}^{[l-1]}$ x $n_{c}^{[l-1]}$
    - output: $n_{H}^{[l]}$ x $n_{W}^{[l]}$ x $n_{c}^{[l]}$
    - $n^{[l]} = \lfloor\dfrac{n^{[l-1]}+2p^{[l]}-f^{[l]}}{s^{[l]}} + 1\rfloor$
    - each filter: $f^{[l]}$ x $f^{[l]}$ x $n_{c}^{[l-1]}$
    - activation: $a^{[l]}$ => $n_{H}^{[l]}$ x $n_{W}^{[l]}$ x $n_{c}^{[l]}$ or $A^{[l]}$ => $m$ x $n_{H}^{[l]}$ x $n_{W}^{[l]}$ x $n_{c}^{[l]}$
    - weights: $f^{[l]}$ x $f^{[l]}$ x $n_{c}^{[l-1]}$ x $n_{c}^{[l]}$
    - bias: $n_{c}^{[l]}$ (represented as $(1,1,1,n_{c}^{[l]})$)
    
## Pooling layers

- max pooling: take max number in each region
- $f$: filter size
- $s$: stride
- no parameters to learn

### Packages

In [None]:
import numpy as np
import h5py
import matplotlib.pyplot as plt

%matplotlib inline
plt.rcParams['figure.figsize'] = (5.0, 4.0) # set default size of plots
plt.rcParams['image.interpolation'] = 'nearest'
plt.rcParams['image.cmap'] = 'gray'