# Convolutional Neural Networks with Numpy and Tf

In this notebook, we look at implementation of CNNs using Numpy(Cython) vs Tensorflow. 

First we will implement CNNs with Numpy then implement it with Tensorflow for comparison

In [1]:
import numpy as np
import matplotlib.pyplot as plt

%matplotlib inline
plt.rcParams['figure.figsize'] = (10.0, 8.0) # set default size of plots
plt.rcParams['image.interpolation'] = 'nearest'
plt.rcParams['image.cmap'] = 'gray'

# for auto-reloading external modules
# see http://stackoverflow.com/questions/1907993/autoreload-of-modules-in-ipython
%load_ext autoreload
%autoreload 2

import sys
if '../common' not in sys.path:
    sys.path.insert(0, '../common')

from gradient_check import eval_numerical_gradient, rel_error

# What is CNNs
Before implemeneting CNNs, let's re-call the definition of CNNs. The CNNs are very similar to ordinary Neural Networks: they are made up of neurons that have learnable weights and biases. 

The main difference is the introduction of Convolutional Layer and Pooling Layer. Let's look at a concrete example architecture
<center>
[INPUT-CONV-RELU-MAX_POOL-FC]
</center>

In more detail

* INPUT [HxWxD] will hold the raw pixel values of the image so we have D = 3 for RGB image.
* CONV layer will compute the output of neurons that are connected to local regions in the input, we will fomulated the mathematical formula later
* RELU layer will apply relu-activation i.e max(x, 0)
* MAX_POOL layer will perform downsampling operation
* FC (fully-connected) layer will compute the class scores


## Convolutional Layer

The Conv Layer takes following arguments
* a volume $[W\times H\times D]$
* a list of $K$ filters of size $[WW\times HH\times D]$
* a stride $S$
* a padding $P$

The computation takes 2 steps
* first it padding 0 around the volume to produce new input $(W+2\times P)\times (H+2\times P)\times D$
* then a sliding volume on the input is multiplied with each filter + each bias that produces ouput $[Wo \times Ho \times K]$

where
\begin{align*}
Wo &= (W - WW + 2P)/S + 1\\
Ho &= (H-HH+2P)/S + 1
\end{align*}

The following demo (taken from [CS231n](cs231n.stanford.edu)) illustrate the Conv Layer for $W=H=7,D=3, S=2, P=1, K=2$

In [4]:
from IPython.display import IFrame
IFrame('./conv-demo/index.html', width=700, height=750)

## Max Pooling Layer
The max pooling layer is a downsampling operation that takes following argument
* a volume $[W_1\times H_1\times D_1]$
* a filters of size $[WW\times HH]$
* a stride $S$

It produce a volume of size $[W_2\times H_2\times D_2]$ where
\begin{align*}
W_2 &= (W_1 - WW)/S + 1\\
H_2 &= (H_1 - HH)/S + 1\\
D_2 &= D_1
\end{align*}
It uses the silding windows as in Conv-Layer but keeps only the max of each windows. We can look at an example (taken from CS231n)

<img src="maxpool.jpeg" alt="Max Pool" style="width: 500px;"/>

# Implement CNNs

## Using Numpy
In this part, we look at how to implement Conv layer and Max-Pool layer

In [15]:
from layers import test_input, conv_fwd_naive

X, W, b = test_input()

out = conv_fwd_naive(X, W, b, 2, 1)

print (out[0, :,:,0])
print (out[0, :,:,1])

[[ -1.   4.   4.]
 [  2.  -1.  12.]
 [  2.   0.  -1.]]
[[-8. -7. -7.]
 [-5. -9. -8.]
 [ 2.  5.  0.]]
