# Convolutional Neural Networks with Numpy and Tf

In this notebook, we look at implementation of CNNs using Numpy(Cython) vs Tensorflow. 

First we will implement CNNs with Numpy then implement it with Tensorflow for comparison

In [17]:
import numpy as np
import matplotlib.pyplot as plt

%matplotlib inline
plt.rcParams['figure.figsize'] = (10.0, 8.0) # set default size of plots
plt.rcParams['image.interpolation'] = 'nearest'
plt.rcParams['image.cmap'] = 'gray'

# for auto-reloading external modules
# see http://stackoverflow.com/questions/1907993/autoreload-of-modules-in-ipython
%load_ext autoreload
%autoreload 2

import sys
if '../common' not in sys.path:
    sys.path.insert(0, '../common')

from gradient_check import eval_numerical_gradient, rel_error

import tensorflow as tf
from time import time

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


# What is CNNs
Before implemeneting CNNs, let's re-call the definition of CNNs. The CNNs are very similar to ordinary Neural Networks: they are made up of neurons that have learnable weights and biases. 

The main difference is the introduction of Convolutional Layer and Pooling Layer. Let's look at a concrete example architecture
<center>
[INPUT-CONV-RELU-MAX_POOL-FC]
</center>

In more detail

* INPUT [HxWxD] will hold the raw pixel values of the image so we have D = 3 for RGB image.
* CONV layer will compute the output of neurons that are connected to local regions in the input, we will fomulated the mathematical formula later
* RELU layer will apply relu-activation i.e max(x, 0)
* MAX_POOL layer will perform downsampling operation
* FC (fully-connected) layer will compute the class scores


## Convolutional Layer

The Conv Layer takes following arguments
* a volume $[W\times H\times D]$
* a list of $K$ filters of size $[WW\times HH\times D]$
* a stride $S$
* a padding $P$

The computation takes 2 steps
* first it padding 0 around the volume to produce new input $(W+2\times P)\times (H+2\times P)\times D$
* then a sliding volume on the input is multiplied with each filter + each bias that produces ouput $[Wo \times Ho \times K]$

where
\begin{align*}
Wo &= (W - WW + 2P)/S + 1\\
Ho &= (H-HH+2P)/S + 1
\end{align*}

The following demo (taken from [CS231n](cs231n.stanford.edu)) illustrate the Conv Layer for $W=H=7,D=3, S=2, P=1, K=2$

In [4]:
from IPython.display import IFrame
IFrame('./conv-demo/index.html', width=700, height=750)

## Max Pooling Layer
The max pooling layer is a downsampling operation that takes following argument
* a volume $[W_1\times H_1\times D_1]$
* a filters of size $[WW\times HH]$
* a stride $S$

It produce a volume of size $[W_2\times H_2\times D_2]$ where
\begin{align*}
W_2 &= (W_1 - WW)/S + 1\\
H_2 &= (H_1 - HH)/S + 1\\
D_2 &= D_1
\end{align*}
It uses the silding windows as in Conv-Layer but keeps only the max of each windows. We can look at an example (taken from CS231n)

<img src="maxpool.jpeg" alt="Max Pool" style="width: 500px;"/>

# Implement CNNs

## Using Numpy
In this part, we look at how to implement Conv layer and Max-Pool layer. Note that we only implement naive version with Numpy since we mainly use it as test v.s Tensorflow

In [2]:
from layers import conv_test_input, conv_fwd_naive

X, W, b = conv_test_input()

out = conv_fwd_naive(X, W, b, 2, 1)

print (out[0, :,:,0])
print (out[0, :,:,1])

[[ -1.   4.   4.]
 [  2.  -1.  12.]
 [  2.   0.  -1.]]
[[-8. -7. -7.]
 [-5. -9. -8.]
 [ 2.  5.  0.]]


We can see that the **conv_fwd_naive** produce the same output as above animation. For max-pool, we have

In [3]:
from layers import maxpool_fwd_naive, maxpool_test_input

X = maxpool_test_input()
out = maxpool_fwd_naive(X, 2, 2, 2)

print (X[0, :, : , 0])
print (out[0, :, :, 0])

[[-0.30000001 -0.27789474 -0.25578949 -0.23368421]
 [-0.21157895 -0.18947369 -0.16736843 -0.14526317]
 [-0.1231579  -0.10105263 -0.07894737 -0.0568421 ]
 [-0.03473684 -0.01263158  0.00947368  0.03157895]]
[[-0.18947369 -0.14526317]
 [-0.01263158  0.03157895]]


Look at the output, it's easy to verify that output is max of the 4 square-corners of shape (2x2). Now let's implement it with Tensorflow

## Using Tensorflow
The implementation using Tensorflow is straightforward with [tf.pad](https://www.tensorflow.org/api_docs/python/tf/pad) and [tf.nn.conv2d](https://www.tensorflow.org/api_docs/python/tf/nn/conv2d)

In [30]:
tf.reset_default_graph()

X, W, b = conv_test_input()
out = conv_fwd_naive(X, W, b, 2, 1)

vX = tf.placeholder(tf.float32, X.shape)
vW = tf.Variable(W, dtype = tf.float32)
vb = tf.Variable(b, dtype = tf.float32)

pad    = 1
stride = 2
paddings = [[0, 0], [pad, pad], [pad, pad], [0, 0]]

vXpad = tf.pad(vX, paddings, mode = 'CONSTANT', name = 'x_pad')
conv = tf.nn.conv2d(vXpad, vW, strides = [1, stride, stride, 1], padding = 'VALID') + vb


with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    out_tf = sess.run(conv, feed_dict = {vX : X})    
    print (out_tf[0, :, :, 0])
    print (out_tf[0, :, :, 1])
    
    print ('Rel-error numpy vs tf: {:e}'.format(rel_error(out, out_tf)))

[[ -1.   4.   4.]
 [  2.  -1.  12.]
 [  2.   0.  -1.]]
[[-8. -7. -7.]
 [-5. -9. -8.]
 [ 2.  5.  0.]]
Rel-error numpy vs tf: 0.000000e+00


For max-pool, we can use [tf.nn.max_pool](https://www.tensorflow.org/versions/master/api_docs/python/tf/nn/max_pool)

In [10]:
tf.reset_default_graph()

X = maxpool_test_input()
out = maxpool_fwd_naive(X, 2, 2, 2)

vX = tf.placeholder(tf.float32, X.shape)

maxpool = tf.nn.max_pool(vX, ksize = [1, 2, 2, 1], strides = [1, 2, 2, 1], padding='SAME')

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    out_tf = sess.run(maxpool, feed_dict = {vX : X})    
    print (X[0, :, : , 0])
    print (out_tf[0, :, :, 0])
    
    print ('\nRel-error numpy vs tf: {:e}'.format(rel_error(out, out_tf)))

[[-0.30000001 -0.27789474 -0.25578949 -0.23368421]
 [-0.21157895 -0.18947369 -0.16736843 -0.14526317]
 [-0.1231579  -0.10105263 -0.07894737 -0.0568421 ]
 [-0.03473684 -0.01263158  0.00947368  0.03157895]]
[[-0.18947369 -0.14526317]
 [-0.01263158  0.03157895]]

Rel-error numpy vs tf: 0.000000e+00


## Testing

In this section, we try some random test to verify the Numpy's implementation matches with Tensorflow's one


In [123]:
from tf_layers import conv_fwd_tf, maxpool_fwd_tf

X = np.random.randn(100, 31, 31, 3).astype(np.float32)
W = np.random.randn(3, 3, 3, 25).astype(np.float32)
b = np.random.randn(25).astype(np.float32)

stride = 2
pad = 1


out    = conv_fwd_naive(X, W, b, stride, pad)
out_tf = conv_fwd_tf(X, W, b, stride, pad)
print (out.shape)
print (out_tf.shape)
print ('Rel-error numpy vs tf: {:e}'.format(rel_error(out, out_tf)))
print ('Abs-error numpy vs tf: {:e}'.format(np.max(np.abs(out -out_tf))))

(100, 16, 16, 25)
(100, 16, 16, 25)
Rel-error numpy vs tf: 1.000000e+00
Abs-error numpy vs tf: 7.629395e-06


In [121]:
X = np.random.randn(100, 31, 31, 3).astype(np.float32)
pH = 2
pW = 2
pS = 2

out    = maxpool_fwd_naive(X, pH, pW, pS)
out_tf = maxpool_fwd_tf(X, pH, pW, pS)
print (out.shape)
print (out_tf.shape)
print ('Rel-error numpy vs tf: {:e}'.format(rel_error(out, out_tf)))
print ('Abs-error numpy vs tf: {:e}'.format(np.max(np.abs(out -out_tf))))

(100, 15, 15, 3)
(100, 15, 15, 3)
Rel-error numpy vs tf: 0.000000e+00
Abs-error numpy vs tf: 0.000000e+00


# Conclusion

In this notebook, we learn the two very frequently used layers: convolution and max-pool. We have implemented it in Numpy v.s Tensorflow to make sure we understand it correctly.