# Convolutional Neural Networks

### Pablo Laiz Trece√±o

Convolutional Neural Networks (CNN or DCNN) are mostly used for image processing, but can also be used with different input, like audio. (A know example is WaveNet, an amazing Google Paper).

The traditional use of CNN is when you feed the network with images and the networks tries to classify the data. For example, in different objects, animals, ...

A convolution layer is a biologically inspired layer which performs the operation, it name implies, a convolution over the image.

Each convolution is performed by sliding a kernel over the image and computing the multiplication of its elements by the image on each pixel.

The big difference with traditional conventional algorithm is that now, the neural network learns the kernel to produce the best output.

<img src="CNN.bmp">

Using convolutions we obtain things that with fully connected layers we don't, like:
- We keep the partial properties of the images along the layers.
- We can find textures and boundaries on the images.

But the main advantage is the reduction of parameters that the network has to learn, because the number of parameters is equal to the kernel size.

When we work with huge image, we can insert in the network **Pooling Layer**.
This type of layers reduces the image in a non-lineal way and decrease the number of parameters needed to train the network.

For apply pooling, we divide the image in small rectangles (normally 2x2) and we compute the maximum, minimum, mean of these pixels.

<img src="pooling.bmp">

The most know architecture using conventional layers and pooling ( or subsampling) layers is LeNet, that classifies handwritten digits.

<img src="lenet.bmp">

In [None]:
import tensorflow as tf

In [None]:
#Load MNIST Data
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets('MNIST_data', one_hot=True)

#### Build a Multilayer Convolutional Network

In [None]:
# Weight Initialization
def weight_variable(shape):
    initial = tf.truncated_normal(shape, stddev=0.1)
    return tf.Variable(initial)

def bias_variable(shape):
    initial = tf.constant(0.1, shape=shape)
    return tf.Variable(initial)

In [None]:
# Convolution and Pooling
def conv2d(x, W):
    return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')

def max_pool_2x2(x):
    return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')

In [None]:
# graph

# 1st Layer
W_conv1 = weight_variable([5, 5, 1, 32])
b_conv1 = bias_variable([32])

x_image = tf.reshape(x, [-1, 28, 28, 1])

h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
h_pool1 = max_pool_2x2(h_conv1)

# 2nd Layer
W_conv2 = weight_variable([5, 5, 32, 64])
b_conv2 = bias_variable([64])

h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
h_pool2 = max_pool_2x2(h_conv2)

# Fully connected
W_fc1 = weight_variable([7 * 7 * 64, 1024])
b_fc1 = bias_variable([1024])

h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64])
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)

# dropout
keep_prob = tf.placeholder(tf.float32)
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)

#output Layer
W_fc2 = weight_variable([1024, 10])
b_fc2 = bias_variable([10])

y_conv = tf.matmul(h_fc1_drop, W_fc2) + b_fc2

In [None]:
# Train and Evaluate the Model
cross_entropy = tf.reduce_mean(
    tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y_conv))
train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
correct_prediction = tf.equal(tf.argmax(y_conv, 1), tf.argmax(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    for i in range(20000):
        batch = mnist.train.next_batch(50)
        if i % 100 == 0:
            train_accuracy = accuracy.eval(feed_dict={ x: batch[0], y_: batch[1], keep_prob: 1.0})
            print('step %d, training accuracy %g' % (i, train_accuracy))
            
        train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5})

    print('test accuracy %g' % accuracy.eval(feed_dict={ x: mnist.test.images, y_: mnist.test.labels, keep_prob: 1.0}))

Example from **TensorFlow**.