<h1>Convolutional Neural Network on MNIST</h1>

<h2><b>What are we talking about?</b></h2>
<p>Convolutional networks were inspired by biological processes in that the connectivity pattern between neurons resembles the organization of the animal visual cortex. Individual cortical neurons respond to stimuli only in a restricted region of the visual field known as the receptive field. The receptive fields of different neurons partially overlap such that they cover the entire visual field.</p>
<p>They have applications in image and video recognition, recommender systems and natural language processing.</p>
<h2><b>Architecture of a CNN</b></h2>
<img src="img/cnn.png" width="600" height="400"></img>

<h3><i>Convolutional layers</i></h3>
<p>Convolutional layers apply a convolution operation to the input, passing the result to the next layer.
The convolution emulates the response of an individual neuron to visual stimuli.</p>

<h3><i>Pooling layers</i></h3>
<p>They combine the outputs of neuron clusters at one layer into a single neuron in the next layer.</p>
<p>Examples: <b>max pooling</b> and <b>avg pooling</b></p>

<h2>References and more infos: </h2>
<ul>
    <li><a href="https://en.wikipedia.org/wiki/Convolutional_neural_network">Wiki</a></li>
    <li><a href=https://deeplearning4j.org/convolutionalnetwork>More infos</a></li>
</ul>

In [7]:
# Import dependencies
import numpy as np
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data

In [12]:
height = 28
width = 28
channels = 1
n_inputs = height * width

# Feature maps --> a layer full of neurons using the same filters
# Kernels --> filter's variables
# Stride --> the distance between two consecutive receptive fields
# Padding --> VALID if the convolutional layer does not use zero padding, SAME if uses zero padding if necessary
conv1_fmaps = 32
conv1_ksize = 3
conv1_stride = 1
conv1_pad = "SAME"

conv2_fmaps = 64
conv2_ksize = 3
conv2_stride = 2
conv2_pad = "SAME"

pool3_fmaps = conv2_fmaps

n_fc1 = 64
n_outputs = 10

with tf.name_scope("inputs"):
    X = tf.placeholder(tf.float32, shape=[None, n_inputs], name="X")
    X_reshaped = tf.reshape(X, shape=[-1, height, width, channels])
    y = tf.placeholder(tf.int32, shape=[None], name="y")

# First convolutional layer
conv1 = tf.layers.conv2d(X_reshaped, filters=conv1_fmaps, kernel_size=conv1_ksize,
                         strides=conv1_stride, padding=conv1_pad,
                         activation=tf.nn.relu, name="conv1")

# Second convolutional layer
conv2 = tf.layers.conv2d(conv1, filters=conv2_fmaps, kernel_size=conv2_ksize,
                         strides=conv2_stride, padding=conv2_pad,
                         activation=tf.nn.relu, name="conv2")

# Pooling: subsample the image to reduce the computational load
with tf.name_scope("pool3"):
    pool3 = tf.nn.max_pool(conv2, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding="VALID")
    pool3_flat = tf.reshape(pool3, shape=[-1, pool3_fmaps * 7 * 7])

# Fully connected layer
with tf.name_scope("fc1"):
    fc1 = tf.layers.dense(pool3_flat, n_fc1, activation=tf.nn.relu, name="fc1")

with tf.name_scope("output"):
    logits = tf.layers.dense(fc1, n_outputs, name="output")
    Y_proba = tf.nn.softmax(logits, name="Y_proba")

with tf.name_scope("train"):
    xentropy = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=logits, labels=y)
    loss = tf.reduce_mean(xentropy)
    optimizer = tf.train.AdamOptimizer()
    training_op = optimizer.minimize(loss)

with tf.name_scope("eval"):
    correct = tf.nn.in_top_k(logits, y, 1)
    accuracy = tf.reduce_mean(tf.cast(correct, tf.float32))

with tf.name_scope("init_and_save"):
    init = tf.global_variables_initializer()
    saver = tf.train.Saver()

In [8]:
mnist = input_data.read_data_sets("/tmp/data/")

Extracting /tmp/data/train-images-idx3-ubyte.gz
Extracting /tmp/data/train-labels-idx1-ubyte.gz
Extracting /tmp/data/t10k-images-idx3-ubyte.gz
Extracting /tmp/data/t10k-labels-idx1-ubyte.gz


In [13]:
n_epochs = 10
batch_size = 100

with tf.Session() as sess:
    init.run()
    for epoch in range(n_epochs):
        for iteration in range(mnist.train.num_examples // batch_size):
            X_batch, y_batch = mnist.train.next_batch(batch_size)
            sess.run(training_op, feed_dict={X: X_batch, y: y_batch})
        acc_train = accuracy.eval(feed_dict={X: X_batch, y: y_batch})
        acc_test = accuracy.eval(feed_dict={X: mnist.test.images, y: mnist.test.labels})
        print(epoch, "Train accuracy:", acc_train, "Test accuracy:", acc_test)

        save_path = saver.save(sess, "./my_mnist_model")

0 Train accuracy: 0.99 Test accuracy: 0.9802
1 Train accuracy: 1.0 Test accuracy: 0.9822
2 Train accuracy: 0.99 Test accuracy: 0.9868
3 Train accuracy: 0.99 Test accuracy: 0.9877
4 Train accuracy: 1.0 Test accuracy: 0.9899
5 Train accuracy: 1.0 Test accuracy: 0.9875
6 Train accuracy: 1.0 Test accuracy: 0.989
7 Train accuracy: 1.0 Test accuracy: 0.9885
8 Train accuracy: 1.0 Test accuracy: 0.9887
9 Train accuracy: 1.0 Test accuracy: 0.9811
