# CNN example with TensorFlow

In this example, we classify hand-written digits of MNIST dataset with LeNet5. This example is based on the official example in TensorFlow.

## Procedures

This example takes the following steps:

1. Import packages
2. Prepare dataset
3. Prepare model and optimizer
4. Initialize parameters
5. Run training loop
6. Save models

## Codes

### 1. Import packages

In [1]:
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data

### 2. Prepare dataset

In [2]:
mnist = input_data.read_data_sets('MNIST_Data/', one_hot=True)

Successfully downloaded train-images-idx3-ubyte.gz 9912422 bytes.
Extracting MNIST_Data/train-images-idx3-ubyte.gz
Successfully downloaded train-labels-idx1-ubyte.gz 28881 bytes.
Extracting MNIST_Data/train-labels-idx1-ubyte.gz
Successfully downloaded t10k-images-idx3-ubyte.gz 1648877 bytes.
Extracting MNIST_Data/t10k-images-idx3-ubyte.gz
Successfully downloaded t10k-labels-idx1-ubyte.gz 4542 bytes.
Extracting MNIST_Data/t10k-labels-idx1-ubyte.gz


### 3. Prepare model and optimizer

In [3]:
# Allocate the tensors on GPU 0
with tf.device('/gpu:0'):
    # MNIST consists of 28x28 gray scale images,
    # each represented as a 784 dimensional vector.
    # None indicates the size of minibatches, determined on exeution
    x = tf.placeholder(tf.float32, [None, 784])
    t = tf.placeholder(tf.float32, [None, 10])

    # -1 indicates the size of minibatches, determined on execution
    x_image = tf.reshape(x, [-1, 28, 28, 1])

    # layer 1: 5x5 convolution, tanh, 2x2 max pooling
    W1 = tf.Variable(tf.truncated_normal([5, 5, 1, 6], stddev=0.1))
    b1 = tf.Variable(tf.constant(0.1, shape=[6]))
    a1 = tf.nn.tanh(tf.nn.conv2d(x_image, W1, strides=[1, 1, 1, 1], padding='SAME') + b1)
    h1 = tf.nn.max_pool(a1, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')

    # layer 2: 5x5 convolution, tanh, 2x2 max pooling
    W2 = tf.Variable(tf.truncated_normal([5, 5, 6, 16], stddev=0.1))
    b2 = tf.Variable(tf.constant(0.1, shape=[16]))
    a2 = tf.nn.tanh(tf.nn.conv2d(h1, W2, strides=[1, 1, 1, 1], padding='VALID') + b2)
    h2 = tf.nn.max_pool(a2, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')
    
    # layer 3: fully connected
    h2_flat = tf.reshape(h2, [-1, 5 * 5 * 16])
    W3 = tf.Variable(tf.truncated_normal([5 * 5 * 16, 120], stddev=0.1))
    b3 = tf.Variable(tf.constant(0.1, shape=[120]))
    h3 = tf.nn.tanh(tf.matmul(h2_flat, W3) + b3)
    
    # layer 4: fully connected
    W4 = tf.Variable(tf.truncated_normal([120, 84], stddev=0.1))
    b4 = tf.Variable(tf.constant(0.1, shape=[84]))
    h4 = tf.nn.tanh(tf.matmul(h3, W4) + b4)
    
    # layer 5: fully connected
    W5 = tf.Variable(tf.truncated_normal([84, 10], stddev=0.1))
    b5 = tf.Variable(tf.constant(0.1, shape=[10]))
    h5 = tf.matmul(h4, W5) + b5
    
    # Compute the loss value
    y = tf.nn.softmax(h5)
    cross_entropy = -tf.reduce_sum(t * tf.log(y))
    
    # Update formula
    train_step = tf.train.GradientDescentOptimizer(1e-3).minimize(cross_entropy)

# We can put operators on different devices just by
# using the 'with' statement.
with tf.device('/cpu:0'):
    # Compute accuracy for evaluation
    correct_or_not = tf.equal(tf.argmax(y, 1), tf.argmax(t, 1))
    accuracy = tf.reduce_mean(tf.cast(correct_or_not, tf.float32))

### How convolution and pooling layers handle merginals in TensorFlow.

Convolution and padding layers sweep over the input tensors with kernels in forward propagation. The output of layers are determined by input size, kernel size, stride, which is the distance kernels move at one step, and finally, how layers handle merginals of input data. 

There two ways to handle the merginals, which are specified by ``padding`` argument. One is ``VALID``, in which residual locations are discarded. The other is ``SAME``, in which the both side of the input is equally padded with 0 so that all locations are convolved.

```
e.g. width = 9, kernel size = 4, stride = 2

* padding = 'VALID'
  1 2 3 4 5 6 7 8 9
  |-----|
      |-----|
          |-----|    

* padding = 'SAME'
0 1 2 3 4 5 6 7 8 9 0 0
|-----|
    |-----|
        |-----|
            |-----|
                |-----|
```

### 4. Initialize parameters

Once the computational graph is built, we can execute the graph using ``Session``. ``Session.run`` takes a list of _fetches_ as the first argument, which indicates the objective node to compute.

Before the training starts, we have to initialize parameters. This can be done by the special operator.

In [5]:
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)

### 5. Run training loop

We can now enter the learning loop.

In [6]:
accum_acc = 0
for i in range(10000):
    x_batch, t_batch = mnist.train.next_batch(100)
    _, acc = sess.run([train_step, accuracy], feed_dict={x: x_batch, t: t_batch})
    accum_acc += acc
    if i % 500 == 0:
        print('step {} train accuracy: {}'.format(i, accum_acc / 500))
        accum_acc = 0
        
        # Evaluate the model on the test dataset
        for _ in range(100):
            x_batch, t_batch = mnist.test.next_batch(100)
            accum_acc += sess.run(accuracy, feed_dict={x: x_batch, t: t_batch})
        print('step {} test accuracy: {}'.format(i, accum_acc / 100))
        accum_acc = 0

step 0 train accuracy: 0.00021999999880790712
step 0 test accuracy: 0.09260000005364417
step 500 train accuracy: 0.9054000027477741
step 500 test accuracy: 0.9637000048160553
step 1000 train accuracy: 0.9698000077009201
step 1000 test accuracy: 0.9772000122070312
step 1500 train accuracy: 0.9779400104284286
step 1500 test accuracy: 0.9819000118970871
step 2000 train accuracy: 0.9806600105762482
step 2000 test accuracy: 0.9829000109434127
step 2500 train accuracy: 0.98526001060009
step 2500 test accuracy: 0.9848000115156174
step 3000 train accuracy: 0.9872400100231171
step 3000 test accuracy: 0.9856000089645386
step 3500 train accuracy: 0.9889800088405609
step 3500 test accuracy: 0.9879000079631806
step 4000 train accuracy: 0.9905800082683563
step 4000 test accuracy: 0.9881000101566315
step 4500 train accuracy: 0.991140007853508
step 4500 test accuracy: 0.9873000109195709
step 5000 train accuracy: 0.9924200071096421
step 5000 test accuracy: 0.9883000087738038
step 5500 train accuracy: 0

### 6. Save model

After the training, we can save the model with ``Saver``.

In [7]:
saver = tf.train.Saver()
saver.save(sess, 'mnist_lenet5', global_step=10000)

'mnist_lenet5-10000'