# Digit Recogniser using Convolutional Neural Network


In this workshop we are going a develop and train a deep neural network called convolutional neural network to recognise handwritten digits. We'll be using the MNIST ("Modified National Institute of Standards and Technology") dataset, also considered as the "Hello World" dataset in computer vision/deep learning. It contains a large set of human handwritten and annotated digit images.

By the end of the tutorial you will have learnt how convolutional neural networks work, how to deploy pre-trained model, how to visualise the results. We'll build this neural net using tensorflow and deploy the pretrained model as rest api.

## Libraries and Verify TensorFlow is Installed

We'll start of by first importing the libraries that are required for this project. The Deep learning framework we are using is tensorflow. This is even to check if the correct version of tensorflow is installed.

In [None]:
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
from tensorflow.examples.tutorials.mnist import input_data
from distutils.version import LooseVersion
import warnings
import time

# Check TensorFlow Version
assert LooseVersion(tf.__version__) >= LooseVersion('1.0'), 'Please use TensorFlow version 1.0 or newer'
print('TensorFlow Version: {}'.format(tf.__version__))

%matplotlib inline

## MNIST Data Set

MNIST Data Set is a collection of handwritten digits images comprising 60,000 training images and 10,000 test images. It usually used as a benchmark for the new computer vision and pattern recognition algorithms. The images here have been size-normalised and centered in fixed size image.

Tensorflow has a module which helps us download this particular dataset, split it into training set, validation set, test set and creates one-hot vectors for labels of each image. The training set contains 55,000 images, validation set has 5,000 and test set has 10,000 images.

**So why is it important to separate data into 3 sets?**

These images here are 2D images each consisting an array of 28x28 values that have been flattened to get a rich structure of 784 dimensional vector space

In [None]:
mnist = input_data.read_data_sets('MNIST_data/', one_hot=True)

In [None]:
print("No. of Training Examples : ",mnist.train.num_examples)
print("No. of Validation Examples : ",mnist.validation.num_examples)
print("No. of Test Examples : ",mnist.test.num_examples)
print("Example of a one-hot encoded vector : \n",mnist.train.labels[0])
print("Flattened Image Shape : ", mnist.train.images[0].shape)
print("Example of a flattened image: \n", mnist.train.images[0])

In [None]:
img_array = mnist.train.images[np.random.randint(mnist.train.images.shape[0])]
img_array = 255 * img_array
img_array = img_array.astype("uint8")
#print(img_array.reshape([28,28]))
plt.imshow(img_array.reshape([28,28]))
plt.gray()
plt.grid(True)
plt.savefig('test.png')

## Build our network

You'll have to build the necessary components given below alongwith us for designing the convolutional neural network.
- `input_placeholders`
- `init_weights_bias`
- `conv2d`
- `max_pool`
- `flatten`
- `fcn`
- `output`
- `cnn`

## Inputs

In [None]:
def input_placeholders():
    """
    Create TF placeholders for inputs, targets, keep_prob, learning_rate
    Return: A Tuple(inputs, targets, keep_prob, learning_rate)
    """
    
    return

## Initialise Weights and Bias

In [None]:
def init_weights_bias(w_shape, b_shape):
    """
    
    """

    return

## Convolutional Layer


In [None]:
def conv2d(inputs, weights, stride):
    """
    Apply convolution operation to inputs tensor.
    inputs: tensorflow tensor
    weights: weights initialised for convolution operation
    stride: 2d tuple for convolution
    """
    
    return 

## Max Pooling Layer

In [None]:
def max_pool(inputs, kernel_size, stride):
    """
    Apply max pooling operation to inputs tensor.
    inputs: tensorflow tensor
    kernel_size: 2d tuple for pooling
    stride: 2d tuple for pooling
    """
    
    return 

## Flatten Layer

In [None]:
def flatten(inputs):
    """
    Flatten inputs tensor to (batch_size, flattened_image_size).
    """
    
    return 

## Fully Connected Layer

In [None]:
def fcn(inputs, num_outputs):
    """
    Apply a fully connected layer to inputs tensor using weights and bias
    inputs: tensorflow tensor
    num_outputs: number of outputs the new tensor should be.
    """
    return

## Output Layer

In [None]:
def outputs(inputs, num_outputs):
    """
    Apply an output layer to inputs tensor using weights and bias
    inputs: tensorflow tensor
    num_outputs: number of outputs the new tensor should be.
    """
    
    return

## Create Convolutional Model

In [None]:
def cnn(inputs, keep_prob):
    """
    Create a convolutional model using inputs tensor and use keep_prob while applying dropout 
    """
    
    return

## Display Stats

In [None]:
def display_stats(sess, batch_train, batch_train_labels, batch_val, batch_val_labels, cost, accuracy, writer, step):
    """
    Display intermediate results.
    """
    train_data = sess.run([accuracy, summ, cost], feed_dict={
        inputs: batch_train, targets: batch_train_labels, keep_prob: 1.0, learning_rate:0.0001})
    val_data =sess.run([accuracy, cost], feed_dict={
        inputs: batch_val, targets: batch_val_labels, keep_prob: 1.0, learning_rate:0.0001})
    
    writer.add_summary(train_data[1])
    
    print("Step {} Train_Loss : {:>10.4f} Train_Accuracy : {:.6f} Val_Loss : {:>10.4f} Val_Accuracy : {:.6f}"
         .format(step, train_data[2], train_data[0], val_data[0], val_data[1]))
    return

## Construct graph

In [None]:
tf.reset_default_graph()

#Initialise Session
sess = tf.Session()

#Inputs
inputs, targets, keep_prob, learning_rate = input_placeholders()

#Model
logits = cnn(inputs, keep_prob)

#Loss Function
with tf.name_scope("Loss_Function"):
    cross_entropy = tf.reduce_mean(
            tf.nn.softmax_cross_entropy_with_logits(labels=targets, logits=logits))
    tf.summary.scalar("loss", cross_entropy)

#Optimizer
with tf.name_scope("Optimizer"):
    train_step = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cross_entropy)

# Report Accuracy
with tf.name_scope("Accuracy"):
    correct_prediction = tf.equal(tf.argmax(logits, 1), tf.argmax(targets, 1))
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
    tf.summary.scalar("Accuracy", accuracy)

#Define variable to save summaries
summ = tf.summary.merge_all()

#Initialise all variables
sess.run(tf.global_variables_initializer())

#Create a saver object to save the model
saver = tf.train.Saver()

#Write Graph to Tensorboard
!rm -r log/
writer = tf.summary.FileWriter("log/")
writer.add_graph(sess.graph)

## Start Training

In [None]:
start=time.time()
for i in range(2000):
    batch = mnist.train.next_batch(16)
    if i % 100 == 0:
        display_stats(sess, batch[0], batch[1], mnist.validation.images, mnist.validation.labels, 
                      cross_entropy, accuracy, writer, i)
        saver.save(sess, 'trained/test_model')
    sess.run(train_step,feed_dict={inputs: batch[0], targets: batch[1], keep_prob: 0.4, learning_rate:0.0005})

end = time.time()

print("Total training time {:2f}".format(end-start))

print('\nTest accuracy : {}'.format(sess.run(accuracy,feed_dict={
    inputs: mnist.test.images[:100], targets: mnist.test.labels[:100], keep_prob: 1.0, learning_rate:0.0001})))

sess.close()

## Prediction using the Trained model

In [None]:
tf.reset_default_graph()

sess = tf.Session()

saver=tf.train.import_meta_graph('trained/test_model.meta')
saver.restore(sess, tf.train.latest_checkpoint('./trained/'))

graph = tf.get_default_graph()

output = graph.get_tensor_by_name("output_fc_layer/outputs:0")

inputs = graph.get_tensor_by_name("Inputs:0")
targets = graph.get_tensor_by_name("Labels:0")
keep_prob = graph.get_tensor_by_name("keep_prob:0")
learning_rate = graph.get_tensor_by_name("learning_rate:0")

pred = tf.nn.softmax(output)
img_predict_index = np.random.randint(mnist.test.images.shape[0])
img_array = 255 * mnist.test.images[img_predict_index]
img_array = img_array.astype("uint8")
plt.imshow(img_array.reshape([28,28]))
plt.gray()

predictions = sess.run(pred, feed_dict={inputs:mnist.test.images[img_predict_index].reshape(1,784), 
                                        targets: mnist.test.labels[img_predict_index].reshape(1,10), 
                                        keep_prob:1.0, learning_rate:0.0001})

print(np.argmax(predictions[0]))


sess.close()