## DIGIT RECOGNITION USING TENSORFLOW

The MNIST database (Modified National Institute of Standards and Technology database) is a large database of handwritten digits that is commonly used for training various image processing systems. We are using one such MNIST dataset to illustrate the convolutional neural network (CNN) using tensorflow in Python.


In [2]:
# Importing the required libraries

import tensorflow as tf

from tensorflow.examples.tutorials.mnist import input_data

In [3]:
# Load the dataset 

mnist = input_data.read_data_sets("C:/Users/320/Python/CNN_Tensorflow/MNIST_Train", one_hot = True)

Instructions for updating:
Please use alternatives such as official/mnist/dataset.py from tensorflow/models.
Instructions for updating:
Please write your own downloading logic.
Instructions for updating:
Please use tf.data to implement this functionality.
Extracting C:/Users/320/Python/CNN_Tensorflow/MNIST_Train\train-images-idx3-ubyte.gz
Instructions for updating:
Please use tf.data to implement this functionality.
Extracting C:/Users/320/Python/CNN_Tensorflow/MNIST_Train\train-labels-idx1-ubyte.gz
Instructions for updating:
Please use tf.one_hot on tensors.
Extracting C:/Users/320/Python/CNN_Tensorflow/MNIST_Train\t10k-images-idx3-ubyte.gz
Extracting C:/Users/320/Python/CNN_Tensorflow/MNIST_Train\t10k-labels-idx1-ubyte.gz
Instructions for updating:
Please use alternatives such as official/mnist/dataset.py from tensorflow/models.


## Convolutional Neural Network

CNN is the artifical neural network for image processing. There are 3 major layers in a CNN
1. Convolutional Layers
2. Pooling Layers
3. Fully Connected Layers

In [4]:
# Defining the weights, biases and functions for different layers

def weight(shape):
  initial = tf.truncated_normal(shape, stddev=0.1)
  return tf.Variable(initial)

def bias(shape):
  initial = tf.constant(0.1, shape=shape)
  return tf.Variable(initial)

def conv2d(x, W):
  return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')

def max_pool(x):
  return tf.nn.max_pool(x, ksize=[1, 2, 2, 1],
                        strides=[1, 2, 2, 1], padding='SAME')

In [6]:
# Defining the tensors to be used in the code
x = tf.placeholder(tf.float32, [None,784])
# Creating a tensor for holding the predicted values of y
y_ = tf.placeholder(tf.float32, [None, 10])

# Defining the Weights and Bias for first set Convolution and Pooling Layer.
# 5, 5 and 32 are the default size used in a basic network
W_conv1 = weight([5, 5, 1, 32])
b_conv1 = bias([32])

# Reshaping the input to pass through the conv layers. 28 - default no of pixel
x_image = tf.reshape(x, [-1,28,28,1])

# The convolutional layer accepts the actual input, weights and bias. 
# it is then given RELU Activation 
h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
h_pool1 = max_pool(h_conv1)


# Defining the Weights and Bias for second set of Convolution and Pooling Layer
# 5, 5 and 64 are the default size used in a basic network in the second layer, 32 is to accept the inpout from the previous layer
W_conv2 = weight([5, 5, 32, 64])
b_conv2 = bias([64])

h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
h_pool2 = max_pool(h_conv2)

# Defining the weight and Bias for the first Fully Connected Layer
# 7*7*64 and 1024 are the default pixel measurement used
W_fc1 = weight([7 * 7 * 64, 1024])
b_fc1 = bias([1024])

# Reshaping the output from the fully connected layer to pull the output from the network
h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64])
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)

# Introducing the dropout inorder to avoid overfitting of out network
keep_prob = tf.placeholder(tf.float32)
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)

# Defining the weight and Bias for the final Fully Connected Layer
# 1024 to accept the input from the previous layer and 10 is for the output classes
W_fc2 = weight([1024, 10])
b_fc2 = bias([10])

# Passing the output of the fully connected layer to the softmax activation function

y_conv=tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2)

In [7]:
# Calcualting the metrics for our network
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y_conv), reduction_indices=[1]))

# setting up the training step as 0.0001 for and using ADAM Optimizer
train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)

# Tensors for the prediction and the accuracy
correct_prediction = tf.equal(tf.argmax(y_conv,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

In [8]:
#  Initializing the Variables and running the session
init = tf.initialize_all_variables()

sess = tf.Session()
sess.run(init)

# Running the session for 1000 epochs but printing the output only for every 100 steps
# we are outputing only the training accuracy after evey 100 epochs
# Test accuracy is displayed at the end.
for i in range(1000):
    batch_xs, batch_ys = mnist.train.next_batch(50)
    if i%100 == 0:
        train_accuracy = accuracy.eval(session=sess,feed_dict={x:batch_xs, y_: batch_ys, keep_prob: 1.0})
        print("step %d, training accuracy %.3f"%(i, train_accuracy))
    sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys, keep_prob: 0.5})
print("test accuracy %g"%accuracy.eval(session=sess,feed_dict={x: mnist.test.images, y_: mnist.test.labels, keep_prob: 1.0}))

step 0, training accuracy 0.100
step 100, training accuracy 0.920
step 200, training accuracy 0.920
step 300, training accuracy 0.920
step 400, training accuracy 0.900
step 500, training accuracy 0.980
step 600, training accuracy 0.960
step 700, training accuracy 0.920
step 800, training accuracy 0.980
step 900, training accuracy 0.980
test accuracy 0.967
