In this, we will create a neural network with tensorflow to recognize handwritten digits using mnist dataset

In [1]:
import tensorflow as tf

In [2]:
from tensorflow.examples.tutorials.mnist import input_data

In [None]:
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

<h5>one hot encoding is used on the data. Meaning, as there are 10 digits, 0-9, the digit 3 will be displayed as [0, 0, 0, 1, 0,
0, 0, 0, 0, 0]
    
Each image is of the size 28x28 pixels and it will be flattened to a 1_D array

In [11]:
n_train = mnist.train.num_examples
n_eval = mnist.validation.num_examples
n_test = mnist.test.num_examples

print('The number of images for training, evaluation and testing are {}, {}, {} images respectively'.format(n_train,n_eval,n_test))

The number of images for training, evaluation and testing are 55000, 5000, 10000 images respectively


<h5>Lets work on developing our NN. We will create 3 hidden layers with 512,256,128 neurons respectively. The input layer will have 784 neurons (28*28 image so 784 pixels in total). The final output layer will have 10 neurons as we have 10 classes (0-9 digits).

The dropout variable represents a threshold at which we eliminate some units at random, in this case, we choose a 50% probability of being eliminated at every training step at the last hidden layer.

In [13]:
X = tf.placeholder("float", [None, 784])  #the first layers placeholder
Y = tf.placeholder("float", [None, 10]) #the output layers placeholder
keep_prob = tf.placeholder(tf.float32) # a tensor to control the dropout rate

As our model will change the values of weights, we need to set an initial value for the weights of our inout layer, 3 hidden layers and output layer. As they are connected to each other, we will create 4 variables

In [14]:
weights = {
'w1': tf.Variable(tf.truncated_normal([784, 512],stddev=0.1)),
'w2': tf.Variable(tf.truncated_normal([512, 256],stddev=0.1)),
'w3': tf.Variable(tf.truncated_normal([256, 128],stddev=0.1)),
'out': tf.Variable(tf.truncated_normal([128, 10],stddev=0.1)),
}

Instructions for updating:
Colocations handled automatically by placer.


Inittialy, we use a small constant value for the bias to ensure that the tensors activate in the intial stages and therefore contribute to the back propagation. The bias is also defined for 4 layers, all except the input layer

In [16]:
biases = {
'b1': tf.Variable(tf.constant(0.1, shape=[512])),
'b2': tf.Variable(tf.constant(0.1, shape=[256])),
'b3': tf.Variable(tf.constant(0.1, shape=[128])),
'out': tf.Variable(tf.constant(0.1, shape=[10]))
}

Lets define the functions for all the layers. It basically is matrix multiplication on the previous
layer’s outputs and the current layer’s weights, and add the bias to these values. 

In [17]:
layer_1 = tf.add(tf.matmul(X, weights['w1']), biases['b1'])         #Matrix Multiplication + Bias
layer_2 = tf.add(tf.matmul(layer_1, weights['w2']), biases['b2'])   #Matrix Multiplication + Bias
layer_3 = tf.add(tf.matmul(layer_2, weights['w3']), biases['b3'])   #Matrix Multiplication + Bias
layer_drop = tf.nn.dropout(layer_3, keep_prob)                      #This defines the dropout for the last layer
output_layer = tf.matmul(layer_3, weights['out']) + biases['out']   #Matrix Multiplication + Bias

Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.


The final step in building the model is to define the loss function that we want to optimize. There are many options to use from the tensor flow module we will use cross_entropy

We also need to choose the optimization function which will be used to minimize the loss function. Gradient descent optimization is a common method for finding the minimum of a function by taking iterative steps along the gradient in a negative direction. For this we choose the Adam optimizer

In [314]:
cross_entropy = tf.reduce_mean( tf.nn.softmax_cross_entropy_with_logits(
    labels=Y, logits=output_layer))   #this defines our loss function
train_step = tf.train.AdamOptimizer(0.0003).minimize(cross_entropy)

Now finally we have the reached the training and testing step.

In [315]:
init = tf.global_variables_initializer() #to initialize all tf variables

In [316]:
with tf.Session() as sess:
    sess.run(init)

The basic theory behind the training are these 4 processes:<br>
1. Propagate values forward through the network<br>
2. Compute the loss<br>
3. Propagate values backward through the network<br>
4. Update the parameters<br>

Our variables are as follows:<br>
1. Learning Rate = 0.0003<br>
2. Batch size = 128<br>
3. dropout value = 0.5 -> 50% chance of dropping value at last hidden layer <br>
4. Total Iterations = 1000 steps

In [317]:
#To get total accuracy score at each batch iteration
correct_pred = tf.equal(tf.argmax(output_layer, 1), tf.argmax(Y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))  

In [329]:
with tf.Session() as sess:
    sess.run(init)
    for i in range(1000):
        batch_x, batch_y = mnist.train.next_batch(128)
        sess.run(train_step, feed_dict={
            X: batch_x, Y: batch_y, keep_prob: 0.5
            })
        # print loss and accuracy (per minibatch)
        if i % 100 == 0:
            minibatch_loss, minibatch_accuracy = sess.run(
            [cross_entropy, accuracy],
            feed_dict={X: batch_x, Y: batch_y, keep_prob: 1.0}
            )
            print(
            "Iteration",
            str(i),
            "\t| Loss =",
            str(minibatch_loss),
            "\t| Accuracy =",
            str(minibatch_accuracy)
            )
            
    test_accuracy = sess.run(accuracy, feed_dict={X: mnist.test.images, Y:
    mnist.test.labels, keep_prob: 1.0})
    print("\nAccuracy on test set:", test_accuracy)

Iteration 0 	| Loss = 3.6396756 	| Accuracy = 0.1171875
Iteration 100 	| Loss = 0.48735958 	| Accuracy = 0.875
Iteration 200 	| Loss = 0.31270728 	| Accuracy = 0.8828125
Iteration 300 	| Loss = 0.47652382 	| Accuracy = 0.8828125
Iteration 400 	| Loss = 0.2207914 	| Accuracy = 0.9296875
Iteration 500 	| Loss = 0.3489792 	| Accuracy = 0.9140625
Iteration 600 	| Loss = 0.24891691 	| Accuracy = 0.921875
Iteration 700 	| Loss = 0.19936237 	| Accuracy = 0.953125
Iteration 800 	| Loss = 0.30250138 	| Accuracy = 0.90625
Iteration 900 	| Loss = 0.25985283 	| Accuracy = 0.90625

Accuracy on test set: 0.9153


<h5>The above accuracy is for the model when it was trained per batch. At the last line, we acquired the accuracy of the model using our testing data. 

Now, Lets see our model work with real data. We will have to import some other libraries as well

In [45]:
from PIL import Image
import numpy as np

In [326]:
img = np.invert(Image.open("dig.png").convert('L')).ravel()
#I am using an image with a black  background and white text

First, we use the convert function to change the image into one grayscale color image. We store this as a numpy array and invert it using np.invert, because the current matrix represents black as 0 and white as 255, whereas we need the opposite. Finally, we call ravel to flatten the array.

In [328]:
with tf.Session() as sess:
    sess.run(init)
    prediction = sess.run(tf.argmax(output_layer, 1), feed_dict={X: [img]})
print ("Prediction for test image:", np.squeeze(prediction))

Prediction for test image: 0


That will be all. For the dataset, if you want to use a bigger dataset, the result will be much better as NN's are data hungry. For the image to use when predicting, you can use a sample image from the internet or simply open paint and change the canvas size to 28*28 pixels, turn the background black and using white, draw any digit.