***Goal:*** Use Tensorflow to build and train a neural
network to recognize and predict the correct label for the digit displayed ( from 0 to 9).

### 1. Importing the MNIST Dataset

The dataset we will be using in this tutorial is called the MNIST dataset, this dataset is made up of images of handwritten digits, 28x28 pixels in size. Here are some examples of the digits included in the dataset:

We will import tensorflow library first.

In [1]:
import tensorflow as tf

We will then import MNIST dataset that is available on tensorflow library

In [2]:
from tensorflow.examples.tutorials.mnist import input_data

When reading in the data, we are using one-hot-encoding to represent the labels. One-hotencoding uses a vector of binary values to represent numeric or categorical values. One of these values is set to 1, to represent the digit at that index of the vector, and the rest are set to 0. For example, the digit 3 is represented using the vector [0, 0, 0, 1, 0, 0, 0, 0, 0, 0].

In [3]:
mnist = input_data.read_data_sets("MNIST_data/", one_hot = True) # y labels are one-hot-encoded

Instructions for updating:
Please use alternatives such as official/mnist/dataset.py from tensorflow/models.
Instructions for updating:
Please write your own downloading logic.
Instructions for updating:
Please use urllib or similar directly.
Successfully downloaded train-images-idx3-ubyte.gz 9912422 bytes.
Instructions for updating:
Please use tf.data to implement this functionality.
Extracting MNIST_data/train-images-idx3-ubyte.gz
Successfully downloaded train-labels-idx1-ubyte.gz 28881 bytes.
Instructions for updating:
Please use tf.data to implement this functionality.
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Instructions for updating:
Please use tf.one_hot on tensors.
Successfully downloaded t10k-images-idx3-ubyte.gz 1648877 bytes.
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Successfully downloaded t10k-labels-idx1-ubyte.gz 4542 bytes.
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz
Instructions for updating:
Please use alternatives such as official/mnist/dataset.py fr

The dataset is splited into 55,000 images for training, 5000 for validation, and 10,000 for testing.

In [4]:
n_train = mnist.train.num_examples # 55,000
n_validation = mnist.validation.num_examples # 5000
n_test = mnist.test.num_examples # 10,000
print ("n_train:", n_train)
print ("n_validation:", n_validation)
print ("n_test:", n_test)

n_train: 55000
n_validation: 5000
n_test: 10000


### 3. Build neural network

We will store the number of units per layer in global variables

In [5]:
n_input = 784 # input layer (28x28 pixels)
n_hidden1 = 512 # 1st hidden layer
n_hidden2 = 256 # 2nd hidden layer
n_hidden3 = 128 # 3rd hidden layer
n_output = 10 # output layer (0-9 digits)

Hyperparameters definition:
* ***The learning rate*** represents how much the parameters will adjust at each step of the learning process.
* ***The number of iterations*** refers to how many times we go through the training step.
* ***Batch size refers*** to how many training examples we are using at each step.
* ***Dropout*** give each unit a 50% chance of being eliminated at every training step. This helps prevent overfitting and reduce complexity of our neural network

In [6]:
learning_rate = 1e-4
n_iterations = 1000
batch_size = 128 # mini-batches size
dropout = 0.5

### 4. Building the TensorFlow Graph

The core concept of **TensorFlow** is the tensor, a data structure similar to an array or list. initialized, manipulated as they are passed through the graph, and updated through the learning process. We’ll start by defining three tensors as placeholders, which are tensors that we’ll feed values into later.

In [7]:
X = tf.placeholder("float", [None, n_input])
Y = tf.placeholder("float", [None, n_output])
keep_prob = tf.placeholder(tf.float32)

Declaration as variables and Initialization of weights and biases of neural network.

In [8]:
weights = {
    'w1':tf.Variable(tf.truncated_normal([n_input, n_hidden1],stddev = 0.1)),
    'w2':tf.Variable(tf.truncated_normal([n_hidden1, n_hidden2],stddev = 0.1)),
    'w3':tf.Variable(tf.truncated_normal([n_hidden2, n_hidden3],stddev = 0.1)),
    'out':tf.Variable(tf.truncated_normal([n_hidden3, n_output],stddev = 0.1))
}

biases = {
    'b1' : tf.Variable(tf.constant(0.1, shape = [n_hidden1])),
    'b2' : tf.Variable(tf.constant(0.1, shape = [n_hidden2])),
    'b3' : tf.Variable(tf.constant(0.1, shape = [n_hidden3])),
    'out' : tf.Variable(tf.constant(0.1, shape = [n_output]))
}

Definition of different operations that will manipulate the tensors through layers.

In [9]:
layer_1 = tf.add(tf.matmul(X, weights['w1']),biases['b1'])
layer_2 = tf.add(tf.matmul(layer_1, weights['w2']),biases['b2'])
layer_3 = tf.add(tf.matmul(layer_2, weights['w3']),biases['b3'])
layer_drop = tf.nn.dropout(layer_3, keep_prob)
output_layer = tf.add(tf.matmul(layer_3, weights['out']),biases['out'])

Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.


Definition of the loss function. We will use **cross-entropy** or **log-loss** function.

In [10]:
cross_entropy = tf.reduce_mean(
    tf.nn.softmax_cross_entropy_with_logits(
    labels=Y, logits = output_layer
    ))

Instructions for updating:

Future major versions of TensorFlow will allow gradients to flow
into the labels input on backprop by default.

See `tf.nn.softmax_cross_entropy_with_logits_v2`.



Definition of Optimization algorithm: Using **Adam Optimizer**.

In [11]:
train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)

### 5. Training and Testing

Definition **evaluation method** of accuracy.

In [12]:
correct_pred = tf.equal(tf.argmax(output_layer,1), tf.argmax(Y,1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))

Initialize a session.

In [13]:
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)

The essence of the training process in deep learning is to optimize the loss function. Here we are aiming to minimize the difference between the predicted labels of the images, and the true labels of the images. The process involves four steps which are repeated for a set number of iterations:
* Propagate values forward through the network;
* Compute the loss;
* Propagate values backward through the network;
* Update the parameters.
    
At each training step, the parameters are adjusted slightly to try and reduce the loss for the next step. As the learning progresses, we should see a reduction in loss, and eventually we can stop training and use the network as a model for testing our new data.


In [14]:
# train on mini-batches
for i in range(n_iterations):
    batch_x, batch_y = mnist.train.next_batch(batch_size)
    sess.run(train_step, feed_dict = {
        X: batch_x, Y: batch_y, keep_prob: dropout
    })
    # print loss and accuracy (per minibatch of 100 examples)
    if i % 100 == 0:
        minibatch_loss, minibatch_accuracy = sess.run([cross_entropy, accuracy], feed_dict = {X: batch_x, Y: batch_y, keep_prob:1.0})
        print(
            "Iteration",
            str(i),
            "\t| Loss =",
            str(minibatch_loss),
            "\t| Accuracy =",
            str(minibatch_accuracy)
            )

Iteration 0 	| Loss = 3.634748 	| Accuracy = 0.1640625
Iteration 100 	| Loss = 0.45517892 	| Accuracy = 0.890625
Iteration 200 	| Loss = 0.4856335 	| Accuracy = 0.84375
Iteration 300 	| Loss = 0.44947112 	| Accuracy = 0.875
Iteration 400 	| Loss = 0.3840907 	| Accuracy = 0.8515625
Iteration 500 	| Loss = 0.2897442 	| Accuracy = 0.9140625
Iteration 600 	| Loss = 0.3297691 	| Accuracy = 0.9140625
Iteration 700 	| Loss = 0.31791586 	| Accuracy = 0.90625
Iteration 800 	| Loss = 0.2644313 	| Accuracy = 0.921875
Iteration 900 	| Loss = 0.3957845 	| Accuracy = 0.90625


In [15]:
test_accuracy = sess.run(accuracy, feed_dict = {X: mnist.test.images, Y : mnist.test.labels, keep_prob: 1.0})

print("\nAccuracy on test set:", test_accuracy)


Accuracy on test set: 0.9154


Test our model on single image of our own

We should first create the test image using paint tool of Windows or write with hands a digit and then take a photo. After that upload image on working directory. You can get working directory using ***pwd*** command in python. 

In [19]:
import numpy as np
from PIL import Image

# Load and test image of the handwritten digit
img = np.invert(Image.open("tensorflow-demo/test_image.png").resize((28,28)).convert('L')).ravel()

get the prediction.

In [20]:
prediction = sess.run(tf.argmax(output_layer, 1), feed_dict = {X: [img]})
print ("Prediction for test image:", np.squeeze(prediction))

Prediction for test image: 6
