# Neural Networks with MNIST

### In this notebook we build a neural network to predict the value of handwritten digits:
<img src="https://tensorflow.rstudio.com/tensorflow/articles/images/MNIST.png"
     alt="Image Didn't Load"
     style="float: left; margin-right: 10px;" />  
<br>
<br>
<br>
<br>
<br>
<br>



### We're going to use a network that looks like this:
![](./images/tf_vis.png)  

### Network structure
We will take in all 28 * 28 pixel values from each image (these are grayscale images so we don't have to worry about color channels). The inputs will be multiplied by the weights of each layer and put through the activation function called ReLU (Rectified Linear Unit). The output will be a probability distribution over ten numbers (0,...,9) indicating the probability that each digit is the number in the image.

![](./images/net_vis.png)

In [13]:
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

# import tensorflow
import tensorflow as tf
# import MNIST (MNIST is a dataset containing correctly labeled images of handwritten numbers)
from tensorflow.examples.tutorials.mnist import input_data

In [2]:
# Create a variable containing MNIST data
# There's more than 30,000 images here 
# Just imagine the man hours :P
mnist = input_data.read_data_sets("/tmp/data/", one_hot=False)
# Ignore the warnings :P

Instructions for updating:
Please use alternatives such as official/mnist/dataset.py from tensorflow/models.
Instructions for updating:
Please write your own downloading logic.
Instructions for updating:
Please use urllib or similar directly.
Successfully downloaded train-images-idx3-ubyte.gz 9912422 bytes.
Instructions for updating:
Please use tf.data to implement this functionality.
Extracting /tmp/data/train-images-idx3-ubyte.gz
Successfully downloaded train-labels-idx1-ubyte.gz 28881 bytes.
Instructions for updating:
Please use tf.data to implement this functionality.
Extracting /tmp/data/train-labels-idx1-ubyte.gz
Successfully downloaded t10k-images-idx3-ubyte.gz 1648877 bytes.
Extracting /tmp/data/t10k-images-idx3-ubyte.gz
Successfully downloaded t10k-labels-idx1-ubyte.gz 4542 bytes.
Extracting /tmp/data/t10k-labels-idx1-ubyte.gz
Instructions for updating:
Please use alternatives such as official/mnist/dataset.py from tensorflow/models.


### Setting hyperparameters 
This is where we specify how we want our network to "look" and behave.

In [3]:
# How much of a step we take. Higher learning rates means faster learning but less stable performance
learning_rate = 0.001 

# How many images we are training on at every step. 
# Remember this is gradient descent so we are gonna take small steps of just 128 images towards the goal
batch_size = 128

# run 2000 steps
n_steps = 2000

# How many neurons for each layer
n_inputs = 28 * 28 # its a 28 by 28 pixel image so input is gonna be 28 * 28
n_hidden1 = 300
n_hidden2 = 100
n_output = 10 # output is probabilities of the image being a particular number

### Placeholders
In a tensorflow graph we create placeholders which upon running the graph will contain our input and output data (i.e. images and associated digits each one represents).

In [4]:
# create placeholders
# this is where we will feed in the images and the labels
# x is where we will feed in the image
x = tf.placeholder(tf.float32, shape=(None, n_inputs), name="X")

# y is where we will feed in the label for the corresponding image
# i.e. if the image is a 3, y is 3
y = tf.placeholder(tf.int64, shape=None, name="y")

### Constructing the actual network
After the input layer we have three more layers: hidden layer 1, hidden layer 2, and the output layer. Here we will use tf.layers.dense(inputs, units, activation) -- <b>inputs</b> is the input to the layer, <b>units</b> is the number of nodes in that layer, and <b>activation</b> is the activation function applied in that layer (no activation if left blank). Remember that we want to use the ReLU function (https://machinelearningmastery.com/rectified-linear-activation-function-for-deep-learning-neural-networks/). 
<br>
<b>ReLU</b>:

<img src="https://3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com/wp-content/uploads/2018/10/Line-Plot-of-Rectified-Linear-Activation-for-Negative-and-Positive-Inputs.png"
     alt="Image Didn't Load"/>

In [6]:
# create the neural network
# hidden1 layer takes in our image input (x)
# we will be using the relu activation function

hidden1_layer = tf.layers.dense(x, n_hidden1, activation=tf.nn.relu)
hidden2_layer = tf.layers.dense(hidden1_layer, n_hidden2, activation=tf.nn.relu)
output_layer = tf.layers.dense(hidden2_layer, n_output)

# if you get some weird error about reuse, restart the kernel

# for more about activation functions:
# https://towardsdatascience.com/activation-functions-and-its-types-which-is-better-a9a5310cc8f

### Loss
Here we will be using cross entropy get our error which is a way to determine how close our probability distribution is to the ground truth probability distribution (i.e. a one-hot vector with the one in index corresponding to the correct digit of the related image). We then want to reduce our error over each element in the batch to one error. More on cross entropy: https://jamesmccaffrey.wordpress.com/2013/11/05/why-you-should-use-cross-entropy-error-instead-of-classification-error-or-mean-squared-error-for-neural-network-classifier-training/. 

<br>
<b>Cross Entropy:</b><br>
p(x) is 0 or 1 indicating the correct "class" or not; q(x) is our predicted probability; for each x in (0,...,9)
<img src="http://4.bp.blogspot.com/-O5gRO-nJIQc/WacXCgSMdxI/AAAAAAAA6Gw/ReT-cKX0MM0BYiDXJ9JJ_WFBzcBluvgqwCK4BGAYYCw/s1600/cross%2Bentropy.png"
     alt="Image Didn't Load"/>

In [9]:
# training
# instead of using squared error like the 3Blue1Brown video, we use cross entropy error
cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits(labels=y, logits=output_layer)

# add up all the error with reduce_mean
loss = tf.reduce_mean(cross_entropy)

### Setting up the training operations

In [10]:
# we will use Adam Optimizer to do all the backpropagation heavy lifting for us
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate)

# create an operator that we can call on to minimize loss
training_op = optimizer.minimize(loss)

In [11]:
# create operators for evaluating the network
with tf.name_scope("eval"):
    # correct variable is the number of correct labels
    correct = tf.nn.in_top_k(output_layer, y, 1)
    # count up the number of correct labels
    accuracy = tf.reduce_mean(tf.cast(correct, tf.float32))

### Setting up the session and running the graph
Now that we have our data, network, hyperparameters, etc... we want to actually train the network so it outputs values that are correct. 

In [12]:
# creates a tensorflow session
# the session will keep all of our values and current state of the network
with tf.Session() as sess:
    
    # first initialize the values in the network
    tf.global_variables_initializer().run()
    
    for step in range(n_steps):
        # creates a batch of images and labels
        x_batch, y_batch = mnist.train.next_batch(batch_size)
        
        # feed in the batch of images and batch of labels
        # and run the training operator on them
        sess.run(training_op, feed_dict={x: x_batch, y: y_batch})
        
        # feed in the test images and the test labels (which the network has never seen)
        # and evaluate the accuracy
        accuracy_val = sess.run(accuracy, feed_dict={x: mnist.test.images, y: mnist.test.labels})
        
        print(step, "Test accuracy:", accuracy_val)

0 Test accuracy: 0.1773
1 Test accuracy: 0.4132
2 Test accuracy: 0.5306
3 Test accuracy: 0.5869
4 Test accuracy: 0.6333
5 Test accuracy: 0.668
6 Test accuracy: 0.6878
7 Test accuracy: 0.7084
8 Test accuracy: 0.729
9 Test accuracy: 0.7553
10 Test accuracy: 0.7786
11 Test accuracy: 0.7905
12 Test accuracy: 0.7974
13 Test accuracy: 0.8028
14 Test accuracy: 0.8063
15 Test accuracy: 0.8143
16 Test accuracy: 0.8185
17 Test accuracy: 0.82
18 Test accuracy: 0.8223
19 Test accuracy: 0.837
20 Test accuracy: 0.8452
21 Test accuracy: 0.849
22 Test accuracy: 0.8523
23 Test accuracy: 0.8558
24 Test accuracy: 0.858
25 Test accuracy: 0.8601
26 Test accuracy: 0.8663
27 Test accuracy: 0.8688
28 Test accuracy: 0.8684
29 Test accuracy: 0.8704
30 Test accuracy: 0.8713
31 Test accuracy: 0.8729
32 Test accuracy: 0.8696
33 Test accuracy: 0.8684
34 Test accuracy: 0.8673
35 Test accuracy: 0.868
36 Test accuracy: 0.8694
37 Test accuracy: 0.8791
38 Test accuracy: 0.8846
39 Test accuracy: 0.8817
40 Test accuracy: 

322 Test accuracy: 0.951
323 Test accuracy: 0.9523
324 Test accuracy: 0.9557
325 Test accuracy: 0.9567
326 Test accuracy: 0.957
327 Test accuracy: 0.9568
328 Test accuracy: 0.9553
329 Test accuracy: 0.9544
330 Test accuracy: 0.9559
331 Test accuracy: 0.9567
332 Test accuracy: 0.957
333 Test accuracy: 0.9576
334 Test accuracy: 0.958
335 Test accuracy: 0.9568
336 Test accuracy: 0.9532
337 Test accuracy: 0.9498
338 Test accuracy: 0.949
339 Test accuracy: 0.9506
340 Test accuracy: 0.9511
341 Test accuracy: 0.952
342 Test accuracy: 0.9547
343 Test accuracy: 0.9543
344 Test accuracy: 0.9546
345 Test accuracy: 0.952
346 Test accuracy: 0.9537
347 Test accuracy: 0.9537
348 Test accuracy: 0.9545
349 Test accuracy: 0.9563
350 Test accuracy: 0.9574
351 Test accuracy: 0.9584
352 Test accuracy: 0.9597
353 Test accuracy: 0.9585
354 Test accuracy: 0.9578
355 Test accuracy: 0.9587
356 Test accuracy: 0.9607
357 Test accuracy: 0.9606
358 Test accuracy: 0.9604
359 Test accuracy: 0.9603
360 Test accuracy: 

641 Test accuracy: 0.9675
642 Test accuracy: 0.9686
643 Test accuracy: 0.9694
644 Test accuracy: 0.9706
645 Test accuracy: 0.9704
646 Test accuracy: 0.9689
647 Test accuracy: 0.9686
648 Test accuracy: 0.9687
649 Test accuracy: 0.9674
650 Test accuracy: 0.9663
651 Test accuracy: 0.966
652 Test accuracy: 0.9657
653 Test accuracy: 0.966
654 Test accuracy: 0.9673
655 Test accuracy: 0.9676
656 Test accuracy: 0.9679
657 Test accuracy: 0.9682
658 Test accuracy: 0.9682
659 Test accuracy: 0.9679
660 Test accuracy: 0.9675
661 Test accuracy: 0.9667
662 Test accuracy: 0.9666
663 Test accuracy: 0.9673
664 Test accuracy: 0.9685
665 Test accuracy: 0.9702
666 Test accuracy: 0.9704
667 Test accuracy: 0.9703
668 Test accuracy: 0.9699
669 Test accuracy: 0.97
670 Test accuracy: 0.9698
671 Test accuracy: 0.9694
672 Test accuracy: 0.9698
673 Test accuracy: 0.9698
674 Test accuracy: 0.9697
675 Test accuracy: 0.9691
676 Test accuracy: 0.9701
677 Test accuracy: 0.97
678 Test accuracy: 0.9692
679 Test accuracy:

960 Test accuracy: 0.9704
961 Test accuracy: 0.9698
962 Test accuracy: 0.9687
963 Test accuracy: 0.9693
964 Test accuracy: 0.9689
965 Test accuracy: 0.9695
966 Test accuracy: 0.9706
967 Test accuracy: 0.9716
968 Test accuracy: 0.9734
969 Test accuracy: 0.9735
970 Test accuracy: 0.9722
971 Test accuracy: 0.9717
972 Test accuracy: 0.9703
973 Test accuracy: 0.9705
974 Test accuracy: 0.9705
975 Test accuracy: 0.9714
976 Test accuracy: 0.9728
977 Test accuracy: 0.9736
978 Test accuracy: 0.9745
979 Test accuracy: 0.9744
980 Test accuracy: 0.9747
981 Test accuracy: 0.9752
982 Test accuracy: 0.9746
983 Test accuracy: 0.9749
984 Test accuracy: 0.9754
985 Test accuracy: 0.9753
986 Test accuracy: 0.9752
987 Test accuracy: 0.9749
988 Test accuracy: 0.9753
989 Test accuracy: 0.9752
990 Test accuracy: 0.9752
991 Test accuracy: 0.9752
992 Test accuracy: 0.9751
993 Test accuracy: 0.975
994 Test accuracy: 0.9755
995 Test accuracy: 0.9752
996 Test accuracy: 0.9756
997 Test accuracy: 0.9754
998 Test accu

1267 Test accuracy: 0.9697
1268 Test accuracy: 0.9696
1269 Test accuracy: 0.9702
1270 Test accuracy: 0.9721
1271 Test accuracy: 0.9727
1272 Test accuracy: 0.9718
1273 Test accuracy: 0.9716
1274 Test accuracy: 0.9714
1275 Test accuracy: 0.9707
1276 Test accuracy: 0.9711
1277 Test accuracy: 0.9719
1278 Test accuracy: 0.973
1279 Test accuracy: 0.9734
1280 Test accuracy: 0.9737
1281 Test accuracy: 0.9738
1282 Test accuracy: 0.9743
1283 Test accuracy: 0.9741
1284 Test accuracy: 0.9744
1285 Test accuracy: 0.9744
1286 Test accuracy: 0.9751
1287 Test accuracy: 0.9757
1288 Test accuracy: 0.9772
1289 Test accuracy: 0.9775
1290 Test accuracy: 0.9764
1291 Test accuracy: 0.9752
1292 Test accuracy: 0.9749
1293 Test accuracy: 0.9752
1294 Test accuracy: 0.9762
1295 Test accuracy: 0.9767
1296 Test accuracy: 0.9767
1297 Test accuracy: 0.9769
1298 Test accuracy: 0.9771
1299 Test accuracy: 0.9755
1300 Test accuracy: 0.9744
1301 Test accuracy: 0.9733
1302 Test accuracy: 0.9724
1303 Test accuracy: 0.9724
13

1572 Test accuracy: 0.9766
1573 Test accuracy: 0.9762
1574 Test accuracy: 0.9757
1575 Test accuracy: 0.9771
1576 Test accuracy: 0.9768
1577 Test accuracy: 0.9766
1578 Test accuracy: 0.9772
1579 Test accuracy: 0.978
1580 Test accuracy: 0.9779
1581 Test accuracy: 0.9784
1582 Test accuracy: 0.9791
1583 Test accuracy: 0.9791
1584 Test accuracy: 0.9785
1585 Test accuracy: 0.9785
1586 Test accuracy: 0.9784
1587 Test accuracy: 0.9785
1588 Test accuracy: 0.9785
1589 Test accuracy: 0.9779
1590 Test accuracy: 0.9767
1591 Test accuracy: 0.9767
1592 Test accuracy: 0.9767
1593 Test accuracy: 0.9776
1594 Test accuracy: 0.9782
1595 Test accuracy: 0.9784
1596 Test accuracy: 0.9774
1597 Test accuracy: 0.9771
1598 Test accuracy: 0.9759
1599 Test accuracy: 0.9755
1600 Test accuracy: 0.9758
1601 Test accuracy: 0.9766
1602 Test accuracy: 0.9768
1603 Test accuracy: 0.9772
1604 Test accuracy: 0.9777
1605 Test accuracy: 0.9778
1606 Test accuracy: 0.9771
1607 Test accuracy: 0.9773
1608 Test accuracy: 0.9767
16

1879 Test accuracy: 0.975
1880 Test accuracy: 0.975
1881 Test accuracy: 0.9744
1882 Test accuracy: 0.9739
1883 Test accuracy: 0.9732
1884 Test accuracy: 0.9734
1885 Test accuracy: 0.9731
1886 Test accuracy: 0.9735
1887 Test accuracy: 0.9733
1888 Test accuracy: 0.9732
1889 Test accuracy: 0.9748
1890 Test accuracy: 0.9751
1891 Test accuracy: 0.9756
1892 Test accuracy: 0.9778
1893 Test accuracy: 0.9786
1894 Test accuracy: 0.979
1895 Test accuracy: 0.9794
1896 Test accuracy: 0.9784
1897 Test accuracy: 0.9786
1898 Test accuracy: 0.9785
1899 Test accuracy: 0.9776
1900 Test accuracy: 0.9769
1901 Test accuracy: 0.9767
1902 Test accuracy: 0.9771
1903 Test accuracy: 0.9777
1904 Test accuracy: 0.9791
1905 Test accuracy: 0.9794
1906 Test accuracy: 0.9795
1907 Test accuracy: 0.9791
1908 Test accuracy: 0.9782
1909 Test accuracy: 0.9769
1910 Test accuracy: 0.9758
1911 Test accuracy: 0.9745
1912 Test accuracy: 0.973
1913 Test accuracy: 0.9735
1914 Test accuracy: 0.9748
1915 Test accuracy: 0.9751
1916 