# MNIST for ML Beginners

In [1]:
import tensorflow as tf

## Intro

My notebook while working through Tensorflow MNIST Beginners tutorial.

This tutorial was found here:
https://www.tensorflow.org/versions/r0.10/tutorials/mnist/beginners/index.html#mnist-for-ml-beginners

The layout of this notebook is based on TensorFlow Mechanics 101, found here: https://www.tensorflow.org/versions/r0.10/tutorials/mnist/tf/index.html#tensorflow-mechanics-101

## Prepare the Data

### Download

Let's get our data. Here is some quick information about the MNIST dataset according ot the tutorial:

1. Total dataset has 55k training, 10k test, and 5k validation records
2. Each record contains the image and the label
3. Each images is 28 by 28 pixels, represented by an array with length 784 (you'll see this number below)
4. Each label is an array with length 10.

In [2]:
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

Extracting MNIST_data/train-images-idx3-ubyte.gz
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz


### Inputs and Placeholders

According to the documentation, placeholders are variables that "will be always fed". As such they don't store information... they are simply used to do calculations for this iteration.

In [3]:
# placeholder
x = tf.placeholder(tf.float32, [None, 784])

# weight matrix and bias vector, both initalized with 0s
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))

## Build the Graph

First we define how the network looks and works. Then Tensorflow runs everything at the end when we hit GO.

### Inference

This is the model. That's it! It multiplies the W with x variables defined above, adds the b variables, then runs the result through a softmax function predict a label.

In [4]:
y = tf.nn.softmax(tf.matmul(x, W) + b)

### Loss

With our variables and network set up we now have to tell Tensorflow what to do with that network during each training iteration.

Here we are telling Tensorflow to do three things:

1. Predict y's using the current values of W, x, and b
2. Check to see how close these predicted y's (y\_) are to the actual y's
3. Update the W and b values if predicted y's and actual y's aren't that close yet

In [5]:
# Placeholder for predicted y
y_ = tf.placeholder(tf.float32, [None, 10])

In [6]:
# Check aggregated "closeness" between y_ and y
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))

### Training

In [7]:
# Prepare next step
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)

## Train the Model

We can now loop through 1000 iterations of our defined process above. This trains our model by updating the W and b values each iteration.

In [8]:
# start our tensorflow session
sess = tf.Session()
# initialize the variables we defined above
init = tf.initialize_all_variables()
sess.run(init)

In [9]:
for i in range(1000):
    # randomly pull 100 records from mnist.train
    batch_xs, batch_ys = mnist.train.next_batch(100)
    # run train_step defined above on these random records
    sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})

## Evaluate the Model

How does our trained model perform? Let's find out!

### Build the Eval Graph

In [10]:
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))

In [11]:
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

### Eval Output

In [12]:
print(sess.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels}))

0.917


It's gotten around 92% accuracy, exactly what the tutorial said it would get. It also said this accuracy is very bad when compared to the cutting-edge currently out there.