# TensorFlow Tutorial: MNIST for Beginners
As published on: https://www.tensorflow.org/versions/r0.11/tutorials/mnist/beginners/

## Don't forget to read
Visualizing MNIST: An Exploration of Dimensionality Reduction: http://colah.github.io/posts/2014-10-Visualizing-MNIST/

Softmax explained in detail: http://neuralnetworksanddeeplearning.com/chap3.html#softmax

Backprop: http://colah.github.io/posts/2015-08-Backprop/

Scoreboard for MNIST models: http://rodrigob.github.io/are_we_there_yet/build/classification_datasets_results.html

## The MNIST data
Hosted on Yann LeCun's website.

In [1]:
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

Successfully downloaded train-images-idx3-ubyte.gz 9912422 bytes.
Extracting MNIST_data/train-images-idx3-ubyte.gz
Successfully downloaded train-labels-idx1-ubyte.gz 28881 bytes.
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Successfully downloaded t10k-images-idx3-ubyte.gz 1648877 bytes.
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Successfully downloaded t10k-labels-idx1-ubyte.gz 4542 bytes.
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz


import required lib(s)

In [2]:
import tensorflow as tf

define a placeholder for our input data (MNIST images), variables and our model:

In [4]:
x = tf.placeholder(tf.float32, [None, 784])
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))

y = tf.nn.softmax(tf.matmul(x, W) + b)

# Training
In order to train the model, we need to define what it means for the model to be good. We will in face be defining what it means to bad, however. This is done using a Cost Function.
The Cost Function used in this case is called cross-entropy, and it goes a little something like this:

$$H_{y'}(y)=-\sum_i{y'_{i}\log(y_{i})}$$

Where $y$ is our predicted probability distribution, and $y'$ is the true distribution (the one-hot vector with the digit labels).

In [8]:
y_ = tf.placeholder(tf.float32, [None, 10])
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))

# Alternatively, this would be numerically stable
# tf.nn.softmax_cross_entropy_with_logits(tf.matmul(x, W) + b)

In [9]:
# Ask TF to minimize cross_entropy using Gradient Descent with a learning rate of 0.5:
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)

In [11]:
# Deprecated! Notify TF
init = tf.initialize_all_variables()

Instructions for updating:
Use `tf.global_variables_initializer` instead.


In [12]:
sess = tf.Session()
sess.run(init)

stochastic gradient decent:

In [13]:
for i in range(1000):
    batch_xs, batch_ys = mnist.train.next_batch(100)
    sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})

## Evaluating our model

In [16]:
correct_prediction = tf.equal(tf.argmax(y,1), tf.arg_max(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print(sess.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels}))

0.9155
