From the tutorial at (https://www.tensorflow.org/tutorials/mnist/beginners/)


The MNIST data is split into three parts: 55,000 data points of training data (mnist.train), 10,000 points of test data (mnist.test), and 5,000 points of validation data (mnist.validation).
We will build the ML model for this.
![ML_MNIST](https://www.tensorflow.org/images/softmax-regression-scalargraph.png)

In [1]:
# Get the MNIST data
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

Extracting MNIST_data/train-images-idx3-ubyte.gz
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz


In [2]:
import tensorflow as tf

# create symbol placeholder for input value a vector, None == unlim, 784 is the size of our data
x = tf.placeholder(tf.float32, [None, 784])

# Now create variables (modifiable tensor that sits in the model) for Weights and Bias
# W has shape 784, 10 because we multiply 784 dimensional image vectors
# to produce 10-dimensional vectors for evidence
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))

# and our model
y = tf.nn.softmax(tf.matmul(x, W) + b)

# input for the correct answers during training
y_ = tf.placeholder(tf.float32, [None, 10])

# cross entropy for determining how wrong the prediction is http://colah.github.io/posts/2015-09-Visual-Information/
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))

First, tf.log computes the logarithm of each element of y. Next, we multiply each element of y_ with the corresponding element of tf.log(y). Then tf.reduce_sum adds the elements in the second dimension of y, due to the reduction_indices=[1] parameter. Finally, tf.reduce_mean computes the mean over all the examples in the batch.
tf.nn.softmax_cross_entropy_with_logits is more stable than the above

Back propagation http://colah.github.io/posts/2015-08-Backprop/

In [3]:
# define train step for the model. tensorflow knowing the entire graph of the model can backpropagate for us.
# we ask TensorFlow to minimize cross_entropy using the gradient descent algorithm with a learning rate of 0.5
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)

In [4]:
# init the graph
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)

In [5]:
# training loop
for i in range(1000):
    batch_xs, batch_ys = mnist.train.next_batch(100)
    sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})

tf.argmax is an extremely useful function which gives you the index of the highest entry in a tensor along some axis. For example, tf.argmax(y,1) is the label our model thinks is most likely for each input, while tf.argmax(y_,1) is the correct label

In [6]:
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

print(sess.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels}))


0.92
