From: https://www.tensorflow.org/get_started/mnist/beginners

# Importing the data

In [1]:
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

Successfully downloaded train-images-idx3-ubyte.gz 9912422 bytes.
Extracting MNIST_data/train-images-idx3-ubyte.gz
Successfully downloaded train-labels-idx1-ubyte.gz 28881 bytes.
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Successfully downloaded t10k-images-idx3-ubyte.gz 1648877 bytes.
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Successfully downloaded t10k-labels-idx1-ubyte.gz 4542 bytes.
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz


In [2]:
import tensorflow as tf

x is a pixel in the image. Since the image has 28 x 28 pixel, the x is for every image a 784 dimensional vector. We have n inputs, so we do not give a limit here, so its `None`.

In [3]:
x = tf.placeholder(tf.float32, [None, 784])

W is the weight and b the bias. Both are variables which get adjusted by tensorflow when running the session.

In [4]:
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))

# Defining the model

This is actually the whole model. The idea is to go through the image pixel by pixel (`x`) and calculate the possibility the image is a certain number given the pixel is set. Given the pixel in the center of the image is set (means it's black, not white) the possibility that it's a 0 is very unlikely, so it will be negative number. The possibility it is a 1 is given, so it's a positive number. `matmul` is standing for matrix multiplication. Afterwards, we add the matrix `b` (the bias).

Softmax is an algorithm which does two things: 1. it enhances differences. This means that small differences get "enlarged". 2. it normalizes values into possibilities which add up to 1.

The result we get is a matrix of possibilities which correspondence to a label (label are in our case 0,1,2,3,4,5,6,7,8,9. If you want to distinct between cats an dogs your labels are `cat`and `dog`. So if your model thinks an image is probably a 8 but could also be a nine then y would be something like [0,0,0,0,0,0,0,0,.6,.4]. (With real data no probability will be exactly 0, since, you know, there is a slight chance....)

In [5]:
y = tf.nn.softmax(tf.matmul(x, W) + b)

Now we need a way to compute how our model can improve itself. This is done by calculating the "error" of the model. For this to happen we first need a placeholder which hods the correct label. So `y` is the predicted label, `y_` is the provided, correct label (also in vector form eg. for 3 it's [0,0,0,1,0,0,0,0,0,0])

In [6]:
y_ = tf.placeholder(tf.float32, [None, 10])

Cross-entropy gives us a way to express how different two probability distributions are.

In [7]:
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))

# Training

In training, we reduce the cross entropy with a gradient descent. 

In [8]:
train_step = tf.train.GradientDescentOptimizer(0.05).minimize(cross_entropy)

Now the actual calculation is done outside of Phyton to optimize for speed. So everything unitl now was just preparation to tell Tensorflow what to do while calculating (means the calulating itself can also be done by a GPU or external services)

In [9]:
sess = tf.InteractiveSession()

In [10]:
tf.global_variables_initializer().run()

Now here the actual training takes place. We first get MINST data (which was downloaded at the very beginning of this tutorial)

In [17]:
for _ in range(1000):
  batch_xs, batch_ys = mnist.train.next_batch(100)
  sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})

Congrats! You now have a trained model! Lets find out how good it is! Since `y` and `y_` are both matrixes, we find the index of the highest number with the tensorflow function `argmax`, then compare the two values with `equal`. The result is a matrix of boolean values (eg. `[true, false, false, true, true]`.

In [12]:
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))

Now we cast the boolean values to floats and calculate the mean value from them.

In [13]:
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

Remeber that no calculaltion is actually done in Python, we just set up tensorflow to run our calculations in its session. This is what we do now. Since we did not start a new session the trained wehights (`W`) and biases (`b`) are still around.

In [18]:
print(sess.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels}))

0.9142
