In [None]:
#This tutorial comes from TensorFlow
#Deep MNIST for Experts
#The link is as follows: https://www.tensorflow.org/versions/0.6.0/tutorials/mnist/pros/index.html

In [3]:
#Load MNIST Data
import input_data
mnist = input_data.read_data_sets('MNIST_data', one_hot=True)

Successfully downloaded train-images-idx3-ubyte.gz 9912422 bytes.
Extracting MNIST_data/train-images-idx3-ubyte.gz
Successfully downloaded train-labels-idx1-ubyte.gz 28881 bytes.
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Successfully downloaded t10k-images-idx3-ubyte.gz 1648877 bytes.
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Successfully downloaded t10k-labels-idx1-ubyte.gz 4542 bytes.
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz


In [4]:
'''
Start TensorFlow InteractiveSession
Tensorflow relies on a highly efficient C++ backend to do its computation. 
The connection to this backend is called a session.
The common usage for TensorFlow programs is to first create a graph and then launch it in a session.
Here we instead use the convenient InteractiveSession class, which makes TensorFlow more flexible about how you structure your code.
It allows you to interleave operations which build a computation graph with ones that run the graph.
This is particularly convenient when working in interactive contexts like iPython. 
If you are not using an InteractiveSession, then you should build the entire computation graph before starting a session and launching the graph.
'''
import tensorflow as tf
sess = tf.InteractiveSession()

In [5]:
#Build a Softmax Regression Model
'''
Here x and y_ aren't specific values.
Rather, they are each a placeholder -- a value that we'll input when we ask TensorFlow to run a computation.
The input images x will consist of a 2d tensor of floating point numbers. 
Here we assign it a shape of [None, 784], where 784 is the dimensionality of a single flattened MNIST image,
and None indicates that the first dimension, corresponding to the batch size, can be of any size. 
The target output classes y_ will also consist of a 2d tensor, where each row is a one-hot 10-dimensional vector indicating which digit class the corresponding MNIST image belongs to.
The shape argument to placeholder is optional, but it allows TensorFlow to automatically catch bugs stemming from inconsistent tensor shapes
'''
x = tf.placeholder("float", shape=[None, 784])
y_ = tf.placeholder("float", shape=[None, 10])

In [6]:
#Variables
'''
We now define the weights W and biases b for our model. 
We could imagine treating these like additional inputs, 
but TensorFlow has an even better way to handle them: Variable. 
A Variable is a value that lives in TensorFlow's computation graph. 
It can be used and even modified by the computation. 
In machine learning applications, one generally has the model paramaters be Variables.
'''
W = tf.Variable(tf.zeros([784,10]))
b = tf.Variable(tf.zeros([10]))
'''
We pass the initial value for each parameter in the call to tf.Variable. 
In this case, we initialize both W and b as tensors full of zeros. 
W is a 784x10 matrix (because we have 784 input features and 10 outputs) and b is a 10-dimensional vector (because we have 10 classes).
Before Variables can be used within a session, they must be initialized using that session. 
This step takes the initial values (in this case tensors full of zeros) that have already been specified, and assigns them to each Variable. 
This can be done for all Variables at once.
'''
sess.run(tf.initialize_all_variables())

In [7]:
#Predicted Class and Cost Function
'''
We can now implement our regression model.
It only takes one line! We multiply the vectorized input images x by the weight matrix W, 
add the bias b, and compute the softmax probabilities that are assigned to each class.

The cost function to be minimized during training can be specified just as easily. 
Our cost function will be the cross-entropy between the target and the model's prediction.
'''
y = tf.nn.softmax(tf.matmul(x,W) + b)
cross_entropy = -tf.reduce_sum(y_*tf.log(y))



In [8]:
#Train the Model
'''
Now that we have defined our model and training cost function, it is straightforward to train using TensorFlow.
Because TensorFlow knows the entire computation graph, it can use automatic differentiation to find the gradients of the cost with respect to each of the variables.
TensorFlow has a variety of builtin optimization algorithms.
For this example, we will use steepest gradient descent, with a step length of 0.01, to descend the cross entropy
'''
train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy)
for i in range(1000):
  batch = mnist.train.next_batch(50)
  train_step.run(feed_dict={x: batch[0], y_: batch[1]})
    


In [9]:
#Evaluate the Model
'''
How well did our model do?

First we'll figure out where we predicted the correct label. 
tf.argmax is an extremely useful function which gives you the index of the highest entry in a tensor along some axis.
For example, tf.argmax(y,1) is the label our model thinks is most likely for each input, while tf.argmax(y_,1) is the true label. 
We can use tf.equal to check if our prediction matches the truth.
'''
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
#That gives us a list of booleans. To determine what fraction are correct, we cast to floating point numbers and then take the mean.
#For example, [True, False, True, True] would become [1,0,1,1] which would become 0.75.
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
print accuracy.eval(feed_dict={x: mnist.test.images, y_: mnist.test.labels})

0.9092
