## Deep MNIST Tensorflow Tutorial

Source: https://www.tensorflow.org/get_started/mnist/pros

In this tutorial:
- Create a softmax regression function that is a model for recognizing MNIST digits, based on looking at every pixel in the image
- Use Tensorflow to train the model to recognize digits by having it "look" at thousands of examples (and run our first Tensorflow session to do so)
- Check the model's accuracy with our test data
- Build, train, and test a multilayer convolutional neural network to improve the results

### Setup

Loading MNIST Data

In [1]:
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

Successfully downloaded train-images-idx3-ubyte.gz 9912422 bytes.
Extracting MNIST_data/train-images-idx3-ubyte.gz
Successfully downloaded train-labels-idx1-ubyte.gz 28881 bytes.
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Successfully downloaded t10k-images-idx3-ubyte.gz 1648877 bytes.
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Successfully downloaded t10k-labels-idx1-ubyte.gz 4542 bytes.
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz


### Start TensorFlow InteractiveSession 

- TF relies on a C++ backend for computation
- Create a connection through a session

In [2]:
import tensorflow as tf
sess = tf.InteractiveSession()

### Build a Softmax Regression Model

#### Placeholders

In [8]:
#Placeholder
x = tf.placeholder(tf.float32, [None, 784])
y_ = tf.placeholder(tf.float32, [None, 10])

#### Variables

Before Variables can be used within a session, they must be initialized using that session. 

In [9]:
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))

# Initialize all variables at once
sess.run(tf.global_variables_initializer())

#y = tf.nn.softmax(tf.matmul(x, W) + b)

#### Predicting Class and Loss Function 

We can now implement the regression model

In [10]:
y = tf.matmul(x,W) + b

And specify the loss function. Loss indicates how bad the model's prediction was on a single example. Here, the loss function is the cross-entropy between the target and the softmax activation function applied to the model's prediction

In [13]:
#Note that tf.nn.softmax_cross_entropy_with_logits internally applies 
#the softmax on the model's unnormalized model prediction and sums across all classes, 
#and tf.reduce_mean takes the average over these sums.
cross_entropy = tf.reduce_mean(
    tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y))

### Train the Model

Since we have defined the model and loss function, tensorflow knows the entire computation graph and it can use automatic differentation to find the gradients of the loss with respect to each of the variables

In [14]:
# Optimization algo: Steepest gradient descent with a step length of 0.5
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)