# Tensor Flow Demo
## MNIST classifiation for ML Beginners

### About

Tensorflow is a Python Library that provides versitile machine learning tools.

### Scope

For this demo, I will create one of the Tensor Flow examples to recognize handwritten digits.

Taken from: https://www.tensorflow.org/get_started/mnist/beginners

### Softmax regression

[Softmax Regression](http://ufldl.stanford.edu/tutorial/supervised/SoftmaxRegression/) is multinomial logistic regression that can handle multiple classifications. Whereas Logistic Regression is binary in classification, Softmax Regression can handle classifications in multiple dimensions.

**Model:**

![Softmax Regression Image 1](https://www.tensorflow.org/images/softmax-regression-scalargraph.png)


**Equations:**

![Softmax Regression Image 2](https://www.tensorflow.org/images/softmax-regression-scalarequation.png)

**Compacted:**

y = softmax(W*x*+b)

In [2]:
# First we need to import our dependencies
# Tensor Flow helps with the ML side of this demo
import tensorflow as tf
# Pandas helps with data output and processing
import pandas as pd
# We're going to import the tutorial library to gather the sample data we need
from tensorflow.examples.tutorials.mnist import input_data

In [3]:
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

Extracting MNIST_data/train-images-idx3-ubyte.gz
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz


In [4]:
p_holder = tf.placeholder(tf.float32, [None, 784])

# Our weight is a shape of [784,10] because we want to multiple the 784-dimensional image
# vectors by it to produce 10-dimensional vectors of evidence for the different classes.
weight = tf.Variable(tf.zeros([784, 10]))
bias = tf.Variable(tf.zeros([10]))

# Implement our model, softmax
y = tf.nn.softmax(tf.matmul(p_holder, weight) + bias)

### Training The Model

When training models, we don't define what is a good model, but rather what is a bad model. A common function to determine the loss of a model is called [cross-entropy](https://www.tensorflow.org/get_started/mnist/beginners#training).

So using cross-entropy, we can train our model.

In [7]:
# Create a placeholder to input the correct answers
y_placeholder = tf.placeholder(tf.float32, [None, 10])

# Implement our cross entropy function

cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_placeholder * tf.log(y), reduction_indices=[1]))
## Explanation
# tf.log() computes the logarithm of each element of the y, or our softmax algorithm
# Then we multiply each element of y_placeholder with the corresponding element of tf.log(y)
# tf.reduce_sum adds the elements in the second dimension of y, because of reduce_indices=[1]
# Finally tf.reduce_mean computes the mean over all the examples in the batch

# We are going to run a backpropagation algorithm to efficiently determine how your variables affect the loss you ask it to minimize.

# For this backpropagation, we're going to use tensorflow's Gradient Descent Algorithm

train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)

# We're going to launch the model
session = tf.InteractiveSession()

# Create an operation to initialize the variables we created

tf.global_variables_initializer().run()

# Now we're going to start the actual training

for _ in range(1000):
    # A batch of 100 random data points from our training set
    batch_xs, batch_ys = mnist.train.next_batch(100)
    # Run the session with our backpropagation, and our batches
    session.run(train_step, feed_dict={p_holder: batch_xs, y_placeholder: batch_ys})

### Evaluating the model

How well did the model do? Let's find out!

We need to first figure out where we predicted the correct label.
For that we'll use `tf.argmax`

In [9]:
correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_placeholder, 1))

# We're converting the list of Booleans to floats so we can calculate the percentage
# of correct predictions
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

# Finally we'll ask tensorflow for the accuracy of our test data
print(session.run(accuracy, feed_dict={p_holder: mnist.test.images, y_placeholder: mnist.test.labels}))

0.9203


### Conclusion

The value stated above is the percentage accuracy that our model had. 92%, although it seems good, it's actually not. If we were to make some improvments to our model, we could get the percentage up to 97%. The best models have over 99% accuracy.

This was just a high-level demo of Tensor Flow, taken from their page, so its mostly meant for educational purposes. Although even though 92% may be pretty bad in a production environment, it is more than good enough for testing.