# Import Tensorflow and MNIST Dataset

In [1]:
import tensorflow as tf

In [2]:
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

Extracting MNIST_data/train-images-idx3-ubyte.gz
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz


# Declaring the features, labels, weights, and biases

In [3]:
x = tf.placeholder(tf.float32, [None, 784])
y_ = tf.placeholder(tf.float32, [None, 10])

In [4]:
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))

The x and y represents the features (x) and the labels (y_) that are going to be used for the training set, they are set in placeholders because they are promised to be given a value later on the process. On the other hand, the weights and the biases are variables because they already contain values in a specific dimension in this case; zeros.

# Declaring the softmax regression

In [5]:
y = tf.nn.softmax(tf.matmul(x, W) + b)

This is where the regression happens, using `softmax` or multinomial logistic regression. In logistic classification such as this one softmax is used by assuming the classes are all binaries which in this dataset the `one hot encoding` was set to true. Also you can think of this where the predcition of the model comes from later on.

# Training

In [6]:
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))

What happens in `cross_entropy` is that each training labels (y_) are multiplied to the logarithm of y and then added with one. The final step in cross_entropy function is by getting the means of all which were summated inside.

In [7]:
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)

This is the learning of the model where it adjusts the weights and biases which minimizes the cross entorpy, in this case it is using a learning rate of 0.5

In [8]:
init = tf.group(tf.global_variables_initializer(), tf.local_variables_initializer())
sess = tf.InteractiveSession()

When running the graph we created in Tensorflow, we need to use a function that will run all of those. Additionaly, for the Variables usually we need to use the `tf.global_variables_initializer()` function to tell tensorflow that these are the values we want the tensors with the name of (i.e 'W', and 'b') to have.

Meanwhile, to run everything we need to use the `tf.Session()` function but since we are using an Interactive kernel (jupyter notebook) we would be using the interactive session.

# Running and Evaluation

In [9]:
sess.run(init)
for i in range(1000):
    batch_xs, batch_ys = mnist.train.next_batch(100)
    sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})

Now we begin running the graph we created, the steps made were set to a thousand, this will iterate or this will begin the training of the model for each xs and ys we have. Feed dict was used for placing values to our placeholders this is because placeholders are dictionary types.

In [10]:
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))

using the argmax function the model can check if it had the correct prediction as you can see with the tf.equal meaning if the model prediction label (y) correcty matched the true label (y_).

In [11]:
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

This is simply getting the average of all correct predictions.

In [12]:
print(sess.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels}))

0.9121


The accuracy in getting all the correct answers from the test set that were placed in the place holders. In this test we got a score of 91.21%