# MNIST data set with softmax regression

## Data

Retrieve the data set.  This data set consists of scanned images of digits, and as label the digit they represent. The task is to train a neural network to recognize handwritten digits.

In [1]:
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets('MNIST_data/', one_hot=True)

Extracting MNIST_data/train-images-idx3-ubyte.gz
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz


Import the modules required.

In [2]:
import tensorflow as tf

## Model

The input data are $28 \times 28$ pixel images, where each pixel is an intensity between 0 and 1.  The output is a label, one for each digit, i.e., $0$ to $9$.

The network is defined as $y = softmax(x \cdot W + b)$, where $x$ represents the pixels of the image ($28 \times 28 = 784$), $y$ represents the probabilities of the 10 digits, $W$ represents the weights of the network ($784 \times 10$ and $b$ is the bias for each digit ($10$).

Define a placeholder for the input data.

In [3]:
x = tf.placeholder(tf.float32, [None, 28*28])

Define the variables to be determined by the training algorithm, the weights $W$, and the bias $b$.

In [4]:
W = tf.Variable(tf.zeros([28*28, 10]))
b = tf.Variable(tf.zeros([10]))

Define the network model.

In [5]:
y = tf.nn.softmax(tf.matmul(x, W) + b)

## Loss function and training

We define the loss function as the cross entropy $H_{y'}(y) = -\sum_i y'_i \log y_i$.

Define a placeholder for the labels.

In [6]:
y_ = tf.placeholder(tf.float32, [None, 10])

Define the cross entropy.

In [7]:
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))

A training step consists of a backpropagation step that minimizes the loss function.

In [8]:
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)

Define the initialization, initializing all the variables.

In [9]:
init = tf.global_variables_initializer()

Define a session to run the computations.

In [10]:
session = tf.Session()

Perform the initialization of the variables.

In [11]:
session.run(init)

Train the model by using 1000 batches of 100 training examples each.

In [12]:
for i in range(1000):
  batch_xs, batch_ys = mnist.train.next_batch(100)
  session.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})

## Evaluation

What is the accuracy of our model? A correct prediction means that the model assigns the highest probability to the digit that is the label in our test set.

In [13]:
correct_predictions = tf.equal(tf.arg_max(y, 1), tf.arg_max(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_predictions, tf.float32))

Run the model on the test data to get the accuracy as defined above.

In [14]:
print(session.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels}))

0.9169
