# MNIST Exercise with softmax regression

#### This exercise refers to TensorFlow tutorial. https://www.tensorflow.org/versions/r1.4/get_started/mnist/beginners
#### 11. Train section refers to same tutorial with different version. https://www.tensorflow.org/versions/r1.4/get_started/mnist/pros

#### 1-1. The MNIST Data
MNIST is a simple computer vision dataset. It consists of images of handwritten digits and labels for each image. You may meet warning after running the first line code. It doesn't impact the result of exercise. 

In [1]:
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets('MNIST_data', one_hot=True)

Extracting MNIST_data\train-images-idx3-ubyte.gz
Extracting MNIST_data\train-labels-idx1-ubyte.gz
Extracting MNIST_data\t10k-images-idx3-ubyte.gz
Extracting MNIST_data\t10k-labels-idx1-ubyte.gz


#### 1-2. Extract of dataset
Use Numpy

In [2]:
import numpy as np
X_train = np.vstack([img.reshape(-1,) for img in mnist.train.images])
y_train = mnist.train.labels

X_test = np.vstack([img.reshape(-1,) for img in mnist.test.images])
y_test = mnist.test.labels

#### 1-3. Shape of dataset

In [3]:
print(X_train.shape)
print(y_train.shape)
print(X_test.shape)
print(y_test.shape)

(55000, 784)
(55000, 10)
(10000, 784)
(10000, 10)


#### 1-4. Data visualization

In [4]:
import matplotlib.pyplot as plt
for i in range(1,10):
    plt.subplot(330+i)
    plt.imshow(X_train[30+i*5000].reshape(28,28),cmap='gray_r')
    plt.show()

<matplotlib.figure.Figure at 0x1e910d36550>

<matplotlib.figure.Figure at 0x1e91159df28>

<matplotlib.figure.Figure at 0x1e90e2f4160>

<matplotlib.figure.Figure at 0x1e913496b38>

<matplotlib.figure.Figure at 0x1e9134e59b0>

<matplotlib.figure.Figure at 0x1e9135337b8>

<matplotlib.figure.Figure at 0x1e9135805f8>

<matplotlib.figure.Figure at 0x1e9135cf438>

<matplotlib.figure.Figure at 0x1e9135b6128>

#### 2. Import TensorFlow

In [None]:
import tensorflow as tf
## tf.__version__ 

#### 3. Placeholders
We'll input when we ask TensorFlow to run a computation. We want to be able to input any number of MNIST images, each flattened into a 784-dimensional vector. We represent this as a 2-D tensor of floating-point numbers, with a shape [None, 784]. (Here None means that a dimension can be of any length.)

In [None]:
x = tf.placeholder(tf.float32, shape=[None, 784])

#### 4. Variables: Weights and biases
A Variable is a modifiable tensor that lives in TensorFlow's graph of interacting operations.

In [None]:
W = tf.Variable(tf.zeros([784,10]))
b = tf.Variable(tf.zeros([10]))

#### 5. Model Implementation
1. Multiply x by W
2. Add b
3. Apply tf.nn.softmax

In [None]:
y = tf.nn.softmax(tf.matmul(x, W) + b)

## Training

#### 6. Cross-entropy placeholder 
new placeholder to input the correct answers

In [None]:
y_ = tf.placeholder(tf.float32, [None, 10])

#### 7. Cross-entropy function

API reference
- tf.reduce_mean https://www.tensorflow.org/api_docs/python/tf/reduce_mean
- tf.nn.softmax_cross_entropy_with_logits_v2 https://www.tensorflow.org/api_docs/python/tf/nn/softmax_cross_entropy_with_logits_v2

In [None]:
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), axis=[1]))

## cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(labels=y_, logits=y)) ## Later TensorFlow API

#### 8. Optimization algorithm

In [None]:
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)

#### 9. Launch model
in an InteractiveSession

In [None]:
sess = tf.InteractiveSession()

#### 10. Initialize the variables

In [None]:
tf.global_variables_initializer().run()

#### 11. Train
Load 100 training examples in each training iteration and train 1000 times

In [None]:
for _ in range(1000):
  batch = mnist.train.next_batch(100)
  train_step.run(feed_dict={x: batch[0], y_: batch[1]})

## Stochastic training sample 
## Using small batches of random data is called stochastic training -- in this case, stochastic gradient descent. 
## Ideally, we'd like to use all our data for every step of training because that would give us a better sense of what we should be doing, but that's expensive. 
## So, instead, we use a different subset every time. Doing this is cheap and has much of the same benefit.

## for _ in range(1000):
##  batch_xs, batch_ys = mnist.train.next_batch(100)
##  sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})

## Evaluate the Model
#### 12. Comparison prediction and the truth

- tf.argmax is an extremely useful function which gives you the index of the highest entry in a tensor along some axis
- tf.equal to check if our prediction matches the truth

In [None]:
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))

That gives us a list of booleans. To determine what fraction are correct, we cast to floating point numbers and then take the mean. For example, [True, False, True, True] would become [1,0,1,1] which would become 0.75

In [None]:
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

In [None]:
print(accuracy.eval(feed_dict={x: mnist.test.images, y_: mnist.test.labels}))