# Feed-Forward Neural Nets on the MNIST Dataset

In this exercise you will implement a 3-layer feed-forward neural network with ReLU activation to perform a binary classification task. We will attempt to take in images and classify them as 8s or 5s.

<img src="mnist_sample.png", height="200" width="200">

## Task 1: Load the Data

Using the np.loadtxt() function, import all the data.

In [1]:
import tensorflow as tf
import numpy as np
from tensorflow.examples.tutorials.mnist import input_data

  from ._conv import register_converters as _register_converters


In [3]:
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

Instructions for updating:
Please use tf.data to implement this functionality.
Extracting MNIST_data/train-images-idx3-ubyte.gz
Instructions for updating:
Please use tf.data to implement this functionality.
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Instructions for updating:
Please use tf.one_hot on tensors.
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz
Instructions for updating:
Please use alternatives such as official/mnist/dataset.py from tensorflow/models.


In [4]:
training_images = mnist.train.images
labels = mnist.train.labels
test_images = mnist.test.images
test_labels = mnist.test.labels

## Declaring Placeholders (Inputs)

Tensorflow uses the "placeholder" keyword to denote values to feed into the network as input. Write all placeholders necessary to perform our learning problem:

In [5]:
in_layer = 784
out_layer = 10
learning_rate = 0.01

In [6]:
img = tf.placeholder(tf.float32, [None, in_layer])
ans = tf.placeholder(tf.float32, [None, out_layer])

## Network Architecture

Your Network should contain 3 feed-forward layers, each with a bias vector. The structure should be as follows:
* Feed-forward layer from 784 nodes to 784 nodes
* Feed-forward layer from 784 nodes to 256 nodes
* Feed-forward layer from 256 nodes to 2 nodes

In [7]:
hiddenSz1 = 784
hiddenSz2 = 256

weights = {
    'w1': tf.Variable(tf.random_normal([in_layer, hiddenSz1], stddev = 0.1)),
    'w2': tf.Variable(tf.random_normal([hiddenSz1, hiddenSz2], stddev = 0.1)),
    'out': tf.Variable(tf.random_normal([hiddenSz2, out_layer], stddev = 0.1))
}
biases = {
    'b1': tf.Variable(tf.random_normal([hiddenSz1], stddev = 0.1)),
    'b2': tf.Variable(tf.random_normal([hiddenSz2], stddev = 0.1)),
    'out': tf.Variable(tf.random_normal([out_layer], stddev = 0.1))
}

## Forward and Backward Pass
Code in the forward and backward pass for the Neural Net
* Use ReLU activation
* Use softmax probabilities 
* Use cross entropy loss function

In [8]:
# Performing the forward pass through the layers
def neural_network(data):
    layer_1 = tf.add(tf.matmul(data, weights['w1']), biases['b1'])
    layer_1 = tf.nn.relu(layer_1)
    layer_2 = tf.add(tf.matmul(layer_1, weights['w2']), biases['b2'])
    layer_2 = tf.nn.relu(layer_2)
    output = tf.matmul(layer_2, weights['out'])+ biases['out']
    return output

# Softmax Probabilities (network output)
output = neural_network(img)
prediction = tf.nn.softmax(output)

# backward pass -- adjusting the parameters
# Note: You don't need to compute the gradient yourself. 
# Simply use the tf.train.GradientDescentOptimizer() function
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels = ans, logits = output))
train = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss)

# Compute the accuracy
NumCorrect = tf.equal(tf.argmax(prediction, 1), tf.argmax(ans, 1))
accuracy = tf.reduce_mean(tf.cast(NumCorrect, tf.float32))

Instructions for updating:

Future major versions of TensorFlow will allow gradients to flow
into the labels input on backprop by default.

See tf.nn.softmax_cross_entropy_with_logits_v2.



In [9]:
sess = tf.Session()
sess.run(tf.global_variables_initializer())

## Training the Model
Use SGD to train the network

In [10]:
for i in range(len(training_images)):
    imgs = [training_images[i]]
    anss = [labels[i]]
    sess.run(train, feed_dict = {img: imgs, ans: anss})    

## Finishing Training and Computing Final Training and Testing Accuracy
Now that the model is trained, check the accuracy and observe the improvement!

In [11]:
sumAcc=0
for i in range(len(test_images)):
    imgs = [test_images[i]]
    anss = [test_labels[i]]
    sumAcc += sess.run(accuracy, feed_dict={img: imgs, ans: anss})

trainacc = 0
for i in range(len(training_images)):
    imgs = [training_images[i]]
    anss = [labels[i]]
    trainacc += sess.run(accuracy, feed_dict={img: imgs, ans: anss})

print ("Train Accuracy: %r" % (trainacc/len(training_images)))
print ("Test Accuracy: %r" % (sumAcc/len(test_images)))

Train Accuracy: 0.9659454545454546
Test Accuracy: 0.9597
