# Feed-Forward Neural Nets on the MNIST Dataset

In the last homework, our network struggled to differentiate between 8s and 5s. In this assignment, we will use a deep learning framework known as Tensorflow to improve our model.

In this exercise you will implement a 3-layer feed-forward neural network with ReLU activation to perform a binary classification task. We will attempt to take in images and classify them as 8s or 5s.

<img src="mnist_sample.png", height="200" width="200">

## Task 1: Load the Data

Using the np.loadtxt() function, import all the data.

In [1]:
import tensorflow as tf
import numpy as np
from tensorflow.examples.tutorials.mnist import input_data

mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

training_images = mnist.train.images
labels = mnist.train.labels
test_images = mnist.test.images
test_labels = mnist.test.labels

print(mnist)

  from ._conv import register_converters as _register_converters


Extracting MNIST_data/train-images-idx3-ubyte.gz
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz
Datasets(train=<tensorflow.contrib.learn.python.learn.datasets.mnist.DataSet object at 0x11b09dcf8>, validation=<tensorflow.contrib.learn.python.learn.datasets.mnist.DataSet object at 0x11b0a7cc0>, test=<tensorflow.contrib.learn.python.learn.datasets.mnist.DataSet object at 0x11b0a7b70>)


In [2]:
np.shape(training_images)

(55000, 784)

## Declaring Placeholders (Inputs)

Tensorflow uses the "placeholder" keyword to denote values to feed into the network as input. Write all placeholders necessary to perform our learning problem:

In [3]:
img=tf.placeholder(tf.float32, [None,784])
ans = tf.placeholder(tf.float32, [None, 10])

## Network Architecture

Your Network should contain 3 feed-forward layers, each with a bias vector. The structure should be as follows:
* Feed-forward layer from 784 nodes to 784 nodes
* Feed-forward layer from 784 nodes to 256 nodes
* Feed-forward layer from 256 nodes to 2 nodes

In [4]:
hiddenSz1 = 784
hiddenSz2 = 256
inputs = 784
out_layer = 10

w1 = tf.Variable(tf.random_normal([inputs,hiddenSz1],stddev = 0.1))
b1 = tf.Variable(tf.random_normal([hiddenSz1]))
w2 = tf.Variable(tf.random_normal([hiddenSz1,hiddenSz2],stddev = 0.1))
b2 = tf.Variable(tf.random_normal([hiddenSz2]))
w3 = tf.Variable(tf.random_normal([hiddenSz2,out_layer],stddev = 0.1))
b3 = tf.Variable(tf.random_normal([out_layer]))

## Forward and Backward Pass
Code in the forward and backward pass for the Neural Net
* Use ReLU activation
* Use softmax probabilities 
* Use cross entropy loss function

In [5]:
# Performing the forward pass through the layers
L1 = tf.nn.relu(tf.matmul(img,w1)+b1)
L2 = tf.nn.relu(tf.matmul(L1,w2)+b2)
# Softmax Probabilities (network output)
before_prbs = tf.matmul(L2,w3)+b3
prbs = tf.nn.softmax(tf.matmul(L2,w3)+b3)

# backward pass -- adjusting the parameters
# Note: You don't need to compute the gradient yourself. 
# Simply use the tf.train.GradientDescentOptimizer() function 
xEnt = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(labels=ans, logits=before_prbs))
train = tf.train.GradientDescentOptimizer(0.01).minimize(xEnt)

# Compute the accuracy
numCorrect = tf.equal(tf.argmax(prbs,1),tf.argmax(ans,1))
accuracy = tf.reduce_mean(tf.cast(numCorrect, tf.float32))


sess = tf.Session()
sess.run(tf.global_variables_initializer())


## Training the Model
Use SGD to train the network

In [None]:
for i in range(len(training_images)):
    imgs = [training_images[i]]
    anss = [labels[i]]
    sess.run(train, feed_dict={img: imgs, ans: anss})

## Finishing Training and Computing Final Training and Testing Accuracy
Now that the model is trained, check the accuracy and observe the improvement!

In [None]:
sumAcc=0
for i in range(len(test_images)):
    imgs = [test_images[i]]
    anss = [test_labels[i]]
    sumAcc += sess.run(accuracy, feed_dict={img: imgs, ans: anss})

trainacc = 0
for i in range(len(training_images)):
    imgs = [training_images[i]]
    anss = [labels[i]]
    trainacc += sess.run(accuracy, feed_dict={img: imgs, ans: anss})


print ("Train Accuracy: %r" % (trainacc/len(training_images)))
print ("Test Accuracy: %r" % (sumAcc/len(test_images)))
