## MNIST

In [7]:
from __future__ import division, print_function, unicode_literals
import tensorflow as tf
from time import time
import numpy as np
from tqdm import tqdm_notebook as tqdm

# matplotlib theme
from jupyterthemes import jtplot
jtplot.style()

In the next exercise, we will use a neural network comprised of only fully-connected layers to classify the MNIST dataset: Here we won't use the classic 10-class classification, but classify into two classes: even digits and odd digits.

The next code snippet downloads the MNIST dataset for you and defines two functions - the weights and the bias variables. 
We will use those functions to construct our network.




In [2]:
# Disable warnings before download / Reenable afterwards
old_v = tf.logging.get_verbosity()
tf.logging.set_verbosity(tf.logging.ERROR)

# Download the dataset
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
tf.logging.set_verbosity(old_v)

def weight_variable(shape):
    initial = tf.truncated_normal(shape, stddev=0.01)
    return tf.Variable(initial)

def bias_variable(shape):
    initial = tf.constant(0.0, shape=shape)
    return tf.Variable(initial)

Extracting MNIST_data/train-images-idx3-ubyte.gz
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz



For this, start by defining two placeholders, one to hold the images, and the second to hold the two classes.
Use tf.float32 for the placeholder type.

In [3]:
batch_size = 1000

# correct labels
y_ = tf.placeholder(tf.int8, shape=[batch_size, 2])

# input data
x = tf.placeholder(tf.float32, shape=[batch_size, 784])


Next, define the network itself. It is up to you how many layers to use, and the number of hidden units in each layer.

You are allowed to use only the following functions:
* weight_variable
* bias_variable
* tf.nn.relu
* tf.nn.dropout
* tf.nn.softmax
* tf.matmul

Please note that each layer includes not only tf.matmul, but also a bias variable.


In [4]:
# build the net
hidden_size = 6
num_classes = 2
neuron_per_layer = 500

# Input Layer
W_fc1 = weight_variable([784, neuron_per_layer])
b_fc1 = bias_variable([neuron_per_layer])

# Hidden Layers
h_fch = tf.expand_dims(tf.nn.relu(tf.matmul(x, W_fc1) + b_fc1), 2)
W_fch = weight_variable([neuron_per_layer, neuron_per_layer, hidden_size])
for i in range(hidden_size-1):
    mult = tf.nn.relu(tf.matmul(h_fch[:,:,i], W_fch[:,:,i])+ b_fc1)
    h_fch = tf.concat([h_fch, tf.expand_dims(mult, 2)], 2)
    
# Output Layer
W_fco = weight_variable([neuron_per_layer, num_classes])
y = tf.matmul(h_fch[:, :, hidden_size-1], W_fco)


Complete the snippet below using your own code.

define the loss function and Optimizer

In [5]:
# define the loss function
cross_entropy = tf.nn.softmax_cross_entropy_with_logits_v2(logits=y,
                                                        labels=y_)
cost = tf.reduce_mean(cross_entropy)

# define Optimizer
Optimizer = tf.train.AdamOptimizer(1e-3).minimize(cost)
reduction = tf.matmul(tf.reshape(tf.reduce_max(y, reduction_indices=[1]), [batch_size,1]), tf.ones([1,2]))

prediction = tf.cast(tf.equal(y, reduction), tf.int8)
correct_predictions = tf.math.equal(y_, prediction)
accuracy = tf.reduce_mean(tf.cast(correct_predictions, tf.float32))
init = tf.global_variables_initializer()

The next code snippet trains and evaluates the network. It does this by opening a session to run the tensorflow graph that we have defined.
Complete the code at the locations marked #YOUR CODE below, in order to train the network and to evaluate its accuracy every 50 steps.

In [8]:
def restrict_to_2_classes(y):
    # reduce to 2 classes: 0 for even, 1 for odd
    return (y @ np.arange(0,10).reshape(10,1) + [0,1]) % 2

with tf.Session() as sess:
    sess.run(init)
    for i in tqdm(range(701)):
        input_images, y_train = mnist.train.next_batch(batch_size)
        y_train = restrict_to_2_classes(y_train)

        sess.run([y, cost, Optimizer], feed_dict={x: input_images, y_: y_train})

        if i % 50 == 0:
            train_accuracy = sess.run(accuracy, feed_dict={x: input_images, y_: y_train})
            print("step %d, training accuracy %g" % (i, train_accuracy))

            # validate
            test_input_images, test_correct_predictions = mnist.test.next_batch(batch_size)

            test_correct_predictions = restrict_to_2_classes(test_correct_predictions)
            test_accuracy = sess.run(accuracy, feed_dict={x: test_input_images, y_: test_correct_predictions})

            print("Validation accuracy: %g." % test_accuracy)

HBox(children=(IntProgress(value=0, max=701), HTML(value='')))

step 0, training accuracy 0.523
Validation accuracy: 0.508.
step 50, training accuracy 0.945
Validation accuracy: 0.935.
step 100, training accuracy 0.978
Validation accuracy: 0.978.
step 150, training accuracy 0.992
Validation accuracy: 0.98.
step 200, training accuracy 0.99
Validation accuracy: 0.987.
step 250, training accuracy 0.993
Validation accuracy: 0.988.
step 300, training accuracy 0.997
Validation accuracy: 0.984.
step 350, training accuracy 1
Validation accuracy: 0.99.
step 400, training accuracy 0.998
Validation accuracy: 0.984.
step 450, training accuracy 1
Validation accuracy: 0.988.
step 500, training accuracy 0.999
Validation accuracy: 0.987.
step 550, training accuracy 0.997
Validation accuracy: 0.987.
step 600, training accuracy 0.999
Validation accuracy: 0.984.
step 650, training accuracy 0.999
Validation accuracy: 0.99.
step 700, training accuracy 1
Validation accuracy: 0.98.

