# Intermediate Neural Network in TensorFlow

In this notebook, we convert our [intermediate-depth MNIST-classifying neural network](https://github.com/the-deep-learners/TensorFlow-LiveLessons/blob/master/notebooks/intermediate_net_in_keras.ipynb) from Keras to TensorFlow (compare them side by side) following Aymeric Damien's [Multi-Layer Perceptron Notebook](https://github.com/aymericdamien/TensorFlow-Examples/blob/master/notebooks/3_NeuralNetworks/multilayer_perceptron.ipynb) style.

#### Load dependencies

In [1]:
import numpy as np
np.random.seed(42)
import tensorflow as tf
tf.set_random_seed(42)

#### Load data

Small cheat here - it's easier to load the data from keras' libraries. TensorFlow previously had these common datasets in `tf.contrib` however that module is experimental and is likely to change heavily between TensorFlow versions. As such, it's easier to get the data from a stable source.

You could, however, download the data yourself from (Yann LeCun's website)[http://yann.lecun.com/exdb/mnist/] and process it on your own. We just think this is the simplest approach here, since this notebook is not about how to process MNIST images.

In [2]:
from keras.datasets import mnist
(X_train, y_train), (X_test, y_test) = mnist.load_data()

# Reshape the data and cast as float
X_train = X_train.reshape(60000, 784).astype('float32')
X_test = X_test.reshape(10000, 784).astype('float32')

# Normalize the data so that each pixel value is in the range [0,1]
X_train /= 255
X_test /= 255

# One-hot encode the labels
from keras.utils import to_categorical
y_train = to_categorical(y_train,10)
y_test = to_categorical(y_test,10)

Using TensorFlow backend.


#### Set neural network hyperparameters (tidier at top of file!)

In [4]:
lr = 0.1
epochs = 20
train_batch_size = 128
weight_initializer = tf.contrib.layers.xavier_initializer()

#### Set number of neurons for each layer

In [5]:
n_input = 784
n_dense_1 = 64
n_dense_2 = 64
n_classes = 10

#### Define placeholders Tensors for inputs and labels

In [6]:
x_placeholder = tf.placeholder(tf.float32, [None, n_input])
y_placeholder = tf.placeholder(tf.int32, [None, n_classes])
batch_size = tf.placeholder(tf.int64)

#### Create TensorFlow data sets

In [7]:
mnist_data = tf.data.Dataset.from_tensor_slices((x_placeholder, y_placeholder)).repeat().batch(batch_size)

#### Initialize iterator for data

In [8]:
iterator = mnist_data.make_initializable_iterator()
x, y = iterator.get_next()

#### Define types of layers

In [9]:
# dense layer with ReLU activation:
def dense(x, W, b):
    z = tf.add(tf.matmul(x, W), b)
    a = tf.nn.relu(z)
    return a

#### Design neural network architecture

In [10]:
def network(x, weights, biases):
    
    # two dense hidden layers: 
    dense_1 = dense(x, weights['W1'], biases['b1'])
    dense_2 = dense(dense_1, weights['W2'], biases['b2'])
    
    # linear output layer (softmax)
    out_layer_z = tf.add(tf.matmul(dense_2, weights['W_out']), biases['b_out'])
    
    return out_layer_z

#### Define dictionaries for storing weights and biases for each layer -- and initialize

In [11]:
bias_dict = {
    'b1': tf.Variable(tf.zeros([n_dense_1])), 
    'b2': tf.Variable(tf.zeros([n_dense_2])),
    'b_out': tf.Variable(tf.zeros([n_classes]))
}

weight_dict = {
    'W1': tf.get_variable('W1', [n_input, n_dense_1], initializer=weight_initializer),
    'W2': tf.get_variable('W2', [n_dense_1, n_dense_2], initializer=weight_initializer),
    'W_out': tf.get_variable('W_out', [n_dense_2, n_classes], initializer=weight_initializer)
}

#### Build model

In [12]:
predictions = network(x, weights=weight_dict, biases=bias_dict)

#### Define model's loss and its optimizer

In [18]:
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(logits=predictions, labels=y))
optimizer = tf.train.GradientDescentOptimizer(learning_rate=lr).minimize(cost)

#### Define evaluation metrics

In [14]:
# calculate accuracy by identifying test cases where the model's highest-probability class matches the true y label: 
correct_prediction = tf.equal(tf.argmax(predictions, 1), tf.argmax(y, 1))
accuracy_pct = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) * 100

#### Create op for variable initialization

In [15]:
initializer_op = tf.global_variables_initializer()

#### Train the network in a session

In [19]:
with tf.Session() as session:
    # Initialize the session variables
    session.run(initializer_op)
    # Initialize the dataset iterator
    session.run(iterator.initializer, feed_dict={x_placeholder: X_train,
                                                 y_placeholder: y_train,
                                                 batch_size: train_batch_size})
    
    print("Training for", epochs, "epochs.")
    
    # loop over epochs: 
    for epoch in range(epochs):
        avg_cost = 0.0 # track cost to monitor performance during training
        avg_accuracy_pct = 0.0
        n_batches = int(round(X_train.shape[0] / train_batch_size))
        
        # Loop over batches
        for batch in range(n_batches):
            # feed batch data to run optimization and fetch cost and accuracy: 
            _, batch_cost, batch_acc = session.run([optimizer, cost, accuracy_pct])
            
            # accumulate mean loss and accuracy over epoch: 
            avg_cost += batch_cost / n_batches
            avg_accuracy_pct += batch_acc / n_batches
            
        # output logs at end of each epoch of training:
        print("Epoch ", '%03d' % (epoch+1), 
              ": cost = ", '{:.3f}'.format(avg_cost), 
              ", accuracy = ", '{:.2f}'.format(avg_accuracy_pct), "%", 
              sep='')
    
    print("Training Complete. Testing Model.\n")
    
    # Re-initialize the dataset iterator with the test data
    session.run(iterator.initializer, feed_dict={x_placeholder: X_test,
                                                 y_placeholder: y_test,
                                                 batch_size: X_test.shape[0]})
    
    # Run the session using the test data and fetch the cost and accuracy
    _, test_cost, test_accuracy_pct = session.run([optimizer, cost, accuracy_pct])
    
    # Print the final results
    print("Test Cost:", '{:.3f}'.format(test_cost))
    print("Test Accuracy: ", '{:.2f}'.format(test_accuracy_pct), "%", sep='')

Training for 20 epochs.
Epoch 001: cost = 0.477, accuracy = 86.41%
Epoch 002: cost = 0.237, accuracy = 93.05%
Epoch 003: cost = 0.181, accuracy = 94.62%
Epoch 004: cost = 0.147, accuracy = 95.69%
Epoch 005: cost = 0.124, accuracy = 96.38%
Epoch 006: cost = 0.108, accuracy = 96.89%
Epoch 007: cost = 0.096, accuracy = 97.21%
Epoch 008: cost = 0.086, accuracy = 97.54%
Epoch 009: cost = 0.077, accuracy = 97.78%
Epoch 010: cost = 0.071, accuracy = 97.96%
Epoch 011: cost = 0.065, accuracy = 98.13%
Epoch 012: cost = 0.059, accuracy = 98.31%
Epoch 013: cost = 0.055, accuracy = 98.48%
Epoch 014: cost = 0.051, accuracy = 98.61%
Epoch 015: cost = 0.047, accuracy = 98.71%
Epoch 016: cost = 0.043, accuracy = 98.84%
Epoch 017: cost = 0.040, accuracy = 98.96%
Epoch 018: cost = 0.037, accuracy = 99.03%
Epoch 019: cost = 0.034, accuracy = 99.11%
Epoch 020: cost = 0.032, accuracy = 99.21%
Training Complete. Testing Model.

Test Cost: 0.088
Test Accuracy: 97.31%
