<a href="https://colab.research.google.com/github/ROARMarketingConcepts/Recurrent-Neural-Network-Examples/blob/master/RNN_Training_on_MNIST_Data_Example.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### Recall the MNIST dataset is 70,000 hand-written digits.

In [0]:
# To support both python 2 and python 3
from __future__ import division, print_function, unicode_literals

# Common imports
import numpy as np
import os

# to make this notebook's output stable across runs
def reset_graph(seed=42):
    tf.reset_default_graph()
    tf.set_random_seed(seed)
    np.random.seed(seed)

# To plot pretty figures
%matplotlib inline
import matplotlib
import matplotlib.pyplot as plt
plt.rcParams['axes.labelsize'] = 14
plt.rcParams['xtick.labelsize'] = 12
plt.rcParams['ytick.labelsize'] = 12

# Where to save the figures
PROJECT_ROOT_DIR = "."
CHAPTER_ID = "rnn"

def save_fig(fig_id, tight_layout=True):
    path = os.path.join(PROJECT_ROOT_DIR, "images", CHAPTER_ID, fig_id + ".png")
    print("Saving figure", fig_id)
    if tight_layout:
        plt.tight_layout()
    plt.savefig(path, format='png', dpi=300)

In [0]:
import tensorflow as tf

Note: we will use `tf.layers.dense()` instead of `tensorflow.contrib.layers.fully_connected()`. The `dense()` function is almost identical to the `fully_connected()` function. The main relevant differences are:
* several parameters are renamed: `scope` becomes `name`, `activation_fn` becomes `activation` (and similarly the `_fn` suffix is removed from other parameters such as `normalizer_fn`), `weights_initializer` becomes `kernel_initializer`, etc.
* the default `activation` is now `None` rather than `tf.nn.relu`.

### First, let's train a single layer RNN with 150 neurons...

In [0]:
reset_graph()

n_steps = 28
n_inputs = 28
n_neurons = 150
n_outputs = 10   # predict digits 0-9

learning_rate = 0.001

In [0]:
X = tf.placeholder(tf.float32, [None, n_steps, n_inputs])
y = tf.placeholder(tf.int32, [None])

basic_cell = tf.contrib.rnn.BasicRNNCell(num_units=n_neurons)
outputs, states = tf.nn.dynamic_rnn(basic_cell, X, dtype=tf.float32)

logits = tf.layers.dense(states, n_outputs)
xentropy = tf.nn.sparse_softmax_cross_entropy_with_logits(labels=y,logits=logits)
loss = tf.reduce_mean(xentropy)
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate)
training_op = optimizer.minimize(loss)
correct = tf.nn.in_top_k(logits, y, 1)
accuracy = tf.reduce_mean(tf.cast(correct, tf.float32))

In [0]:
init = tf.global_variables_initializer()

In [7]:
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("/tmp/data/")
X_test = mnist.test.images.reshape((-1, n_steps, n_inputs))  # shape of (10000, 28, 28)
y_test = mnist.test.labels                                   # shape of (10000,)

Instructions for updating:
Please use alternatives such as official/mnist/dataset.py from tensorflow/models.
Instructions for updating:
Please write your own downloading logic.
Instructions for updating:
Please use urllib or similar directly.
Successfully downloaded train-images-idx3-ubyte.gz 9912422 bytes.
Instructions for updating:
Please use tf.data to implement this functionality.
Extracting /tmp/data/train-images-idx3-ubyte.gz
Successfully downloaded train-labels-idx1-ubyte.gz 28881 bytes.
Instructions for updating:
Please use tf.data to implement this functionality.
Extracting /tmp/data/train-labels-idx1-ubyte.gz
Successfully downloaded t10k-images-idx3-ubyte.gz 1648877 bytes.
Extracting /tmp/data/t10k-images-idx3-ubyte.gz
Successfully downloaded t10k-labels-idx1-ubyte.gz 4542 bytes.
Extracting /tmp/data/t10k-labels-idx1-ubyte.gz
Instructions for updating:
Please use alternatives such as official/mnist/dataset.py from tensorflow/models.


In [8]:
n_epochs = 100
batch_size = 150

with tf.Session() as sess:
  
    init.run()
    for epoch in range(n_epochs):
        for iteration in range(mnist.train.num_examples // batch_size):
            X_batch, y_batch = mnist.train.next_batch(batch_size)
            X_batch = X_batch.reshape((-1, n_steps, n_inputs))
            sess.run(training_op, feed_dict={X: X_batch, y: y_batch})
            
        acc_train = accuracy.eval(feed_dict={X: X_batch, y: y_batch})
        acc_test = accuracy.eval(feed_dict={X: X_test, y: y_test})
        
        print(epoch, "Train accuracy:", acc_train, "Test accuracy:", acc_test)

0 Train accuracy: 0.93333334 Test accuracy: 0.9311
1 Train accuracy: 0.96666664 Test accuracy: 0.9522
2 Train accuracy: 0.97333336 Test accuracy: 0.9579
3 Train accuracy: 0.96666664 Test accuracy: 0.9625
4 Train accuracy: 0.97333336 Test accuracy: 0.9645
5 Train accuracy: 0.9866667 Test accuracy: 0.9679
6 Train accuracy: 0.96 Test accuracy: 0.9626
7 Train accuracy: 0.98 Test accuracy: 0.9718
8 Train accuracy: 0.94666666 Test accuracy: 0.9692
9 Train accuracy: 0.98 Test accuracy: 0.9714
10 Train accuracy: 0.9866667 Test accuracy: 0.9758
11 Train accuracy: 0.96 Test accuracy: 0.9745
12 Train accuracy: 0.98 Test accuracy: 0.972
13 Train accuracy: 0.9866667 Test accuracy: 0.9737
14 Train accuracy: 0.98 Test accuracy: 0.9709
15 Train accuracy: 1.0 Test accuracy: 0.9752
16 Train accuracy: 0.99333334 Test accuracy: 0.9773
17 Train accuracy: 0.9866667 Test accuracy: 0.9733
18 Train accuracy: 0.98 Test accuracy: 0.9663
19 Train accuracy: 0.9866667 Test accuracy: 0.979
20 Train accuracy: 0.97333

### We get over 98% accuracy on the test set...not bad!!  We could improve this with hyperparameter optimization and some dropout.

### Let's now train a multi-layer RNN for the MNIST dataset...

In [0]:
reset_graph()

n_steps = 28
n_inputs = 28
n_outputs = 10
n_neurons = 100
n_layers = 3

learning_rate = 0.001

In [0]:
X = tf.placeholder(tf.float32, [None, n_steps, n_inputs])
y = tf.placeholder(tf.int32, [None])

layers = [tf.contrib.rnn.BasicRNNCell(num_units=n_neurons,activation=tf.nn.relu) for layer in range(n_layers)]
multi_layer_cell = tf.contrib.rnn.MultiRNNCell(layers)
outputs, states = tf.nn.dynamic_rnn(multi_layer_cell, X, dtype=tf.float32)

In [0]:
states_concat = tf.concat(axis=1, values=states)
logits = tf.layers.dense(states_concat, n_outputs)
xentropy = tf.nn.sparse_softmax_cross_entropy_with_logits(labels=y, logits=logits)
loss = tf.reduce_mean(xentropy)
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate)
training_op = optimizer.minimize(loss)
correct = tf.nn.in_top_k(logits, y, 1)
accuracy = tf.reduce_mean(tf.cast(correct, tf.float32))

In [0]:
init = tf.global_variables_initializer()

In [26]:
n_epochs = 10
batch_size = 150

with tf.Session() as sess:
    init.run()
    for epoch in range(n_epochs):
        for iteration in range(mnist.train.num_examples // batch_size):
            X_batch, y_batch = mnist.train.next_batch(batch_size)
            X_batch = X_batch.reshape((-1, n_steps, n_inputs))
            sess.run(training_op, feed_dict={X: X_batch, y: y_batch})
        acc_train = accuracy.eval(feed_dict={X: X_batch, y: y_batch})
        acc_test = accuracy.eval(feed_dict={X: X_test, y: y_test})
        print(epoch, "Train accuracy:", acc_train, "Test accuracy:", acc_test)

0 Train accuracy: 0.96 Test accuracy: 0.9367
1 Train accuracy: 0.9866667 Test accuracy: 0.963
2 Train accuracy: 0.9533333 Test accuracy: 0.9639
3 Train accuracy: 0.96666664 Test accuracy: 0.9732
4 Train accuracy: 0.98 Test accuracy: 0.9748
5 Train accuracy: 0.96666664 Test accuracy: 0.9756
6 Train accuracy: 0.98 Test accuracy: 0.9803
7 Train accuracy: 0.9866667 Test accuracy: 0.9802
8 Train accuracy: 0.99333334 Test accuracy: 0.982
9 Train accuracy: 0.99333334 Test accuracy: 0.9829


### Again, we achieve over 98% accuracy on the test set!!