# Recurrent Neural Network
In this tutorial, we will implement a recurrent neural network to classify mnist dataset.

## Import library and load MNIST dataset
MNIST dataset has 55,000 examples for training and 10,000 examples for testing. Each image has a size of 784 (28x28).

In [1]:
from tensorflow.examples.tutorials.mnist import input_data
import tensorflow as tf
from ops import conv2d, linear


mnist = input_data.read_data_sets("mnist/", one_hot=False)
x_train = mnist.train.images.reshape(-1, 28, 28)
y_train = mnist.train.labels
x_test = mnist.test.images.reshape(-1, 28, 28)
y_test = mnist.test.labels

print "x_train: ", x_train.shape
print "y_train: ", y_train.shape
print "x_test: ", x_test.shape
print "y_test: ", y_test.shape

Extracting mnist/train-images-idx3-ubyte.gz
Extracting mnist/train-labels-idx1-ubyte.gz
Extracting mnist/t10k-images-idx3-ubyte.gz
Extracting mnist/t10k-labels-idx1-ubyte.gz
x_train:  (55000, 28, 28)
y_train:  (55000,)
x_test:  (10000, 28, 28)
y_test:  (10000,)


## Hyper-parameter settings

In [2]:
seq_length = 28   
dim_input = 28
dim_hidden = 32
dim_out = 10
num_layers = 1
keep_prob = 0.7  # keep probability for drop-out 

## Define and construct model

Now, we will develope a recurrent neural network with LSTM unit.

In [3]:
def recurrent_network(x, mode='train'):
    with tf.variable_scope('lstm') as scope:
        lstm_cell = tf.nn.rnn_cell.BasicLSTMCell(num_units=dim_hidden)
        if mode == 'train':
            lstm_cell = tf.nn.rnn_cell.DropoutWrapper(cell=lstm_cell, output_keep_prob=keep_prob)
        lstm_cell = tf.nn.rnn_cell.MultiRNNCell(cells=[lstm_cell]*num_layers, state_is_tuple=True)
        outputs, states = tf.nn.dynamic_rnn(cell=lstm_cell, inputs=x, dtype=tf.float32, scope=scope)
  
    with tf.variable_scope('logits'):
        w = tf.get_variable('w', shape=[dim_hidden, dim_out], initializer=tf.random_normal_initializer())
        b = tf.get_variable('b', shape=[dim_out], initializer=tf.constant_initializer(0.0))
        out = tf.matmul(tf.reshape(outputs[:, -1, :], [-1, dim_hidden]), w) + b
        return out

In [4]:
x = tf.placeholder(dtype=tf.float32, shape=[None, 28, 28], name='inputs')  
y = tf.placeholder(dtype=tf.int64, shape=[None], name='labels')
out = recurrent_network(x, mode='train')
tf.get_variable_scope().reuse_variables()
out2 = recurrent_network(x, mode='test')

loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(out, y))
train_op = tf.train.RMSPropOptimizer(learning_rate=0.01).minimize(loss)

pred = tf.argmax(out2, 1) # batch_size

correct_pred = tf.equal(pred, y)
incorrect_pred = tf.not_equal(pred, y)

accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))

In [6]:
for var in tf.trainable_variables():
    print var.op.name, var.get_shape()

lstm/MultiRNNCell/Cell0/BasicLSTMCell/Linear/Matrix (60, 128)
lstm/MultiRNNCell/Cell0/BasicLSTMCell/Linear/Bias (128,)
logits/w (32, 10)
logits/b (10,)


## Train and test the model

From above, we build our neural network model to classify the MNIST dataset. To launch our model, we will implement a session where the model is actually trained and tested on the MNIST dataset.
#### Train
First, we initialize all variables we created above. This can be done by running `tf.initialize_all_variables()`. The most important part of code in the training phase is `sess.run([train_op, loss], feed_dict={x: x_batch, y:y_batch})`. This part of code feeds mini-batch data into placeholder and run train_op to update variables with `tf.train.RMSPropOptimizer` once. Also, loss is evaluated to print out the average loss for each epoch.
#### Test
Testing phase is quite simple. We use `sess.run(accuracy, feed_dict={x: x_batch, y: y_batch})` to print out the test accuracy.

In [5]:
batch_size = 100
num_epoch = 5
num_iter_per_epoch = int(x_train.shape[0] / batch_size)


config = tf.ConfigProto()
config.gpu_options.allow_growth = True
with tf.Session(config=config) as sess:
    # initialize tensor variables
    tf.initialize_all_variables().run()
    for e in range(num_epoch):
        avg_loss = 0.
        
        # loop over all batches
        for i in range(num_iter_per_epoch):
            x_batch = x_train[i*batch_size:(i+1)*batch_size]
            y_batch = y_train[i*batch_size:(i+1)*batch_size]
            
            # run optimization op (backprop) and loss op (to get loss value)
            _, c = sess.run([train_op, loss], feed_dict={x: x_batch, y: y_batch})
            # compute average loss
            avg_loss += c / num_iter_per_epoch
        print "Epoch %d, Loss: %.3f"% (e+1, avg_loss)
    print "Finished training!"
    
    num_iter_per_epoch = int(x_test.shape[0] / batch_size)
    test_accuracy = 0.0
    for i in range(num_iter_per_epoch):
        x_batch = x_test[i*batch_size:(i+1)*batch_size]
        y_batch = y_test[i*batch_size:(i+1)*batch_size]
       
        acc = sess.run(accuracy, feed_dict={x: x_batch, y: y_batch})
        test_accuracy += acc

    print ("Test accuracy: %.3f" %(test_accuracy/num_iter_per_epoch))

Epoch 1, Loss: 0.740
Epoch 2, Loss: 0.186
Epoch 3, Loss: 0.136
Epoch 4, Loss: 0.109
Epoch 5, Loss: 0.097
Finished training!
Test accuracy: 0.968
