In this example, we are going to focus on a Bi-Directional Recurrent Neural Network. This type of recurrent neural network is unique from others becasue instead of keeping recurrent in a forward way it also keeps record of a backwards way. Essentially looking into the future and also looking at the past examples.

In [1]:
import tensorflow as tf
from tensorflow.contrib import rnn
import numpy as np

We are going to get a few things started before we dive deep with deep learning. Before we start we need get the input data and define a few paramaters such as: hyper paramaters, network paramaters, define the weights and define the TF graph.

In [3]:
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("../data-sets/MNIST/", one_hot=True)
# parameters
learning_rate = 0.001
training_iters = 100000
batch_size = 128
display_step = 10

# network parameters
n_input = 28
n_steps = 28
n_hidden = 128
n_classes = 10

# defince weights & biases
weights = {'out': tf.Variable(tf.random_normal([2*n_hidden, n_classes]))}
biases = {'out': tf.Variable(tf.random_normal([n_classes]))}

# define the graph
x = tf.placeholder("float", [None, n_steps, n_input])
y = tf.placeholder("float", [None, n_classes])

Extracting ../data-sets/MNIST/train-images-idx3-ubyte.gz
Extracting ../data-sets/MNIST/train-labels-idx1-ubyte.gz
Extracting ../data-sets/MNIST/t10k-images-idx3-ubyte.gz
Extracting ../data-sets/MNIST/t10k-labels-idx1-ubyte.gz


Next lets define the Bi-Directional Recurant Neural Network. This RNN cell requires a forward cell and a backwards cell which we will define using the Basic LSTM Cell function. After we define the LSTM cells we must pass it to static bi-directional rnn which will take into account both. Once we get the output we multiple the outputs with the weights and add the bias.

In [4]:
def BiRNN(x, weights, biases):
    """Prepare data and bidirectiona RNN
    Current data: (batch_size, n_steps, n_input)
    required data: 'n_steps' tensor list of shape (batch_size, n_input)
    """
    # permutating batch_size and n_steps
    x = tf.transpose(x, [1, 0, 2])
    # reshape to (n_steps*batch_size, n_input)
    x = tf.reshape(x, [-1, n_input])
    # split to get a list of n_steps tensor of shape (batch_size, n_input)
    x = tf.split(x, n_steps, 0)

    # define lstm cells with tensorflow
    # forward deriction cell
    lstm_fw_cell = rnn.BasicLSTMCell(n_hidden, forget_bias=1.0)
    # backward direction cell
    lstm_bw_cell = rnn.BasicLSTMCell(n_hidden, forget_bias=1.0)

    # get lstm cell output
    try:
        outputs, _, _ = rnn.static_bidirectional_rnn(lstm_fw_cell, lstm_bw_cell, x,
                                                dtype=tf.float32)
    except Exception:
        outputs = rnn.static_bidirectional_rnn(lstm_fw_cell, lstm_bw_cell, x,
                                          dtype=tf.float32)
    return tf.matmul(outputs[-1], weights['out']) + biases['out']

pred = BiRNN(x, weights, biases)

In this section of the code we are going to define a few critical componets of a neural network. We are going to define the cost function and the optimizer function. The cost function in this example is in charge of reducing the mean by doing a softmax on the cross entropy with logits. We are also going to define function to evaluate the neural network. We will define a funttion that let us not wether the prediction is correct and we are also going to define the accurace of the model. After we defin our function we are going to initilize our varibles.

In [None]:
# define loss and optimizer
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred, labels=y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)

# evaluate model
correct_pred = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))

# initializing the varibales
init = tf.global_variables_initializer()

Now we can finally start the training. 

In this section of the code we are going to start the tensorflow session and begin training with batches of the data. First we loop through the number of training interations we want train for. Once in the loop we can get the MNIST data, hand writted digits, in batches. Next, we need to reshape the data to 28 sequences of 28 elements becuase the size of the image is 28x28. Now we are ready to optimize so we feed the image and the answer. 

In the following step we will keep track of our accuracy and loss score. We print to the console every so often we don't flood it with information.

We are finallly done trainin the neural network! But now we should probably test the model. To test the model we will feed it never before seen examples and grade how will the neural network performs. We define the number of images we want to test as well the test data and the label data. Now we finally done with it all!

In [None]:
with tf.Session() as sess:
    sess.run(init)
    step = 1
    while step * batch_size < training_iters:
        # lets get the data in batches
        batch_x, batch_y = mnist.train.next_batch(batch_size)

        # reshape the data to 28 seq of 28 elements to run optimization
        batch_x = batch_x.reshape((batch_size, n_steps, n_input))
        sess.run(optimizer, feed_dict={x: batch_x, y: batch_y})

        if step % display_step == 0:
            # calculate batch accuracy & loss
            acc = sess.run(cost, feed_dict={x: batch_x, y: batch_y})
            loss = sess.run(cost, feed_dict={x: batch_x, y: batch_y})
            print "Iter " + str(step*batch_size) + ", Minibatch Loss= " + \
                "{:.6f}".format(loss) + ", Training Accuracy= " + \
                "{:.5f}".format(acc)
        step += 1
    print "Optimization Finished!"
  
    # calculate accuracy for 128 test images
    test_len = 128
    test_data = mnist.test.images[:test_len].reshape((-1, n_steps, n_input))
    test_label = mnist.test.labels[:test_len]
    print "Testing Accuracy: ", sess.run(accuracy, feed_dict={x: test_data, y: test_label})

Iter 1280, Minibatch Loss= 2.030882, Training Accuracy= 2.03088
Iter 2560, Minibatch Loss= 1.768116, Training Accuracy= 1.76812
Iter 3840, Minibatch Loss= 1.435340, Training Accuracy= 1.43534
Iter 5120, Minibatch Loss= 1.206275, Training Accuracy= 1.20628
Iter 6400, Minibatch Loss= 1.076237, Training Accuracy= 1.07624
Iter 7680, Minibatch Loss= 1.253329, Training Accuracy= 1.25333
Iter 8960, Minibatch Loss= 0.993062, Training Accuracy= 0.99306
Iter 10240, Minibatch Loss= 0.822306, Training Accuracy= 0.82231
Iter 11520, Minibatch Loss= 0.532800, Training Accuracy= 0.53280
Iter 12800, Minibatch Loss= 0.842915, Training Accuracy= 0.84292
Iter 14080, Minibatch Loss= 0.637548, Training Accuracy= 0.63755
Iter 15360, Minibatch Loss= 0.443063, Training Accuracy= 0.44306
Iter 16640, Minibatch Loss= 0.499470, Training Accuracy= 0.49947
Iter 17920, Minibatch Loss= 0.377550, Training Accuracy= 0.37755
Iter 19200, Minibatch Loss= 0.313469, Training Accuracy= 0.31347
Iter 20480, Minibatch Loss= 0.19