## Training a Sequence Classifier

From the book's [Github](https://github.com/ageron/handson-ml/):

In [6]:
def reset_graph(seed=42):
    tf.reset_default_graph()
    tf.set_random_seed(seed)
    np.random.seed(seed)

In [1]:
import numpy as np
import tensorflow as tf
from tensorflow.contrib.layers import fully_connected, variance_scaling_initializer
from tensorflow.examples.tutorials.mnist import input_data

The dimensions of the RNN:

In [2]:
n_steps   = 28   # Each step we feed a single row.
n_inputs  = 28   # Each row consists of 28 pixels so each step needs 28 inputs.
n_neurons = 150  # We use 150 neurons in the recurrent layer.
n_outputs = 10   # And 10 neurons in the fully connected output layer.

learning_rate = 0.001

Build the RNN:

In [9]:
reset_graph()

X = tf.placeholder(tf.float32, [None, n_steps, n_inputs])
y = tf.placeholder(tf.int32, [None])

# The recurrent layer.
basic_cell = tf.contrib.rnn.BasicRNNCell(num_units=n_neurons)
outputs, states = tf.nn.dynamic_rnn(basic_cell, X, dtype=tf.float32)

# The output layer.
logits = fully_connected(states, n_outputs, activation_fn=None)

# Define our loss measure.
xentropy = tf.nn.sparse_softmax_cross_entropy_with_logits(labels=y, logits=logits)
loss = tf.reduce_mean(xentropy)

# Declare an optimizer to minimize the loss.
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate)
training_op = optimizer.minimize(loss)

# An accuracy measurement.
correct = tf.nn.in_top_k(logits, y, 1)
accuracy = tf.reduce_mean(tf.cast(correct, tf.float32))

init = tf.global_variables_initializer()

The RNN looks like this (edit the [XML](RNN.xml) in [draw.io](http://draw.io)):

<img src="RNN.svg" />

Obviously not all connections between neurons are drawn.

Read the MNIST data:

In [4]:
mnist  = input_data.read_data_sets('/tmp/data/')
X_test = mnist.test.images.reshape((-1, n_steps, n_inputs))
y_test = mnist.test.labels

Extracting /tmp/data/train-images-idx3-ubyte.gz
Extracting /tmp/data/train-labels-idx1-ubyte.gz
Extracting /tmp/data/t10k-images-idx3-ubyte.gz
Extracting /tmp/data/t10k-labels-idx1-ubyte.gz


In [10]:
n_epochs   = 100
batch_size = 150

with tf.Session() as sess:
    init.run()
    for epoch in range(n_epochs):
        for iteration in range(mnist.train.num_examples // batch_size):
            X_batch, y_batch = mnist.train.next_batch(batch_size)
            X_batch = X_batch.reshape((-1, n_steps, n_inputs))
            sess.run(training_op, feed_dict={X: X_batch, y: y_batch})
        
        acc_train = accuracy.eval(feed_dict={X: X_batch, y: y_batch})
        acc_test  = accuracy.eval(feed_dict={X: X_test,  y: y_test})
        print(epoch, "Train accuracy:", acc_train, "Test accuracy:", acc_test)

0 Train accuracy: 0.9 Test accuracy: 0.9225
1 Train accuracy: 0.966667 Test accuracy: 0.9401
2 Train accuracy: 0.96 Test accuracy: 0.9617
3 Train accuracy: 0.96 Test accuracy: 0.951
4 Train accuracy: 0.973333 Test accuracy: 0.9664
5 Train accuracy: 0.98 Test accuracy: 0.9684
6 Train accuracy: 0.986667 Test accuracy: 0.9732
7 Train accuracy: 0.986667 Test accuracy: 0.9713
8 Train accuracy: 0.973333 Test accuracy: 0.9719
9 Train accuracy: 0.986667 Test accuracy: 0.969
10 Train accuracy: 0.973333 Test accuracy: 0.9733
11 Train accuracy: 1.0 Test accuracy: 0.9683
12 Train accuracy: 0.986667 Test accuracy: 0.9717
13 Train accuracy: 0.973333 Test accuracy: 0.9757
14 Train accuracy: 0.993333 Test accuracy: 0.9759
15 Train accuracy: 0.98 Test accuracy: 0.9703
16 Train accuracy: 1.0 Test accuracy: 0.9757
17 Train accuracy: 0.98 Test accuracy: 0.9731
18 Train accuracy: 0.993333 Test accuracy: 0.9771
19 Train accuracy: 0.993333 Test accuracy: 0.9727
20 Train accuracy: 0.986667 Test accuracy: 0.97