# Classifiying MNIST digits with an RNN 

Adapted from Aymeric Damien's [Tensorflow tutorials](https://github.com/aymericdamien/TensorFlow-Examples/blob/master/notebooks/3_NeuralNetworks/recurrent_network.ipynb)

The same digit classification problem as before, this time with more organized code as well as a few Tensorflow features- scoping and Tensorboard.

References:
- [Variable sharing](https://www.tensorflow.org/programmers_guide/variable_scope)
- [Tensorboard basics](https://www.tensorflow.org/get_started/summaries_and_tensorboard)
	- Official tutorial on how to use summaries for later display on Tensorboard. Includes sample code for training and testing a convnet that shows best practices for scoping

## Setting up

In [24]:
import tensorflow as tf
import numpy as np
import path
cur_dir=path.os.getcwd()

# Import MINST data
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

Extracting MNIST_data/train-images-idx3-ubyte.gz
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz


## Params

In [2]:
learning_rate = 0.001
training_iters = 10000
batch_size = 10

n_hidden=50

n_steps= 28
n_input= 28
n_classes= 10

## Building Model and data ingestion

In [30]:
tf.reset_default_graph()


x= tf.placeholder(tf.float32, (None, n_steps, n_input), 'input')
y= tf.placeholder(tf.float32, (None, n_classes), 'label')

with tf.variable_scope('rnnlm'):
    W= tf.Variable(tf.random_uniform((n_hidden, n_classes),0,1), name='W')
    b= tf.Variable(tf.random_uniform((n_classes,1),0,1), name='b')
    
inps= tf.unstack(x, num=28, axis=1)
lstm_cell= tf.contrib.rnn.LSTMCell(n_hidden)
outputs, states= tf.contrib.rnn.static_rnn(lstm_cell, inps, dtype=tf.float32,   scope= 'rnnlm')
pred=tf.matmul(outputs[-1],W) + b # batch_size*50, 50*10
#Forward pass ends here. Fetch pred for test accuracy

with tf.name_scope('loss'):
    loss= tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels= y, logits= pred))
tf.summary.scalar('loss',loss)

with tf.name_scope('train'):
    train_op = tf.train.AdamOptimizer(learning_rate).minimize(loss)

with tf.name_scope('accuracy'):
    correct_pred = tf.equal(tf.argmax(pred,1), tf.argmax(y,1))
    accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))
tf.summary.scalar('accuracy',accuracy)

merged = tf.summary.merge_all()


init_op = tf.global_variables_initializer()

## Training

In [46]:
def data_loader(data, batch_size):
    inp,target= data.next_batch(10)
    inp=np.reshape(inp, (batch_size, n_steps, n_input))
    return inp, target

with tf.Session() as sess:
    train_writer = tf.summary.FileWriter(cur_dir+r'\train', sess.graph)
    test_writer = tf.summary.FileWriter(cur_dir+r'\test',sess.graph)
    #sess.run(init_op) #If you're not using tf.InteractiveSession, then this is how you initialize global variables
    tf.global_variables_initializer().run()
    for i in range(training_iters):
        inp, target= data_loader(mnist.train, batch_size)
        _, l, acc, summary= sess.run([train_op,loss,accuracy, merged], feed_dict= {x:inp, y:target})
        train_writer.add_summary(summary,i)
        if i%1000==0:
            print(i, l, acc)
    
    print("Now testing")
    accs=[]
    while(mnist.test.epochs_completed==0):
        data_loader(mnist.test, batch_size)
        acc, summary= sess.run([accuracy,merged], feed_dict= {x:inp, y:target})
        accs.append(acc)
        i= len(accs)
        test_writer.add_summary(summary,i)
        if i%100==0:
            print(i, acc)
    print("Accuracy on test set:", np.mean(accs))

0 2.30122 0.0
1000 0.194261 0.9
2000 0.209981 0.9
3000 0.027512 1.0
4000 0.0834822 1.0
5000 0.0079508 1.0
6000 0.00940743 1.0
7000 0.00399902 1.0
8000 0.0100951 1.0
9000 0.090911 0.9
Now testing
Accuracy on test set: nan


  out=out, **kwargs)
  ret = ret.dtype.type(ret / rcount)


## Tensorboard Graph Visualization

Tensorboard makes extensive use of name_scopes. The better you're able to organize your code into name_scopes, the better the visualization

![Tensorboard Graph Visualization](tensorboard-graph.png)

## Notes

- Even when using namespaces, it seems unlikely (from where I'm standing now) that there is a perfect schema for organizing your code. There's too much criss-crossing of data to make a perfect abstraction. But one must strive...
- Really need to understand scoping to make use of Tensorboard features but also useful nay, necessary for organizing large Tensorflow projects
- A good way to run model on test data, is to use feed_dict to feed test data, but only fetch predictions. This essentially stops data flowing through the rest of the graph (loss, optimizer, etc.)