## Tut5 - RNN with variable length seqs

From: https://gist.github.com/evanthebouncy/8e16148687e807a46e3f#file-tensorflow-rnn-variable-seq-length
https://www.reddit.com/r/MachineLearning/comments/3sok8k/tensorflow_basic_rnn_example_with_variable_length/

There are some comments in the link (reproduced below)

Actually, it did as intended.


#### Original Comments:

I've finally gotten a chance to look at recurrence in tensorflow, the documentation examples are a bit complicated for understanding the bare bones of what is happening. This is the basic example I've come up with for just passing some data through a LSTM with no learning going on, its useful for understanding how to set things up.

My takeaways from writing this are:
- Getting the inputs in is a little weird, since the recurrence loop is built with a python loop. Because of this I had to define the input, then use tf.split to break it into discrete timesteps. Split also keeps the dimension you split on, so there is a reshape in there as well. If you aren't comfortable with list comprehension, it feels like something you will want to bone up on for TF.
- Variable initialization is an operation you run, not a function you call. This threw me off (coming from theano), I assumed tf.initialize_all_variables() was what I needed, but you have to actually pass that into the session. Makes sense in hindsight.
- Conditionals aren't documented at all on the tensorflow website, but are in the library. This is how we 'bail' from the recurrent loop for variable length sequences. check out rnn.py for how it is used in action. Same idea as theano's ifelse.
- For variable length sequences you will need to build the graph out to the maximum length you want to support, then exit early during runtime. You can pass in the bail point for your sequence at each .run() call, since the conditional check is in tensorflow and not python.
- You are going to need to pad your input to the maximum size of the loop. I didn't play with tf.pad enough to figure out if you can actually pass in variable length sequences to the .run() call, but the inputs you pass to the rnn when constructing it needs to be the maximum length so I had to make the placeholder that long. Worst case is you will need to pad your data before passing it into .run(), I assume the pain of this is lessened with the Queue setup that is available.


In [1]:
import tensorflow as tf    
import numpy as np

In [2]:
#if __name__ == '__main__':
np.random.seed(1)  
# the size of the hidden state for the lstm (notice the lstm uses 2x of this amount so actually lstm will have state of size 2)
size = 1
# 2 different sequences total
batch_size= 2
# the maximum steps for both sequences is 10
n_steps = 10
# each element of the sequence has dimension of 2
seq_width = 2

# the first input is to be stopped at 4 steps, the second at 6 steps
e_stop = np.array([4,6])

initializer = tf.random_uniform_initializer(-1,1) 

# the sequences, has n steps of maximum size
seq_input = tf.placeholder(tf.float32, [n_steps, batch_size, seq_width])
# what timesteps we want to stop at, notice it's different for each batch hence dimension of [batch]
early_stop = tf.placeholder(tf.int32, [batch_size])

# inputs for rnn needs to be a list, each item being a timestep. 
# we need to split our input into each timestep, and reshape it because split keeps dims by default  
inputs = [tf.reshape(i, (batch_size, seq_width)) for i in tf.split(0, n_steps, seq_input)]

cell = tf.nn.rnn_cell.LSTMCell(size, seq_width, initializer=initializer)  
initial_state = cell.zero_state(batch_size, tf.float32)

# ========= This is the most important part ==========
# output will be of length 4 and 6
# the state is the final state at termination (stopped at step 4 and 6)  
outputs, state = tf.nn.rnn(cell, inputs, initial_state=initial_state, sequence_length=early_stop)

# usual crap
iop = tf.initialize_all_variables()
session = tf.Session()
session.run(iop)
feed = {early_stop:e_stop, seq_input:np.random.rand(n_steps, batch_size, seq_width).astype('float32')}



To interpret this:
- there will be two outputs.
- the total length will be n_steps = 10
- however, one of the seqs will stop at 6 and another at 4, and it will be filled/padded by zeros

In [3]:
print("outputs, should be 2 things one of length 4 and other of 6")
outs = session.run(outputs, feed_dict=feed)
for i,x in enumerate(outs):`
    print("{}: {}".format(i,x))

outputs, should be 2 things one of length 4 and other of 6
0: [[ 0.04543639]
 [ 0.03712803]]
1: [[ 0.02025254]
 [ 0.05193096]]
2: [[ 0.04791925]
 [ 0.08056492]]
3: [[ 0.09846628]
 [ 0.11049453]]
4: [[ 0.        ]
 [ 0.07409128]]
5: [[ 0.        ]
 [ 0.10492748]]
6: [[ 0.]
 [ 0.]]
7: [[ 0.]
 [ 0.]]
8: [[ 0.]
 [ 0.]]
9: [[ 0.]
 [ 0.]]


In [4]:
print("states, 2 things total both of size 2, which is the size of the hidden state")
st = session.run(state, feed_dict=feed)
print(st)

states, 2 things total both of size 2, which is the size of the hidden state
[[ 0.16851933  0.09846628]
 [ 0.17804879  0.10492748]]
