<h2>Simple Recurrent neural network in Tensorflow</h2>

<h3>Basic</h3>

RNN is a model used for sequence learning, where the particular order of the data points matter. It works on the principle that each member of output is a function of previous members of output. i.e., 


<b>present state of system h(t) = function of (previous state of system h(t-1) and present input x(t)) </b>

<img src="img/RNN.png"/>

 The computational graph to compute the training loss of a recurrent network that maps an input sequence of <b>x</b> values to a corresponding sequence of output <b>o</b> values.A loss <b>L</b> measures how far each <b>o</b> is from the corresponding training targety. When using softmax outputs, we assume <b>o</b> is the unnormalized log probabilities. The loss <b>L</b> internally computesˆ y=softmax(o) and compares this to the targety. The RNN has input to hidden connections parametrized by a weight matrix <b>U</b>, hidden-to-hidden recurrent connections parametrized by a weight matrix <b>W</b>, and hidden-to-output connections parametrizedby a weight matrix <b>V</b>.

<h3>Echo RNN</h3>

In [8]:
#import modules
import numpy as np
import tensorflow as tf

In [150]:
#parameters
no_timestep = 3
n = 10000
batchsize = 5
truncated_backprop_length = 15
state_size = 4
num_classes = 2
num_epochs = 500
num_batches = n//batchsize//truncated_backprop_length

When a RNN is trained, it is actually treated as a deep neural network with reoccurring weights in every layer. These layers will not be unrolled to the beginning of time, that would be too computationally expensive, and are therefore truncated at a limited number of time-steps. In our sample schematics above, the error is backpropagated three steps in our batch.

In [131]:
def generate_data(with_reshape = True):
    X = np.random.choice(2, n, p = [0.5, 0.5])
    Y = np.roll(X, no_timestep)
    Y[0:no_timestep] = 0
    if(with_reshape):
        X = X.reshape((batchsize, -1))
        Y = Y.reshape((batchsize, -1))
    return X, Y

In [132]:
X, Y = generate_data(False)

In [133]:
X[:10], Y[:10]

(array([1, 1, 1, 0, 0, 0, 1, 1, 1, 1]), array([0, 0, 0, 1, 1, 1, 0, 0, 0, 1]))

In [134]:
X.shape, Y.shape

((10000,), (10000,))

In [135]:
X, Y = generate_data(True)

In [136]:
X.shape, Y.shape

((5, 2000), (5, 2000))

<h3>defining the model</h3>

In [146]:
#define X and Y placeholders
X_placeholder = tf.placeholder(tf.float32, [batchsize, truncated_backprop_length])
Y_placeholder = tf.placeholder(tf.int32, [batchsize, truncated_backprop_length])

#the RNN-state is supplied in a placeholder, which is saved from the output of the previous run
#RNN we dont just feed the X and Y we also feed the previous state
init_state = tf.placeholder(tf.float32, [batchsize, state_size])

#define the weights and bias as tf variables
# additional +1 is given to the state as not just the current state size but also 1 from previous
W = tf.Variable(np.random.rand(state_size + 1, state_size), dtype = tf.float32)
b = tf.Variable(np.zeros((1, state_size)), dtype = tf.float32)

W2 = tf.Variable(np.random.rand(state_size, num_classes), dtype = tf.float32)
b2 = tf.Variable(np.zeros((1, num_classes)), dtype = tf.float32)


In [147]:
#unpacking the model into a 1-D sequence
inputs_series = tf.unstack(X_placeholder, axis=1)
labels_series = tf.unstack(Y_placeholder, axis=1)


<h3>Forward Pass</h3>

<img src="img/RNNEq.png"/>

In [148]:
previous_state = init_state 
states_series = []
for current_input in inputs_series:
    current_input = tf.reshape(current_input, [batchsize, 1])
    
    #the input for next level is the concatenation of current input and the state we know (current state)
    #which is also the state for the previous input.
    input_and_state_concatenated = tf.concat([current_input, previous_state], 1)
    
    #calculate next state Eq 10.8
    aggregate = tf.matmul(input_and_state_concatenated, W) + b
    #Eq 10.9
    current_state = tf.tanh(aggregate)
    
    #add the next state to the list which keeps a history of states
    states_series.append(current_state)
    
    #set the current state as the previous state
    previous_state = current_state
    


<h3>Calculating loss</h3>

In [149]:
#Eq 10.10 for calculate the mat mul of state with the weights w2 
logits_series  = [tf.matmul(state, W2) + b2 for state in states_series]
#Eq 10.11 for calculating final prediction values
predictions_series = [tf.nn.softmax(logit) for logit in logits_series]

losses = [tf.nn.sparse_softmax_cross_entropy_with_logits(logits = logits, labels = labels) for logits, labels in 
          zip(logits_series, labels_series)]
total_loss = tf.reduce_mean(losses)

train_step = tf.train.AdamOptimizer(0.3).minimize(total_loss)


<h3>Running the RNN</h3>

In [161]:

with tf.Session() as sess:
    tf.global_variables_initializer()
    loss_list = []
    for epoch in range(num_epochs):
        #generate the data 
        x, y = generate_data(True)
        #initialize the current state
        _previous_state = np.zeros((batchsize, state_size))
        
        print("Epoch :", epoch)
        
        #selecting the batch data by updating the start and end index
        for batch_idx in range(num_batches):
            start_idx = batch_idx * truncated_backprop_length
            end_idx = start_idx + truncated_backprop_length

            batchX = x[:,start_idx:end_idx]
            batchY = y[:,start_idx:end_idx]
            
            #train the model
            _total_loss, _train_step, _previous_state, _predictions_series = sess.run(
                    [total_loss, train_step, previous_state, predictions_series],
                    feed_dict={
                        X_placeholder:batchX,
                        Y_placeholder:batchY,
                        init_state:_previous_state
                        })
            loss_list.append(_total_loss)
        
            #print loss after each 100 steps
            if batch_idx%100 == 0:
                    print("Step",batch_idx, "Loss", _total_loss)

Epoch : 0


FailedPreconditionError: Attempting to use uninitialized value Variable_44
	 [[Node: Variable_44/read = Identity[T=DT_FLOAT, _class=["loc:@Adam/update_Variable_44/ApplyAdam"], _device="/job:localhost/replica:0/task:0/device:GPU:0"](Variable_44)]]
	 [[Node: Softmax_34/_47 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_927_Softmax_34", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

Caused by op 'Variable_44/read', defined at:
  File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/kiran/.local/lib/python3.6/site-packages/ipykernel_launcher.py", line 16, in <module>
    app.launch_new_instance()
  File "/home/kiran/.local/lib/python3.6/site-packages/traitlets/config/application.py", line 658, in launch_instance
    app.start()
  File "/home/kiran/.local/lib/python3.6/site-packages/ipykernel/kernelapp.py", line 486, in start
    self.io_loop.start()
  File "/home/kiran/.local/lib/python3.6/site-packages/tornado/platform/asyncio.py", line 132, in start
    self.asyncio_loop.run_forever()
  File "/usr/lib/python3.6/asyncio/base_events.py", line 422, in run_forever
    self._run_once()
  File "/usr/lib/python3.6/asyncio/base_events.py", line 1432, in _run_once
    handle._run()
  File "/usr/lib/python3.6/asyncio/events.py", line 145, in _run
    self._callback(*self._args)
  File "/home/kiran/.local/lib/python3.6/site-packages/tornado/platform/asyncio.py", line 122, in _handle_events
    handler_func(fileobj, events)
  File "/home/kiran/.local/lib/python3.6/site-packages/tornado/stack_context.py", line 300, in null_wrapper
    return fn(*args, **kwargs)
  File "/home/kiran/.local/lib/python3.6/site-packages/zmq/eventloop/zmqstream.py", line 450, in _handle_events
    self._handle_recv()
  File "/home/kiran/.local/lib/python3.6/site-packages/zmq/eventloop/zmqstream.py", line 480, in _handle_recv
    self._run_callback(callback, msg)
  File "/home/kiran/.local/lib/python3.6/site-packages/zmq/eventloop/zmqstream.py", line 432, in _run_callback
    callback(*args, **kwargs)
  File "/home/kiran/.local/lib/python3.6/site-packages/tornado/stack_context.py", line 300, in null_wrapper
    return fn(*args, **kwargs)
  File "/home/kiran/.local/lib/python3.6/site-packages/ipykernel/kernelbase.py", line 283, in dispatcher
    return self.dispatch_shell(stream, msg)
  File "/home/kiran/.local/lib/python3.6/site-packages/ipykernel/kernelbase.py", line 233, in dispatch_shell
    handler(stream, idents, msg)
  File "/home/kiran/.local/lib/python3.6/site-packages/ipykernel/kernelbase.py", line 399, in execute_request
    user_expressions, allow_stdin)
  File "/home/kiran/.local/lib/python3.6/site-packages/ipykernel/ipkernel.py", line 208, in do_execute
    res = shell.run_cell(code, store_history=store_history, silent=silent)
  File "/home/kiran/.local/lib/python3.6/site-packages/ipykernel/zmqshell.py", line 537, in run_cell
    return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
  File "/home/kiran/.local/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2662, in run_cell
    raw_cell, store_history, silent, shell_futures)
  File "/home/kiran/.local/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2785, in _run_cell
    interactivity=interactivity, compiler=compiler, result=result)
  File "/home/kiran/.local/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2901, in run_ast_nodes
    if self.run_code(code, result):
  File "/home/kiran/.local/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2961, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-146-e94888568fdf>", line 14, in <module>
    W2 = tf.Variable(np.random.rand(state_size, num_classes), dtype = tf.float32)
  File "/home/kiran/.local/lib/python3.6/site-packages/tensorflow/python/ops/variables.py", line 235, in __init__
    constraint=constraint)
  File "/home/kiran/.local/lib/python3.6/site-packages/tensorflow/python/ops/variables.py", line 397, in _init_from_args
    self._snapshot = array_ops.identity(self._variable, name="read")
  File "/home/kiran/.local/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py", line 142, in identity
    return gen_array_ops.identity(input, name=name)
  File "/home/kiran/.local/lib/python3.6/site-packages/tensorflow/python/ops/gen_array_ops.py", line 3187, in identity
    "Identity", input=input, name=name)
  File "/home/kiran/.local/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/home/kiran/.local/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3392, in create_op
    op_def=op_def)
  File "/home/kiran/.local/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1718, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

FailedPreconditionError (see above for traceback): Attempting to use uninitialized value Variable_44
	 [[Node: Variable_44/read = Identity[T=DT_FLOAT, _class=["loc:@Adam/update_Variable_44/ApplyAdam"], _device="/job:localhost/replica:0/task:0/device:GPU:0"](Variable_44)]]
	 [[Node: Softmax_34/_47 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_927_Softmax_34", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
