# LSTM Basics

In [1]:
import numpy as np
import tensorflow as tf
sess = tf.Session()

We want to create a network that has only one LSTM cell. We have to pass 2 elements to LSTM, the <b>prv_output</b> and <b>prv_state</b>, so called, <b>h</b> and <b>c</b>. Therefore, we initialize a state vector, <b>state</b>.  Here, <b>state</b> is a tuple with 2 elements, each one is of size [1 x 4], one for passing prv_output to next time step, and another for passing the prv_state to next time stamp.

Let define a sample input. In this example, batch_size = 1, and  seq_len = 6:

In [2]:
LSTM_CELL_SIZE = 4  # output size (dimension), which is same as hidden size in the cell

lstm_cell = tf.contrib.rnn.BasicLSTMCell(LSTM_CELL_SIZE, state_is_tuple=True)
state = (tf.zeros([1,LSTM_CELL_SIZE]),)*2
print(state)
sample_input = tf.constant([[3,2,2,2,2,2]],dtype=tf.float32)
print(sess.run(sample_input))


For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
If you depend on functionality not listed there, please file an issue.

Instructions for updating:
This class is equivalent as tf.keras.layers.LSTMCell, and will be replaced by that in Tensorflow 2.0.
(<tf.Tensor 'zeros:0' shape=(1, 4) dtype=float32>, <tf.Tensor 'zeros:0' shape=(1, 4) dtype=float32>)
[[3. 2. 2. 2. 2. 2.]]


Now, we can pass the input to lstm_cell, and check the new state:

In [3]:
with tf.variable_scope("LSTM_sample1"):
    output, state_new = lstm_cell(sample_input, state)
sess.run(tf.global_variables_initializer())
print(sess.run(state_new))

Instructions for updating:
Colocations handled automatically by placer.
LSTMStateTuple(c=array([[0.38289595, 0.06775343, 0.31134155, 0.83778703]], dtype=float32), h=array([[0.3377007 , 0.05769717, 0.16416939, 0.1088017 ]], dtype=float32))


Checking the new state of c and the output h.

In [4]:
print (sess.run(output))

[[0.3377007  0.05769717 0.16416939 0.1088017 ]]


# Stacked LSTM

 A 2-layer LSTM. In this case, the output of the first layer will become the input of the second.

In [5]:
input_dim = 6
cells = []

Creating the layers LTSM cell. We can use tf.contrib.rnn.LSTMCell or tf.keras.layers.LSTMCell

In [6]:
LSTM_CELL_SIZE_1 = 4 #4 hidden nodes
cell1 = tf.keras.layers.LSTMCell(LSTM_CELL_SIZE_1)
cells.append(cell1)
LSTM_CELL_SIZE_2 = 5 #5 hidden nodes
cell2 = tf.keras.layers.LSTMCell(LSTM_CELL_SIZE_2)
cells.append(cell2)

To create a multi-layer LTSM we use the <b>tf.contrib.rnnMultiRNNCell or tf.keras.layers.StackedRNNCells</b> function, it takes in multiple single layer LTSM cells to create a multilayer stacked LTSM model.

In [7]:
stacked_lstm = tf.keras.layers.StackedRNNCells(cells)

Now we can create the RNN from <b>stacked_lstm</b>:

In [8]:
# Batch size x time steps x features.
data = tf.placeholder(tf.float32, [None, None, input_dim])
output, state = tf.nn.dynamic_rnn(stacked_lstm, data, dtype=tf.float32)

Instructions for updating:
Please use `keras.layers.RNN(cell)`, which is equivalent to this API


Lets say the input sequence length is 3, and the dimensionality of the inputs is 6. The input should be a Tensor of shape: [batch_size, max_time, dimension], in our case it would be (2, 3, 6)

In [9]:
#Batch size x time steps x features.
sample_input = [[[1,2,3,4,3,2], [1,2,1,1,1,2],[1,2,2,2,2,2]],[[1,2,3,4,3,2],[3,2,2,1,1,2],[0,0,0,0,3,2]]]
sample_input

[[[1, 2, 3, 4, 3, 2], [1, 2, 1, 1, 1, 2], [1, 2, 2, 2, 2, 2]],
 [[1, 2, 3, 4, 3, 2], [3, 2, 2, 1, 1, 2], [0, 0, 0, 0, 3, 2]]]

Checking the output.

In [10]:
output

<tf.Tensor 'rnn/transpose_1:0' shape=(?, ?, 5) dtype=float32>

In [11]:
sess.run(tf.global_variables_initializer())
sess.run(output, feed_dict={data: sample_input})

array([[[-0.01052323, -0.00220595,  0.00668446,  0.01008166,
         -0.00985327],
        [ 0.00688282, -0.00148626,  0.03038777, -0.00444307,
         -0.0344028 ],
        [ 0.01346044,  0.00659346,  0.06291576, -0.01084299,
         -0.04741259]],

       [[-0.01052323, -0.00220595,  0.00668446,  0.01008166,
         -0.00985327],
        [-0.03165527, -0.01431225, -0.03297432,  0.04273326,
         -0.01937381],
        [-0.01501577,  0.00385762, -0.00544851,  0.04087001,
          0.00090047]]], dtype=float32)

As you see, the output is of shape (2, 3, 5), which corresponds to our 2 batches, 3 elements in our sequence, and the dimensionality of the output which is 5.