# Long short-term memory (LSTM)

In [None]:
import numpy as np
import tensorflow as tf

We want to create a network that has only one LSTM cell. We have to pass 2 elements to LSTM, the <b>prv_output</b> and <b>prv_state</b>, so called, <b>h</b> and <b>c</b>. Therefore, we initialize a state vector, <b>state</b>.  Here, <b>state</b> is a tuple with 2 elements, each one is of size \[1 x 4], one for passing prv_output to next time step, and another for passing the prv_state to next time stamp.

\\

Queremos crear una red que tenga solo una celda LSTM. Tenemos que pasar 2 elementos a LSTM, prv_output y prv_state, llamados h y c. Por lo tanto, inicializamos un vector de estado, state. Aquí, state es una tupla con 2 elementos, cada uno de tamaño [1 x 4], uno para pasar prv_output al siguiente paso de tiempo y otro para pasar prv_state al siguiente sello de tiempo.


In [None]:
LSTM_CELL_SIZE = 4  # output size (dimension), which is same as hidden size in the cell

state = (tf.zeros([1,LSTM_CELL_SIZE]),)*2
state

In [None]:
lstm = tf.keras.layers.LSTM(LSTM_CELL_SIZE, return_sequences=True, return_state=True)

lstm.states=state

#As we can see, the states has 2 parts, the new state c, and also the output h.
print(lstm.states)

Let define a sample input.

In [None]:
#Batch size x time steps x features.
sample_input = tf.constant([[3,2,2,2,2,2]],dtype=tf.float32)

batch_size = 1
sentence_max_length = 1
n_features = 6

new_shape = (batch_size, sentence_max_length, n_features)

inputs = tf.constant(np.reshape(sample_input, new_shape), dtype = tf.float32)

Now, we can pass the input to lstm_cell, and check the new state:

In [None]:
output, final_memory_state, final_carry_state = lstm(inputs)

print('Output shape: ', tf.shape(output))
print('Output :', output)

print('Memory shape: ', tf.shape(final_memory_state))
print('Memory : ', final_memory_state)

print('Carry state shape: ', tf.shape(final_carry_state))
print('Carry state: ', final_carry_state)

# Stacked LSTM

What about if we want to have a RNN with stacked LSTM? For example, a 2-layer LSTM. In this case, the output of the first layer will become the input of the second.

Lets create the stacked LSTM cell:


In [None]:
cells = []

Creating the first layer LTSM cell.

In [None]:
LSTM_CELL_SIZE_1 = 4 #4 hidden nodes
cell1 = tf.keras.layers.LSTMCell(LSTM_CELL_SIZE_1)
cells.append(cell1)

Creating the second layer LTSM cell.

In [None]:
LSTM_CELL_SIZE_2 = 5 #5 hidden nodes
cell2 = tf.keras.layers.LSTMCell(LSTM_CELL_SIZE_2)
cells.append(cell2)

To create a multi-layer LTSM we use the <b>tf.keras.layers.StackedRNNCells</b> function, it takes in multiple single layer LTSM cells to create a multilayer stacked LTSM model.

In [None]:
stacked_lstm =  tf.keras.layers.StackedRNNCells(cells)

In [None]:
#Now we can create the RNN from stacked_lstm:
lstm_layer= tf.keras.layers.RNN(stacked_lstm ,return_sequences=True, return_state=True)

In [None]:
#Batch size x time steps x features.
sample_input = [[[1,2,3,4,3,2], [1,2,1,1,1,2],[1,2,2,2,2,2]],[[1,2,3,4,3,2],[3,2,2,1,1,2],[0,0,0,0,3,2]]]
sample_input

batch_size = 2
time_steps = 3
features = 6
new_shape = (batch_size, time_steps, features)

x = tf.constant(np.reshape(sample_input, new_shape), dtype = tf.float32)

In [None]:
output, final_memory_state, final_carry_state  = lstm_layer(x)

In [None]:
print('Output shape: ', tf.shape(output))
print('Output : ', output)

print('Memory shape: ', tf.shape(final_memory_state))
print('Memory : ', final_memory_state)

print('Carry state shape: ', tf.shape(final_carry_state))
print('Carry state : ', final_carry_state)

As you see, the output is of shape (2, 3, 5), which corresponds to our 2 batches, 3 elements in our sequence, and the dimensionality of the output which is 5.