<a href="https://colab.research.google.com/github/mahesh-keswani/ML-DL-Basics/blob/main/Difference_Between_ReturnStates_ReturnSequences.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
# Resource: https://machinelearningmastery.com/return-sequences-and-return-states-for-lstms-in-keras/

In [2]:
# The Keras deep learning library provides an implementation of the Long Short-Term Memory, or LSTM, recurrent neural network.

# As part of this implementation, the Keras API provides access to both return sequences and return state. 
# The use and difference between these data can be confusing when designing sophisticated recurrent neural network models,
#  such as the encoder-decoder model.

# After completing this tutorial, you will know:

# That return sequences return the hidden state output for EACH input time step.
# That return state returns the hidden state output and cell state for the LAST input time step.
# That return sequences and return state can be used at the same time.

<img src="https://i.stack.imgur.com/bVXaT.png" />

In [3]:
# Each unit or cell within the layer has an internal cell state, often abbreviated as “c“, and outputs a hidden state,
#  often abbreviated as “h“.

# The Keras API allows you to access these data, which can be useful or even required when developing sophisticated 
# recurrent neural network architectures, such as the encoder-decoder model.

# For the rest of this tutorial, we will look at the API for access these data.

In [4]:
# Return Sequences

# Each LSTM cell will output one hidden state h for each input.
# In this example, we will have one input sample with 3 time steps and one feature:

from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input, LSTM
from numpy import array

# define model
inputs1 = Input(shape=(3, 1))

# number of units=1
lstm1 = LSTM(1)(inputs1)

model = Model(inputs=inputs1, outputs=lstm1)

# define input data
data = array([0.1, 0.2, 0.3]).reshape((1,3,1))

# make and show prediction
print(model.predict(data))

[[-0.07235589]]


In [5]:
# Running the example outputs a single hidden state for the single input sequence with 3 time steps

In [6]:
# It is possible to access the hidden state output for each input time step.
# This can be done by setting the return_sequences attribute to True when defining the LSTM layer, as follows:

# define model
inputs1 = Input(shape=(3, 1))

lstm1 = LSTM(1, return_sequences=True)(inputs1)

model = Model(inputs=inputs1, outputs=lstm1)

# define input data
data = array([0.1, 0.2, 0.3]).reshape((1,3,1))

# make and show prediction
print(model.predict(data))

[[[0.00627518]
  [0.01691228]
  [0.03039883]]]


In [7]:
# Running the example returns a sequence of 3 values, one hidden state output for each sample time step for the 
# single LSTM cell in the layer.

In [8]:
# Return States

# The output of an LSTM cell or layer of cells is called the hidden state.

# This is confusing, because each LSTM cell retains an internal state that is not output, called the cell state, or c.

# Generally, we do not need to access the cell state unless we are developing sophisticated models where 
# subsequent layers may need to have their cell state initialized with the final cell state of another layer, 
# such as in an encoder-decoder model.

# Keras provides the return_state argument to the LSTM layer that will provide access to the hidden state output (state_h) 
# and the cell state (state_c). For example:

# define model
inputs1 = Input(shape=(3, 1))

lstm1, state_h, state_c = LSTM(1, return_state=True)(inputs1)

model = Model(inputs=inputs1, outputs=[lstm1, state_h, state_c])

# define input data
data = array([0.1, 0.2, 0.3]).reshape((1,3,1))

# make and show prediction
print(model.predict(data))

[array([[0.0659593]], dtype=float32), array([[0.0659593]], dtype=float32), array([[0.15195765]], dtype=float32)]


In [9]:
# Running the example returns 3 arrays:

# The LSTM hidden state output for the last time step.
# The LSTM hidden state output for the last time step (again).
# The LSTM cell state for the last time step.

In [10]:
# Return States and Sequences

# We can access both the sequence of hidden state and the cell states at the same time.
# This can be done by configuring the LSTM layer to both return sequences and return states.

# define model
inputs1 = Input(shape=(3, 1))

lstm1, state_h, state_c = LSTM(1, return_sequences=True, return_state=True)(inputs1)

model = Model(inputs=inputs1, outputs=[lstm1, state_h, state_c])

# define input data
data = array([0.1, 0.2, 0.3]).reshape((1,3,1))

# make and show prediction
print(model.predict(data))

[array([[[0.0187664 ],
        [0.05030981],
        [0.0897956 ]]], dtype=float32), array([[0.0897956]], dtype=float32), array([[0.21259052]], dtype=float32)]


In [None]:
# The layer returns the hidden state for each input time step, then separately, the hidden state output for the last
#  time step and the cell state for the last input time step.

# This can be confirmed by seeing that the last value in the returned sequences (first array) matches the value 
# in the hidden state (second array).