## Intro
This notebook describes difference between `return_sequence` and `return_state` arguments in LSTM/RNN layers of tensorflow/keras.

In [1]:
import tensorflow as tf
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input
from tensorflow.keras.layers import LSTM
import numpy as np


np.set_printoptions(suppress=True) # to suppress scientific notation while printing arrays

def reset_graph(seed=2):
    tf.reset_default_graph()
    tf.set_random_seed(seed)
    np.random.seed(seed)

In [2]:
seq_len = 10
in_features = 3
batch_size = 2
units = 5

# define input data
data = np.random.normal(0,1, size=(batch_size, seq_len, in_features))
print('input shape is', data.shape)


input shape is (2, 10, 3)


In [3]:
reset_graph()

# define model
inputs1 = Input(shape=(seq_len, in_features))
lstm1 = LSTM(units)(inputs1)
model = Model(inputs=inputs1, outputs=lstm1)

# check output
output = model.predict(data)
print('output shape is ', output.shape)
print(output)

W1217 16:35:20.460479 181768 deprecation.py:506] From C:\Users\USER\Anaconda3\envs\tfgpu\lib\site-packages\tensorflow\python\ops\init_ops.py:1251: calling VarianceScaling.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor


output shape is  (2, 5)
[[-0.11063188  0.05536821 -0.00112649 -0.04271765 -0.04566799]
 [-0.07557035  0.11102945 -0.25066054  0.05387768  0.0188199 ]]


### Return Sequence
If we use `return_sequence=True`, we can get hidden state which is also output, at each time step instead of just one final output.

In [4]:
reset_graph()

print('input shape is', data.shape)

# define model
inputs1 = Input(shape=(seq_len, in_features))
lstm1 = LSTM(units, return_sequences=True)(inputs1)
model = Model(inputs=inputs1, outputs=lstm1)

# check output
output = model.predict(data)
print('output shape is ', output.shape)
print(output)

input shape is (2, 10, 3)
output shape is  (2, 10, 5)
[[[-0.00959962  0.21499124 -0.17225736 -0.06007293 -0.01426327]
  [-0.13070984  0.21733947 -0.29199782 -0.02914739 -0.05185204]
  [ 0.00457804  0.12874249 -0.07751576 -0.10823897  0.12222017]
  [-0.03010699  0.2299158  -0.25902417 -0.11228839  0.04989259]
  [ 0.08941653  0.07683862 -0.04283267  0.01186768  0.15385148]
  [-0.17279187  0.20893084 -0.19236568 -0.04028462 -0.06045159]
  [-0.1421089   0.09406001 -0.05696344  0.06213297 -0.07928834]
  [-0.14963478  0.10163516 -0.10343038  0.07638221 -0.07256965]
  [-0.13566948  0.02508682  0.08233571  0.09342598 -0.12378266]
  [-0.11063188  0.05536821 -0.00112649 -0.04271765 -0.04566799]]

 [[ 0.05736056 -0.074391    0.19495797  0.08077516  0.06829052]
  [ 0.23908362 -0.10028643  0.1312296  -0.0465778   0.16892622]
  [ 0.04285454 -0.08158276  0.2188183   0.01525448  0.029679  ]
  [ 0.01975241 -0.00393771  0.12709902 -0.09848741 -0.05448593]
  [ 0.02327636 -0.02502736  0.13571821  0.041887

### Return States
If we use `return_state=True`, it will give final hidden state/output plus the cell state as well

In [5]:
reset_graph()

# define model
inputs1 = Input(shape=(seq_len, in_features))
lstm1, state_h, state_c = LSTM(units, return_state=True)(inputs1)
model = Model(inputs=inputs1, outputs=[lstm1, state_h, state_c])

# check output
_h, h, c = model.predict(data)
print('_h: shape {} values \n {}\n'.format(_h.shape, _h))
print('h: shape {} values \n {}\n'.format(h.shape, h))
print('c: shape {} values \n {}'.format(c.shape, c))

_h: shape (2, 5) values 
 [[-0.11063188  0.05536821 -0.00112649 -0.04271765 -0.04566799]
 [-0.07557035  0.11102945 -0.25066054  0.05387768  0.0188199 ]]

h: shape (2, 5) values 
 [[-0.11063188  0.05536821 -0.00112649 -0.04271765 -0.04566799]
 [-0.07557035  0.11102945 -0.25066054  0.05387768  0.0188199 ]]

c: shape (2, 5) values 
 [[-0.18525022  0.09085277 -0.0023678  -0.09870709 -0.07825917]
 [-0.14414924  0.23718104 -0.4945711   0.10743085  0.04478739]]


## using both at same time
We can use both `return_sequences` and `return_states` at same time as well.

In [6]:
reset_graph()

# define model
inputs1 = Input(shape=(seq_len, in_features))
lstm1, state_h, state_c = LSTM(units, return_state=True, return_sequences=True)(inputs1)
model = Model(inputs=inputs1, outputs=[lstm1, state_h, state_c])

# check output
_h, h, c = model.predict(data)
print('_h: shape {} values \n {}\n'.format(_h.shape, _h))
print('h: shape {} values \n {}\n'.format(h.shape, h))
print('c: shape {} values \n {}'.format(c.shape, c))

_h: shape (2, 10, 5) values 
 [[[-0.00959962  0.21499124 -0.17225736 -0.06007293 -0.01426327]
  [-0.13070984  0.21733947 -0.29199782 -0.02914739 -0.05185204]
  [ 0.00457804  0.12874249 -0.07751576 -0.10823897  0.12222017]
  [-0.03010699  0.2299158  -0.25902417 -0.11228839  0.04989259]
  [ 0.08941653  0.07683862 -0.04283267  0.01186768  0.15385148]
  [-0.17279187  0.20893084 -0.19236568 -0.04028462 -0.06045159]
  [-0.1421089   0.09406001 -0.05696344  0.06213297 -0.07928834]
  [-0.14963478  0.10163516 -0.10343038  0.07638221 -0.07256965]
  [-0.13566948  0.02508682  0.08233571  0.09342598 -0.12378266]
  [-0.11063188  0.05536821 -0.00112649 -0.04271765 -0.04566799]]

 [[ 0.05736056 -0.074391    0.19495797  0.08077516  0.06829052]
  [ 0.23908362 -0.10028643  0.1312296  -0.0465778   0.16892622]
  [ 0.04285454 -0.08158276  0.2188183   0.01525448  0.029679  ]
  [ 0.01975241 -0.00393771  0.12709902 -0.09848741 -0.05448593]
  [ 0.02327636 -0.02502736  0.13571821  0.04188707 -0.00229754]
  [ 0.02

## Credits
This post is inspired from Jason Brownlee's [page](https://machinelearningmastery.com/return-sequences-and-return-states-for-lstms-in-keras/)