<a href="https://colab.research.google.com/github/AtrCheema/Miscellaneous_DL_Tutorials/blob/master/return_sequences_vs_return_states.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Intro
This notebook describes difference between `return_sequence` and `return_state` arguments in LSTM/RNN layers of tensorflow/keras.

In [1]:
import tensorflow as tf
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input
from tensorflow.keras.layers import LSTM
import numpy as np
from tensorflow.keras.layers import MaxPooling1D, Flatten, Conv1D


np.set_printoptions(suppress=True) # to suppress scientific notation while printing arrays

def reset_graph(seed=313):
    tf.compat.v1.reset_default_graph()
    tf.compat.v1.set_random_seed(seed)
    np.random.seed(seed)

tf.__version__

'2.2.0'

In [2]:
seq_len = 10
in_features = 3
batch_size = 2
units = 5

# define input data
data = np.random.normal(0,1, size=(batch_size, seq_len, in_features))
print('input shape is', data.shape)


input shape is (2, 10, 3)


In [3]:
reset_graph()

# define model
inputs1 = Input(shape=(seq_len, in_features))
lstm1 = LSTM(units)(inputs1)
model = Model(inputs=inputs1, outputs=lstm1)

# check output
output = model.predict(data)
print('output shape is ', output.shape)
print(output)

output shape is  (2, 5)
[[-0.22040175 -0.05103757  0.11528992  0.19536228  0.05018005]
 [-0.15136391 -0.04078249  0.24475926  0.20049164  0.06182506]]


### Return Sequence
If we use `return_sequence=True`, we can get hidden state which is also output, at each time step instead of just one final output.

In [4]:
reset_graph()

print('input shape is', data.shape)

# define model
inputs1 = Input(shape=(seq_len, in_features))
lstm1 = LSTM(units, return_sequences=True)(inputs1)
model = Model(inputs=inputs1, outputs=lstm1)

# check output
output = model.predict(data)
print('output shape is ', output.shape)
print(output)

input shape is (2, 10, 3)
output shape is  (2, 10, 5)
[[[-0.24374452 -0.09573656  0.05246579  0.30608448 -0.00679237]
  [ 0.03395621  0.12980539  0.02035871  0.15129386  0.04364523]
  [ 0.13811205  0.21388356  0.02841143  0.02947732  0.18655203]
  [ 0.19264422  0.24333045  0.04939342  0.02225473  0.1564781 ]
  [-0.1532616  -0.07438491 -0.16188076  0.10787361 -0.05223484]
  [-0.287893   -0.11108701 -0.01680756  0.2710228  -0.04088374]
  [-0.07195782  0.02167981 -0.13415164  0.18494166 -0.19910014]
  [-0.08962998  0.00878443  0.01223693  0.07652631  0.0050485 ]
  [-0.27628472 -0.07134081  0.07844858  0.2543428   0.00744304]
  [-0.22040175 -0.05103757  0.11528992  0.19536228  0.05018005]]

 [[ 0.08307333  0.05767366  0.11010146 -0.05327794  0.23188937]
  [ 0.18589358  0.09882817  0.14136802  0.11742888  0.19913538]
  [ 0.05701723  0.0274472   0.22920178  0.18271996  0.189026  ]
  [-0.01641063 -0.06463264  0.13454576  0.24447107  0.09633702]
  [-0.01751786 -0.02926924  0.16121973  0.143784

### Return States
If we use `return_state=True`, it will give final hidden state/output plus the cell state as well

In [5]:
reset_graph()

# define model
inputs1 = Input(shape=(seq_len, in_features))
lstm1, state_h, state_c = LSTM(units, return_state=True)(inputs1)
model = Model(inputs=inputs1, outputs=[lstm1, state_h, state_c])

# check output
_h, h, c = model.predict(data)
print('_h: shape {} values \n {}\n'.format(_h.shape, _h))
print('h: shape {} values \n {}\n'.format(h.shape, h))
print('c: shape {} values \n {}'.format(c.shape, c))

_h: shape (2, 5) values 
 [[-0.22040175 -0.05103757  0.11528992  0.19536228  0.05018005]
 [-0.15136391 -0.04078249  0.24475926  0.20049164  0.06182506]]

h: shape (2, 5) values 
 [[-0.22040175 -0.05103757  0.11528992  0.19536228  0.05018005]
 [-0.15136391 -0.04078249  0.24475926  0.20049164  0.06182506]]

c: shape (2, 5) values 
 [[-0.44137242 -0.10972145  0.27500203  0.50635266  0.09567016]
 [-0.3063129  -0.13716078  0.42285413  0.5800175   0.26215786]]


## using both at same time
We can use both `return_sequences` and `return_states` at same time as well.

In [6]:
reset_graph()

# define model
inputs1 = Input(shape=(seq_len, in_features))
lstm1, state_h, state_c = LSTM(units, return_state=True, return_sequences=True)(inputs1)
model = Model(inputs=inputs1, outputs=[lstm1, state_h, state_c])

# check output
_h, h, c = model.predict(data)
print('_h: shape {} values \n {}\n'.format(_h.shape, _h))
print('h: shape {} values \n {}\n'.format(h.shape, h))
print('c: shape {} values \n {}'.format(c.shape, c))

_h: shape (2, 10, 5) values 
 [[[-0.24374452 -0.09573656  0.05246579  0.30608448 -0.00679237]
  [ 0.03395621  0.12980539  0.02035871  0.15129386  0.04364523]
  [ 0.13811205  0.21388356  0.02841143  0.02947732  0.18655203]
  [ 0.19264422  0.24333045  0.04939342  0.02225473  0.1564781 ]
  [-0.1532616  -0.07438491 -0.16188076  0.10787361 -0.05223484]
  [-0.287893   -0.11108701 -0.01680756  0.2710228  -0.04088374]
  [-0.07195782  0.02167981 -0.13415164  0.18494166 -0.19910014]
  [-0.08962998  0.00878443  0.01223693  0.07652631  0.0050485 ]
  [-0.27628472 -0.07134081  0.07844858  0.2543428   0.00744304]
  [-0.22040175 -0.05103757  0.11528992  0.19536228  0.05018005]]

 [[ 0.08307333  0.05767366  0.11010146 -0.05327794  0.23188937]
  [ 0.18589358  0.09882817  0.14136802  0.11742888  0.19913538]
  [ 0.05701723  0.0274472   0.22920178  0.18271996  0.189026  ]
  [-0.01641063 -0.06463264  0.13454576  0.24447107  0.09633702]
  [-0.01751786 -0.02926924  0.16121973  0.14378461  0.15671824]
  [-0.23

##LSTM to 1D CNN

We can put 1d cnn at the end of LSTM to further extract some features from LSTM output.

In [7]:
reset_graph()

print('input shape is', data.shape)

# define model
inputs = Input(shape=(seq_len, in_features))
lstm_layer = LSTM(units, return_sequences=True)
lstm_outputs = lstm_layer(inputs)
print('lstm output: ', lstm_outputs.shape)

conv1 = Conv1D(filters=64, kernel_size=2, activation='relu', input_shape=(seq_len, units))(lstm_outputs)
print('conv output: ', conv1.shape)

max1d1 = MaxPooling1D(pool_size=2)(conv1)
print('max pool output: ', max1d1.shape)

flat1 = Flatten()(max1d1)
print('flatten output: ', flat1.shape)

model = Model(inputs=inputs, outputs=flat1)

# check output
output = model.predict(data)
print('output shape: ', output.shape)

input shape is (2, 10, 3)
lstm output:  (None, 10, 5)
conv output:  (None, 9, 64)
max pool output:  (None, 4, 64)
flatten output:  (None, 256)
output shape:  (2, 256)


## Credits
This post is inspired from Jason Brownlee's [page](https://machinelearningmastery.com/return-sequences-and-return-states-for-lstms-in-keras/)