# Introduction
<hr style="border:2px solid black"> </hr>

<div class="alert alert-warning">
<font color=black>

**What?** How to use return_state or return_sequences in Keras?

</font>
</div>

# Import modules
<hr style="border:2px solid black"> </hr>

In [1]:
from keras.models import Model
from keras.layers import Input
from keras.layers import LSTM
from numpy import array
import keras

# Return sequences
<hr style="border:2px solid black"> </hr>

<div class="alert alert-info">
<font color=black>

- `return_sequences=True` returns all the hidden states a(t1), a(t2), ....a(T).
- `return_sequences=False` is the **default** option and returns last hidden state output a(T). 
    
- Conceptually we would need:
    - The **last hidden state output** offers the last abstract representation of the input sequence. In cases such as classification and regression this is all we need.
    - In other cases, we need the full sequence as the output. One example is the encoder-decoder case.
    
</font>
</div>

In [5]:
# Constant initialiser so the results are reproducible
k_init = keras.initializers.Constant(value=0.1)
b_init = keras.initializers.Constant(value=0)
r_init = keras.initializers.Constant(value=0.1)

# LSTM units
units = 1

# Define model
inputs1 = Input(shape=(3, 2))

In [10]:
# Define the model
lstm1 = LSTM(units, return_sequences=True, kernel_initializer=k_init,
             bias_initializer=b_init, recurrent_initializer=r_init)(inputs1)
model = Model(inputs=inputs1, outputs=lstm1)

# Define input data
data = array([0.1, 0.2, 0.3, 0.1, 0.2, 0.3]).reshape((1, 3, 2))

# Make and show prediction
output = model.predict(data)
print(output, output.shape)

[[[0.00772376]
  [0.01633997]
  [0.02572775]]] (1, 3, 1)


In [11]:
# Define model
lstm1 = LSTM(units, kernel_initializer=k_init,
             bias_initializer=b_init, recurrent_initializer=r_init)(inputs1)
model = Model(inputs=inputs1, outputs=lstm1)

# Define input data
data = array([0.1, 0.2, 0.3, 0.1, 0.2, 0.3]).reshape((1, 3, 2))

# Dake and show prediction
preds = model.predict(data)
print(preds, preds.shape)

[[0.02572775]] (1, 1)


<div class="alert alert-info">
<font color=black>

-  We can see the output array's shape of the LSTM layer is `(1,3,1)` which stands for `(#Samples, #Time steps, #LSTM units)`. 
- Compared to when return_sequences is set to False, the shape will be `(#Samples, #LSTM units)`, which only returns the last time step hidden state.

</font>
</div>

# Return states
<hr style="border:2px solid black"> </hr>

<div class="alert alert-info">
<font color=black>

- In Keras we can output RNN's last cell state in addition to its hidden states by setting `return_state=True`. 
- Return sequences refer to return the cell state c<t>. For GRU, as we discussed in "RNN in a nutshell" section, `a<t>=c<t>`, so you can get around without this parameter. But for LSTM, hidden state and cell state are not the same.
- The output of the LSTM layer has three components, they are `(a<T>, a<T>, c<T>)`, "T" stands for the last timestep, each one has the shape `(#Samples, #LSTM units)`.
- The major reason you want to set the return_state is an RNN may need to have its cell state initialized with previous time step while the weights are shared, such as in an encoder-decoder model.
    
</font>
</div>

In [12]:
# define model
inputs1 = Input(shape=(3, 2))
lstm1, state_h, state_c = LSTM(units, return_state=True, kernel_initializer=k_init, bias_initializer=b_init, recurrent_initializer=r_init)(inputs1)
model = Model(inputs=inputs1, outputs=[lstm1, state_h, state_c])
# define input data
data = array([0.1, 0.2, 0.3, 0.1, 0.2, 0.3]).reshape((1,3,2))

# make and show prediction
output = model.predict(data)
print(output)
for a in output:
    print(a.shape) 

[array([[0.02572775]], dtype=float32), array([[0.02572775]], dtype=float32), array([[0.05020291]], dtype=float32)]
(1, 1)
(1, 1)
(1, 1)


# References
<hr style="border:2px solid black"> </hr>

<div class="alert alert-warning">
<font color=black>

- https://www.dlology.com/blog/how-to-use-return_state-or-return_sequences-in-keras/

</font>
</div>