# LSTM_Autoencoders_with_Keras

## 1.0 Autoencoders ?
An **Autoencoder** is a type of Artificial Neural Network model that learns a compressed representation of input (i.e., [Wiki](https://en.wikipedia.org/wiki/Autoencoder)). LSTM autoencoder is an implementation for compressed sequence representation for Encoder-Decoder LSTM. The Encoder part can compress the original input sequence to a fixed length, which can be used as feature vector for other supervised learning algorithms or even data visualization.

The lenght of sequences in any sequence prediction problems might vary from time to time. But, ANN are usaually designed to work with fixed length feature vectors. Hence, encoders play important roles in such scenarios, where the they encode the sequences of varying length to a fixed length reprsentation, making them suitable for the input of ANN models.

## 2.0 Encoder-Decoder LSTM Model

In [1]:
import numpy as np
from numpy import array

np.arange(36).reshape(1, 36, 1)

array([[[ 0],
        [ 1],
        [ 2],
        [ 3],
        [ 4],
        [ 5],
        [ 6],
        [ 7],
        [ 8],
        [ 9],
        [10],
        [11],
        [12],
        [13],
        [14],
        [15],
        [16],
        [17],
        [18],
        [19],
        [20],
        [21],
        [22],
        [23],
        [24],
        [25],
        [26],
        [27],
        [28],
        [29],
        [30],
        [31],
        [32],
        [33],
        [34],
        [35]]])

#### NOTE: 
The shape (x,y,z) means x number of arrays containing y rows by z columns. That is (1, 36, 1) means a column vector (1 column) with 36 rows and there are 1 of such vectors, as displayed above.

## 3.0 Reconstruction: Encoder-Decoder LSTM
This is the simplest LSTM autoencoder, that just learns to reconstruct the input sequences.

### 3.1 The Input Sequence to Reconstruct

In [2]:
# define input sequence
sequence = array([0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9])
print(type(sequence))
# reshape input into [samples, timesteps, features]
n_in = len(sequence)
sequence

<class 'numpy.ndarray'>


array([0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9])

### 3.2 Reshape for LSTM
Keras LSTM always expect 3D array as input. The 3D tensor with shape defines **(batch_size, timesteps, input_dim)**. Here, for this simple example we only use single batch. The number of timesteps are 9 (i.e., 0.1 through 0.9). And per input or feature dimensions per timesteps is only 1 (i.e., single number 0.1 or 0.2 and so on). Hence we reshape input as per the required for LSTM input, i.e., (1, 9, 1), ie., a **3D array as presented in the following:**

In [3]:
sequence = sequence.reshape((1, n_in, 1))
sequence

array([[[0.1],
        [0.2],
        [0.3],
        [0.4],
        [0.5],
        [0.6],
        [0.7],
        [0.8],
        [0.9]]])

### 3.3 Define Model

#### Useful Link:: Understanding LSTM setup in Keras
1. https://stackoverflow.com/questions/38714959/understanding-keras-lstms#
2. https://medium.com/@shivajbd/understanding-input-and-output-shape-in-lstm-keras-c501ee95c65e
3. (Keep Standalone LSTM Encoder) https://machinelearningmastery.com/lstm-autoencoders/

In [4]:
from numpy import array
from keras.models import Sequential
from keras.layers import LSTM
from keras.layers import Dense
from keras.layers import RepeatVector
from keras.layers import TimeDistributed
from keras.utils import plot_model
from keras.models import Model

# define model
model = Sequential()
model.add(LSTM(100, activation='relu', input_shape=(n_in,1)))
model.add(RepeatVector(n_in))
model.add(LSTM(100, activation='relu', return_sequences=True))
model.add(TimeDistributed(Dense(1)))
model.compile(optimizer='adam', loss='mse')
model.summary()

Using TensorFlow backend.


_________________________________________________________________
Layer (type)                 Output Shape              Param #   
lstm_1 (LSTM)                (None, 100)               40800     
_________________________________________________________________
repeat_vector_1 (RepeatVecto (None, 9, 100)            0         
_________________________________________________________________
lstm_2 (LSTM)                (None, 9, 100)            80400     
_________________________________________________________________
time_distributed_1 (TimeDist (None, 9, 1)              101       
Total params: 121,301
Trainable params: 121,301
Non-trainable params: 0
_________________________________________________________________


### 3.4 Note: Understanding the Model shapes and Structures

#### 3.4.1 Layer 01 (LSTM)

`model.add(LSTM(100, activation='relu', input_shape=(n_in,1)))`

**>>Input:** The first step of the model takes the 3D array as input as reshaped above. We do not have any batch for this example, so **None** and we do not need to specify it, i.e., `input_shape=(n_in,1)` is equivalent to `input_shape=(None,n_in,1)`. That is, **batch = Unkown/None, timesteps = 9, and feature dimension per time step 1**. This first LSTM layer contains 100 nodes.

**>>Output:** The output shape is (None, 100). We can use the following sample code to examine the output by layers. For example, the ouput of the first layer can be found as following:

In [9]:
# lstm autoencoder recreate sequence
from numpy import array
from keras.models import Sequential
from keras.models import Model
from keras.layers import LSTM
from keras.layers import Dense
from keras.layers import RepeatVector
from keras.layers import TimeDistributed
from keras.utils import plot_model

sequence = array([0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9])

n_in = len(sequence)
sequence = sequence.reshape((1, n_in, 1))

model = Sequential()
model.add(LSTM(100, activation='relu', input_shape=(n_in,1)))
model.add(RepeatVector(n_in))
model.add(LSTM(100, activation='relu', return_sequences=True))
model.add(TimeDistributed(Dense(1)))
model.compile(optimizer='adam', loss='mse')
# fit model
model.fit(sequence, sequence, epochs=300, verbose=0)

################################
# Output of Layer 0
################################
model = Model(inputs=model.inputs, outputs=model.layers[0].output)

# get the feature vector for the input sequence
yhat = model.predict(sequence)
print(yhat.shape)
print(yhat)

(1, 100)
[[0.         0.08526425 0.         0.08668269 0.         0.
  0.         0.08373296 0.06473152 0.         0.10590955 0.13052945
  0.00632676 0.09575267 0.         0.         0.         0.
  0.         0.         0.         0.09235387 0.         0.03934644
  0.03931042 0.         0.05035766 0.         0.         0.
  0.         0.0008235  0.         0.         0.10211127 0.10406243
  0.         0.         0.         0.10208047 0.02607506 0.
  0.04599658 0.06028002 0.         0.06495491 0.04745879 0.06214409
  0.         0.         0.         0.09650227 0.05514612 0.
  0.         0.04339869 0.0507921  0.10846753 0.         0.
  0.02928454 0.         0.03709463 0.06268398 0.11128254 0.
  0.08007399 0.         0.         0.         0.         0.0214268
  0.         0.         0.05982703 0.06501038 0.06495484 0.08794844
  0.08229375 0.         0.         0.00263113 0.10752204 0.
  0.         0.         0.10252966 0.         0.         0.
  0.         0.         0.         0.       

#### 3.4.2 Layer 02 (RepeatVector)

As we can note from the above, the output from the previous layer is a 2D array, ie., (1, 100). However, as the next layer is also another LSTM, it accepts only 3D array as discussed above. We hence use this layer, i.e., **RepeatVector** to simply duplicate the output vector to make it suitable as input for the next LSTM. Also, note that is nothing to learn by the network at this step, hence the **Param #**  in model summary is 0. The outpu of this RepeatVector layer is as following:

In [10]:
# lstm autoencoder recreate sequence
from numpy import array
from keras.models import Sequential
from keras.models import Model
from keras.layers import LSTM
from keras.layers import Dense
from keras.layers import RepeatVector
from keras.layers import TimeDistributed
from keras.utils import plot_model

sequence = array([0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9])

n_in = len(sequence)
sequence = sequence.reshape((1, n_in, 1))

model = Sequential()
model.add(LSTM(100, activation='relu', input_shape=(n_in,1)))
model.add(RepeatVector(n_in))
model.add(LSTM(100, activation='relu', return_sequences=True))
model.add(TimeDistributed(Dense(1)))
model.compile(optimizer='adam', loss='mse')
# fit model
model.fit(sequence, sequence, epochs=300, verbose=0)

################################
# Output of Layer 1
################################
model = Model(inputs=model.inputs, outputs=model.layers[1].output)

# get the feature vector for the input sequence
yhat = model.predict(sequence)
print(yhat.shape)
print(yhat)

(1, 9, 100)
[[[0.         0.         0.06829739 0.09849601 0.11844321 0.
   0.10648596 0.         0.         0.06946453 0.         0.00135783
   0.04175166 0.         0.         0.10055738 0.10648762 0.
   0.         0.         0.         0.07734827 0.         0.
   0.         0.07906882 0.         0.         0.         0.02494789
   0.         0.09046537 0.05006797 0.         0.02365715 0.
   0.         0.         0.         0.         0.         0.
   0.13598846 0.         0.05468839 0.04727116 0.         0.
   0.         0.03248655 0.05415042 0.         0.         0.
   0.         0.00435075 0.09720369 0.         0.         0.
   0.08369314 0.         0.         0.05793652 0.         0.08219028
   0.         0.         0.01177892 0.10881354 0.         0.
   0.         0.06474225 0.05809931 0.         0.         0.02664913
   0.01288957 0.         0.03107853 0.         0.         0.
   0.13248082 0.         0.         0.         0.         0.
   0.         0.10522534 0.         0.074

#### 3.4.3 Layer 03 (LSTM)
This is another LSTM layer. We have already setup the 3D input of the this layer using RepeatVector in the previous layer. This layer uses 100 nodes. Hence the output should be of (1, 100) shape. However, we have repeated the input vector using RepeatVector 9 times, hence the output shape from this layer is (1, 9, 100). Example in the following:

In [11]:
# lstm autoencoder recreate sequence
from numpy import array
from keras.models import Sequential
from keras.models import Model
from keras.layers import LSTM
from keras.layers import Dense
from keras.layers import RepeatVector
from keras.layers import TimeDistributed
from keras.utils import plot_model

sequence = array([0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9])

n_in = len(sequence)
sequence = sequence.reshape((1, n_in, 1))

model = Sequential()
model.add(LSTM(100, activation='relu', input_shape=(n_in,1)))
model.add(RepeatVector(n_in))
model.add(LSTM(100, activation='relu', return_sequences=True))
model.add(TimeDistributed(Dense(1)))
model.compile(optimizer='adam', loss='mse')
# fit model
model.fit(sequence, sequence, epochs=300, verbose=0)

################################
# Output of Third Layer 
################################
model = Model(inputs=model.inputs, outputs=model.layers[2].output)

# get the feature vector for the input sequence
yhat = model.predict(sequence)
print(yhat.shape)
print(yhat)

(1, 9, 100)
[[[4.64976057e-02 2.06131376e-02 5.08040003e-02 7.50926556e-03
   0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
   0.00000000e+00 2.83644926e-02 0.00000000e+00 0.00000000e+00
   0.00000000e+00 0.00000000e+00 1.83519516e-02 1.23489164e-02
   2.26073666e-03 0.00000000e+00 2.92143710e-02 2.48436835e-02
   3.61590311e-02 0.00000000e+00 0.00000000e+00 0.00000000e+00
   0.00000000e+00 0.00000000e+00 0.00000000e+00 1.79621559e-02
   0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
   0.00000000e+00 0.00000000e+00 0.00000000e+00 2.65220553e-03
   0.00000000e+00 4.07699011e-02 0.00000000e+00 0.00000000e+00
   0.00000000e+00 3.89819629e-02 4.06789966e-02 0.00000000e+00
   0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
   0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
   0.00000000e+00 4.01959606e-02 0.00000000e+00 0.00000000e+00
   0.00000000e+00 3.23795229e-02 0.00000000e+00 0.00000000e+00
   0.00000000e+00 0.00000000e+00 0.00000000

#### 3.4.4 Layer 4 (Last Layer, i.e, TimeDistributed)
Here this layer comprises of only 1 node, `Dense(1)`, ie., every node output from the last layer is connected to this single node as input. And, we had 9 repeated vector, the node predicts a single number for each of the vector generating 9 sequences. Example:

In [12]:
# lstm autoencoder recreate sequence
from numpy import array
from keras.models import Sequential
from keras.models import Model
from keras.layers import LSTM
from keras.layers import Dense
from keras.layers import RepeatVector
from keras.layers import TimeDistributed
from keras.utils import plot_model

sequence = array([0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9])

n_in = len(sequence)
sequence = sequence.reshape((1, n_in, 1))

model = Sequential()
model.add(LSTM(100, activation='relu', input_shape=(n_in,1)))
model.add(RepeatVector(n_in))
model.add(LSTM(100, activation='relu', return_sequences=True))
model.add(TimeDistributed(Dense(1)))
model.compile(optimizer='adam', loss='mse')
# fit model
model.fit(sequence, sequence, epochs=300, verbose=0)

################################
# Output of Fourth Layer
################################
model = Model(inputs=model.inputs, outputs=model.layers[3].output)

# get the feature vector for the input sequence
yhat = model.predict(sequence)
print(yhat.shape)
print(yhat)

(1, 9, 1)
[[[0.10291987]
  [0.19771233]
  [0.2986349 ]
  [0.4012162 ]
  [0.5014176 ]
  [0.60020894]
  [0.69911045]
  [0.79914594]
  [0.900632  ]]]


### 3.5 Reproduce the input Sequence.
The full code for reproducing the input sequence:

In [13]:
# lstm autoencoder recreate sequence
from numpy import array
from keras.models import Sequential
from keras.models import Model
from keras.layers import LSTM
from keras.layers import Dense
from keras.layers import RepeatVector
from keras.layers import TimeDistributed
from keras.utils import plot_model

sequence = array([0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9])

n_in = len(sequence)
sequence = sequence.reshape((1, n_in, 1))

model = Sequential()
model.add(LSTM(100, activation='relu', input_shape=(n_in,1)))
model.add(RepeatVector(n_in))
model.add(LSTM(100, activation='relu', return_sequences=True))
model.add(TimeDistributed(Dense(1)))
model.compile(optimizer='adam', loss='mse')
# fit model
model.fit(sequence, sequence, epochs=300, verbose=0)


# get the feature vector for the input sequence
yhat = model.predict(sequence)
print(yhat.shape)
print(yhat)

(1, 9, 1)
[[[0.10714255]
  [0.20551506]
  [0.30306545]
  [0.40028146]
  [0.49773994]
  [0.596095  ]
  [0.6960797 ]
  [0.7986292 ]
  [0.90475166]]]


## 4.0 Keeping the Standalone Encoder
As we have discussed, we can use the output of first layer, LSTM as the encoder of the input. That is we train and fit the whole model and at the end we just use the ouptut of first LSTM layer as encoder. An example for such encoder is in the following:

In [14]:
# lstm autoencoder recreate sequence
from numpy import array
from keras.models import Sequential
from keras.models import Model
from keras.layers import LSTM
from keras.layers import Dense
from keras.layers import RepeatVector
from keras.layers import TimeDistributed
from keras.utils import plot_model
# define input sequence
sequence = array([0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9])
# reshape input into [samples, timesteps, features]
n_in = len(sequence)
sequence = sequence.reshape((1, n_in, 1))
# define model
model = Sequential()
model.add(LSTM(100, activation='relu', input_shape=(n_in,1)))
model.add(RepeatVector(n_in))
model.add(LSTM(100, activation='relu', return_sequences=True))
model.add(TimeDistributed(Dense(1)))
model.compile(optimizer='adam', loss='mse')
# fit model
model.fit(sequence, sequence, epochs=300, verbose=0)
# connect the encoder LSTM as the output layer
model = Model(inputs=model.inputs, outputs=model.layers[0].output)

# get the feature vector for the input sequence
yhat = model.predict(sequence)
print(yhat.shape)
print(yhat)

(1, 100)
[[0.         0.         0.05114636 0.         0.03163566 0.
  0.10090625 0.08232216 0.         0.06240095 0.         0.
  0.13956766 0.         0.02350081 0.         0.0130856  0.
  0.0372422  0.         0.         0.05535569 0.06782397 0.
  0.         0.1143057  0.         0.         0.08879374 0.07698841
  0.         0.         0.04171373 0.         0.         0.03945698
  0.09528719 0.         0.06141445 0.10190405 0.         0.
  0.01744407 0.         0.         0.02750597 0.05886816 0.
  0.06366922 0.         0.07811875 0.         0.09518047 0.
  0.11582238 0.         0.00261876 0.         0.05623443 0.08539794
  0.07995928 0.03876071 0.12081884 0.         0.         0.04304748
  0.         0.05891561 0.09684899 0.         0.04247667 0.05588685
  0.         0.04837837 0.         0.         0.         0.
  0.         0.         0.         0.06170175 0.11355913 0.
  0.10680667 0.         0.06447414 0.0725139  0.         0.0655063
  0.07715986 0.         0.         0.0627669