<a href="https://colab.research.google.com/github/cagBRT/timeSeries/blob/main/11a_LSTMs.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
!git clone -l -s https://github.com/cagBRT/timeSeries.git cloned-repo
%cd cloned-repo

In [None]:
from IPython.display import Image
def page(num):
    return Image("LSTM"+str(num)+ ".png" , width=640)

# **Recurrent Neural Networks (RNNs)**

Humans don’t start their thinking from scratch every second. As you read this sentence, you understand each word based on your understanding of previous words. Your thoughts have persistence.<br>
So it makes sense that we would have DNNs that have persistence. 

We can use RNNs to connect very recent past information with present information, for example trying to decipher what is happening in a video. <br>

In [None]:
page(1)

When we unroll an RNN, we see there are multiple cells for information.

In [None]:
page(2)

RNNs can learn to predict the very next piece of information, as long as the gap between context and the prediction is small. <br>
For example:<br>
If we had a sentence, "The car is on the ????"<br>
It would a fairly straightforward to predict "road". The gap between context and prediction is small. 

In [None]:
page(5)

There are times when the information we need is not from the very recent past (the previous frame of a video), but from a slightly more distant past. <br>
For example:<br><br>
I was born in Spain but live in Canada, so I am fluent in English and ????
<br><br>
As you can see from this example, we need information that is contained at the beginning of the sentence to determine the correct word to finish the sentence. <br><br>

In [None]:
page(6)

For this kind of task we need Long Short Term Memory Networks

# **Long Short-Term Memory Networks (LSTMs)**<br>


The difference between RNNs and LSTMs is in the architecture of the repeating module.<br>

A standard RNN contains a single layer. 

In [None]:
page(7)

The LSTM contains four interacting layers.

In [None]:
page(8)

The architecture of the LSTM means it has the ability to add or remove information. This means the gap between context and prediction can be larger. 

In [None]:
page(9)

LSTMs are a special class of Recurrent Neural Networks (RNNs).<br>
They can learn long term dependencies.

**Univariate LSTM Models**

**Data Preparation**<br>
[10, 20, 30, 40, 50, 60, 70, 80, 90]<br>

The above sequence can be changed for the model

X__________y <br>
10, 20, 30  40<br>
20, 30, 40  50<br>
30, 40, 50  60

In [None]:
# split a univariate sequence into samples
def split_sequence(sequence, n_steps):
  X, y = list(), list()
  for i in range(len(sequence)):
    # find the end of this pattern
    end_ix = i + n_steps
    # check if we are beyond the sequence
    if end_ix > len(sequence)-1:
     break
    # gather input and output parts of the pattern
    seq_x, seq_y = sequence[i:end_ix], sequence[end_ix]
    X.append(seq_x)
    y.append(seq_y)
  return array(X), array(y)

In [None]:
from numpy import array
from keras.models import Sequential
from keras.layers import LSTM
from keras.layers import Dense
from keras.utils.vis_utils import plot_model

**Prepare the data**

In [None]:
raw_seq = [10, 20, 30, 40, 50, 60, 70, 80, 90]

In [None]:
n_steps = 3
n_features = 1
# split into samples
X, y = split_sequence(raw_seq, n_steps)
# summarize the data
for i in range(len(X)):
  print(X[i], y[i])

**Using an off-the-shelf LSTM model** 

In [None]:
model = Sequential()
model.add(LSTM(50, activation='relu', input_shape=(n_steps, n_features))) 
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')

In [None]:
model.fit(X, y, epochs=200, verbose=0)

In [None]:
plot_model(model, show_shapes=True, show_layer_names=True)

**Make a Prediction**<br>
We are expecting:<br>
100

In [None]:
x_input = array([70, 80, 90])
x_input = x_input.reshape((1, n_steps, n_features))

In [None]:
yhat = model.predict(x_input, verbose=0)
print(yhat)