[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/drdave-teaching/OPIM5509Files/blob/main/OPIM5509_Module4_Files/Advanced_RNN_Theory.ipynb)

# Advanced RNN Theory
**Dr. Dave Wanik - University of Connecticut**

----------------------------


**Convolution and Pooling (1D - for sequences!)**
Pay attention to how the output shape changes when you include convolution and pooling.

The rows become fewer (due to the convolutions/when the kernel hits the wall) and the columns become your number of 'feature maps'. Convolution does elementwise multiplication and summing, so your original shape of the input features (i.e. 9 stocks) gets destroyed. See below.

For pooling, if you have any remainders - they simply get chopped out. So try to pick a kernel that prevents this, or include some padding (add 0s to help maintain shape and retain all information)

**Recurrent Dropout vs. Dropout**
Recurrent dropout occurs within a SimpleRNN, LSTM or GRU layer. It is DIFFERENT than a DROPOUT layer.

**Bidirectional Layers**
Sequences can be read both forwards and backwards, and so this Bidirectional() layer wraps around SimpleRNN, LSTM or GRU and fits two models at the same time (takes twice as long to run). Yes, they can also be stacked!

**Monster**
Put all of these concepts together and explain the output shape and trainable parameters... good luck!!!

In [None]:
# import modules
# standard modules
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# RNN-specific modules
import pandas as pd
import numpy as np
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import classification_report,accuracy_score
from tensorflow.keras import layers, Sequential
from tensorflow.keras.layers import Conv1D, MaxPooling1D
from tensorflow.keras.layers import Dense, Dropout, SimpleRNN, GRU, LSTM
from tensorflow.keras.callbacks import EarlyStopping

# Intro to Convolution and Pooling (Sequences)

![alt text](https://qph.fs.quoracdn.net/main-qimg-523434af0d21bb0b59454aa9563cc90b.webp)

In [None]:
# it does not matter how many features you have - the convolution for a single
# feature map will recode your 9 features into 1 via elementwise multiplication and summing

# link: https://www.quora.com/What-does-it-mean-by-1D-convolutional-neural-network

# there are more examples on text sequences, and I'll share these in future videos!

# Single SimpleRNN with Conv1D and MaxPool1D

In [None]:
# same as Excel Spreadsheet
n_features = 9 # 9 stock prices (WMT, GOOG, NETF, GE, AMD...)
n_steps = 21 # lookback 21 days

# my data is 9 columns and 21 days long

# for Conv1D, you can play with the filters and kernel size

# define model
model = Sequential()
# remember - convolution does elementwise multiplication and summation!
model.add(Conv1D(filters=3, # these are the new columns that will be created (YOU GET TO PICK THIS!)
                 kernel_size=2, # this dictates the size of the modified lookback (YOU GET TO PICK THIS!)
                 input_shape=(n_steps,n_features))) # notice how input shape goes in first layer
# you are just downsampling that 1D array
model.add(MaxPooling1D(2)) # the output from pooling is still a sequence (but is richer and denser)
# the new sequence we create will go into a SimpleRNN layer
model.add(SimpleRNN(30, activation='relu'))
# the output from this layer will be (None, 30) and we just return the last hidden state
model.add(Dropout(0.1)) # pick a number between 0.1 and 0.3
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse',metrics=['mae'])
model.summary()

Model: "sequential_31"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv1d_31 (Conv1D)          (None, 20, 3)             57        
                                                                 
 max_pooling1d_31 (MaxPoolin  (None, 10, 3)            0         
 g1D)                                                            
                                                                 
 simple_rnn_15 (SimpleRNN)   (None, 30)                1020      
                                                                 
 dropout_31 (Dropout)        (None, 30)                0         
                                                                 
 dense_31 (Dense)            (None, 1)                 31        
                                                                 
Total params: 1,108
Trainable params: 1,108
Non-trainable params: 0
___________________________________________________

In [None]:
# convolutional layer
# 9 input columns get transformed into 1 output column
# kernel width of 2
# 21 - 2 + 1 = 20 (shape)
# 1 is equal to the number of feature maps

# elementwise mutliplication and summation
# weights + biases
9*2*1 + 1  # 9 = 9 col inputs * 2 = kernel width * 1 output feature maps + 1 (for bias on one feature maps/sequences)

19

In [None]:
# simpleRNN trainable parms
# g × [h(h+i) + h]
# 1 x [30(30+1)+30]
1*(30*(30+1)+30)

960

In [None]:
# convolution layer
# elementwise mutliplication and summation
# weights + biases
9*3*2 + 3 # 9 = 9 col inputs * 2 = kernel width * 3 output feature maps + 3 (for bias on 3 feature maps/sequences)

57

# Intro to Recurrent Dropout
Imagine some of the weights being turned off in each of the feed-forward network WITHIN the recurrent layer!

![alt text](https://miro.medium.com/max/2250/1*goJVQs-p9kgLODFNyhl9zA.gif)




# Single LSTM layer with Conv1D and MaxPool1D

In [None]:
# same as Excel Spreadsheet
n_features = 9 # 9 stock prices
n_steps = 21 # lookback

# for Conv1D, you can play with the filters and kernel size

# define model
model = Sequential()
# remember - convolution does elementwise multiplication and pooling!
model.add(Conv1D(filters=1,
                 kernel_size=2,
                 input_shape=(n_steps,n_features))) # notice how input shape goes in first layer
model.add(MaxPooling1D(2))
model.add(LSTM(30, activation='relu', recurrent_dropout=0.2))
model.add(Dropout(0.1)) # pick a number between 0.1 and 0.3
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse',metrics=['mae'])
model.summary()

Model: "sequential_39"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv1d_39 (Conv1D)          (None, 20, 1)             19        
                                                                 
 max_pooling1d_39 (MaxPoolin  (None, 10, 1)            0         
 g1D)                                                            
                                                                 
 lstm_56 (LSTM)              (None, 30)                3840      
                                                                 
 dropout_39 (Dropout)        (None, 30)                0         
                                                                 
 dense_39 (Dense)            (None, 1)                 31        
                                                                 
Total params: 3,890
Trainable params: 3,890
Non-trainable params: 0
___________________________________________________

# Multiple LSTM layers with Conv1D and MaxPool1D and Recurrent Dropout
recurrent_dropout=True as an argument in an LSTM layer is different than a Dropout() layer.

In [None]:
# same as Excel Spreadsheet
n_features = 9 # 9 stock prices
n_steps = 21 # lookback

# for Conv1D, you can play with the filters and kernel size

# define model
model = Sequential()
# remember - convolution does elementwise multiplication and pooling!
model.add(Conv1D(filters=1, kernel_size=2, input_shape=(n_steps,n_features))) # notice how input shape goes in first layer
model.add(MaxPooling1D(2))
# return_sequences is only true for the recurrent layers going INTO other recurrent layers
model.add(LSTM(30, activation='relu', recurrent_dropout=0.1, return_sequences=True))
model.add(LSTM(40, activation='relu', recurrent_dropout=0.1, return_sequences=True))
model.add(LSTM(50, activation='relu', recurrent_dropout=0.1, return_sequences=True))
model.add(LSTM(60, activation='relu', recurrent_dropout=0.1))
model.add(Dropout(0.1)) # pick a number between 0.1 and 0.3
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse',metrics=['mae'])
model.summary()

Model: "sequential_33"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv1d_33 (Conv1D)          (None, 20, 1)             19        
                                                                 
 max_pooling1d_33 (MaxPoolin  (None, 10, 1)            0         
 g1D)                                                            
                                                                 
 lstm_45 (LSTM)              (None, 10, 30)            3840      
                                                                 
 lstm_46 (LSTM)              (None, 10, 40)            11360     
                                                                 
 lstm_47 (LSTM)              (None, 10, 50)            18200     
                                                                 
 lstm_48 (LSTM)              (None, 60)                26640     
                                                     

# Monster #1
(stacked Conv1D and Pool1D layers, with recurrent_dropout)

In [None]:
# same as Excel Spreadsheet
n_features = 20 # 20 stock prices
n_steps = 50 # lookback

# for Conv1D, you can play with the filters and kernel size

# define model
model = Sequential()
# remember - convolution does elementwise multiplication and pooling!
model.add(Conv1D(filters=64, kernel_size=2, input_shape=(n_steps,n_features))) # notice how input shape goes in first layer
model.add(MaxPooling1D(2))
model.add(Conv1D(filters=128, kernel_size=2)) # no need for input shape!
model.add(MaxPooling1D(2))
model.add(SimpleRNN(64, activation='relu', return_sequences=True))
model.add(SimpleRNN(120, activation='relu',
                    recurrent_dropout=0.1))
model.add(Dropout(0.1)) # pick a number between 0.1 and 0.3
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse',metrics=['mae'])
model.summary()

Model: "sequential_34"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv1d_34 (Conv1D)          (None, 49, 64)            2624      
                                                                 
 max_pooling1d_34 (MaxPoolin  (None, 24, 64)           0         
 g1D)                                                            
                                                                 
 conv1d_35 (Conv1D)          (None, 23, 128)           16512     
                                                                 
 max_pooling1d_35 (MaxPoolin  (None, 11, 128)          0         
 g1D)                                                            
                                                                 
 simple_rnn_16 (SimpleRNN)   (None, 11, 64)            12352     
                                                                 
 simple_rnn_17 (SimpleRNN)   (None, 120)             

# Introduction to Bidirectional Layers
See Google Sheets for overview.

# Single Bidirectional LSTM Layer

In [None]:
from keras.layers import Bidirectional

# same as Excel Spreadsheet
n_features = 9 # 9 stock prices
n_steps = 21 # lookback

# for Conv1D, you can play with the filters and kernel size

# define model
model = Sequential()
model.add(Bidirectional(LSTM(30, activation='relu'),
                        input_shape=(n_steps,n_features)))
model.add(Dropout(0.1)) # pick a number between 0.1 and 0.3
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse',metrics=['mae'])
model.summary()

# if you ever get an error like 'the model has not yet been built'
# it means that you have not provided an input shape to your model,
# note how input is an argument to the Bidirectional layer, NOT the LSTM!
# g*(h*(h+i)+h) # for a regular recurrent layer
# 2*g*h(h+i)+h)
# 4*(30*(30+9)+30) for a LSTM with 30 hidden size
# 2*4*(30*(30+9)+30) for a LSTM with 30 hidden size

Model: "sequential_35"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bidirectional_24 (Bidirecti  (None, 60)               9600      
 onal)                                                           
                                                                 
 dropout_35 (Dropout)        (None, 60)                0         
                                                                 
 dense_35 (Dense)            (None, 1)                 61        
                                                                 
Total params: 9,661
Trainable params: 9,661
Non-trainable params: 0
_________________________________________________________________


# Stacked Bidirectional Layers

In [None]:
from keras.layers import Bidirectional

# same as Excel Spreadsheet
n_features = 9 # 9 stock prices
n_steps = 21 # lookback

# for Conv1D, you can play with the filters and kernel size

# define model
model = Sequential()
model.add(Bidirectional(LSTM(30, activation='relu', return_sequences=True),
                        input_shape=(n_steps,n_features)))
model.add(Bidirectional(LSTM(10, activation='relu')))
model.add(Dropout(0.1)) # pick a number between 0.1 and 0.3
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse',metrics=['mae'])
model.summary()

Model: "sequential_36"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bidirectional_25 (Bidirecti  (None, 21, 60)           9600      
 onal)                                                           
                                                                 
 bidirectional_26 (Bidirecti  (None, 20)               5680      
 onal)                                                           
                                                                 
 dropout_36 (Dropout)        (None, 20)                0         
                                                                 
 dense_36 (Dense)            (None, 1)                 21        
                                                                 
Total params: 15,301
Trainable params: 15,301
Non-trainable params: 0
_________________________________________________________________


# Monster #2
Convolution, pooling, bidirectional etc. All of it!

In [None]:
from keras.layers import Bidirectional

# same as Excel Spreadsheet
n_features = 50 # 50 stock prices
n_steps = 21 # lookback

# for Conv1D, you can play with the filters and kernel size

# define model
model = Sequential()
model.add(Conv1D(filters=20, kernel_size=2, input_shape=(n_steps,n_features))) # notice how input shape goes in first layer
model.add(MaxPooling1D(2))
model.add(Conv1D(filters=10, kernel_size=2,))
model.add(MaxPooling1D(2))
model.add(Bidirectional(LSTM(30, activation='relu', return_sequences=True)))
model.add(Bidirectional(LSTM(60, activation='relu', return_sequences=True)))
model.add(GRU(40, activation='relu', return_sequences=True))
model.add(Bidirectional(LSTM(10, activation='relu')))
model.add(Dropout(0.1)) # pick a number between 0.1 and 0.3
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse',metrics=['mae'])
model.summary()

Model: "sequential_37"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv1d_36 (Conv1D)          (None, 20, 20)            2020      
                                                                 
 max_pooling1d_36 (MaxPoolin  (None, 10, 20)           0         
 g1D)                                                            
                                                                 
 conv1d_37 (Conv1D)          (None, 9, 10)             410       
                                                                 
 max_pooling1d_37 (MaxPoolin  (None, 4, 10)            0         
 g1D)                                                            
                                                                 
 bidirectional_27 (Bidirecti  (None, 4, 60)            9840      
 onal)                                                           
                                                     