# Exercise - building RNNs

1. Build three RNNs which take as input a sequence of length 5 with 20 features, uses a single recurrent layer with 20 nodes with a $\texttt{ReLU}$ activation function, and then uses a $\texttt{Dense}$ layer to perform classification of 10 classes. Use $\texttt{SimpleRNN}$, $\texttt{LSTM}$, and $\texttt{GRU}$ for the recurrent layers (hence 3 total models). What is the number of parameters in each of the 3 cases?
1. Modify your models fom **1.** using $\texttt{Bidirectional}$ layers. Do this for (at least) the merging methods $\texttt{concat}$ and $\texttt{sum}$. What is the number of parameters in each of the 6 cases?
1. Modify your models from **1.** by using a *second* recurrent layer (of the same type, so still 3 total models) with 10 nodes. What is the number of parameters in each of the 3 cases?

**Note**: You may want to use:
1. https://www.tensorflow.org/api_docs/python/tf/keras/layers/SimpleRNN
1. https://www.tensorflow.org/api_docs/python/tf/keras/layers/LSTM
1. https://www.tensorflow.org/api_docs/python/tf/keras/layers/GRU
1. https://www.tensorflow.org/api_docs/python/tf/keras/layers/Bidirectional

**See slides for more details!**

# Setup

In [1]:
import tensorflow as tf

# Exercise 1

Build three RNNs which takes as input a sequence of length 5 with 20 features, uses a single recurrent layer with 20 nodes with a $\texttt{ReLU}$ activation function, and then uses a $\texttt{Dense}$ layer to perform classification of 10 classes. Use $\texttt{SimpleRNN}$, $\texttt{LSTM}$, and $\texttt{GRU}$ for the recurrent layers (hence 3 total models). What is the number of parameters in each of the 3 cases?

In [2]:
recurrent_layer_types = [
    tf.keras.layers.SimpleRNN,
    tf.keras.layers.LSTM,
    tf.keras.layers.GRU,
]

for layer_type in recurrent_layer_types:
    model = tf.keras.models.Sequential([
        layer_type(20, activation='relu', input_shape=(5, 20)),
        tf.keras.layers.Dense(10, activation='softmax'),
    ])
    model.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 simple_rnn (SimpleRNN)      (None, 20)                820       
                                                                 
 dense (Dense)               (None, 10)                210       
                                                                 
Total params: 1,030
Trainable params: 1,030
Non-trainable params: 0
_________________________________________________________________
Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 lstm (LSTM)                 (None, 20)                3280      
                                                                 
 dense_1 (Dense)             (None, 10)                210       
                                                                 
Total params: 3,490
Trainable pa

**Comment**: ReLU is problematic for cuDNN implementation (it only supports hyperbolic tangent).

# Exercise 2

Modify your models fom **1.** using $\texttt{Bidirectional}$ layers. Do this for (at least) the merging methods $\texttt{concat}$ and $\texttt{sum}$. What is the number of parameters in each of the 6 cases?

In [3]:
merging_methods = [
    'concat',
    'sum'
]

for layer_type in recurrent_layer_types:
    for merging_method in merging_methods:
        model = tf.keras.models.Sequential([
            tf.keras.layers.Bidirectional(
                layer_type(20, activation='relu'), 
                merge_mode=merging_method,
                input_shape=(5, 20)),
            tf.keras.layers.Dense(10, activation='softmax'),
        ])
        model.summary()

Model: "sequential_3"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bidirectional (Bidirectiona  (None, 40)               1640      
 l)                                                              
                                                                 
 dense_3 (Dense)             (None, 10)                410       
                                                                 
Total params: 2,050
Trainable params: 2,050
Non-trainable params: 0
_________________________________________________________________
Model: "sequential_4"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bidirectional_1 (Bidirectio  (None, 20)               1640      
 nal)                                                            
                                                                 
 dense_4 (Dense)             (

# Exercise 3

Modify your models from **1.** by using a *second* recurrent layer (of the same type, so still 3 total models) with 10 nodes. What is the number of parameters in each of the 3 cases?

In [4]:
for layer_type in recurrent_layer_types:
    model = tf.keras.models.Sequential([
        layer_type(20, activation='relu', return_sequences=True, input_shape=(5, 20)),
        layer_type(10), # the default activation is hyperbolic tangent
        tf.keras.layers.Dense(10, activation='softmax'),
    ])
    model.summary()

Model: "sequential_9"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 simple_rnn_3 (SimpleRNN)    (None, 5, 20)             820       
                                                                 
 simple_rnn_4 (SimpleRNN)    (None, 10)                310       
                                                                 
 dense_9 (Dense)             (None, 10)                110       
                                                                 
Total params: 1,240
Trainable params: 1,240
Non-trainable params: 0
_________________________________________________________________
Model: "sequential_10"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 lstm_3 (LSTM)               (None, 5, 20)             3280      
                                                                 
 lstm_4 (LSTM)               