# 12 Advanced Recurrent Neural Networks

Advanced Neural Network architectures represent significant advancements in the field of deep learning, which are most used in the domain of sequence modeling and processing. These architectures build upon the traditional feedforward neural networks and introduce recurrent connections, allowing them to exhibit temporal dynamics and memory capabilities.

* The Elman RNN employs a simple recurrent loop in its hidden layer, enabling it to capture short-term temporal dependencies, making it suitable for applications such as speech recognition and time series analysis.
* The Jordan RNN possesses feedback connections from the output layer to the hidden layer, rendering it capable of modeling longer-term dependencies, which finds applications in machine translation and language modeling tasks.
* The Bidirectional RNN combines both forward and backward temporal processing, allowing it to consider both past and future context in its predictions, making it effective in natural language processing tasks such as sentiment analysis and named entity recognition.

These advanced neural network architectures significantly expand the modeling capabilities of traditional neural networks and have become indispensable tools in various sequential data processing applications.

### Exercise
Use the IMDB movie reviews dataset to perform sentiment analysis with a Elman, Jordan and Bidirectional RNN.
Highlight the differences on the performance of each architecture.

In [23]:
from keras.datasets import imdb
from keras.preprocessing import sequence
from keras.models import Sequential
from keras.layers import Embedding, SimpleRNN, Dense, Bidirectional

In [39]:
from tensorflow.keras.layers import GlobalMaxPooling1D

1. Load the IMDB movie reviews dataset

In [24]:
max_features = 5000  # Number of words to consider as features
max_len_short = 100  # Maximum sequence length for short sequences
max_len_long = 500   # Maximum sequence length for long sequences

(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=max_features)

In [25]:
from tensorflow.keras.preprocessing.sequence import pad_sequences


2. Pad sequences to a fixed length for RNN input

In [26]:
x_train_short = pad_sequences(x_train, maxlen=max_len_short)
x_test_short = pad_sequences(x_test, maxlen=max_len_short)

x_train_long = pad_sequences(x_train, maxlen=max_len_long)
x_test_long = pad_sequences(x_test, maxlen=max_len_long)

3. Build the distinct RNN models

In [33]:
def build_elman_rnn_model():
    model = Sequential()
    model.add(Embedding(max_features, 32))
    model.add(SimpleRNN(32, activation='relu'))
    model.add(Dense(1, activation='sigmoid'))
    return model

def build_jordan_rnn_model():
    model = Sequential()
    model.add(Embedding(max_features, 32))
    model.add(SimpleRNN(32, activation='relu', return_sequences=True))
    model.add(GlobalMaxPooling1D())
    model.add(Dense(1, activation='sigmoid'))
    return model



#def build_jordan_rnn_model():
#    model = Sequential()
#    model.add(Embedding(max_features, 32))
#    model.add(SimpleRNN(32, activation='relu', return_sequences=True))
#    model.add(Dense(1, activation='sigmoid'))
#    return model

def build_bidirectional_rnn_model():
    model = Sequential()
    model.add(Embedding(max_features, 32))
    model.add(Bidirectional(SimpleRNN(32, activation='relu')))
    model.add(Dense(1, activation='sigmoid'))
    return model

4. Train and evaluate the RNN model

In [34]:
def train_and_evaluate_model(model, x_train, y_train, x_test, y_test):
    model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
    history = model.fit(x_train, y_train, epochs=5, batch_size=128, validation_split=0.2)
    loss, accuracy = model.evaluate(x_test, y_test)
    return loss, accuracy, history

5. Train and evaluate the RNN model on short sequences

In [35]:
print("\nTraining Elman RNN model on short sequences:")
rnn_model_elman_short = build_elman_rnn_model()
loss_elman_short, accuracy_elman_short, history_elman_short = train_and_evaluate_model(
    rnn_model_elman_short, x_train_short, y_train, x_test_short, y_test
)


Training Elman RNN model on short sequences:
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


In [40]:
print("\nTraining Jordan RNN model on short sequences:")
rnn_model_jordan_short = build_jordan_rnn_model()
loss_jordan_short, accuracy_jordan_short, history_jordan_short = train_and_evaluate_model(
    rnn_model_jordan_short, x_train_short, y_train, x_test_short, y_test
)


Training Jordan RNN model on short sequences:
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


In [41]:
print("\nTraining Bidirectional RNN model on short sequences:")
rnn_model_bidirectional_short = build_bidirectional_rnn_model()
loss_bidirectional_short, accuracy_bidirectional_short, history_bidirectional_short = train_and_evaluate_model(
    rnn_model_bidirectional_short, x_train_short, y_train, x_test_short, y_test
)



Training Bidirectional RNN model on short sequences:
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


6. Train and evaluate the RNN model on long sequences

In [44]:
print("\nTraining Elman RNN model on long sequences:")
rnn_model_elman_long = build_elman_rnn_model()
loss_elman_long, accuracy_elman_long, history_elman_long = train_and_evaluate_model(
    rnn_model_elman_long, x_train_long, y_train, x_test_long, y_test
)


Training Elman RNN model on long sequences:
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


In [42]:
print("\nTraining Jordan RNN model on long sequences:")
rnn_model_jordan_long = build_jordan_rnn_model()
loss_jordan_long, accuracy_jordan_long, history_jordan_long = train_and_evaluate_model(
    rnn_model_jordan_long, x_train_long, y_train, x_test_long, y_test
)


Training Jordan RNN model on long sequences:
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


In [43]:
print("\nTraining Bidirectional RNN model on long sequences:")
rnn_model_bidirectional_long = build_bidirectional_rnn_model()
loss_bidirectional_long, accuracy_bidirectional_long, history_bidirectional_long = train_and_evaluate_model(
    rnn_model_bidirectional_long, x_train_long, y_train, x_test_long, y_test
)


Training Bidirectional RNN model on long sequences:
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


7. Compare the results

In [45]:
print("\nResults for Elman RNN on Short Sequences:")
print(f"Loss: {loss_elman_short:.4f}, Accuracy: {accuracy_elman_short:.4f}")

print("\nResults for Elman RNN on Long Sequences:")
print(f"Loss: {loss_elman_long:.4f}, Accuracy: {accuracy_elman_long:.4f}")

print("\nResults for Jordan RNN on Short Sequences:")
print(f"Loss: {loss_jordan_short:.4f}, Accuracy: {accuracy_jordan_short:.4f}")

print("\nResults for Jordan RNN on Long Sequences:")
print(f"Loss: {loss_jordan_long:.4f}, Accuracy: {accuracy_jordan_long:.4f}")

print("\nResults for Bidirectional RNN on Short Sequences:")
print(f"Loss: {loss_bidirectional_short:.4f}, Accuracy: {accuracy_bidirectional_short:.4f}")

print("\nResults for Bidirectional RNN on Long Sequences:")
print(f"Loss: {loss_bidirectional_long:.4f}, Accuracy: {accuracy_bidirectional_long:.4f}")



Results for Elman RNN on Short Sequences:
Loss: 0.3885, Accuracy: 0.8369

Results for Elman RNN on Long Sequences:
Loss: 0.3330, Accuracy: 0.8631

Results for Jordan RNN on Short Sequences:
Loss: 0.4037, Accuracy: 0.8247

Results for Jordan RNN on Long Sequences:
Loss: 0.3630, Accuracy: 0.8612

Results for Bidirectional RNN on Short Sequences:
Loss: 0.4451, Accuracy: 0.8294

Results for Bidirectional RNN on Long Sequences:
Loss: 0.3623, Accuracy: 0.8470
