# 12 Advanced Recurrent Neural Networks

Advanced Neural Network architectures represent significant advancements in the field of deep learning, which are most used in the domain of sequence modeling and processing. These architectures build upon the traditional feedforward neural networks and introduce recurrent connections, allowing them to exhibit temporal dynamics and memory capabilities.

* The Elman RNN employs a simple recurrent loop in its hidden layer, enabling it to capture short-term temporal dependencies, making it suitable for applications such as speech recognition and time series analysis.
* The Jordan RNN possesses feedback connections from the output layer to the hidden layer, rendering it capable of modeling longer-term dependencies, which finds applications in machine translation and language modeling tasks.
* The Bidirectional RNN combines both forward and backward temporal processing, allowing it to consider both past and future context in its predictions, making it effective in natural language processing tasks such as sentiment analysis and named entity recognition.

These advanced neural network architectures significantly expand the modeling capabilities of traditional neural networks and have become indispensable tools in various sequential data processing applications.

### Exercise
Use the IMDB movie reviews dataset to perform sentiment analysis with a Elman, Jordan and Bidirectional RNN.
Highlight the differences on the performance of each architecture.

In [8]:
from keras.datasets import imdb
from keras.preprocessing import sequence
from keras.models import Sequential
from keras.layers import Embedding, SimpleRNN, Dense, Bidirectional
import tensorflow as tf
from tensorflow import keras

1. Load the IMDB movie reviews dataset

In [9]:
max_features = 5000  # Number of words to consider as features
max_len_short = 100  # Maximum sequence length for short sequences
max_len_long = 500   # Maximum sequence length for long sequences

(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=max_features)

2. Pad sequences to a fixed length for RNN input

In [10]:
x_train_short = keras.preprocessing.sequence.pad_sequences(x_train, maxlen=max_len_short)
x_test_short = keras.preprocessing.sequence.pad_sequences(x_test, maxlen=max_len_short)

x_train_long = keras.preprocessing.sequence.pad_sequences(x_train, maxlen=max_len_long)
x_test_long = keras.preprocessing.sequence.pad_sequences(x_test, maxlen=max_len_long)

3. Build the distinct RNN models

In [14]:
from keras.layers.pooling.global_max_pooling1d import GlobalMaxPooling1D
GlobalMaxPooling1D
def build_elman_rnn_model():
    model = Sequential()
    model.add(Embedding(max_features, 32))
    model.add(SimpleRNN(32, activation='relu'))
    model.add(Dense(1, activation='sigmoid'))
    return model

def build_jordan_rnn_model():
    model = Sequential()
    model.add(Embedding(max_features, 32))
    model.add(SimpleRNN(32, activation='relu', return_sequences=True))
    model.add(GlobalMaxPooling1D()) #nos ayuda a reducir y adecuar la salida para la ultima capa, ya que si no se aumenta esta linea de codigo el modelo nos da otra dimension y nos da error en el entrenamiento.
    model.add(Dense(1, activation='sigmoid'))
    return model

def build_bidirectional_rnn_model():
    model = Sequential()
    model.add(Embedding(max_features, 32))
    model.add(Bidirectional(SimpleRNN(32, activation='relu')))
    model.add(Dense(1, activation='sigmoid'))
    return model

4. Train and evaluate the RNN model

In [15]:
def train_and_evaluate_model(model, x_train, y_train, x_test, y_test):
    model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
    history = model.fit(x_train, y_train, epochs=5, batch_size=128, validation_split=0.2)
    loss, accuracy = model.evaluate(x_test, y_test)
    return loss, accuracy, history

5. Train and evaluate the RNN model on short sequences

In [16]:
print("\nTraining RNN Elman model on short sequences:")
rnn_model_short_elman = build_elman_rnn_model()
loss_short_elmam, accuracy_short_elman, history_short_elman = train_and_evaluate_model(
    rnn_model_short_elman, x_train_short, y_train, x_test_short, y_test
)

print("\nTraining RNN Jordan model on short sequences:")
rnn_model_short_jordan = build_jordan_rnn_model()
loss_short_jordan, accuracy_short_jordan, history_short_jordan = train_and_evaluate_model(
    rnn_model_short_jordan, x_train_short, y_train, x_test_short, y_test
)

print("\nTraining RNN Bidirectional model on short sequences:")
rnn_model_short_bidirection = build_bidirectional_rnn_model()
loss_short_bidirection, accuracy_short_bidirection, history_short_bidirection=train_and_evaluate_model(
    rnn_model_short_bidirection, x_train_short, y_train, x_test_short, y_test
)


Training RNN Elman model on short sequences:
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5

Training RNN Jordan model on short sequences:
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5

Training RNN Bidirectional model on short sequences:
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


6. Train and evaluate the RNN model on long sequences

In [17]:
print("\nTraining Elman RNN model on long sequences:")
rnn_model_long_elman = build_elman_rnn_model()
loss_long_elman, accuracy_long_elman, history_long_elman = train_and_evaluate_model(
    rnn_model_long_elman, x_train_long, y_train, x_test_long, y_test
)

print("\nTraining Jordan RNN model on long sequences:")
rnn_model_long_jordan = build_jordan_rnn_model()
loss_long_jordan, accuracy_long_jordan, history_long_jordan = train_and_evaluate_model(
    rnn_model_long_jordan, x_train_long, y_train, x_test_long, y_test
)

print("\nTraining Bidireccional RNN model on long sequences:")
rnn_model_long_bidirection = build_bidirectional_rnn_model()
loss_long_bidirection, accuracy_long_bidirection, history_long_bidirection = train_and_evaluate_model(
    rnn_model_long_bidirection, x_train_long, y_train, x_test_long, y_test
)


Training Elman RNN model on long sequences:
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5

Training Jordan RNN model on long sequences:
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5

Training Bidireccional RNN model on long sequences:
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


7. Compare the results

In [18]:
print("\nResults on Short Sequences of Elman:")
print(f"Loss: {loss_short_elmam:.4f}, Accuracy: {accuracy_short_elman:.4f}")

print("\nResults on Long Sequences of Elman:")
print(f"Loss: {loss_long_elman:.4f}, Accuracy: {accuracy_long_elman:.4f}")


Results on Short Sequences of Elman:
Loss: 0.3622, Accuracy: 0.8424

Results on Long Sequences of Elman:
Loss: 0.3962, Accuracy: 0.8318


In [19]:
print("\nResults on Short Sequences of Jordan:")
print(f"Loss: {loss_short_jordan:.4f}, Accuracy: {accuracy_short_jordan:.4f}")

print("\nResults on Long Sequences of Jordan:")
print(f"Loss: {loss_long_jordan:.4f}, Accuracy: {accuracy_long_jordan:.4f}")


Results on Short Sequences of Jordan:
Loss: 0.4381, Accuracy: 0.8231

Results on Long Sequences of Jordan:
Loss: 0.3320, Accuracy: 0.8550


In [20]:
print("\nResults on Short Sequences of Bidirection:")
print(f"Loss: {loss_short_bidirection:.4f}, Accuracy: {accuracy_short_bidirection:.4f}")

print("\nResults on Long Sequences of Bidirection:")
print(f"Loss: {loss_long_bidirection:.4f}, Accuracy: {accuracy_long_bidirection:.4f}")


Results on Short Sequences of Bidirection:
Loss: 0.3980, Accuracy: 0.8371

Results on Long Sequences of Bidirection:
Loss: 0.3430, Accuracy: 0.8653


Se puede concluir que el modelo de Bidireccional tiene el accurancy mayor a los otros modelos en las secuecnias largas  ya que tiene un 0.8653.
Mientras que en las secuencias cortas el modelo se Elman  tiene el accurancy mayor a los otros modelos con un valor de 0.8424.

Ahoara en la perdida con mayor porcentaje es el modelo de jordan  con un valor de 0.4381 para las secuencias cortas comparado con los otros modelos, mientras que para las secuencias  largas el modelo con mayor perdida es el de Elman con un valor de 0.3962 comparado con los otros dos modelos.