# 12 Advanced Recurrent Neural Networks

Advanced Neural Network architectures represent significant advancements in the field of deep learning, which are most used in the domain of sequence modeling and processing. These architectures build upon the traditional feedforward neural networks and introduce recurrent connections, allowing them to exhibit temporal dynamics and memory capabilities.

* The Elman RNN employs a simple recurrent loop in its hidden layer, enabling it to capture short-term temporal dependencies, making it suitable for applications such as speech recognition and time series analysis.
* The Jordan RNN possesses feedback connections from the output layer to the hidden layer, rendering it capable of modeling longer-term dependencies, which finds applications in machine translation and language modeling tasks.
* The Bidirectional RNN combines both forward and backward temporal processing, allowing it to consider both past and future context in its predictions, making it effective in natural language processing tasks such as sentiment analysis and named entity recognition.

These advanced neural network architectures significantly expand the modeling capabilities of traditional neural networks and have become indispensable tools in various sequential data processing applications.

### Exercise
Use the IMDB movie reviews dataset to perform sentiment analysis with a Elman, Jordan and Bidirectional RNN.
Highlight the differences on the performance of each architecture.

In [1]:
#Librerias metodo 1
from keras.datasets import imdb
from keras.preprocessing import sequence
from keras.models import Sequential
from keras.layers import Embedding, SimpleRNN, Dense, Bidirectional

In [2]:
#Librerias metodo 2
import tensorflow as tf
from tensorflow.keras.datasets import imdb
from tensorflow.keras.preprocessing import sequence
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input, Embedding, SimpleRNN, Dense

In [3]:
#Librerias metodo 3
from keras.datasets import imdb
from tensorflow.keras.preprocessing import sequence  # Cambio: Importar pad_sequences desde tensorflow.keras
from keras.models import Sequential
from keras.layers import Embedding, SimpleRNN, Dense, Bidirectional

1. Load the IMDB movie reviews dataset

In [4]:
max_features = 5000  # Number of words to consider as features
max_len_short = 100  # Maximum sequence length for short sequences
max_len_long = 500   # Maximum sequence length for long sequences

(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=max_features)

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/imdb.npz


2. Pad sequences to a fixed length for RNN input

In [5]:
x_train_short = sequence.pad_sequences(x_train, maxlen=max_len_short)
x_test_short = sequence.pad_sequences(x_test, maxlen=max_len_short)

x_train_long = sequence.pad_sequences(x_train, maxlen=max_len_long)
x_test_long = sequence.pad_sequences(x_test, maxlen=max_len_long)

3. Build the distinct RNN models

In [6]:
#metodo 1
def build_elman_rnn_model():
    model = Sequential()
    model.add(Embedding(max_features, 32))
    model.add(SimpleRNN(32, activation='relu'))
    model.add(Dense(1, activation='sigmoid'))
    return model

#metodo 2
def build_jordan_rnn_model(max_len):
    inputs = Input(shape=(max_len,))
    embedding = Embedding(max_features, 32)(inputs)
    rnn_output, rnn_state = SimpleRNN(32, activation='relu', return_sequences=False, return_state=True)(embedding)
    output = Dense(1, activation='sigmoid')(rnn_output)
    model = Model(inputs=inputs, outputs=output)
    return model

#metodo 3
def build_bidirectional_rnn_model():
    model = Sequential()
    model.add(Embedding(max_features, 32))
    model.add(Bidirectional(SimpleRNN(32, activation='relu'), merge_mode='concat'))
    model.add(Dense(1, activation='sigmoid'))
    return model

4. Train and evaluate the RNN model

In [7]:
def train_and_evaluate_model(model, x_train, y_train, x_test, y_test):
    model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
    history = model.fit(x_train, y_train, epochs=5, batch_size=128, validation_split=0.2)
    loss, accuracy = model.evaluate(x_test, y_test)
    return loss, accuracy, history


#Modelo 1

5. Train and evaluate the RNN model on short sequences

In [8]:
print("\nTraining RNN model on short sequences:")
rnn_model_short = build_elman_rnn_model()
loss_short, accuracy_short, history_short = train_and_evaluate_model(
    rnn_model_short, x_train_short, y_train, x_test_short, y_test
)


Training RNN model on short sequences:
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


6. Train and evaluate the RNN model on long sequences

In [9]:
print("\nTraining Elman RNN model on long sequences:")
rnn_model_long = build_elman_rnn_model()
loss_long, accuracy_long, history_long = train_and_evaluate_model(
    rnn_model_long, x_train_long, y_train, x_test_long, y_test
)


Training Elman RNN model on long sequences:
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


7. Compare the results

#Modelo 2

In [16]:
print("\nTraining Jordan RNN model on short sequences:")
jordan_model_short = build_jordan_rnn_model(max_len_short)
loss_jordan_short, accuracy_jordan_short, history_jordan_short = train_and_evaluate_model(
    jordan_model_short, x_train_short, y_train, x_test_short, y_test
)


Training Jordan RNN model on short sequences:
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


In [19]:
print("\nTraining Jordan RNN model on long sequences:")
jordan_model_long = build_jordan_rnn_model(max_len_long)
loss_jordan_long,  accuracy_jordan_long, history_jordan_long = train_and_evaluate_model(
    jordan_model_long, x_train_long, y_train, x_test_long, y_test
)


Training Jordan RNN model on long sequences:
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


#Modelo 3

In [12]:
print("\nTraining Bidirectional RNN model on short sequences:")  # Cambio: Añadir impresión para el nuevo modelo
bidirectional_model_short = build_bidirectional_rnn_model()
loss_bidirectional_short, accuracy_bidirectional_short, history_bidirectional_short = train_and_evaluate_model(
    bidirectional_model_short, x_train_short, y_train, x_test_short, y_test
)


Training Bidirectional RNN model on short sequences:
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


In [13]:
print("\nTraining Bidirectional RNN model on long sequences:")  # Cambio: Añadir impresión para el nuevo modelo
bidirectional_model_long = build_bidirectional_rnn_model()
loss_bidirectional_long, accuracy_bidirectional_long, history_bidirectional_long = train_and_evaluate_model(
    bidirectional_model_long, x_train_long, y_train, x_test_long, y_test
)


Training Bidirectional RNN model on long sequences:
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


#Resultados obtenidos

In [20]:
#Modelo 1
print("Resultados Modelo 1 \n")
print("\nResults on Short Sequences:")
print(f"Loss: {loss_short:.4f}, Accuracy: {accuracy_short:.4f}")

print("\nResults on Long Sequences:")
print(f"Loss: {loss_long:.4f}, Accuracy: {accuracy_long:.4f}")

print("\n=====================================================")
print("Resultados Modelo 2 \n")
print("\nResults on Short Sequences:")
print(f"Loss: {loss_jordan_short:.4f}, Accuracy: {accuracy_jordan_short:.4f}")

print("\nResults on Long Sequences:")
print(f"Loss: {loss_jordan_long:.4f}, Accuracy: {accuracy_jordan_long:.4f}")

print("\n=====================================================")
print("Resultados Modelo 3 \n")
print("\nResults on Bidirectional Short Sequences:")
print(f"Loss: {loss_bidirectional_short:.4f}, Accuracy: {accuracy_bidirectional_short:.4f}")

print("\nResults on Bidirectional Long Sequences:")
print(f"Loss: {loss_bidirectional_long:.4f}, Accuracy: {accuracy_bidirectional_long:.4f}")

Resultados Modelo 1 


Results on Short Sequences:
Loss: 0.4145, Accuracy: 0.8378

Results on Long Sequences:
Loss: 0.4071, Accuracy: 0.8283

Resultados Modelo 2 


Results on Short Sequences:
Loss: 0.3833, Accuracy: 0.6552

Results on Long Sequences:
Loss: 0.3393, Accuracy: 0.8563

Resultados Modelo 3 


Results on Bidirectional Short Sequences:
Loss: 0.3665, Accuracy: 0.8457

Results on Bidirectional Long Sequences:
Loss: 0.5368, Accuracy: 0.7252
