# 12 Advanced Recurrent Neural Networks

Advanced Neural Network architectures represent significant advancements in the field of deep learning, which are most used in the domain of sequence modeling and processing. These architectures build upon the traditional feedforward neural networks and introduce recurrent connections, allowing them to exhibit temporal dynamics and memory capabilities.

* The Elman RNN employs a simple recurrent loop in its hidden layer, enabling it to capture short-term temporal dependencies, making it suitable for applications such as speech recognition and time series analysis.
* The Jordan RNN possesses feedback connections from the output layer to the hidden layer, rendering it capable of modeling longer-term dependencies, which finds applications in machine translation and language modeling tasks.
* The Bidirectional RNN combines both forward and backward temporal processing, allowing it to consider both past and future context in its predictions, making it effective in natural language processing tasks such as sentiment analysis and named entity recognition.

These advanced neural network architectures significantly expand the modeling capabilities of traditional neural networks and have become indispensable tools in various sequential data processing applications.

### Exercise
Use the IMDB movie reviews dataset to perform sentiment analysis with a Elman, Jordan and Bidirectional RNN.
Highlight the differences on the performance of each architecture.

In [4]:
from datetime import datetime, timezone, timedelta
from keras.datasets import imdb
#from keras.preprocessing import sequence
from tensorflow.keras.preprocessing import sequence
from keras.models import Sequential
from keras.layers import Embedding, SimpleRNN, Dense, Bidirectional, Flatten, TimeDistributed, GlobalMaxPool1D

1. Load the IMDB movie reviews dataset

In [5]:
max_features = 5000  # Number of words to consider as features
max_len_short = 100  # Maximum sequence length for short sequences
max_len_long = 500   # Maximum sequence length for long sequences

(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=max_features)

2. Pad sequences to a fixed length for RNN input

In [6]:
x_train_short = sequence.pad_sequences(x_train, maxlen=max_len_short)
x_test_short = sequence.pad_sequences(x_test, maxlen=max_len_short)

x_train_long = sequence.pad_sequences(x_train, maxlen=max_len_long)
x_test_long = sequence.pad_sequences(x_test, maxlen=max_len_long)

3. Build the distinct RNN models

In [7]:
def build_elman_rnn_model():
    model = Sequential()
    model.add(Embedding(max_features, 32))
    model.add(SimpleRNN(32, activation='relu'))
    model.add(Dense(1, activation='sigmoid'))
    return model

def build_jordan_rnn_model():
    model = Sequential()
    model.add(Embedding(max_features, 32))
    model.add(SimpleRNN(32, activation='relu', return_sequences=True))
    #model.add(Flatten())  # Agrega una capa Flatten
    #model.add(TimeDistributed(Dense(1, activation='sigmoid')))  # Capa de salida con distribución temporal
    model.add(GlobalMaxPool1D())
    model.add(Dense(1, activation='sigmoid'))
    return model

def build_bidirectional_rnn_model():
    model = Sequential()
    model.add(Embedding(max_features, 32))
    model.add(Bidirectional(SimpleRNN(32, activation='relu')))
    model.add(Dense(1, activation='sigmoid'))
    return model

4. Train and evaluate the RNN model

In [8]:
def train_and_evaluate_model(model, x_train, y_train, x_test, y_test):
    model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
    history = model.fit(x_train, y_train, epochs=5, batch_size=128, validation_split=0.2)
    loss, accuracy = model.evaluate(x_test, y_test)
    return loss, accuracy, history

5. Train and evaluate the Elman RNN model on short sequences



In [9]:
start_time = datetime.now(timezone(timedelta(hours=-5)))
print("Inicio de ejecución:", start_time)

print("\nTraining RNN model on short sequences:")
rnn_model_short = build_elman_rnn_model()
loss_short, accuracy_short, history_short = train_and_evaluate_model(
    rnn_model_short, x_train_short, y_train, x_test_short, y_test
)

end_time = datetime.now(timezone(timedelta(hours=-5)))
print("Fin de ejecución:", end_time)

execution_time = end_time - start_time
print("Tiempo total de ejecución:", execution_time)

Inicio de ejecución: 2023-08-04 20:25:55.409682-05:00

Training RNN model on short sequences:
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Fin de ejecución: 2023-08-04 20:28:31.165663-05:00
Tiempo total de ejecución: 0:02:35.755981


6. Train and evaluate the Elman RNN model on long sequences.

In [10]:
start_time = datetime.now(timezone(timedelta(hours=-5)))
print("Inicio de ejecución:", start_time)

print("\nTraining Elman RNN model on long sequences:")
rnn_model_long = build_elman_rnn_model()
loss_long, accuracy_long, history_long = train_and_evaluate_model(
    rnn_model_long, x_train_long, y_train, x_test_long, y_test
)

end_time = datetime.now(timezone(timedelta(hours=-5)))
print("Fin de ejecución:", end_time)

execution_time = end_time - start_time
print("Tiempo total de ejecución:", execution_time)

Inicio de ejecución: 2023-08-04 20:28:31.177206-05:00

Training Elman RNN model on long sequences:
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Fin de ejecución: 2023-08-04 20:37:11.456196-05:00
Tiempo total de ejecución: 0:08:40.278990


7. Compare the results

In [16]:
print("\nResults on Short Sequences:")
print(f"Loss: {loss_short:.4f}, Accuracy: {accuracy_short:.4f}")

print("\nResults on Long Sequences:")
print(f"Loss: {loss_long:.4f}, Accuracy: {accuracy_long:.4f}")


Results on Short Sequences:
Loss: 0.4191, Accuracy: 0.8363

Results on Long Sequences:
Loss: 0.3696, Accuracy: 0.8454


8. Train and evaluate the Jordan RNN model on short sequences




In [18]:
start_time = datetime.now(timezone(timedelta(hours=-5)))
print("Inicio de ejecución:", start_time)

print("\nTraining Jordan RNN model on short sequences:")
rnn_model_short = build_jordan_rnn_model()
loss_short, accuracy_short, history_short = train_and_evaluate_model(
    rnn_model_short, x_train_short, y_train, x_test_short, y_test
)

end_time = datetime.now(timezone(timedelta(hours=-5)))
print("Fin de ejecución:", end_time)

execution_time = end_time - start_time
print("Tiempo total de ejecución:", execution_time)

Inicio de ejecución: 2023-08-04 20:44:34.228598-05:00

Training Jordan RNN model on short sequences:
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Fin de ejecución: 2023-08-04 20:46:42.695749-05:00
Tiempo total de ejecución: 0:02:08.467151


9. Train and evaluate the Jordan RNN model on long sequences

In [19]:
start_time = datetime.now(timezone(timedelta(hours=-5)))
print("Inicio de ejecución:", start_time)

print("\nTraining Jordan RNN model on long sequences:")
rnn_model_long = build_jordan_rnn_model()
loss_long, accuracy_long, history_long = train_and_evaluate_model(
    rnn_model_long, x_train_long, y_train, x_test_long, y_test
)

end_time = datetime.now(timezone(timedelta(hours=-5)))
print("Fin de ejecución:", end_time)

execution_time = end_time - start_time
print("Tiempo total de ejecución:", execution_time)

Inicio de ejecución: 2023-08-04 20:46:42.711087-05:00

Training Jordan RNN model on long sequences:
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Fin de ejecución: 2023-08-04 20:55:46.932527-05:00
Tiempo total de ejecución: 0:09:04.221440


10. Compare the results

In [21]:
print("\nResults on Short Sequences:")
print(f"Loss: {loss_short:.4f}, Accuracy: {accuracy_short:.4f}")

print("\nResults on Long Sequences:")
print(f"Loss: {loss_long:.4f}, Accuracy: {accuracy_long:.4f}")


Results on Short Sequences:
Loss: 0.4021, Accuracy: 0.8272

Results on Long Sequences:
Loss: 0.3282, Accuracy: 0.8629


11. Train and evaluate the Bidirectional RNN model on short sequences

In [22]:
start_time = datetime.now(timezone(timedelta(hours=-5)))
print("Inicio de ejecución:", start_time)

print("\nTraining Bidirectional RNN model on short sequences:")
rnn_model_short = build_bidirectional_rnn_model()
loss_short, accuracy_short, history_short = train_and_evaluate_model(
    rnn_model_short, x_train_short, y_train, x_test_short, y_test
)

end_time = datetime.now(timezone(timedelta(hours=-5)))
print("Fin de ejecución:", end_time)

execution_time = end_time - start_time
print("Tiempo total de ejecución:", execution_time)

Inicio de ejecución: 2023-08-04 20:58:48.886321-05:00

Training Bidirectional RNN model on short sequences:
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Fin de ejecución: 2023-08-04 21:02:27.182522-05:00
Tiempo total de ejecución: 0:03:38.296201


12. Train and evaluate the Bidirectional RNN model on long sequences

In [23]:
start_time = datetime.now()
print("Inicio de ejecución:", start_time)
print("\nTraining Bidirectional RNN model on long sequences:")
rnn_model_long = build_bidirectional_rnn_model()
loss_long, accuracy_long, history_long = train_and_evaluate_model(
    rnn_model_long, x_train_long, y_train, x_test_long, y_test
)
end_time = datetime.now()
print("Fin de ejecución:", end_time)

execution_time = end_time - start_time
print("Tiempo total de ejecución:", execution_time)

Inicio de ejecución: 2023-08-05 02:02:27.200772

Training Bidirectional RNN model on long sequences:
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Fin de ejecución: 2023-08-05 02:18:13.509753
Tiempo total de ejecución: 0:15:46.308981


13. Compare the results

In [25]:
print("\nResults on Short Sequences:")
print(f"Loss: {loss_short:.4f}, Accuracy: {accuracy_short:.4f}")

print("\nResults on Long Sequences:")
print(f"Loss: {loss_long:.4f}, Accuracy: {accuracy_long:.4f}")


Results on Short Sequences:
Loss: 0.4125, Accuracy: 0.8403

Results on Long Sequences:
Loss: 0.3429, Accuracy: 0.8644


14. Comparación de Precisión:

Se observaron diferencias en la precisión entre las secuencias cortas y largas para cada tipo de RNN. Los resultados son los siguientes:

Elman RNN:

Precisión en secuencias cortas: 0.8363

Precisión en secuencias largas: 0.8454

Jordan RNN:

Precisión en secuencias cortas: 0.8272

Precisión en secuencias largas: 0.8629

Bidireccional RNN:

Precisión en secuencias cortas: 0.8403

Precisión en secuencias largas: 0.8644

Estos resultados sugieren que la arquitectura Jordan RNN muestra la mayor mejora en la precisión al trabajar con secuencias largas en comparación con las cortas. Sin embargo, las otras arquitecturas también demuestran mejoras, pero en menor medida.

15. Tiempo de Ejecución:

Se calcularon los tiempos de entrenamiento para secuencias cortas y largas en cada arquitectura. Los resultados son los siguientes:

Elman RNN:

Tiempo de entrenamiento en secuencias cortas: 0:02:35.755981

Tiempo de entrenamiento en secuencias largas: 0:08:40.278990

Jordan RNN:

Tiempo de entrenamiento en secuencias cortas: 0:02:08.467151

Tiempo de entrenamiento en secuencias largas: 0:09:04.221440

Bidireccional RNN:

Tiempo de entrenamiento en secuencias cortas: 0:03:38.296201

Tiempo de entrenamiento en secuencias largas: 0:15:46.308981

Se observa que la arquitectura Bidireccional RNN tiende a requerir más tiempo de entrenamiento, especialmente en secuencias largas, en comparación con las otras dos arquitecturas.

16. Fortalezas y Debilidades:

En general, las arquitecturas Elman, Jordan y Bidireccional RNN demuestran ser capaces de aprender y generalizar patrones en las reseñas de películas. Sin embargo, se observan algunas diferencias:

- Elman RNN muestra un buen desempeño en ambas secuencias, con una precisión ligeramente mayor en secuencias largas.
- Jordan RNN muestra una mejora notable en la precisión al trabajar con secuencias largas, sugiriendo una mejor captura de dependencias a largo plazo.
- Bidireccional RNN tiene una precisión competitiva y puede beneficiarse de un análisis más profundo de sus tiempos de entrenamiento.

En general, la elección de la arquitectura depende de la longitud de la secuencia y la tolerancia al tiempo de entrenamiento.

