<a href="https://colab.research.google.com/github/PaolaMaribel18/hands-on-2023A/blob/master/notebooks/12_advancedRNN.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 12 Advanced Recurrent Neural Networks

Advanced Neural Network architectures represent significant advancements in the field of deep learning, which are most used in the domain of sequence modeling and processing. These architectures build upon the traditional feedforward neural networks and introduce recurrent connections, allowing them to exhibit temporal dynamics and memory capabilities.

* The Elman RNN employs a simple recurrent loop in its hidden layer, enabling it to capture short-term temporal dependencies, making it suitable for applications such as speech recognition and time series analysis.
* The Jordan RNN possesses feedback connections from the output layer to the hidden layer, rendering it capable of modeling longer-term dependencies, which finds applications in machine translation and language modeling tasks.
* The Bidirectional RNN combines both forward and backward temporal processing, allowing it to consider both past and future context in its predictions, making it effective in natural language processing tasks such as sentiment analysis and named entity recognition.

These advanced neural network architectures significantly expand the modeling capabilities of traditional neural networks and have become indispensable tools in various sequential data processing applications.

### Exercise
Use the IMDB movie reviews dataset to perform sentiment analysis with a Elman, Jordan and Bidirectional RNN.
Highlight the differences on the performance of each architecture.

In [7]:
from keras.datasets import imdb
from keras.preprocessing import sequence
from keras.models import Sequential
from keras.layers import Embedding, SimpleRNN, Dense, Bidirectional
from tensorflow.keras.preprocessing.sequence import pad_sequences

####1. Load the IMDB movie reviews dataset

In [2]:
max_features = 5000  # Number of words to consider as features
max_len_short = 100  # Maximum sequence length for short sequences
max_len_long = 500   # Maximum sequence length for long sequences

(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=max_features)

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/imdb.npz


####2. Pad sequences to a fixed length for RNN input

In [11]:
x_train_short = pad_sequences(x_train, maxlen=max_len_short)
x_test_short = pad_sequences(x_test, maxlen=max_len_short)

x_train_long =pad_sequences(x_train, maxlen=max_len_long)
x_test_long =pad_sequences(x_test, maxlen=max_len_long)

####3. Build the distinct RNN models

In [38]:
def build_elman_rnn_model():
    model = Sequential()
    model.add(Embedding(max_features, 32))
    model.add(SimpleRNN(32, activation='relu'))
    model.add(Dense(1, activation='sigmoid'))
    return model

def build_jordan_rnn_model():
    model_jd = Sequential()
    model_jd.add(Embedding(max_features, 32))
    model_jd.add(SimpleRNN(32, activation='relu',return_sequences=False)) #devuelve solo la salida del último paso de tiempo, se obtiene una única salida al final de la secuencia.
    model_jd.add(Dense(1, activation='sigmoid'))
    return model_jd

def build_bidirectional_rnn_model():
    model_bd = Sequential()
    model_bd.add(Embedding(max_features, 32))
    model_bd.add(Bidirectional(SimpleRNN(32, activation='relu')))
    model_bd.add(Dense(1, activation='sigmoid'))
    return model_bd

####4. Train and evaluate the RNN model

#####Elman

In [13]:
def train_and_evaluate_model(model, x_train, y_train, x_test, y_test):
    model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
    history = model.fit(x_train, y_train, epochs=5, batch_size=128, validation_split=0.2)
    loss, accuracy = model.evaluate(x_test, y_test)
    return loss, accuracy, history

#####Jordan

In [26]:
def train_and_evaluate_model_jd(model_jd, x_train, y_train, x_test, y_test):
    model_jd.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
    history_jd = model_jd.fit(x_train, y_train, epochs=5, batch_size=128, validation_split=0.2)
    loss_jd, accuracy_jd = model_jd.evaluate(x_test, y_test)
    return loss_jd, accuracy_jd, history_jd

#####Bidirectional

In [35]:
def train_and_evaluate_model_bd(model_bd, x_train, y_train, x_test, y_test):
    model_bd.compile(optimizer='adam', loss ='binary_crossentropy', metrics=['accuracy'])
    history_bd = model_bd.fit(x_train, y_train, epochs=5, batch_size=128, validation_split=0.2)
    loss_bd, accuracy_bd = model_bd.evaluate(x_test, y_test)
    return loss_bd, accuracy_bd, history_bd

####5. Train and evaluate the RNN model on short sequences

#####Elman

In [21]:
print("\nTraining Elman RNN model on short sequences:")
rnn_model_short = build_elman_rnn_model()
loss_short, accuracy_short, history_short = train_and_evaluate_model(
    rnn_model_short, x_train_short, y_train, x_test_short, y_test
)


Training Elman RNN model on short sequences:
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


#####Jordan

In [43]:
print("\nTraining Jordan RNN model on short sequences:")
rnn_model_jd_short = build_jordan_rnn_model()
loss_short_jd, accuracy_short_jd, history_short_jd = train_and_evaluate_model_jd(
    rnn_model_jd_short, x_train_short, y_train, x_test_short, y_test
)


Training Jordan RNN model on short sequences:
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


#####Bidirectional

In [44]:
print("\nTraining Bidirectional RNN model on short sequences:")
rnn_model_bd_short = build_bidirectional_rnn_model()
loss_short_bd, accuracy_short_bd, history_short_bd = train_and_evaluate_model_bd(
    rnn_model_bd_short, x_train_short, y_train, x_test_short, y_test
)


Training Bidirectional RNN model on short sequences:
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


####6. Train and evaluate the RNN model on long sequences

#####Elman

In [39]:
print("\nTraining Elman RNN model on long sequences:")
rnn_model_long = build_elman_rnn_model()
loss_long, accuracy_long, history_long = train_and_evaluate_model(
    rnn_model_long, x_train_long, y_train, x_test_long, y_test
)


Training Elman RNN model on long sequences:
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


#####Jordan

In [45]:
print("\nTraining Jordan RNN model on long sequences:")
rnn_model_long_jd = build_jordan_rnn_model()
loss_long_jd, accuracy_long_jd, history_long_jd = train_and_evaluate_model_jd(
    rnn_model_long_jd, x_train_long, y_train, x_test_long, y_test
)


Training Jordan RNN model on long sequences:
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


#####Bidirectional

In [46]:
print("\nTraining Bidirectional RNN model on long sequences:")
rnn_model_long_bd = build_bidirectional_rnn_model()
loss_long_bd, accuracy_long_bd, history_long_bd = train_and_evaluate_model_bd(
    rnn_model_long_bd, x_train_long, y_train, x_test_long, y_test
)


Training Bidirectional RNN model on long sequences:
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


####7. Compare the results

In [47]:
#Elman
print("\nResults on Short Sequences (Elman):")
print(f"Loss: {loss_short:.4f}, Accuracy: {accuracy_short:.4f}")

print("\nResults on Long Sequences (Elman):")
print(f"Loss: {loss_long:.4f}, Accuracy: {accuracy_long:.4f}")

#Jordan
print("\nResults on Short Sequences (Jordan)):")
print(f"Loss: {loss_short_jd:.4f}, Accuracy: {accuracy_short_jd:.4f}")

print("\nResults on Long Sequences (Jordan):")
print(f"Loss: {loss_long_jd:.4f}, Accuracy: {accuracy_long_jd:.4f}")

#Bidirectional
print("\nResults on Short Sequences (Bidirectional)):")
print(f"Loss: {loss_short_bd:.4f}, Accuracy: {accuracy_short_bd:.4f}")

print("\nResults on Long Sequences (Bidirectional):")
print(f"Loss: {loss_long_bd:.4f}, Accuracy: {accuracy_long_bd:.4f}")


Results on Short Sequences (Elman):
Loss: 0.3681, Accuracy: 0.8418

Results on Long Sequences (Elman):
Loss: 0.3536, Accuracy: 0.8472

Results on Short Sequences (Jordan)):
Loss: 0.4126, Accuracy: 0.8369

Results on Long Sequences (Jordan):
Loss: 0.4489, Accuracy: 0.7917

Results on Short Sequences (Bidirectional)):
Loss: 0.3923, Accuracy: 0.8372

Results on Long Sequences (Bidirectional):
Loss: 0.3337, Accuracy: 0.8630


In [48]:
from tabulate import tabulate

# Sample results, replace with your actual values
results = [
    ["Elman", "Short Sequences", loss_short, accuracy_short],
    ["Elman", "Long Sequences", loss_long, accuracy_long],
    ["Jordan", "Short Sequences", loss_short_jd, accuracy_short_jd],
    ["Jordan", "Long Sequences", loss_long_jd, accuracy_long_jd],
    ["Bidirectional", "Short Sequences", loss_short_bd, accuracy_short_bd],
    ["Bidirectional", "Long Sequences", loss_long_bd, accuracy_long_bd]
]

headers = ["Model Type", "Sequence Length", "Loss", "Accuracy"]

print("\nResults for Different RNN Models:")
print("=" * 50)

# Print the results in a formatted table
print(tabulate(results, headers=headers, tablefmt="grid"))



Results for Different RNN Models:
+---------------+-------------------+----------+------------+
| Model Type    | Sequence Length   |     Loss |   Accuracy |
| Elman         | Short Sequences   | 0.368058 |    0.84176 |
+---------------+-------------------+----------+------------+
| Elman         | Long Sequences    | 0.353645 |    0.84724 |
+---------------+-------------------+----------+------------+
| Jordan        | Short Sequences   | 0.412564 |    0.83692 |
+---------------+-------------------+----------+------------+
| Jordan        | Long Sequences    | 0.448894 |    0.79168 |
+---------------+-------------------+----------+------------+
| Bidirectional | Short Sequences   | 0.392342 |    0.83724 |
+---------------+-------------------+----------+------------+
| Bidirectional | Long Sequences    | 0.333651 |    0.86304 |
+---------------+-------------------+----------+------------+


Según los resultados obtenidos, se observa que el modelo Bidireccional superó a los otros en términos de precisión, con un 83.72% en secuencias cortas y un 86.30% en secuencias largas, y pérdidas de 0.3923 y 0.3337 respectivamente.