# Bi-Directional RNN and Sentiment Analysis
A Bidirectional Recurrent Neural Network (Bi-RNN) is an extension of the traditional RNN that can improve model performance on sequence classification problems. In a Bi-RNN, the input sequence is processed in both forward and backward directions, allowing the network to have information from both past and future states. This is particularly useful for tasks where context from both directions is important, such as sentiment analysis, where understanding the sentiment of a sentence often requires context from the entire sentence.

In this notebook, we will explore the use of Bi-RNNs for sentiment analysis on a dataset of tweets. We will compare the performance of different models, including a shallow RNN, a unidirectional LSTM, and a bidirectional LSTM, to understand the benefits of using bidirectional layers.
/**
Dataset Description:
  
This dataset is used for predicting whether a given tweet is about a real disaster or not. The dataset consists of three main files: train.csv, test.csv, and sample_submission.csv.
  
 Files:
  - This is the set that contains labeled data.
  
 Data Format:
  Each sample in the train and test set includes the following information:
  - The text of a tweet.
  - A keyword from that tweet (this field may be blank).
  - The location from where the tweet was sent (this field may also be blank).
  
  Prediction Task:
  You are predicting whether a given tweet is about a real disaster or not. If the tweet is about a real disaster, predict a 1. If not, predict a 0.
  
 Columns:
  - id: A unique identifier for each tweet.
  - text: The text content of the tweet.
  - location: The location from where the tweet was sent (this field may be blank).
  - keyword: A particular keyword from the tweet (this field may be blank).
  - target: This column is present only in train.csv and denotes whether a tweet is about a real disaster (1) or not (0).
 */

In [None]:
# Load, explore and plot data
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Train test split
from sklearn.model_selection import train_test_split

# Accuracy score.
from sklearn.metrics import f1_score, accuracy_score, precision_score, recall_score

# Text pre-processing
import tensorflow as tf
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.callbacks import EarlyStopping

# Modeling
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, GRU, Dense, Embedding, Dropout, Bidirectional, Flatten, GlobalAveragePooling1D, GlobalMaxPool1D

## Data Splitting and Pre-processing
In this section, we will load the dataset and perform initial data exploration. We will then split the data into training and testing sets. After splitting, we will preprocess the text data by tokenizing and padding the sequences to ensure uniform input length for the neural network models. This pre-processing step is crucial for preparing the data for training the models.

In [None]:
# Read the CSV file into a DataFrame
df = pd.read_csv('train.csv')

# Select only the 'text' and 'target' columns from the DataFrame
df = df[['text', 'target']]

# Rename the 'target' column to 'label'
df.rename(columns={'target': 'label'}, inplace=True)

# Display the first 10 rows of the DataFrame
df.head(10)

In [None]:
tweet = df['text'].values
"""
This code snippet performs the following operations:
1. Extracts the 'text' column from the DataFrame `df` and assigns it to the variable `tweet`.
2. Extracts the 'label' column from the DataFrame `df` and assigns it to the variable `y`.
3. Provides a summary description of the DataFrame `df`.

Variables:
    tweet (numpy.ndarray): Array containing the text data from the DataFrame.
    y (numpy.ndarray): Array containing the label data from the DataFrame.

Functions:
    df.describe(): Generates descriptive statistics that summarize the central tendency, dispersion, and shape of the DataFrame's dataset.
"""
y = df['label'].values
df.describe()

In [None]:
df.groupby('label').describe().T
"""
Groups the DataFrame by the 'label' column and provides descriptive statistics for each group.

Returns:
    pandas.DataFrame: A transposed DataFrame containing descriptive statistics for each group in the 'label' column.
"""


**Distribution of Labels**

The following plot shows the distribution of disaster and non-disaster tweets in the dataset. This helps us understand the balance of the dataset and the proportion of each class.

In [None]:
plt.figure(figsize=(8,6))
sns.countplot(data=df, x='label', hue='label', palette=['#FF9999', '#66B2FF'], legend=False)

plt.xticks([0, 1], ['Not Disaster', 'Disaster'])

plt.xlabel('Disaster Tweets', fontsize=14)
plt.ylabel('Count', fontsize=14)
plt.title('Distribution of Labels', fontsize=16)

plt.show()

**Train Test Split**


The dataset is split into training and testing sets using an 90-10 split ratio. The `train_test_split` function from `sklearn.model_selection` is used for this purpose, ensuring that the data is randomly shuffled and split. The `random_state` parameter is set to 1000 to ensure reproducibility of the results.

In [None]:
tweet_train, tweet_test, y_train, y_test = train_test_split(tweet, y, test_size=0.1, random_state=1000)

**Tokenization**

The `Tokenizer` class from Keras is used to vectorize a corpus of text by turning each text into either a sequence of integers or into a vector where the coefficient for each token could be binary, based on word count, based on tf-idf, etc. In this step, we fit the tokenizer on the training tweets and convert the tweets into sequences of integers. We also determine the vocabulary size and the maximum sequence length for padding purposes.

In [None]:
tokenizer = Tokenizer()
tokenizer.fit_on_texts(tweet_train)

X_train = tokenizer.texts_to_sequences(tweet_train)
X_test = tokenizer.texts_to_sequences(tweet_test)

vocab_size = len(tokenizer.word_index) + 1

print(tweet_test[2])
print(X_train[2])
print(vocab_size)
maxlen = max(len(seq) for seq in X_train)
maxlen_test = max(len(seq) for seq in X_test)
print(maxlen, maxlen_test)


**Padding**

The sequences are padded to ensure that all input sequences have the same length, which is necessary for batch processing in neural networks. Padding is done using the `pad_sequences` function from Keras, which pads shorter sequences with zeros at the end. The maximum sequence length is determined based on the longest sequence in the training set.

In [None]:
import matplotlib.pyplot as plt

# Calculate lengths of all sequences
sequence_lengths_train = [len(seq) for seq in X_train]
sequence_lengths_test = [len(seq) for seq in X_test]

# Plot histogram
plt.hist(sequence_lengths_train, bins=30, alpha=0.7, label='Train')
plt.hist(sequence_lengths_test, bins=30, alpha=0.7, label='Test')
plt.xlabel('Sequence Length')
plt.ylabel('Frequency')
plt.title('Distribution of Sequence Lengths')
plt.legend(loc='upper right')
plt.show()



**Padding Sequences**

In this step, we pad the sequences to ensure that all input sequences have the same length, which is necessary for batch processing in neural networks. Padding is done using the `pad_sequences` function from Keras, which pads shorter sequences with zeros at the end. The maximum sequence length is determined based on the longest sequence in the training set.

In [None]:
# Training
print(maxlen)
# maxlen = 128
X_train = pad_sequences(X_train, padding='post', maxlen=maxlen)
# Testing
X_test = pad_sequences(X_test, padding='post', maxlen=maxlen)

print(X_train[0, :])
print('Shape of training tensor: ', X_train.shape)
print('Shape of testing tensor: ', X_test.shape)

**Embedding with GloVe**
*Using matrices of 100 dimensions for each word and storing the embedding matrix extracted from GLoVe in a variable called embedding_matrix*


**Creating the Embedding Matrix**

In this step, we create the embedding matrix using pre-trained GloVe embeddings. The embedding matrix is a 2D array where each row corresponds to a word in the vocabulary, and each column corresponds to the embedding dimension. We use the `create_embedding_matrix` function to load the GloVe embeddings and map them to the words in our vocabulary. We also calculate the proportion of words in our vocabulary that have a corresponding embedding in the GloVe vectors.


In [None]:
def create_embedding_matrix(filepath, word_index, embedding_dim):
    vocab_size = len(word_index) + 1
    embedding_matrix = np.zeros((vocab_size, embedding_dim))

    with open(filepath, 'r', encoding='utf-8') as f:
        for line in f:
            word, *vector = line.split()
            if word in word_index:
                idx = word_index[word]
                if len(vector) == embedding_dim:
                    embedding_matrix[idx] = np.array(vector, dtype=np.float32)
    return embedding_matrix

**Creating the Embedding Matrix**

In this step, we create the embedding matrix using pre-trained GloVe embeddings. The embedding matrix is a 2D array where each row corresponds to a word in the vocabulary, and each column corresponds to the embedding dimension. We use the `create_embedding_matrix` function to load the GloVe embeddings and map them to the words in our vocabulary. We also calculate the proportion of words in our vocabulary that have a corresponding embedding in the GloVe vectors.

In [None]:
embedding_dim = 100
embedding_matrix = create_embedding_matrix('glove.6B.100d.txt', tokenizer.word_index, embedding_dim)

print(embedding_matrix.shape)

nonzero_elements = np.count_nonzero(np.count_nonzero(embedding_matrix, axis=1))
nonzero_elements / vocab_size

# Creating Multiple Model

**Model-1 : Shallow RNN Model with an embedding layer, a dense layer with 10 hidden units and a output layer.**

The shallow RNN model consists of an embedding layer initialized with pre-trained GloVe embeddings, followed by a global max pooling layer, a dense layer with 32 hidden units and ReLU activation, and a final dense layer with a sigmoid activation for binary classification. The model is compiled with the Adam optimizer and binary cross-entropy loss function.


In [None]:
# Dimensions
print(X_train.shape)        # (6851, 33)
print(embedding_matrix.shape)  # (21084, 100)
print(vocab_size)         # 21084
print(maxlen)         # 33

# Model (explicit input = input_shape)
model0 = Sequential([
    Embedding(input_dim=vocab_size,
              output_dim=embedding_dim,
              weights=[embedding_matrix],
              input_length=maxlen,
              input_shape=(maxlen,)),
    GlobalMaxPool1D(),
    Dense(32, activation='relu'),
    Dense(1, activation='sigmoid')
])

#Compilation
model0.compile(optimizer='adam',
               loss='binary_crossentropy',
               metrics=['accuracy'])

#Summary
model0.summary()

**Model-2: Introduction of gated recurrence relation to the shallow RNN model with an embedding layer, a unidirectional LSTM layer with 10 hidden units and an output layer.**

The unidirectional LSTM model introduces a gated recurrence relation to the shallow RNN model. This model consists of an embedding layer initialized with pre-trained GloVe embeddings, followed by a unidirectional LSTM layer with 32 hidden units, a dropout layer with a rate of 0.1, and a final dense layer with a sigmoid activation for binary classification. The model is compiled with the Adam optimizer and binary cross-entropy loss function.

In [None]:
# Defined Parameters:
n_lstm = 32
drop_rate = 0.1

# Model
model1 = Sequential([
    Embedding(vocab_size, embedding_dim,
              weights=[embedding_matrix],
              input_length=maxlen,
              input_shape=(maxlen,)),
    LSTM(n_lstm, return_sequences=False),
    Dropout(drop_rate),
    Dense(1, activation='sigmoid')
])

#Compilation
model1.compile(optimizer='adam',
               loss='binary_crossentropy',
               metrics=['accuracy'])
#Summary
model1.summary()

**Model-3:  Changing the uni-directional LSTM layer with a bi-directional LSTM layer.**

The bidirectional LSTM model leverages the power of processing the input sequence in both forward and backward directions. This model consists of an embedding layer initialized with pre-trained GloVe embeddings, followed by a bidirectional LSTM layer with 32 hidden units, a dropout layer with a rate of 0.1, and a final dense layer with a sigmoid activation for binary classification. The model is compiled with the Adam optimizer and binary cross-entropy loss function.

In [None]:
# Defined Parameters:
n_lstm = 32
drop_rate = 0.1

# Model
model2 = Sequential([
    Embedding(input_dim=vocab_size,
              output_dim=embedding_dim,
              weights=[embedding_matrix],
              input_length=maxlen,
              input_shape=(maxlen,)),
    Bidirectional(LSTM(n_lstm, return_sequences=False)),
    Dropout(drop_rate),
    Dense(1, activation='sigmoid')
])

# Model Compilation
model2.compile(optimizer='adam',
               loss='binary_crossentropy',
               metrics=['accuracy',]) #'Precision', 'Recall'

# Model Summary
model2.summary()

# Training, Testing and Visual Representation of output.

In [None]:
from keras.backend import clear_session
clear_session()

**Graph general code**


The following functions are used to plot the training history and compare metrics between training and testing datasets:

- `plot_training_history(history)`: This function takes the training history of a model as input and generates two subplots. The first subplot shows the training and validation accuracy over epochs, while the second subplot shows the training and validation loss over epochs. It helps visualize how the model's performance evolves during training.
- `plot_metrics_comparison(train_metrics, test_metrics)`: This function takes two lists of metrics (one for training and one for testing) and generates a bar chart comparing these metrics. The metrics compared are Accuracy, Precision, Recall, and F1 Score. This visualization helps in understanding how well the model performs on the training data versus the testing data.


In [None]:
def plot_training_history(history):
    fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(12, 10))

    # Plot training & validation accuracy values
    ax1.plot(history.history['accuracy'])
    ax1.plot(history.history['val_accuracy'])
    ax1.set_title('Model Accuracy')
    ax1.set_ylabel('Accuracy')
    ax1.set_xlabel('Epoch')
    ax1.legend(['Train', 'Validation'], loc='lower right')

    # Plot training & validation loss values
    ax2.plot(history.history['loss'])
    ax2.plot(history.history['val_loss'])
    ax2.set_title('Model Loss')
    ax2.set_ylabel('Loss')
    ax2.set_xlabel('Epoch')
    ax2.legend(['Train', 'Validation'], loc='upper right')

    plt.tight_layout()
    plt.show()

def plot_metrics_comparison(train_metrics, test_metrics):
    metrics = ['Accuracy', 'Precision', 'Recall', 'F1 Score']

    x = np.arange(len(metrics))
    width = 0.35

    fig, ax = plt.subplots(figsize=(10, 6))
    rects1 = ax.bar(x - width/2, train_metrics, width, label='Train')
    rects2 = ax.bar(x + width/2, test_metrics, width, label='Test')

    ax.set_ylabel('Scores')
    ax.set_title('Training vs Testing Metrics')
    ax.set_xticks(x)
    ax.set_xticklabels(metrics)
    ax.legend()

    ax.bar_label(rects1, padding=3)
    ax.bar_label(rects2, padding=3)

    fig.tight_layout()
    plt.show()

# **Model-01**

The shallow RNN model is trained using early stopping to prevent overfitting. The model is evaluated on both training and testing datasets, and various metrics such as accuracy, precision, recall, and F1 score are calculated to assess the model's performance. The results are printed for both training and testing datasets.


In [None]:
# Early Stopping
early_stopping = EarlyStopping(monitor='val_loss', patience=3, restore_best_weights=True)

# Train
basic_rnn = model0.fit(
    X_train, y_train,
    epochs=20,
    batch_size=32,
    validation_split=0.1,
    callbacks=[early_stopping],
    verbose=1
)

# Evaluation on training data
train_loss, train_accuracy = model0.evaluate(X_train, y_train, verbose=0)
print(f"Training Loss: {train_loss:.4f}")
print(f"Training Accuracy: {train_accuracy:.4f}")

# Evaluation on testing data
test_loss, test_accuracy = model0.evaluate(X_test, y_test, verbose=0)
print(f"Testing Loss: {test_loss:.4f}")
print(f"Testing Accuracy: {test_accuracy:.4f}")

# Predictions
y_train_pred_prob = model0.predict(X_train)
y_test_pred_prob = model0.predict(X_test)

y_train_pred = (y_train_pred_prob > 0.5).astype(int)
y_test_pred = (y_test_pred_prob > 0.5).astype(int)

# Metrics Calc for training data
train_accuracy = accuracy_score(y_train, y_train_pred)
train_f1 = f1_score(y_train, y_train_pred)
train_precision = precision_score(y_train, y_train_pred)
train_recall = recall_score(y_train, y_train_pred)

print("\nTraining Metrics:")
print(f"Accuracy: {train_accuracy:.4f}")
print(f"Precision: {train_precision:.4f}")
print(f"Recall: {train_recall:.4f}")
print(f"F1 Score: {train_f1:.4f}")

# Metrics Calc for testing data
test_accuracy = accuracy_score(y_test, y_test_pred)
test_f1 = f1_score(y_test, y_test_pred)
test_precision = precision_score(y_test, y_test_pred)
test_recall = recall_score(y_test, y_test_pred)

print("\nTesting Metrics:")
print(f"Accuracy: {test_accuracy:.4f}")
print(f"Precision: {test_precision:.4f}")
print(f"Recall: {test_recall:.4f}")
print(f"F1 Score: {test_f1:.4f}")

**Model-01 Training History and Metrics Comparison**

In this section, we visualize the training history of the shallow RNN model and compare the performance metrics between the training and testing datasets. The training history plot shows the accuracy and loss over epochs, while the metrics comparison plot highlights the differences in accuracy, precision, recall, and F1 score between the training and testing datasets. This helps us understand how well the model generalizes to unseen data.

In [None]:
# Plot training history
plot_training_history(basic_rnn)

# Prepare metrics for comparison plot
train_metrics = [train_accuracy, train_precision, train_recall, train_f1]
test_metrics = [test_accuracy, test_precision, test_recall, test_f1]

# Plot metrics comparison
plot_metrics_comparison(train_metrics, test_metrics)

# Print final loss values
print(f"Final Training Loss: {train_loss:.4f}")
print(f"Final Testing Loss: {test_loss:.4f}")

# **Model-02**

The unidirectional LSTM model is trained using early stopping to prevent overfitting. The model is evaluated on both training and testing datasets, and various metrics such as accuracy, precision, recall, and F1 score are calculated to assess the model's performance. The results are printed for both training and testing datasets.

In [None]:
# Early Stopping
early_stopping = EarlyStopping(monitor='val_loss', patience=3, restore_best_weights=True)

# Train
uni_LSTM = model1.fit(
    X_train, y_train,
    epochs=20,
    batch_size=32,
    validation_split=0.1,
    callbacks=[early_stopping],
    verbose=1
)

# Evaluation on training data
train_loss, train_accuracy = model1.evaluate(X_train, y_train, verbose=0)
print(f"Training Loss: {train_loss:.4f}")
print(f"Training Accuracy: {train_accuracy:.4f}")

# Evaluation on testing data
test_loss, test_accuracy = model1.evaluate(X_test, y_test, verbose=0)
print(f"Testing Loss: {test_loss:.4f}")
print(f"Testing Accuracy: {test_accuracy:.4f}")

# Predictions
y_train_pred_prob = model1.predict(X_train)
y_test_pred_prob = model1.predict(X_test)

y_train_pred = (y_train_pred_prob > 0.5).astype(int)
y_test_pred = (y_test_pred_prob > 0.5).astype(int)

# Metrics Calc for training data
train_accuracy = accuracy_score(y_train, y_train_pred)
train_f1 = f1_score(y_train, y_train_pred)
train_precision = precision_score(y_train, y_train_pred)
train_recall = recall_score(y_train, y_train_pred)

print("\nTraining Metrics:")
print(f"Accuracy: {train_accuracy:.4f}")
print(f"Precision: {train_precision:.4f}")
print(f"Recall: {train_recall:.4f}")
print(f"F1 Score: {train_f1:.4f}")

# Metrics Calc for testing data
test_accuracy = accuracy_score(y_test, y_test_pred)
test_f1 = f1_score(y_test, y_test_pred)
test_precision = precision_score(y_test, y_test_pred)
test_recall = recall_score(y_test, y_test_pred)

print("\nTesting Metrics:")
print(f"Accuracy: {test_accuracy:.4f}")
print(f"Precision: {test_precision:.4f}")
print(f"Recall: {test_recall:.4f}")
print(f"F1 Score: {test_f1:.4f}")


**Model-02 Training History and Metrics Comparison**

In this section, we visualize the training history of the unidirectional LSTM model and compare the performance metrics between the training and testing datasets. The training history plot shows the accuracy and loss over epochs, while the metrics comparison plot highlights the differences in accuracy, precision, recall, and F1 score between the training and testing datasets. This helps us understand how well the model generalizes to unseen data.


In [None]:
# Plot training history
plot_training_history(uni_LSTM)

# Prepare metrics for comparison plot
train_metrics = [train_accuracy, train_precision, train_recall, train_f1]
test_metrics = [test_accuracy, test_precision, test_recall, test_f1]

# Plot metrics comparison
plot_metrics_comparison(train_metrics, test_metrics)

# Print final loss values
print(f"Final Training Loss: {train_loss:.4f}")
print(f"Final Testing Loss: {test_loss:.4f}")

# **Model-03**

The bidirectional LSTM model is trained using early stopping to prevent overfitting. The model is evaluated on both training and testing datasets, and various metrics such as accuracy, precision, recall, and F1 score are calculated to assess the model's performance. The results are printed for both training and testing datasets.

In [None]:
# Early Stopping
early_stopping = EarlyStopping(monitor='val_loss', patience=3, restore_best_weights=True)

# Train
bi_LSTM = model2.fit(
    X_train, y_train,
    epochs=20,
    batch_size=32,
    validation_split=0.1,
    callbacks=[early_stopping],
    verbose=1
)

# Evaluation on training data
train_loss, train_accuracy = model2.evaluate(X_train, y_train, verbose=0)
print(f"Training Loss: {train_loss:.4f}")
print(f"Training Accuracy: {train_accuracy:.4f}")

# Evaluation on testing data
test_loss, test_accuracy = model2.evaluate(X_test, y_test, verbose=0)
print(f"Testing Loss: {test_loss:.4f}")
print(f"Testing Accuracy: {test_accuracy:.4f}")

# Predictions
y_train_pred_prob = model2.predict(X_train)
y_test_pred_prob = model2.predict(X_test)

y_train_pred = (y_train_pred_prob > 0.5).astype(int)
y_test_pred = (y_test_pred_prob > 0.5).astype(int)

# Metrics Calc for training data
train_accuracy = accuracy_score(y_train, y_train_pred)
train_f1 = f1_score(y_train, y_train_pred)
train_precision = precision_score(y_train, y_train_pred)
train_recall = recall_score(y_train, y_train_pred)

print("\nTraining Metrics:")
print(f"Accuracy: {train_accuracy:.4f}")
print(f"Precision: {train_precision:.4f}")
print(f"Recall: {train_recall:.4f}")
print(f"F1 Score: {train_f1:.4f}")

# Metrics Calc for testing data
test_accuracy = accuracy_score(y_test, y_test_pred)
test_f1 = f1_score(y_test, y_test_pred)
test_precision = precision_score(y_test, y_test_pred)
test_recall = recall_score(y_test, y_test_pred)

print("\nTesting Metrics:")
print(f"Accuracy: {test_accuracy:.4f}")
print(f"Precision: {test_precision:.4f}")
print(f"Recall: {test_recall:.4f}")
print(f"F1 Score: {test_f1:.4f}")


**Model-03 Training History and Metrics Comparison**

In this section, we visualize the training history of the bidirectional LSTM model and compare the performance metrics between the training and testing datasets. The training history plot shows the accuracy and loss over epochs, while the metrics comparison plot highlights the differences in accuracy, precision, recall, and F1 score between the training and testing datasets. This helps us understand how well the model generalizes to unseen data.


In [None]:
# Plot training history
plot_training_history(bi_LSTM)

# Prepare metrics for comparison plot
train_metrics = [train_accuracy, train_precision, train_recall, train_f1]
test_metrics = [test_accuracy, test_precision, test_recall, test_f1]

# Plot metrics comparison
plot_metrics_comparison(train_metrics, test_metrics)

# Print final loss values
print(f"Final Training Loss: {train_loss:.4f}")
print(f"Final Testing Loss: {test_loss:.4f}")

Hardcoded - [Picked up the template from the internet]

The following code compares the performance of three models (Shallow RNN, Unidirectional LSTM, and Bidirectional LSTM) on training and testing datasets. It visualizes the metrics (Accuracy, Precision, Recall, and F1 Score) using bar charts and line plots to provide a clear comparison of model performance.

In [None]:
import matplotlib.pyplot as plt
import numpy as np

# Data
models = ['Shallow RNN', 'Unidirectional LSTM', 'Bidirectional LSTM']
metrics = ['Accuracy', 'Precision', 'Recall', 'F1 Score']

training_data = np.array([
    [0.8866, 0.8861, 0.8442, 0.8647],
    [0.9118, 0.9241, 0.8656, 0.8939],
    [0.9238, 0.9287, 0.8908, 0.9094]
])

testing_data = np.array([
    [0.7979, 0.7642, 0.7734, 0.7688],
    [0.8123, 0.8154, 0.7341, 0.7727],
    [0.8163, 0.7957, 0.7764, 0.7859]
])

# Plotting
fig, axs = plt.subplots(2, 2, figsize=(15, 12))
fig.suptitle('Model Comparison: Training vs Testing Performance', fontsize=16)

bar_width = 0.25
index = np.arange(len(models))

for i, metric in enumerate(metrics):
    ax = axs[i // 2, i % 2]

    ax.bar(index, training_data[:, i], bar_width, label='Training', alpha=0.8)
    ax.bar(index + bar_width, testing_data[:, i], bar_width, label='Testing', alpha=0.8)

    ax.set_xlabel('Models')
    ax.set_ylabel(metric)
    ax.set_title(f'{metric} Comparison')
    ax.set_xticks(index + bar_width / 2)
    ax.set_xticklabels(models, rotation=45, ha='right')
    ax.legend()

    # Add value labels
    for j, v in enumerate(training_data[:, i]):
        ax.text(j, v, f'{v:.4f}', ha='center', va='bottom')
    for j, v in enumerate(testing_data[:, i]):
        ax.text(j + bar_width, v, f'{v:.4f}', ha='center', va='bottom')

plt.tight_layout()
plt.subplots_adjust(top=0.93)
plt.show()

# Line plot for overall comparison
plt.figure(figsize=(12, 6))

for i, metric in enumerate(metrics):
    plt.plot(models, training_data[:, i], marker='o', label=f'Training {metric}')
    plt.plot(models, testing_data[:, i], marker='s', linestyle='--', label=f'Testing {metric}')

plt.xlabel('Models')
plt.ylabel('Metric Value')
plt.title('Overall Model Performance Comparison')
plt.legend(bbox_to_anchor=(1.05, 1), loc='upper left')
plt.tight_layout()
plt.show()

In [None]:
clear_session()


# Conclusion

In this notebook, we explored the use of Bidirectional Recurrent Neural Networks (Bi-RNNs) for sentiment analysis on a dataset of tweets. We compared the performance of three different models: a shallow RNN, a unidirectional LSTM, and a bidirectional LSTM. Here are the key takeaways:

1. **Data Preprocessing**: We performed extensive data preprocessing, including tokenization, padding, and the creation of an embedding matrix using pre-trained GloVe embeddings. This step was crucial for preparing the text data for input into the neural network models.

2. **Model Training and Evaluation**:
    - **Shallow RNN**: The shallow RNN model provided a baseline for comparison. It achieved reasonable performance but was outperformed by the more complex models.
    - **Unidirectional LSTM**: Introducing a gated recurrence relation with a unidirectional LSTM improved the model's performance, demonstrating the benefits of capturing temporal dependencies in the data.
    - **Bidirectional LSTM**: The bidirectional LSTM model further improved performance by processing the input sequence in both forward and backward directions, allowing the model to capture context from both past and future states.

3. **Performance Metrics**: We evaluated the models using various metrics, including accuracy, precision, recall, and F1 score. The bidirectional LSTM consistently outperformed the other models across these metrics, highlighting its effectiveness for sentiment analysis tasks.

4. **Visualization**: We visualized the training history and compared the performance metrics between the training and testing datasets. These visualizations helped us understand how well the models generalized to unseen data and provided insights into their strengths and weaknesses.

Overall, the bidirectional LSTM model demonstrated the best performance for sentiment analysis on the tweet dataset, showcasing the advantages of using bidirectional layers for sequence classification tasks. This notebook provides a comprehensive workflow for building and evaluating different RNN architectures for sentiment analysis, and the insights gained can be applied to similar natural language processing tasks.