#**Project Description**


This notebook presents a systematic comparative benchmark of three popular deep learning architectures—Bi-directional Long Short-Term Memory (Bi-LSTM), Gated Recurrent Unit (GRU), and 1D Convolutional Neural Network (Conv1D)—for the task of text classification (specifically, sentiment analysis).

**Objective**

The primary goal is to evaluate the trade-offs between sequential models (Bi-LSTM, GRU) and pattern-based models (Conv1D) in terms of prediction accuracy, computational cost (training time), and model complexity (trainable parameters).

**Methodology**

Data Preprocessing:

The text data is tokenized using Keras's Tokenizer, and sequences are padded to a uniform length (MAX_LEN).

Labels are converted using LabelBinarizer.

**Model Definition**:

A factory function (create_model) defines each of the three architectures with an initial Embedding layer, followed by the respective core layer (Bi-LSTM, GRU, or Conv1D + GlobalMaxPooling), and standard dense output layers.

**Benchmarking**:

The run_benchmark function trains each model, evaluates its performance on a held-out test set, and measures the wall-clock training time.




In [None]:
import os
import pandas as pd
import numpy as np
import time
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelBinarizer
from sklearn.metrics import classification_report, accuracy_score
import tensorflow as tf
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import (
    Embedding, Bidirectional, LSTM, GRU, Conv1D, GlobalMaxPooling1D, Dense, Dropout
)




In [None]:




#DRIVE_FILE_PATH =


if os.path.exists(DRIVE_FILE_PATH):

    df = pd.read_csv(DRIVE_FILE_PATH, encoding='latin-1')
    print("Data loaded successfully.")
else:
    raise FileNotFoundError(f"ERROR: {DRIVE_FILE_PATH} not found. "
                            "Please check the file path and ensure the file exists in your Google Drive.")




MAX_WORDS = 10000
MAX_LEN = 100
EMBEDDING_DIM = 64

NUM_CLASSES = len(df['Sentiment'].unique()) if 'Sentiment' in df.columns else len(df.iloc[:, 1].unique())
EPOCHS = 10
BATCH_SIZE = 32


if 'Unnamed: 0' in df.columns:
    df = df.drop(columns=['Unnamed: 0'])

df.columns = ['text', 'label']
df.dropna(subset=['text', 'label'], inplace=True)


X = df['text'].values
y = df['label'].values
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42, stratify=y
)


lb = LabelBinarizer()
y_train_ohe = lb.fit_transform(y_train)
y_test_ohe = lb.transform(y_test)
CLASS_NAMES = lb.classes_.astype(str)


tokenizer = Tokenizer(num_words=MAX_WORDS, oov_token="<OOV>")
tokenizer.fit_on_texts(X_train)

X_train_sequences = tokenizer.texts_to_sequences(X_train)
X_test_sequences = tokenizer.texts_to_sequences(X_test)

X_train_padded = pad_sequences(X_train_sequences, maxlen=MAX_LEN, padding='post', truncating='post')
X_test_padded = pad_sequences(X_test_sequences, maxlen=MAX_LEN, padding='post', truncating='post')

print(f"Data ready. Sequences padded to length {MAX_LEN}.")
print(f"Train samples: {X_train_padded.shape[0]}, Test samples: {X_test_padded.shape[0]}")


def create_model(model_type):
    model = Sequential()
    model.add(Embedding(MAX_WORDS, EMBEDDING_DIM, input_length=MAX_LEN))

    if model_type == 'Bi-LSTM':

        model.add(Bidirectional(LSTM(64)))
        model.add(Dropout(0.5))

    elif model_type == 'GRU':

        model.add(GRU(64))
        model.add(Dropout(0.5))

    elif model_type == 'Conv1D':

        model.add(Conv1D(filters=128, kernel_size=5, activation='relu'))
        model.add(GlobalMaxPooling1D())


    model.add(Dense(32, activation='relu'))
    model.add(Dense(NUM_CLASSES, activation='softmax'))

    model.compile(
        optimizer='adam',
        loss='categorical_crossentropy',
        metrics=['accuracy']
    )
    return model


def run_benchmark(model_type):
    print(f"\n---  BUILDING {model_type} Model ---")
    model = create_model(model_type)


    print(f"---  TRAINING {model_type} ---")
    start_time = time.time()
    history = model.fit(
        X_train_padded, y_train_ohe,
        epochs=EPOCHS,
        batch_size=BATCH_SIZE,
        validation_split=0.1,
        verbose=0
    )
    training_time = time.time() - start_time


    _, accuracy = model.evaluate(X_test_padded, y_test_ohe, verbose=0)


    y_pred_ohe = model.predict(X_test_padded, verbose=0)
    y_pred = np.argmax(y_pred_ohe, axis=1)
    y_test_labels = y_test_ohe.argmax(axis=1)

    report = classification_report(y_test_labels, y_pred, target_names=CLASS_NAMES, output_dict=True)

    metrics = {
        'Test Accuracy': accuracy,
        'Training Time (s)': training_time,
        'F1-Score (Macro Avg)': report['macro avg']['f1-score'],
        'Precision (Macro Avg)': report['macro avg']['precision'],
        'Recall (Macro Avg)': report['macro avg']['recall'],
        'Trainable Parameters': model.count_params()
    }

    print(f" {model_type} complete. Accuracy: {accuracy:.4f}, Time: {training_time:.2f}s")
    return metrics, model


models_to_test = ['Bi-LSTM', 'GRU', 'Conv1D']
benchmark_results = {}
trained_models = {}

for model_type in models_to_test:
    metrics, model = run_benchmark(model_type)
    benchmark_results[model_type] = metrics
    trained_models[model_type] = model


print("\n" + "="*70)
print("              DEEP LEARNING MODEL BENCHMARK RESULTS ")
print("="*70)

results_df = pd.DataFrame(benchmark_results).T


results_df['Test Accuracy'] = (results_df['Test Accuracy'] * 100).map('{:.2f}%'.format)
results_df['F1-Score (Macro Avg)'] = results_df['F1-Score (Macro Avg)'].map('{:.3f}'.format)
results_df['Training Time (s)'] = results_df['Training Time (s)'].map('{:.2f}s'.format)
results_df['Trainable Parameters'] = results_df['Trainable Parameters'].map('{:,}'.format)


best_model = max(benchmark_results, key=lambda k: benchmark_results[k]['Test Accuracy'])

print(results_df)

print("\n" + "="*70)
print(f" **BEST PERFORMER (by Test Accuracy): {best_model}** ")
print("="*70)


print("\n### ARCHITECTURAL ANALYSIS:")
print("* **Recurrent Models (Bi-LSTM, GRU):** These models excel at sequential data, capturing context by remembering information across the sequence (the 'memory' effect).")
print("  * **Bi-LSTM** is the most complex, offering high accuracy by processing text both forward and backward, but it is typically the slowest to train.")
print("  * **GRU** is a simplified, two-gate version of LSTM. It often offers a near-LSTM performance level while being computationally more efficient and faster to train.")
print("* **Convolutional Model (Conv1D):** This model operates like a pattern detector, scanning for local patterns (n-grams). It is exceptionally fast because convolutions are highly parallelizable (run simultaneously across the sequence).")

Data loaded successfully.
Data ready. Sequences padded to length 100.
Train samples: 192742, Test samples: 48186

---  BUILDING Bi-LSTM Model ---
---  TRAINING Bi-LSTM ---




 Bi-LSTM complete. Accuracy: 0.8418, Time: 6270.96s

---  BUILDING GRU Model ---
---  TRAINING GRU ---


  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


 GRU complete. Accuracy: 0.4278, Time: 4355.75s

---  BUILDING Conv1D Model ---
---  TRAINING Conv1D ---
 Conv1D complete. Accuracy: 0.8405, Time: 1663.47s

              DEEP LEARNING MODEL BENCHMARK RESULTS 
        Test Accuracy Training Time (s) F1-Score (Macro Avg)  \
Bi-LSTM        84.18%          6270.96s                0.834   
GRU            42.78%          4355.75s                0.200   
Conv1D         84.05%          1663.47s                0.832   

         Precision (Macro Avg)  Recall (Macro Avg) Trainable Parameters  
Bi-LSTM               0.833726            0.833335            710,275.0  
GRU                   0.409255            0.333478            667,139.0  
Conv1D                0.831810            0.832951            685,315.0  

 **BEST PERFORMER (by Test Accuracy): Bi-LSTM** 

### ARCHITECTURAL ANALYSIS:
* **Recurrent Models (Bi-LSTM, GRU):** These models excel at sequential data, capturing context by remembering information across the sequence (the 'memory' e

#**Key Findings**

The results provide clear insight into the performance characteristics of each model:

**Bi-LSTM**:

It offers the highest accuracy due to its ability to capture long-range dependencies in both forward and backward directions, but it is the most time-intensive to train.

