# NLP Assignment #2
### by Prodromos Kampouridis 

### Γ. Text Classification with RNNs

#### IMPORTANT NOTE
##### *Due to the large length of code, the answers to task Γ can also be found as markdowns in the cells below.*


##### *For more detailed information, please refer to the report entitled PRODROMOS KAMPOURIDIS REPORT*

Firstly, we modify the given code in order to create some reusable functions for each subquestion.The following cell containes the modified data preprocessing, model training and model evaluation code. Furthermore, we move the functionality of data preprocessing under a function named "preprocess_data" that receives the max_words variable as a parameter, so that we can later reuse the same code to produce data loaders and vocabulary based for different max_words value for task 3. Moreover, we alter the TrainModel function to calculate the average epoch time cost in seconds and return it.

In [None]:
"""

A RNN classifier applied to AG_NEWS dataset

Download dataset:
https://www.kaggle.com/datasets/amananandrai/ag-news-classification-dataset

"""
import time
import random

import torch
import numpy as np

from torch.utils.data import DataLoader
from torchtext.data import get_tokenizer
from torchtext.vocab import build_vocab_from_iterator
from torch.utils.data.dataset import random_split
from torch import nn
from torch.nn import functional as F
import pandas as pd
from tqdm import tqdm
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# HYPER-PARAMETERS
MAX_WORDS = 25
EPOCHS = 15
LEARNING_RATE = 1e-3
BATCH_SIZE = 1024
EMBEDDING_DIM = 100
HIDDEN_DIM = 64

######################################################################
# Read dataset files 
# ------------------


train_data = pd.read_csv('train.csv')
test_data = pd.read_csv('test.csv')

######################################################################
# Data processing 
# -----------------------------


tokenizer = get_tokenizer("basic_english")

def preprocess_data(max_words, vocab=None):

    # All texts are truncated and padded to max_words tokens
    def collate_batch(batch):
        Y, X = list(zip(*batch))
        Y = torch.tensor(Y) - 1 # Target names in range [0,1,2,3] instead of [1,2,3,4]
        X = [vocab(tokenizer(text)) for text in X]
        # Bringing all samples to max_words length. Shorter texts are padded with <PAD> sequences, longer texts are truncated.
        X = [tokens+([vocab['<PAD>']]* (max_words-len(tokens))) if len(tokens)<max_words else tokens[:max_words] for tokens in X]
        return torch.tensor(X, dtype=torch.int32).to(device), Y.to(device) 

    train_dataset = [(label,train_data['Title'][i] + ' ' + train_data['Description'][i]) for i,label in enumerate(train_data['Class Index'])]
    test_dataset = [(label,test_data['Title'][i] + ' ' + test_data['Description'][i]) for i,label in enumerate(test_data['Class Index'])]

    train_loader = DataLoader(train_dataset, batch_size=BATCH_SIZE,
                                  shuffle=True, collate_fn=collate_batch)
    test_loader = DataLoader(test_dataset, batch_size=BATCH_SIZE,
                                  shuffle=False, collate_fn=collate_batch)
    
    def build_vocabulary(datasets):
        for dataset in datasets:
            for _, text in dataset:
                yield tokenizer(text)

    if vocab is None:
        # Vocabulary includes all tokens with at least 10 occurrences in the texts
        # Special tokens <PAD> and <UNK> are used for padding sequences and unknown words respectively
        vocab = build_vocab_from_iterator(build_vocabulary([train_dataset, test_dataset]), min_freq=10, specials=["<PAD>","<UNK>"])
        vocab.set_default_index(vocab["<UNK>"])

    return train_dataset, test_dataset, train_loader, test_loader, vocab

train_dataset, test_dataset, train_loader, test_loader, vocab = preprocess_data(MAX_WORDS)

target_classes = ["World", "Sports", "Business", "Sci/Tech"]


######################################################################
# Define functions to train and evaluate the model
# ------------------------------------------------

# Count model parameters
def count_parameters(model):
    return sum(p.numel() for p in model.parameters() if p.requires_grad)

def EvaluateModel(model, loss_fn, val_loader):
    model.eval()
    with torch.no_grad():
        Y_actual, Y_preds, losses = [],[],[]
        for X, Y in val_loader:
            preds = model(X)
            loss = loss_fn(preds, Y)
            losses.append(loss.item())

            Y_actual.append(Y)
            Y_preds.append(preds.argmax(dim=-1))

        Y_actual = torch.cat(Y_actual)
        Y_preds = torch.cat(Y_preds)
    
    # Returns mean loss, actual labels, predicted labels 
    return torch.tensor(losses).mean(), Y_actual.detach().cpu().numpy(), Y_preds.detach().cpu().numpy()


def TrainModel(model, loss_fn, optimizer, train_loader, epochs):
    epoch_times_list = []
    for i in range(1, epochs+1):
        start = time.time()
        model.train()
        print('Epoch:',i)
        losses = []
        for X, Y in tqdm(train_loader):
            Y_preds = model(X)

            loss = loss_fn(Y_preds, Y)
            losses.append(loss.item())

            optimizer.zero_grad()
            loss.backward()
            optimizer.step()

        epoch_time = (time.time() - start)
        epoch_times_list.append(epoch_time)
        print("Train Loss : {:.3f}".format(torch.tensor(losses).mean()))
    
    # return the average epoch duration in seconds.
    return sum(epoch_times_list) / len(epoch_times_list)


Thereafter, we alter the Model class to make it customizable. This allows us to use this class to create all alternative models of the first task, by adjusting the rnn_type, rnn_layers and bidirectional parameters, and also load pre-trained embeddings and choose to keep them freezed or not (pretrained_embeddings and freeze_embeddings parameters), for questions 4 and 5.

In [None]:
######################################################################
# Define the model
# ----------------
class Model(nn.Module):
    def __init__(self, input_dim, embedding_dim, hidden_dim, output_dim, rnn_type, rnn_layers, bidirectional, pretrained_embeddings=None, freeze_embeddings=False):
        super(Model, self).__init__()
        
        # Setting seed to ensure reproducibility
        SEED = 1
        random.seed(SEED)
        np.random.seed(SEED)
        torch.manual_seed(SEED)
        torch.cuda.manual_seed_all(SEED)

        if pretrained_embeddings is None:
            self.embedding_layer = nn.Embedding(num_embeddings=input_dim, embedding_dim=embedding_dim)
        else:
            # Create embedding layer with shape that matches the pretrained embeddings matrix. Setting padding index to 0, to avoid training the <PAD> embedding when embedding layer is not frezed.
            self.embedding_layer = nn.Embedding(num_embeddings=pretrained_embeddings.shape[0], embedding_dim=pretrained_embeddings.shape[1], padding_idx=0)
            self.embedding_layer.weight.data.copy_(pretrained_embeddings)
            self.embedding_layer.weight.requires_grad = not freeze_embeddings
        
        if rnn_type == "rnn":
          self.rnn = nn.RNN(input_size=embedding_dim, hidden_size=hidden_dim, batch_first=True, num_layers=rnn_layers, bidirectional=bidirectional)
        elif rnn_type == "lstm":
          self.rnn = nn.LSTM(input_size=embedding_dim, hidden_size=hidden_dim, batch_first=True, num_layers=rnn_layers, bidirectional=bidirectional)
        else:
          raise ValueError(f"Unsupported rnn type: {rnn_type}")

        if bidirectional:
            # In the case of bidirectional RNN, its output dimension will be 2 times the hidden size, 
            # since the hidden states of the forward and the backward RNN are concatenated. 
            # Thus, we change the input dimension of the linear layer to be 2 times the hidden dimension of the RNN, to match the output of the RNN.
            self.linear = nn.Linear(2*hidden_dim, output_dim)
        else:
            self.linear = nn.Linear(hidden_dim, output_dim)

    def forward(self, X_batch):
        embeddings = self.embedding_layer(X_batch)
        output, hidden = self.rnn(embeddings)
        logits = self.linear(output[:,-1])  # The last output of RNN is used for sequence classification
        probs = F.softmax(logits, dim=1)
        return probs

In addition, we define the run_expirements method which trains and evaluates all models in an experimental_configs list and we will use it for the first question. It also supports loading of pre-trained embeddings (freezed or not), thus, we will also use it for questions 4 and 5 too.

In [None]:
def run_experiments(experimental_configs, pretrained_embeddings=None, freeze_embeddings=False):
    results = []
    for config in experimental_configs:
        print(f"Experimental configuration: {config}")
        classifier = Model(
            input_dim=len(vocab), 
            embedding_dim=EMBEDDING_DIM, 
            hidden_dim=HIDDEN_DIM, 
            output_dim=len(target_classes),
            rnn_type=config["rnn_type"],
            rnn_layers=config["rnn_layers"],
            bidirectional=config["bidirectional"],
            pretrained_embeddings=pretrained_embeddings,
            freeze_embeddings=freeze_embeddings
        ).to(device)
        loss_fn = nn.CrossEntropyLoss()
        optimizer = torch.optim.Adam([param for param in classifier.parameters() if param.requires_grad == True],lr=LEARNING_RATE)
        num_of_parameters = count_parameters(classifier)

        print('\nModel:')
        print(classifier)
        print('Total parameters: ', num_of_parameters)
        print('\n\n')

        average_epoch_time = TrainModel(classifier, loss_fn, optimizer, train_loader, EPOCHS)
        _, Y_actual, Y_preds = EvaluateModel(classifier, loss_fn, test_loader)

        accuracy = accuracy_score(Y_actual, Y_preds)


        print(f"Average epoch duration: {average_epoch_time:.2f} seconds.")
        print("\nTest Accuracy : {:.3f}".format(accuracy))
        print("\nClassification Report : ")
        print(classification_report(Y_actual, Y_preds, target_names=target_classes))
        print("\nConfusion Matrix : ")
        print(confusion_matrix(Y_actual, Y_preds))
        print("-" * 80 + "\n")

        results.append(
            {
                "Experiment name": config["experiment_name"],
                "Accuracy": accuracy,
                "Parameters": num_of_parameters,
                "Time cost": average_epoch_time,
                "Y_preds": Y_preds,
                "Y_actual": Y_actual
            }
        )
    return results

As a final point, we define the print_eval_table function that prints a table showcasing the results of all models, as described in the first task.

In [None]:
from tabulate import tabulate


def print_eval_table(results):
    table = [
        [""] + [x["Experiment name"] for x in results],
        ["Accuracy"] + [x["Accuracy"] for x in results],
        ["Parameters"] + [x["Parameters"] for x in results],
        ["Time cost"] + [x["Time cost"] for x in results],
    ]
    print(tabulate(table, headers="firstrow", tablefmt="grid"))



## Answers

### 1.

Initially, we create a list of the experimental configurations that describe the 6 requested models. 

In [None]:
experimental_configs = [
    {
        "experiment_name": "1RNN",
        "rnn_type": "rnn",
        "bidirectional": False,
        "rnn_layers": 1
    },
    {
        "experiment_name": "1Bi-RNN",
        "rnn_type": "rnn",
        "bidirectional": True,
        "rnn_layers": 1
    },
    {
        "experiment_name": "2Bi-RNN",
        "rnn_type": "rnn",
        "bidirectional": True,
        "rnn_layers": 2
    },
    {
        "experiment_name": "1LSTM",
        "rnn_type": "lstm",
        "bidirectional": False,
        "rnn_layers": 1
    },
    {
        "experiment_name": "1Bi-LSTM",
        "rnn_type": "lstm",
        "bidirectional": True,
        "rnn_layers": 1
    },
    {
        "experiment_name": "2Bi-LSTM",
        "rnn_type": "lstm",
        "bidirectional": True,
        "rnn_layers": 2
    }
]

Additionally, we create, train, and evaluate these models by using the run_experiments method giving the experimental_configs as input.

In [None]:
results = run_experiments(experimental_configs)
print_eval_table(results)

Experimental configuration: {'experiment_name': '1RNN', 'rnn_type': 'rnn', 'bidirectional': False, 'rnn_layers': 1}

Model:
Model(
  (embedding_layer): Embedding(21254, 100)
  (rnn): RNN(100, 64, batch_first=True)
  (linear): Linear(in_features=64, out_features=4, bias=True)
)
Total parameters:  2136284



Epoch: 1


100%|██████████| 118/118 [00:06<00:00, 18.53it/s]


Train Loss : 1.292
Epoch: 2


100%|██████████| 118/118 [00:04<00:00, 27.66it/s]


Train Loss : 1.051
Epoch: 3


100%|██████████| 118/118 [00:04<00:00, 28.20it/s]


Train Loss : 0.963
Epoch: 4


100%|██████████| 118/118 [00:05<00:00, 22.25it/s]


Train Loss : 0.923
Epoch: 5


100%|██████████| 118/118 [00:04<00:00, 27.61it/s]


Train Loss : 0.903
Epoch: 6


100%|██████████| 118/118 [00:04<00:00, 26.45it/s]


Train Loss : 0.887
Epoch: 7


100%|██████████| 118/118 [00:05<00:00, 22.46it/s]


Train Loss : 0.876
Epoch: 8


100%|██████████| 118/118 [00:04<00:00, 28.27it/s]


Train Loss : 0.866
Epoch: 9


100%|██████████| 118/118 [00:05<00:00, 23.60it/s]


Train Loss : 0.859
Epoch: 10


100%|██████████| 118/118 [00:04<00:00, 26.79it/s]


Train Loss : 0.855
Epoch: 11


100%|██████████| 118/118 [00:04<00:00, 28.23it/s]


Train Loss : 0.848
Epoch: 12


100%|██████████| 118/118 [00:05<00:00, 22.40it/s]


Train Loss : 0.843
Epoch: 13


100%|██████████| 118/118 [00:04<00:00, 27.41it/s]


Train Loss : 0.839
Epoch: 14


100%|██████████| 118/118 [00:04<00:00, 26.87it/s]


Train Loss : 0.836
Epoch: 15


100%|██████████| 118/118 [00:05<00:00, 22.69it/s]


Train Loss : 0.834
Average epoch duration: 4.74 seconds.

Test Accuracy : 0.872

Classification Report : 
              precision    recall  f1-score   support

       World       0.90      0.86      0.88      1900
      Sports       0.92      0.95      0.93      1900
    Business       0.84      0.83      0.84      1900
    Sci/Tech       0.82      0.85      0.84      1900

    accuracy                           0.87      7600
   macro avg       0.87      0.87      0.87      7600
weighted avg       0.87      0.87      0.87      7600


Confusion Matrix : 
[[1635   80  108   77]
 [  33 1798   22   47]
 [  62   38 1577  223]
 [  79   42  162 1617]]
--------------------------------------------------------------------------------

Experimental configuration: {'experiment_name': '1Bi-RNN', 'rnn_type': 'rnn', 'bidirectional': True, 'rnn_layers': 1}

Model:
Model(
  (embedding_layer): Embedding(21254, 100)
  (rnn): RNN(100, 64, batch_first=True, bidirectional=True)
  (linear): Linear(in_featu

100%|██████████| 118/118 [00:04<00:00, 26.40it/s]


Train Loss : 1.296
Epoch: 2


100%|██████████| 118/118 [00:05<00:00, 21.13it/s]


Train Loss : 1.054
Epoch: 3


100%|██████████| 118/118 [00:04<00:00, 26.37it/s]


Train Loss : 0.962
Epoch: 4


100%|██████████| 118/118 [00:04<00:00, 26.04it/s]


Train Loss : 0.922
Epoch: 5


100%|██████████| 118/118 [00:05<00:00, 19.95it/s]


Train Loss : 0.900
Epoch: 6


100%|██████████| 118/118 [00:04<00:00, 26.47it/s]


Train Loss : 0.886
Epoch: 7


100%|██████████| 118/118 [00:05<00:00, 21.86it/s]


Train Loss : 0.874
Epoch: 8


100%|██████████| 118/118 [00:04<00:00, 25.89it/s]


Train Loss : 0.865
Epoch: 9


100%|██████████| 118/118 [00:04<00:00, 26.64it/s]


Train Loss : 0.859
Epoch: 10


100%|██████████| 118/118 [00:05<00:00, 21.18it/s]


Train Loss : 0.852
Epoch: 11


100%|██████████| 118/118 [00:04<00:00, 26.73it/s]


Train Loss : 0.846
Epoch: 12


100%|██████████| 118/118 [00:05<00:00, 21.73it/s]


Train Loss : 0.848
Epoch: 13


100%|██████████| 118/118 [00:05<00:00, 23.37it/s]


Train Loss : 0.841
Epoch: 14


100%|██████████| 118/118 [00:04<00:00, 25.18it/s]


Train Loss : 0.837
Epoch: 15


100%|██████████| 118/118 [00:05<00:00, 21.37it/s]


Train Loss : 0.834
Average epoch duration: 4.98 seconds.

Test Accuracy : 0.861

Classification Report : 
              precision    recall  f1-score   support

       World       0.86      0.88      0.87      1900
      Sports       0.91      0.96      0.93      1900
    Business       0.83      0.81      0.82      1900
    Sci/Tech       0.85      0.79      0.82      1900

    accuracy                           0.86      7600
   macro avg       0.86      0.86      0.86      7600
weighted avg       0.86      0.86      0.86      7600


Confusion Matrix : 
[[1678   82   92   48]
 [  26 1817   23   34]
 [ 122   43 1546  189]
 [ 133   58  206 1503]]
--------------------------------------------------------------------------------

Experimental configuration: {'experiment_name': '2Bi-RNN', 'rnn_type': 'rnn', 'bidirectional': True, 'rnn_layers': 2}

Model:
Model(
  (embedding_layer): Embedding(21254, 100)
  (rnn): RNN(100, 64, num_layers=2, batch_first=True, bidirectional=True)
  (linear): L

100%|██████████| 118/118 [00:04<00:00, 23.64it/s]


Train Loss : 1.263
Epoch: 2


100%|██████████| 118/118 [00:06<00:00, 19.15it/s]


Train Loss : 1.035
Epoch: 3


100%|██████████| 118/118 [00:04<00:00, 23.98it/s]


Train Loss : 0.966
Epoch: 4


100%|██████████| 118/118 [00:05<00:00, 20.30it/s]


Train Loss : 0.932
Epoch: 5


100%|██████████| 118/118 [00:05<00:00, 23.54it/s]


Train Loss : 0.910
Epoch: 6


100%|██████████| 118/118 [00:04<00:00, 23.78it/s]


Train Loss : 0.895
Epoch: 7


100%|██████████| 118/118 [00:05<00:00, 19.81it/s]


Train Loss : 0.889
Epoch: 8


100%|██████████| 118/118 [00:04<00:00, 23.86it/s]


Train Loss : 0.882
Epoch: 9


100%|██████████| 118/118 [00:05<00:00, 20.55it/s]


Train Loss : 0.874
Epoch: 10


100%|██████████| 118/118 [00:05<00:00, 23.42it/s]


Train Loss : 0.877
Epoch: 11


100%|██████████| 118/118 [00:05<00:00, 20.51it/s]


Train Loss : 0.864
Epoch: 12


100%|██████████| 118/118 [00:04<00:00, 24.02it/s]


Train Loss : 0.861
Epoch: 13


100%|██████████| 118/118 [00:05<00:00, 23.27it/s]


Train Loss : 0.856
Epoch: 14


100%|██████████| 118/118 [00:05<00:00, 20.47it/s]


Train Loss : 0.853
Epoch: 15


100%|██████████| 118/118 [00:04<00:00, 23.73it/s]


Train Loss : 0.850
Average epoch duration: 5.34 seconds.

Test Accuracy : 0.864

Classification Report : 
              precision    recall  f1-score   support

       World       0.90      0.84      0.87      1900
      Sports       0.90      0.96      0.93      1900
    Business       0.84      0.81      0.82      1900
    Sci/Tech       0.81      0.86      0.83      1900

    accuracy                           0.86      7600
   macro avg       0.86      0.86      0.86      7600
weighted avg       0.86      0.86      0.86      7600


Confusion Matrix : 
[[1590  101  130   79]
 [  24 1822   20   34]
 [  77   33 1533  257]
 [  68   60  147 1625]]
--------------------------------------------------------------------------------

Experimental configuration: {'experiment_name': '1LSTM', 'rnn_type': 'lstm', 'bidirectional': False, 'rnn_layers': 1}

Model:
Model(
  (embedding_layer): Embedding(21254, 100)
  (rnn): LSTM(100, 64, batch_first=True)
  (linear): Linear(in_features=64, out_feature

100%|██████████| 118/118 [00:05<00:00, 20.91it/s]


Train Loss : 1.240
Epoch: 2


100%|██████████| 118/118 [00:04<00:00, 24.33it/s]


Train Loss : 0.972
Epoch: 3


100%|██████████| 118/118 [00:05<00:00, 21.10it/s]


Train Loss : 0.911
Epoch: 4


100%|██████████| 118/118 [00:04<00:00, 24.75it/s]


Train Loss : 0.882
Epoch: 5


100%|██████████| 118/118 [00:04<00:00, 24.52it/s]


Train Loss : 0.865
Epoch: 6


100%|██████████| 118/118 [00:05<00:00, 20.48it/s]


Train Loss : 0.852
Epoch: 7


100%|██████████| 118/118 [00:04<00:00, 25.19it/s]


Train Loss : 0.844
Epoch: 8


100%|██████████| 118/118 [00:05<00:00, 21.12it/s]


Train Loss : 0.836
Epoch: 9


100%|██████████| 118/118 [00:04<00:00, 24.40it/s]


Train Loss : 0.831
Epoch: 10


100%|██████████| 118/118 [00:04<00:00, 23.81it/s]


Train Loss : 0.826
Epoch: 11


100%|██████████| 118/118 [00:05<00:00, 22.16it/s]


Train Loss : 0.821
Epoch: 12


100%|██████████| 118/118 [00:04<00:00, 25.59it/s]


Train Loss : 0.818
Epoch: 13


100%|██████████| 118/118 [00:05<00:00, 21.02it/s]


Train Loss : 0.815
Epoch: 14


100%|██████████| 118/118 [00:04<00:00, 24.72it/s]


Train Loss : 0.813
Epoch: 15


100%|██████████| 118/118 [00:05<00:00, 22.84it/s]


Train Loss : 0.811
Average epoch duration: 5.14 seconds.

Test Accuracy : 0.877

Classification Report : 
              precision    recall  f1-score   support

       World       0.89      0.88      0.89      1900
      Sports       0.94      0.95      0.94      1900
    Business       0.86      0.81      0.83      1900
    Sci/Tech       0.82      0.87      0.84      1900

    accuracy                           0.88      7600
   macro avg       0.88      0.88      0.88      7600
weighted avg       0.88      0.88      0.88      7600


Confusion Matrix : 
[[1680   62   86   72]
 [  40 1800   18   42]
 [  93   31 1541  235]
 [  77   30  148 1645]]
--------------------------------------------------------------------------------

Experimental configuration: {'experiment_name': '1Bi-LSTM', 'rnn_type': 'lstm', 'bidirectional': True, 'rnn_layers': 1}

Model:
Model(
  (embedding_layer): Embedding(21254, 100)
  (rnn): LSTM(100, 64, batch_first=True, bidirectional=True)
  (linear): Linear(in_fe

100%|██████████| 118/118 [00:05<00:00, 20.70it/s]


Train Loss : 1.252
Epoch: 2


100%|██████████| 118/118 [00:05<00:00, 19.71it/s]


Train Loss : 0.971
Epoch: 3


100%|██████████| 118/118 [00:05<00:00, 20.04it/s]


Train Loss : 0.909
Epoch: 4


100%|██████████| 118/118 [00:05<00:00, 20.63it/s]


Train Loss : 0.880
Epoch: 5


100%|██████████| 118/118 [00:06<00:00, 19.01it/s]


Train Loss : 0.863
Epoch: 6


100%|██████████| 118/118 [00:05<00:00, 21.32it/s]


Train Loss : 0.850
Epoch: 7


100%|██████████| 118/118 [00:06<00:00, 18.60it/s]


Train Loss : 0.841
Epoch: 8


100%|██████████| 118/118 [00:05<00:00, 22.19it/s]


Train Loss : 0.835
Epoch: 9


100%|██████████| 118/118 [00:06<00:00, 18.94it/s]


Train Loss : 0.829
Epoch: 10


100%|██████████| 118/118 [00:05<00:00, 21.95it/s]


Train Loss : 0.823
Epoch: 11


100%|██████████| 118/118 [00:06<00:00, 18.94it/s]


Train Loss : 0.821
Epoch: 12


100%|██████████| 118/118 [00:05<00:00, 21.97it/s]


Train Loss : 0.817
Epoch: 13


100%|██████████| 118/118 [00:06<00:00, 18.95it/s]


Train Loss : 0.814
Epoch: 14


100%|██████████| 118/118 [00:05<00:00, 21.24it/s]


Train Loss : 0.812
Epoch: 15


100%|██████████| 118/118 [00:06<00:00, 18.60it/s]


Train Loss : 0.810
Average epoch duration: 5.88 seconds.

Test Accuracy : 0.884

Classification Report : 
              precision    recall  f1-score   support

       World       0.90      0.89      0.89      1900
      Sports       0.93      0.95      0.94      1900
    Business       0.85      0.84      0.84      1900
    Sci/Tech       0.86      0.86      0.86      1900

    accuracy                           0.88      7600
   macro avg       0.88      0.88      0.88      7600
weighted avg       0.88      0.88      0.88      7600


Confusion Matrix : 
[[1685   58   87   70]
 [  45 1802   35   18]
 [  84   33 1595  188]
 [  62   39  162 1637]]
--------------------------------------------------------------------------------

Experimental configuration: {'experiment_name': '2Bi-LSTM', 'rnn_type': 'lstm', 'bidirectional': True, 'rnn_layers': 2}

Model:
Model(
  (embedding_layer): Embedding(21254, 100)
  (rnn): LSTM(100, 64, num_layers=2, batch_first=True, bidirectional=True)
  (linear)

100%|██████████| 118/118 [00:06<00:00, 18.23it/s]


Train Loss : 1.209
Epoch: 2


100%|██████████| 118/118 [00:07<00:00, 16.08it/s]


Train Loss : 0.956
Epoch: 3


100%|██████████| 118/118 [00:06<00:00, 18.48it/s]


Train Loss : 0.897
Epoch: 4


100%|██████████| 118/118 [00:07<00:00, 16.63it/s]


Train Loss : 0.872
Epoch: 5


100%|██████████| 118/118 [00:06<00:00, 18.28it/s]


Train Loss : 0.857
Epoch: 6


100%|██████████| 118/118 [00:07<00:00, 16.17it/s]


Train Loss : 0.847
Epoch: 7


100%|██████████| 118/118 [00:07<00:00, 16.44it/s]


Train Loss : 0.837
Epoch: 8


100%|██████████| 118/118 [00:06<00:00, 18.21it/s]


Train Loss : 0.830
Epoch: 9


100%|██████████| 118/118 [00:07<00:00, 16.16it/s]


Train Loss : 0.826
Epoch: 10


100%|██████████| 118/118 [00:06<00:00, 18.14it/s]


Train Loss : 0.821
Epoch: 11


100%|██████████| 118/118 [00:07<00:00, 16.18it/s]


Train Loss : 0.817
Epoch: 12


100%|██████████| 118/118 [00:06<00:00, 18.44it/s]


Train Loss : 0.814
Epoch: 13


100%|██████████| 118/118 [00:07<00:00, 16.67it/s]


Train Loss : 0.813
Epoch: 14


100%|██████████| 118/118 [00:06<00:00, 17.28it/s]


Train Loss : 0.810
Epoch: 15


100%|██████████| 118/118 [00:06<00:00, 17.50it/s]


Train Loss : 0.809
Average epoch duration: 6.87 seconds.

Test Accuracy : 0.890

Classification Report : 
              precision    recall  f1-score   support

       World       0.91      0.89      0.90      1900
      Sports       0.95      0.95      0.95      1900
    Business       0.86      0.85      0.85      1900
    Sci/Tech       0.85      0.87      0.86      1900

    accuracy                           0.89      7600
   macro avg       0.89      0.89      0.89      7600
weighted avg       0.89      0.89      0.89      7600


Confusion Matrix : 
[[1691   52   83   74]
 [  31 1799   33   37]
 [  75   25 1612  188]
 [  63   23  153 1661]]
--------------------------------------------------------------------------------

+------------+-------------+-------------+-----------+-------------+-------------+-------------+
|            |        1RNN |     1Bi-RNN |   2Bi-RNN |       1LSTM |    1Bi-LSTM |    2Bi-LSTM |
| Accuracy   | 0.871974    | 0.861053    | 0.864474  | 0.877105    | 

Although the differences in Accuracy between the models is small, models with more parameters tend to perform better. 1RNN is an exception and it performs better than more complex RNNs, which could be the result of a better random initialization of this model. Nonetheless, when comparing LSTMs with RNNs, all LSTMs perform better than RNNs, even if an RNN model has more parameters. e.g. 1LSTM performs better than 2Bi-RNN even though the latter has more parameters.

Moreover, Time cost increases with the complexity (i.e. number of paramters) of the model. Models with 2 layers (vs. 1) are slower, with 2Bi-LSTM being the slowest model. Time cost was calculated on GPU runtimes and even larger differences were noticed on CPU runtimes. 

Last but not least, using 2 layers instead of 1, slightly improves the performance of both Bi-RNN and Bi-LSTM.

### 2.

In [None]:
import numpy as np
from collections import Counter

mistakes_per_experiment = []
misstake_pairs_counter = Counter()
for r in results:
    wrong_preds_indices = np.where(r["Y_preds"] != r["Y_actual"])[0]
    mistakes_per_experiment.append(set(wrong_preds_indices.tolist()))
    for ind in wrong_preds_indices:
        actual_class = target_classes[r["Y_actual"][ind]]
        pred_class = target_classes[r["Y_preds"][ind]]
        misstake_pairs_counter.update([(actual_class, pred_class)])

common_mistakes = list(set.intersection(*mistakes_per_experiment))
print(f"There are {len(common_mistakes)} examples that were classified incorrectly by all {len(results)} experiments")

misclassfied_example = test_dataset[common_mistakes[0]]
print(f"\nExample:\n{misclassfied_example[1]}")
print(f"Actual label: {target_classes[misclassfied_example[0] - 1]}")

common_mistakes_per_class = Counter()
for mistake_ind in common_mistakes:
    label = target_classes[test_dataset[mistake_ind][0] - 1]
    common_mistakes_per_class.update([label])


print("\nNumber of common misclassifications accross models, per category:")
print(tabulate(common_mistakes_per_class.most_common()))

print("\nNumber of common misclassifications accross models, per actual class - wrong prediction pairs:")
print(tabulate(misstake_pairs_counter.most_common()))

most_common_pair = misstake_pairs_counter.most_common(1)[0]
print(f"\nThe most common actual class - wrong prediction pairs is: \nActual class: {most_common_pair[0][0]}\nPredicted class: {most_common_pair[0][1]}\nOccurrences: {most_common_pair[1]}")

There are 414 examples that were classified incorrectly by all 6 experiments

Example:
Greenspan: Debt, home prices not dangerous The record level of debt carried by American households and soaring home prices do not appear to represent serious threats to the US economy, Federal Reserve Chairman Alan Greenspan said Tuesday.
Actual label: Sci/Tech

Number of common misclassifications accross models, per category:
--------  ---
Business  155
World     122
Sci/Tech  114
Sports     23
--------  ---

Number of common misclassifications accross models, per actual class - wrong prediction pairs:
------------------------  ----
('Business', 'Sci/Tech')  1280
('Sci/Tech', 'Business')   978
('World', 'Business')      586
('Business', 'World')      513
('Sci/Tech', 'World')      482
('World', 'Sports')        435
('World', 'Sci/Tech')      420
('Sci/Tech', 'Sports')     252
('Sports', 'Sci/Tech')     212
('Business', 'Sports')     203
('Sports', 'World')        199
('Sports', 'Business')     151
-

### 3.

Next, we create new datasets, data loaders and vocab by calling preprocess_data with MAX_WORDS=50.

Consequently. we run again all 6 models by using the run_experiment function, which will now respect the new data loaders that use MAX_WORDS=50

In [None]:
MAX_WORDS = 50
train_dataset, test_dataset, train_loader, test_loader, vocab = preprocess_data(MAX_WORDS)
results = run_experiments(experimental_configs)
print_eval_table(results)

Experimental configuration: {'experiment_name': '1RNN', 'rnn_type': 'rnn', 'bidirectional': False, 'rnn_layers': 1}

Model:
Model(
  (embedding_layer): Embedding(21254, 100)
  (rnn): RNN(100, 64, batch_first=True)
  (linear): Linear(in_features=64, out_features=4, bias=True)
)
Total parameters:  2136284



Epoch: 1


100%|██████████| 118/118 [00:04<00:00, 23.87it/s]


Train Loss : 1.378
Epoch: 2


100%|██████████| 118/118 [00:04<00:00, 25.17it/s]


Train Loss : 1.336
Epoch: 3


100%|██████████| 118/118 [00:05<00:00, 21.40it/s]


Train Loss : 1.357
Epoch: 4


100%|██████████| 118/118 [00:04<00:00, 24.51it/s]


Train Loss : 1.356
Epoch: 5


100%|██████████| 118/118 [00:05<00:00, 21.13it/s]


Train Loss : 1.355
Epoch: 6


100%|██████████| 118/118 [00:04<00:00, 25.54it/s]


Train Loss : 1.351
Epoch: 7


100%|██████████| 118/118 [00:04<00:00, 25.80it/s]


Train Loss : 1.353
Epoch: 8


100%|██████████| 118/118 [00:05<00:00, 20.81it/s]


Train Loss : 1.361
Epoch: 9


100%|██████████| 118/118 [00:04<00:00, 25.33it/s]


Train Loss : 1.324
Epoch: 10


100%|██████████| 118/118 [00:05<00:00, 20.97it/s]


Train Loss : 1.376
Epoch: 11


100%|██████████| 118/118 [00:04<00:00, 24.44it/s]


Train Loss : 1.366
Epoch: 12


100%|██████████| 118/118 [00:04<00:00, 25.86it/s]


Train Loss : 1.363
Epoch: 13


100%|██████████| 118/118 [00:05<00:00, 20.63it/s]


Train Loss : 1.360
Epoch: 14


100%|██████████| 118/118 [00:04<00:00, 25.35it/s]


Train Loss : 1.357
Epoch: 15


100%|██████████| 118/118 [00:05<00:00, 20.99it/s]


Train Loss : 1.350
Average epoch duration: 5.08 seconds.

Test Accuracy : 0.349

Classification Report : 
              precision    recall  f1-score   support

       World       0.45      0.14      0.21      1900
      Sports       0.34      0.62      0.44      1900
    Business       0.31      0.49      0.38      1900
    Sci/Tech       0.49      0.15      0.23      1900

    accuracy                           0.35      7600
   macro avg       0.40      0.35      0.32      7600
weighted avg       0.40      0.35      0.32      7600


Confusion Matrix : 
[[ 261  714  872   53]
 [  99 1175  535   91]
 [ 112  701  924  163]
 [ 106  816  685  293]]
--------------------------------------------------------------------------------

Experimental configuration: {'experiment_name': '1Bi-RNN', 'rnn_type': 'rnn', 'bidirectional': True, 'rnn_layers': 1}

Model:
Model(
  (embedding_layer): Embedding(21254, 100)
  (rnn): RNN(100, 64, batch_first=True, bidirectional=True)
  (linear): Linear(in_featu

100%|██████████| 118/118 [00:05<00:00, 23.44it/s]


Train Loss : 1.375
Epoch: 2


100%|██████████| 118/118 [00:05<00:00, 20.13it/s]


Train Loss : 1.314
Epoch: 3


100%|██████████| 118/118 [00:04<00:00, 23.70it/s]


Train Loss : 1.346
Epoch: 4


100%|██████████| 118/118 [00:05<00:00, 23.24it/s]


Train Loss : 1.346
Epoch: 5


100%|██████████| 118/118 [00:05<00:00, 20.24it/s]


Train Loss : 1.344
Epoch: 6


100%|██████████| 118/118 [00:05<00:00, 23.49it/s]


Train Loss : 1.342
Epoch: 7


100%|██████████| 118/118 [00:05<00:00, 20.05it/s]


Train Loss : 1.338
Epoch: 8


100%|██████████| 118/118 [00:05<00:00, 23.04it/s]


Train Loss : 1.316
Epoch: 9


100%|██████████| 118/118 [00:05<00:00, 19.74it/s]


Train Loss : 1.331
Epoch: 10


100%|██████████| 118/118 [00:05<00:00, 23.50it/s]


Train Loss : 1.363
Epoch: 11


100%|██████████| 118/118 [00:05<00:00, 21.89it/s]


Train Loss : 1.351
Epoch: 12


100%|██████████| 118/118 [00:05<00:00, 22.27it/s]


Train Loss : 1.275
Epoch: 13


100%|██████████| 118/118 [00:05<00:00, 23.31it/s]


Train Loss : 1.253
Epoch: 14


100%|██████████| 118/118 [00:05<00:00, 20.11it/s]


Train Loss : 1.265
Epoch: 15


100%|██████████| 118/118 [00:04<00:00, 24.06it/s]


Train Loss : 1.242
Average epoch duration: 5.36 seconds.

Test Accuracy : 0.512

Classification Report : 
              precision    recall  f1-score   support

       World       0.68      0.55      0.61      1900
      Sports       0.44      0.81      0.57      1900
    Business       0.53      0.63      0.57      1900
    Sci/Tech       0.42      0.07      0.12      1900

    accuracy                           0.51      7600
   macro avg       0.52      0.51      0.47      7600
weighted avg       0.52      0.51      0.47      7600


Confusion Matrix : 
[[1036  564  242   58]
 [ 100 1532  190   78]
 [ 215  449 1191   45]
 [ 171  972  626  131]]
--------------------------------------------------------------------------------

Experimental configuration: {'experiment_name': '2Bi-RNN', 'rnn_type': 'rnn', 'bidirectional': True, 'rnn_layers': 2}

Model:
Model(
  (embedding_layer): Embedding(21254, 100)
  (rnn): RNN(100, 64, num_layers=2, batch_first=True, bidirectional=True)
  (linear): L

100%|██████████| 118/118 [00:06<00:00, 17.89it/s]


Train Loss : 1.368
Epoch: 2


100%|██████████| 118/118 [00:05<00:00, 20.27it/s]


Train Loss : 1.293
Epoch: 3


100%|██████████| 118/118 [00:06<00:00, 17.84it/s]


Train Loss : 1.253
Epoch: 4


100%|██████████| 118/118 [00:05<00:00, 20.34it/s]


Train Loss : 1.282
Epoch: 5


100%|██████████| 118/118 [00:06<00:00, 18.37it/s]


Train Loss : 1.285
Epoch: 6


100%|██████████| 118/118 [00:05<00:00, 20.21it/s]


Train Loss : 1.311
Epoch: 7


100%|██████████| 118/118 [00:06<00:00, 18.31it/s]


Train Loss : 1.293
Epoch: 8


100%|██████████| 118/118 [00:05<00:00, 20.66it/s]


Train Loss : 1.285
Epoch: 9


100%|██████████| 118/118 [00:06<00:00, 17.63it/s]


Train Loss : 1.281
Epoch: 10


100%|██████████| 118/118 [00:05<00:00, 20.77it/s]


Train Loss : 1.255
Epoch: 11


100%|██████████| 118/118 [00:06<00:00, 18.14it/s]


Train Loss : 1.273
Epoch: 12


100%|██████████| 118/118 [00:05<00:00, 20.67it/s]


Train Loss : 1.299
Epoch: 13


100%|██████████| 118/118 [00:06<00:00, 17.65it/s]


Train Loss : 1.355
Epoch: 14


100%|██████████| 118/118 [00:05<00:00, 20.13it/s]


Train Loss : 1.274
Epoch: 15


100%|██████████| 118/118 [00:06<00:00, 18.25it/s]


Train Loss : 1.241
Average epoch duration: 6.20 seconds.

Test Accuracy : 0.469

Classification Report : 
              precision    recall  f1-score   support

       World       0.50      0.19      0.27      1900
      Sports       0.45      0.84      0.58      1900
    Business       0.47      0.70      0.56      1900
    Sci/Tech       0.56      0.15      0.23      1900

    accuracy                           0.47      7600
   macro avg       0.50      0.47      0.41      7600
weighted avg       0.50      0.47      0.41      7600


Confusion Matrix : 
[[ 360 1170  326   44]
 [ 144 1599  122   35]
 [  93  343 1328  136]
 [ 122  463 1036  279]]
--------------------------------------------------------------------------------

Experimental configuration: {'experiment_name': '1LSTM', 'rnn_type': 'lstm', 'bidirectional': False, 'rnn_layers': 1}

Model:
Model(
  (embedding_layer): Embedding(21254, 100)
  (rnn): LSTM(100, 64, batch_first=True)
  (linear): Linear(in_features=64, out_feature

100%|██████████| 118/118 [00:05<00:00, 22.09it/s]


Train Loss : 1.321
Epoch: 2


100%|██████████| 118/118 [00:06<00:00, 18.74it/s]


Train Loss : 1.078
Epoch: 3


100%|██████████| 118/118 [00:05<00:00, 21.78it/s]


Train Loss : 0.980
Epoch: 4


100%|██████████| 118/118 [00:06<00:00, 19.00it/s]


Train Loss : 0.934
Epoch: 5


100%|██████████| 118/118 [00:05<00:00, 21.18it/s]


Train Loss : 0.910
Epoch: 6


100%|██████████| 118/118 [00:06<00:00, 18.90it/s]


Train Loss : 0.899
Epoch: 7


100%|██████████| 118/118 [00:05<00:00, 21.78it/s]


Train Loss : 0.896
Epoch: 8


100%|██████████| 118/118 [00:06<00:00, 18.81it/s]


Train Loss : 0.881
Epoch: 9


100%|██████████| 118/118 [00:05<00:00, 21.83it/s]


Train Loss : 0.869
Epoch: 10


100%|██████████| 118/118 [00:06<00:00, 18.67it/s]


Train Loss : 0.860
Epoch: 11


100%|██████████| 118/118 [00:05<00:00, 22.07it/s]


Train Loss : 0.855
Epoch: 12


100%|██████████| 118/118 [00:05<00:00, 20.46it/s]


Train Loss : 0.851
Epoch: 13


100%|██████████| 118/118 [00:05<00:00, 20.70it/s]


Train Loss : 0.846
Epoch: 14


100%|██████████| 118/118 [00:05<00:00, 21.23it/s]


Train Loss : 0.841
Epoch: 15


100%|██████████| 118/118 [00:06<00:00, 19.43it/s]


Train Loss : 0.837
Average epoch duration: 5.81 seconds.

Test Accuracy : 0.882

Classification Report : 
              precision    recall  f1-score   support

       World       0.92      0.86      0.89      1900
      Sports       0.93      0.95      0.94      1900
    Business       0.86      0.84      0.85      1900
    Sci/Tech       0.82      0.88      0.85      1900

    accuracy                           0.88      7600
   macro avg       0.88      0.88      0.88      7600
weighted avg       0.88      0.88      0.88      7600


Confusion Matrix : 
[[1637   71  109   83]
 [  41 1803   15   41]
 [  57   24 1588  231]
 [  53   41  134 1672]]
--------------------------------------------------------------------------------

Experimental configuration: {'experiment_name': '1Bi-LSTM', 'rnn_type': 'lstm', 'bidirectional': True, 'rnn_layers': 1}

Model:
Model(
  (embedding_layer): Embedding(21254, 100)
  (rnn): LSTM(100, 64, batch_first=True, bidirectional=True)
  (linear): Linear(in_fe

100%|██████████| 118/118 [00:06<00:00, 18.23it/s]


Train Loss : 1.324
Epoch: 2


100%|██████████| 118/118 [00:06<00:00, 17.13it/s]


Train Loss : 1.042
Epoch: 3


100%|██████████| 118/118 [00:07<00:00, 16.65it/s]


Train Loss : 0.961
Epoch: 4


100%|██████████| 118/118 [00:06<00:00, 18.72it/s]


Train Loss : 0.921
Epoch: 5


100%|██████████| 118/118 [00:07<00:00, 16.41it/s]


Train Loss : 0.894
Epoch: 6


100%|██████████| 118/118 [00:06<00:00, 18.87it/s]


Train Loss : 0.879
Epoch: 7


100%|██████████| 118/118 [00:07<00:00, 16.43it/s]


Train Loss : 0.868
Epoch: 8


100%|██████████| 118/118 [00:06<00:00, 18.63it/s]


Train Loss : 0.858
Epoch: 9


100%|██████████| 118/118 [00:07<00:00, 16.74it/s]


Train Loss : 0.853
Epoch: 10


100%|██████████| 118/118 [00:06<00:00, 18.48it/s]


Train Loss : 0.847
Epoch: 11


100%|██████████| 118/118 [00:07<00:00, 16.42it/s]


Train Loss : 0.847
Epoch: 12


100%|██████████| 118/118 [00:06<00:00, 17.14it/s]


Train Loss : 0.845
Epoch: 13


100%|██████████| 118/118 [00:06<00:00, 17.68it/s]


Train Loss : 0.837
Epoch: 14


100%|██████████| 118/118 [00:07<00:00, 16.70it/s]


Train Loss : 0.834
Epoch: 15


100%|██████████| 118/118 [00:06<00:00, 18.25it/s]


Train Loss : 0.829
Average epoch duration: 6.77 seconds.

Test Accuracy : 0.887

Classification Report : 
              precision    recall  f1-score   support

       World       0.92      0.87      0.89      1900
      Sports       0.94      0.95      0.95      1900
    Business       0.85      0.84      0.85      1900
    Sci/Tech       0.84      0.89      0.86      1900

    accuracy                           0.89      7600
   macro avg       0.89      0.89      0.89      7600
weighted avg       0.89      0.89      0.89      7600


Confusion Matrix : 
[[1646   72  116   66]
 [  23 1814   23   40]
 [  54   21 1600  225]
 [  58   25  133 1684]]
--------------------------------------------------------------------------------

Experimental configuration: {'experiment_name': '2Bi-LSTM', 'rnn_type': 'lstm', 'bidirectional': True, 'rnn_layers': 2}

Model:
Model(
  (embedding_layer): Embedding(21254, 100)
  (rnn): LSTM(100, 64, num_layers=2, batch_first=True, bidirectional=True)
  (linear)

100%|██████████| 118/118 [00:07<00:00, 15.56it/s]


Train Loss : 1.268
Epoch: 2


100%|██████████| 118/118 [00:06<00:00, 17.52it/s]


Train Loss : 1.005
Epoch: 3


100%|██████████| 118/118 [00:07<00:00, 16.07it/s]


Train Loss : 0.934
Epoch: 4


100%|██████████| 118/118 [00:07<00:00, 15.75it/s]


Train Loss : 0.897
Epoch: 5


100%|██████████| 118/118 [00:06<00:00, 17.27it/s]


Train Loss : 0.881
Epoch: 6


100%|██████████| 118/118 [00:07<00:00, 15.56it/s]


Train Loss : 0.866
Epoch: 7


100%|██████████| 118/118 [00:06<00:00, 18.04it/s]


Train Loss : 0.856
Epoch: 8


100%|██████████| 118/118 [00:07<00:00, 16.22it/s]


Train Loss : 0.847
Epoch: 9


100%|██████████| 118/118 [00:06<00:00, 17.27it/s]


Train Loss : 0.841
Epoch: 10


100%|██████████| 118/118 [00:07<00:00, 15.89it/s]


Train Loss : 0.835
Epoch: 11


100%|██████████| 118/118 [00:07<00:00, 16.17it/s]


Train Loss : 0.830
Epoch: 12


100%|██████████| 118/118 [00:06<00:00, 17.84it/s]


Train Loss : 0.827
Epoch: 13


100%|██████████| 118/118 [00:07<00:00, 16.07it/s]


Train Loss : 0.825
Epoch: 14


100%|██████████| 118/118 [00:06<00:00, 17.60it/s]


Train Loss : 0.824
Epoch: 15


100%|██████████| 118/118 [00:07<00:00, 16.00it/s]


Train Loss : 0.822
Average epoch duration: 7.14 seconds.

Test Accuracy : 0.892

Classification Report : 
              precision    recall  f1-score   support

       World       0.91      0.88      0.89      1900
      Sports       0.93      0.96      0.95      1900
    Business       0.85      0.87      0.86      1900
    Sci/Tech       0.87      0.86      0.87      1900

    accuracy                           0.89      7600
   macro avg       0.89      0.89      0.89      7600
weighted avg       0.89      0.89      0.89      7600


Confusion Matrix : 
[[1664   74  108   54]
 [  29 1824   29   18]
 [  55   22 1660  163]
 [  74   38  154 1634]]
--------------------------------------------------------------------------------

+------------+-------------+-------------+-----------+-------------+-------------+-------------+
|            |        1RNN |     1Bi-RNN |   2Bi-RNN |       1LSTM |    1Bi-LSTM |    2Bi-LSTM |
| Accuracy   | 0.349079    | 0.511842    | 0.469211  | 0.881579    | 

RNN models perform significanlt worse when using MAX_WORDS=50 instead of 25. This is probably happening due to the vanishing gradiend problem of RNNs. The longer the sequence that the RNN has to encode, the more difficult it is to capture information from the first timesteps of the RNN. This problem is solved in LSTMs which manage to "remember" important information from earlier timesteps. Indeed, we notice that the LSTM models are not affected by vanishing gradient and perform similarly and slightly better than the experiments with MAX_WORDS=25.

Regarding the complexity of the models, it remains the same regardless of the MAX_WORDS values, since its value only affects the number of timesteps/calculations and not the structure of the neural network.

### 4.

Now, we use torchtext to load the GloVe embeddings. First, we create a vocabulary for this experiment, by finding the common words between the vocab that we costructed previously from the dataset, and the vocabulary of the GloVe embeddings. This way, we will only load the embeddings that we need for the training and evaluation of the model. Using these common words, we create a new torchtext vocab that also contains \<PAD> and \<UNK> tokens at the first two positions. We then create the embedding matrix by taking the embedding of each word in the new vocab and create a matrix whose rows are these embeddings in the same order as in the vocabulary. We set the first row to zeros (for the <PAD> token) and the second vector to a random vector (for the <UNK> token). 

To load the embeddings matrix, we modified the model so that:
 - The embedding layer is created with the same dimensions as the pre-trained mebeddings matrix to load.
 - We explicitly set padding_idx=0 to prevent the \<PAD> embedding from updating during training.
 - We replace the randomly initialized embedding layer weight, with the pre-trained embeddings.

self.embedding_layer = nn.Embedding(num_embeddings=pretrained_embeddings.shape[0], embedding_dim=pretrained_embeddings.shape[1], padding_idx=0)

self.embedding_layer.weight.data.copy_(pretrained_embeddings)

In [None]:
import torchtext
from collections import OrderedDict
pretrained_embeddings = torchtext.vocab.GloVe(name='6B', dim=100)

# Creating the new embeddings vocabulary, by keeping common terms between our original dataset vocabualry, and the Glove-6B-100D vocabulary. 
embeddings_vocab = set(pretrained_embeddings.itos)
embeddings_vocab = [t for t in vocab.get_itos() if t in embeddings_vocab]

# Creating an OrderedDict with dummy token frequencies (1), in order to build a torchtext vocab.
vocab = OrderedDict([(token, 1) for token in embeddings_vocab])

# Building torchtext vocab, with <PAD> and <UNK> special tokens at the first positions (0 and 1 respectively).
vocab = torchtext.vocab.vocab(vocab, min_freq=1, specials=["<PAD>","<UNK>"], special_first=True)
vocab.set_default_index(vocab["<UNK>"])

embeddings_matrix = pretrained_embeddings.get_vecs_by_tokens(vocab.get_itos())
# Add add zeros-vector (for <PAD> token) and random vector (for <UNK> token) at the first two rows of the embeddings matrix.
embeddings_matrix[0, :] = torch.zeros(1, embeddings_matrix.shape[1]) # Setting the 0th vector to zeros (for the <PAD> token)
embeddings_matrix[1, :] = torch.rand(1, embeddings_matrix.shape[1]) # Setting the 1st vector to random vector (for the <UNK> token)

.vector_cache/glove.6B.zip: 862MB [02:40, 5.36MB/s]                           
100%|█████████▉| 399999/400000 [00:18<00:00, 22058.35it/s]


In [None]:
MAX_WORDS = 25 # Setting MAX_WORDS back to 25 for Γ4.
train_dataset, test_dataset, train_loader, test_loader, vocab = preprocess_data(MAX_WORDS, vocab)
results = run_experiments(experimental_configs, pretrained_embeddings=embeddings_matrix, freeze_embeddings=False)
print_eval_table(results)

Experimental configuration: {'experiment_name': '1RNN', 'rnn_type': 'rnn', 'bidirectional': False, 'rnn_layers': 1}

Model:
Model(
  (embedding_layer): Embedding(20437, 100, padding_idx=0)
  (rnn): RNN(100, 64, batch_first=True)
  (linear): Linear(in_features=64, out_features=4, bias=True)
)
Total parameters:  2054584



Epoch: 1


100%|██████████| 118/118 [00:04<00:00, 28.02it/s]


Train Loss : 1.053
Epoch: 2


100%|██████████| 118/118 [00:05<00:00, 23.29it/s]


Train Loss : 0.901
Epoch: 3


100%|██████████| 118/118 [00:04<00:00, 28.74it/s]


Train Loss : 0.881
Epoch: 4


100%|██████████| 118/118 [00:03<00:00, 30.09it/s]


Train Loss : 0.866
Epoch: 5


100%|██████████| 118/118 [00:05<00:00, 23.09it/s]


Train Loss : 0.860
Epoch: 6


100%|██████████| 118/118 [00:04<00:00, 29.40it/s]


Train Loss : 0.857
Epoch: 7


100%|██████████| 118/118 [00:04<00:00, 29.31it/s]


Train Loss : 0.850
Epoch: 8


100%|██████████| 118/118 [00:05<00:00, 23.54it/s]


Train Loss : 0.850
Epoch: 9


100%|██████████| 118/118 [00:04<00:00, 28.93it/s]


Train Loss : 0.850
Epoch: 10


100%|██████████| 118/118 [00:04<00:00, 26.87it/s]


Train Loss : 0.849
Epoch: 11


100%|██████████| 118/118 [00:04<00:00, 24.97it/s]


Train Loss : 0.854
Epoch: 12


100%|██████████| 118/118 [00:04<00:00, 28.96it/s]


Train Loss : 0.845
Epoch: 13


100%|██████████| 118/118 [00:05<00:00, 23.41it/s]


Train Loss : 0.851
Epoch: 14


100%|██████████| 118/118 [00:04<00:00, 28.23it/s]


Train Loss : 0.841
Epoch: 15


100%|██████████| 118/118 [00:04<00:00, 28.75it/s]


Train Loss : 0.836
Average epoch duration: 4.41 seconds.

Test Accuracy : 0.885

Classification Report : 
              precision    recall  f1-score   support

       World       0.89      0.89      0.89      1900
      Sports       0.93      0.97      0.95      1900
    Business       0.87      0.81      0.84      1900
    Sci/Tech       0.84      0.87      0.85      1900

    accuracy                           0.88      7600
   macro avg       0.88      0.88      0.88      7600
weighted avg       0.88      0.88      0.88      7600


Confusion Matrix : 
[[1693   75   80   52]
 [  21 1845   14   20]
 [  91   30 1541  238]
 [  89   31  136 1644]]
--------------------------------------------------------------------------------

Experimental configuration: {'experiment_name': '1Bi-RNN', 'rnn_type': 'rnn', 'bidirectional': True, 'rnn_layers': 1}

Model:
Model(
  (embedding_layer): Embedding(20437, 100, padding_idx=0)
  (rnn): RNN(100, 64, batch_first=True, bidirectional=True)
  (linear): 

100%|██████████| 118/118 [00:05<00:00, 22.04it/s]


Train Loss : 1.070
Epoch: 2


100%|██████████| 118/118 [00:04<00:00, 28.23it/s]


Train Loss : 0.911
Epoch: 3


100%|██████████| 118/118 [00:04<00:00, 27.98it/s]


Train Loss : 0.879
Epoch: 4


100%|██████████| 118/118 [00:05<00:00, 21.96it/s]


Train Loss : 0.877
Epoch: 5


100%|██████████| 118/118 [00:04<00:00, 27.74it/s]


Train Loss : 0.893
Epoch: 6


100%|██████████| 118/118 [00:05<00:00, 22.07it/s]


Train Loss : 0.864
Epoch: 7


100%|██████████| 118/118 [00:04<00:00, 26.84it/s]


Train Loss : 0.854
Epoch: 8


100%|██████████| 118/118 [00:04<00:00, 26.26it/s]


Train Loss : 0.853
Epoch: 9


100%|██████████| 118/118 [00:05<00:00, 22.05it/s]


Train Loss : 0.864
Epoch: 10


100%|██████████| 118/118 [00:04<00:00, 25.72it/s]


Train Loss : 0.846
Epoch: 11


100%|██████████| 118/118 [00:04<00:00, 23.64it/s]


Train Loss : 0.851
Epoch: 12


100%|██████████| 118/118 [00:04<00:00, 24.57it/s]


Train Loss : 0.846
Epoch: 13


100%|██████████| 118/118 [00:04<00:00, 26.04it/s]


Train Loss : 0.841
Epoch: 14


100%|██████████| 118/118 [00:05<00:00, 22.38it/s]


Train Loss : 0.849
Epoch: 15


100%|██████████| 118/118 [00:04<00:00, 26.28it/s]


Train Loss : 0.851
Average epoch duration: 4.79 seconds.

Test Accuracy : 0.862

Classification Report : 
              precision    recall  f1-score   support

       World       0.85      0.89      0.87      1900
      Sports       0.96      0.89      0.92      1900
    Business       0.88      0.77      0.82      1900
    Sci/Tech       0.78      0.89      0.83      1900

    accuracy                           0.86      7600
   macro avg       0.87      0.86      0.86      7600
weighted avg       0.87      0.86      0.86      7600


Confusion Matrix : 
[[1690   48   78   84]
 [  85 1687   18  110]
 [ 134   11 1472  283]
 [  81    8  112 1699]]
--------------------------------------------------------------------------------

Experimental configuration: {'experiment_name': '2Bi-RNN', 'rnn_type': 'rnn', 'bidirectional': True, 'rnn_layers': 2}

Model:
Model(
  (embedding_layer): Embedding(20437, 100, padding_idx=0)
  (rnn): RNN(100, 64, num_layers=2, batch_first=True, bidirectional=True

100%|██████████| 118/118 [00:05<00:00, 22.01it/s]


Train Loss : 1.003
Epoch: 2


100%|██████████| 118/118 [00:05<00:00, 23.01it/s]


Train Loss : 0.894
Epoch: 3


100%|██████████| 118/118 [00:04<00:00, 24.84it/s]


Train Loss : 0.900
Epoch: 4


100%|██████████| 118/118 [00:05<00:00, 20.83it/s]


Train Loss : 0.869
Epoch: 5


100%|██████████| 118/118 [00:04<00:00, 23.77it/s]


Train Loss : 0.854
Epoch: 6


100%|██████████| 118/118 [00:05<00:00, 21.24it/s]


Train Loss : 0.851
Epoch: 7


100%|██████████| 118/118 [00:04<00:00, 24.98it/s]


Train Loss : 0.848
Epoch: 8


100%|██████████| 118/118 [00:04<00:00, 24.81it/s]


Train Loss : 0.854
Epoch: 9


100%|██████████| 118/118 [00:05<00:00, 20.50it/s]


Train Loss : 0.882
Epoch: 10


100%|██████████| 118/118 [00:04<00:00, 24.02it/s]


Train Loss : 0.847
Epoch: 11


100%|██████████| 118/118 [00:05<00:00, 20.95it/s]


Train Loss : 0.841
Epoch: 12


100%|██████████| 118/118 [00:04<00:00, 24.97it/s]


Train Loss : 0.865
Epoch: 13


100%|██████████| 118/118 [00:05<00:00, 22.01it/s]


Train Loss : 0.850
Epoch: 14


100%|██████████| 118/118 [00:05<00:00, 23.22it/s]


Train Loss : 0.870
Epoch: 15


100%|██████████| 118/118 [00:04<00:00, 24.85it/s]


Train Loss : 0.851
Average epoch duration: 5.15 seconds.

Test Accuracy : 0.872

Classification Report : 
              precision    recall  f1-score   support

       World       0.93      0.84      0.88      1900
      Sports       0.95      0.92      0.94      1900
    Business       0.76      0.88      0.82      1900
    Sci/Tech       0.86      0.85      0.86      1900

    accuracy                           0.87      7600
   macro avg       0.88      0.87      0.87      7600
weighted avg       0.88      0.87      0.87      7600


Confusion Matrix : 
[[1591   69  184   56]
 [  20 1756  108   16]
 [  42    9 1670  179]
 [  59   10  223 1608]]
--------------------------------------------------------------------------------

Experimental configuration: {'experiment_name': '1LSTM', 'rnn_type': 'lstm', 'bidirectional': False, 'rnn_layers': 1}

Model:
Model(
  (embedding_layer): Embedding(20437, 100, padding_idx=0)
  (rnn): LSTM(100, 64, batch_first=True)
  (linear): Linear(in_features=

100%|██████████| 118/118 [00:05<00:00, 21.39it/s]


Train Loss : 1.013
Epoch: 2


100%|██████████| 118/118 [00:04<00:00, 24.28it/s]


Train Loss : 0.858
Epoch: 3


100%|██████████| 118/118 [00:05<00:00, 21.57it/s]


Train Loss : 0.842
Epoch: 4


100%|██████████| 118/118 [00:04<00:00, 25.47it/s]


Train Loss : 0.833
Epoch: 5


100%|██████████| 118/118 [00:04<00:00, 24.67it/s]


Train Loss : 0.826
Epoch: 6


100%|██████████| 118/118 [00:05<00:00, 21.36it/s]


Train Loss : 0.820
Epoch: 7


100%|██████████| 118/118 [00:04<00:00, 24.88it/s]


Train Loss : 0.817
Epoch: 8


100%|██████████| 118/118 [00:05<00:00, 21.43it/s]


Train Loss : 0.811
Epoch: 9


100%|██████████| 118/118 [00:04<00:00, 25.93it/s]


Train Loss : 0.808
Epoch: 10


100%|██████████| 118/118 [00:04<00:00, 25.11it/s]


Train Loss : 0.806
Epoch: 11


100%|██████████| 118/118 [00:05<00:00, 21.56it/s]


Train Loss : 0.803
Epoch: 12


100%|██████████| 118/118 [00:04<00:00, 25.27it/s]


Train Loss : 0.802
Epoch: 13


100%|██████████| 118/118 [00:05<00:00, 21.26it/s]


Train Loss : 0.801
Epoch: 14


100%|██████████| 118/118 [00:04<00:00, 24.63it/s]


Train Loss : 0.800
Epoch: 15


100%|██████████| 118/118 [00:05<00:00, 22.46it/s]


Train Loss : 0.799
Average epoch duration: 5.08 seconds.

Test Accuracy : 0.908

Classification Report : 
              precision    recall  f1-score   support

       World       0.93      0.90      0.91      1900
      Sports       0.95      0.97      0.96      1900
    Business       0.88      0.86      0.87      1900
    Sci/Tech       0.87      0.90      0.88      1900

    accuracy                           0.91      7600
   macro avg       0.91      0.91      0.91      7600
weighted avg       0.91      0.91      0.91      7600


Confusion Matrix : 
[[1704   61   72   63]
 [  18 1840   19   23]
 [  57   23 1642  178]
 [  48   12  126 1714]]
--------------------------------------------------------------------------------

Experimental configuration: {'experiment_name': '1Bi-LSTM', 'rnn_type': 'lstm', 'bidirectional': True, 'rnn_layers': 1}

Model:
Model(
  (embedding_layer): Embedding(20437, 100, padding_idx=0)
  (rnn): LSTM(100, 64, batch_first=True, bidirectional=True)
  (linear

100%|██████████| 118/118 [00:05<00:00, 21.22it/s]


Train Loss : 1.024
Epoch: 2


100%|██████████| 118/118 [00:05<00:00, 20.26it/s]


Train Loss : 0.858
Epoch: 3


100%|██████████| 118/118 [00:05<00:00, 19.87it/s]


Train Loss : 0.843
Epoch: 4


100%|██████████| 118/118 [00:05<00:00, 21.86it/s]


Train Loss : 0.832
Epoch: 5


100%|██████████| 118/118 [00:06<00:00, 18.96it/s]


Train Loss : 0.824
Epoch: 6


100%|██████████| 118/118 [00:05<00:00, 22.03it/s]


Train Loss : 0.819
Epoch: 7


100%|██████████| 118/118 [00:06<00:00, 19.45it/s]


Train Loss : 0.816
Epoch: 8


100%|██████████| 118/118 [00:05<00:00, 22.09it/s]


Train Loss : 0.813
Epoch: 9


100%|██████████| 118/118 [00:06<00:00, 19.05it/s]


Train Loss : 0.809
Epoch: 10


100%|██████████| 118/118 [00:05<00:00, 21.97it/s]


Train Loss : 0.806
Epoch: 11


100%|██████████| 118/118 [00:06<00:00, 18.92it/s]


Train Loss : 0.805
Epoch: 12


100%|██████████| 118/118 [00:05<00:00, 22.43it/s]


Train Loss : 0.803
Epoch: 13


100%|██████████| 118/118 [00:06<00:00, 19.16it/s]


Train Loss : 0.800
Epoch: 14


100%|██████████| 118/118 [00:05<00:00, 21.63it/s]


Train Loss : 0.799
Epoch: 15


100%|██████████| 118/118 [00:06<00:00, 19.39it/s]


Train Loss : 0.797
Average epoch duration: 5.77 seconds.

Test Accuracy : 0.908

Classification Report : 
              precision    recall  f1-score   support

       World       0.92      0.90      0.91      1900
      Sports       0.95      0.98      0.96      1900
    Business       0.88      0.87      0.88      1900
    Sci/Tech       0.88      0.89      0.88      1900

    accuracy                           0.91      7600
   macro avg       0.91      0.91      0.91      7600
weighted avg       0.91      0.91      0.91      7600


Confusion Matrix : 
[[1710   62   68   60]
 [  15 1858   14   13]
 [  62   22 1652  164]
 [  66   18  134 1682]]
--------------------------------------------------------------------------------

Experimental configuration: {'experiment_name': '2Bi-LSTM', 'rnn_type': 'lstm', 'bidirectional': True, 'rnn_layers': 2}

Model:
Model(
  (embedding_layer): Embedding(20437, 100, padding_idx=0)
  (rnn): LSTM(100, 64, num_layers=2, batch_first=True, bidirectional=T

100%|██████████| 118/118 [00:06<00:00, 18.96it/s]


Train Loss : 0.980
Epoch: 2


100%|██████████| 118/118 [00:07<00:00, 16.81it/s]


Train Loss : 0.854
Epoch: 3


100%|██████████| 118/118 [00:06<00:00, 18.64it/s]


Train Loss : 0.842
Epoch: 4


100%|██████████| 118/118 [00:06<00:00, 16.91it/s]


Train Loss : 0.833
Epoch: 5


100%|██████████| 118/118 [00:06<00:00, 18.97it/s]


Train Loss : 0.827
Epoch: 6


100%|██████████| 118/118 [00:07<00:00, 16.60it/s]


Train Loss : 0.822
Epoch: 7


100%|██████████| 118/118 [00:06<00:00, 18.63it/s]


Train Loss : 0.817
Epoch: 8


100%|██████████| 118/118 [00:06<00:00, 16.91it/s]


Train Loss : 0.813
Epoch: 9


100%|██████████| 118/118 [00:06<00:00, 18.39it/s]


Train Loss : 0.811
Epoch: 10


100%|██████████| 118/118 [00:06<00:00, 17.37it/s]


Train Loss : 0.808
Epoch: 11


100%|██████████| 118/118 [00:07<00:00, 16.41it/s]


Train Loss : 0.807
Epoch: 12


100%|██████████| 118/118 [00:06<00:00, 18.82it/s]


Train Loss : 0.806
Epoch: 13


100%|██████████| 118/118 [00:07<00:00, 16.50it/s]


Train Loss : 0.804
Epoch: 14


100%|██████████| 118/118 [00:06<00:00, 18.46it/s]


Train Loss : 0.803
Epoch: 15


100%|██████████| 118/118 [00:07<00:00, 16.54it/s]


Train Loss : 0.801
Average epoch duration: 6.71 seconds.

Test Accuracy : 0.907

Classification Report : 
              precision    recall  f1-score   support

       World       0.93      0.90      0.91      1900
      Sports       0.96      0.97      0.96      1900
    Business       0.88      0.86      0.87      1900
    Sci/Tech       0.86      0.90      0.88      1900

    accuracy                           0.91      7600
   macro avg       0.91      0.91      0.91      7600
weighted avg       0.91      0.91      0.91      7600


Confusion Matrix : 
[[1701   57   75   67]
 [  19 1840   16   25]
 [  66   15 1640  179]
 [  45   10  130 1715]]
--------------------------------------------------------------------------------

+------------+-------------+-------------+------------+-------------+-------------+-------------+
|            |        1RNN |     1Bi-RNN |    2Bi-RNN |       1LSTM |    1Bi-LSTM |    2Bi-LSTM |
| Accuracy   | 0.884605    | 0.861579    | 0.871711   | 0.907895   

We observe that all models perform better when using pre-trained embeddings vs. random embedding initialization. The best model with pre-trained embeddings is 1Bi-LSTM with accuracy: 0.908158. The best model with random embeddings initialization (task 1) was 2Bi-LSTM with accuracy: 0.889868. That is, the usage of pre-trained embeddings lead to singificant improvements. The complexity of the models is similar to that of the models in first task, since we just changed the way we initialize the weights of the embeddings, which are still parameters to be optimized during training. The number of parameters in the models with pre-trained embeddings is slightly smaller that in the models of task 1 only due to the slightly smaller vocabulary size, since not all words of our original vocabualry had a cooresponding GloVe embedding.

### 5.

In this task, the pre-trained embeddings should be modified during model training (freeze embeddings). To achieve this, we modified the code so that gradient calculation is not required for the embedding layer, as follows:

self.embedding_layer.weight.requires_grad = not freeze_embeddings

i.e. when we want to freeze the embeddings this translates to:

self.embedding_layer.weight.requires_grad = False

During the creation of the optimizer, this layer will be ingored because its requires_grad variable is False, and the layer will not be optimized during training:

optimizer = torch.optim.Adam([param for param in classifier.parameters() if param.requires_grad == True],lr=LEARNING_RATE)


In [None]:
MAX_WORDS = 25
train_dataset, test_dataset, train_loader, test_loader, vocab = preprocess_data(MAX_WORDS, vocab)
results = run_experiments(experimental_configs, pretrained_embeddings=embeddings_matrix, freeze_embeddings=True)
print_eval_table(results)

Experimental configuration: {'experiment_name': '1RNN', 'rnn_type': 'rnn', 'bidirectional': False, 'rnn_layers': 1}

Model:
Model(
  (embedding_layer): Embedding(20437, 100, padding_idx=0)
  (rnn): RNN(100, 64, batch_first=True)
  (linear): Linear(in_features=64, out_features=4, bias=True)
)
Total parameters:  10884



Epoch: 1


100%|██████████| 118/118 [00:04<00:00, 27.48it/s]


Train Loss : 1.063
Epoch: 2


100%|██████████| 118/118 [00:05<00:00, 23.00it/s]


Train Loss : 0.921
Epoch: 3


100%|██████████| 118/118 [00:03<00:00, 29.59it/s]


Train Loss : 0.903
Epoch: 4


100%|██████████| 118/118 [00:04<00:00, 28.25it/s]


Train Loss : 0.889
Epoch: 5


100%|██████████| 118/118 [00:05<00:00, 23.50it/s]


Train Loss : 0.888
Epoch: 6


100%|██████████| 118/118 [00:04<00:00, 28.80it/s]


Train Loss : 0.887
Epoch: 7


100%|██████████| 118/118 [00:04<00:00, 27.76it/s]


Train Loss : 0.881
Epoch: 8


100%|██████████| 118/118 [00:04<00:00, 24.13it/s]


Train Loss : 0.879
Epoch: 9


100%|██████████| 118/118 [00:04<00:00, 28.60it/s]


Train Loss : 0.885
Epoch: 10


100%|██████████| 118/118 [00:04<00:00, 25.11it/s]


Train Loss : 0.887
Epoch: 11


100%|██████████| 118/118 [00:04<00:00, 26.51it/s]


Train Loss : 0.876
Epoch: 12


100%|██████████| 118/118 [00:04<00:00, 28.26it/s]


Train Loss : 0.876
Epoch: 13


100%|██████████| 118/118 [00:05<00:00, 22.17it/s]


Train Loss : 0.874
Epoch: 14


100%|██████████| 118/118 [00:04<00:00, 28.56it/s]


Train Loss : 0.925
Epoch: 15


100%|██████████| 118/118 [00:04<00:00, 28.36it/s]


Train Loss : 0.883
Average epoch duration: 4.47 seconds.

Test Accuracy : 0.854

Classification Report : 
              precision    recall  f1-score   support

       World       0.86      0.88      0.87      1900
      Sports       0.92      0.95      0.94      1900
    Business       0.77      0.86      0.81      1900
    Sci/Tech       0.88      0.73      0.80      1900

    accuracy                           0.85      7600
   macro avg       0.86      0.85      0.85      7600
weighted avg       0.86      0.85      0.85      7600


Confusion Matrix : 
[[1671   78  111   40]
 [  34 1813   31   22]
 [ 109   34 1626  131]
 [ 123   45  351 1381]]
--------------------------------------------------------------------------------

Experimental configuration: {'experiment_name': '1Bi-RNN', 'rnn_type': 'rnn', 'bidirectional': True, 'rnn_layers': 1}

Model:
Model(
  (embedding_layer): Embedding(20437, 100, padding_idx=0)
  (rnn): RNN(100, 64, batch_first=True, bidirectional=True)
  (linear): 

100%|██████████| 118/118 [00:05<00:00, 22.15it/s]


Train Loss : 1.072
Epoch: 2


100%|██████████| 118/118 [00:04<00:00, 27.19it/s]


Train Loss : 0.918
Epoch: 3


100%|██████████| 118/118 [00:04<00:00, 23.88it/s]


Train Loss : 0.896
Epoch: 4


100%|██████████| 118/118 [00:04<00:00, 25.46it/s]


Train Loss : 0.890
Epoch: 5


100%|██████████| 118/118 [00:04<00:00, 28.28it/s]


Train Loss : 0.882
Epoch: 6


100%|██████████| 118/118 [00:05<00:00, 22.39it/s]


Train Loss : 0.889
Epoch: 7


100%|██████████| 118/118 [00:04<00:00, 28.32it/s]


Train Loss : 0.883
Epoch: 8


100%|██████████| 118/118 [00:04<00:00, 28.46it/s]


Train Loss : 0.883
Epoch: 9


100%|██████████| 118/118 [00:05<00:00, 20.91it/s]


Train Loss : 0.884
Epoch: 10


100%|██████████| 118/118 [00:04<00:00, 26.76it/s]


Train Loss : 0.874
Epoch: 11


100%|██████████| 118/118 [00:04<00:00, 23.76it/s]


Train Loss : 0.876
Epoch: 12


100%|██████████| 118/118 [00:04<00:00, 26.24it/s]


Train Loss : 0.874
Epoch: 13


100%|██████████| 118/118 [00:04<00:00, 25.37it/s]


Train Loss : 0.872
Epoch: 14


100%|██████████| 118/118 [00:05<00:00, 20.66it/s]


Train Loss : 0.874
Epoch: 15


100%|██████████| 118/118 [00:04<00:00, 27.63it/s]


Train Loss : 0.873
Average epoch duration: 4.75 seconds.

Test Accuracy : 0.868

Classification Report : 
              precision    recall  f1-score   support

       World       0.92      0.84      0.88      1900
      Sports       0.92      0.95      0.94      1900
    Business       0.82      0.80      0.81      1900
    Sci/Tech       0.81      0.87      0.84      1900

    accuracy                           0.87      7600
   macro avg       0.87      0.87      0.87      7600
weighted avg       0.87      0.87      0.87      7600


Confusion Matrix : 
[[1601   81  144   74]
 [  23 1813   41   23]
 [  50   29 1528  293]
 [  65   42  140 1653]]
--------------------------------------------------------------------------------

Experimental configuration: {'experiment_name': '2Bi-RNN', 'rnn_type': 'rnn', 'bidirectional': True, 'rnn_layers': 2}

Model:
Model(
  (embedding_layer): Embedding(20437, 100, padding_idx=0)
  (rnn): RNN(100, 64, num_layers=2, batch_first=True, bidirectional=True

100%|██████████| 118/118 [00:05<00:00, 23.11it/s]


Train Loss : 1.080
Epoch: 2


100%|██████████| 118/118 [00:05<00:00, 22.61it/s]


Train Loss : 0.905
Epoch: 3


100%|██████████| 118/118 [00:04<00:00, 25.60it/s]


Train Loss : 0.889
Epoch: 4


100%|██████████| 118/118 [00:05<00:00, 22.27it/s]


Train Loss : 0.901
Epoch: 5


100%|██████████| 118/118 [00:04<00:00, 26.00it/s]


Train Loss : 0.893
Epoch: 6


100%|██████████| 118/118 [00:04<00:00, 24.25it/s]


Train Loss : 0.887
Epoch: 7


100%|██████████| 118/118 [00:04<00:00, 23.82it/s]


Train Loss : 0.882
Epoch: 8


100%|██████████| 118/118 [00:04<00:00, 26.14it/s]


Train Loss : 0.881
Epoch: 9


100%|██████████| 118/118 [00:05<00:00, 22.20it/s]


Train Loss : 0.879
Epoch: 10


100%|██████████| 118/118 [00:04<00:00, 25.99it/s]


Train Loss : 0.876
Epoch: 11


100%|██████████| 118/118 [00:04<00:00, 27.00it/s]


Train Loss : 0.876
Epoch: 12


100%|██████████| 118/118 [00:05<00:00, 22.14it/s]


Train Loss : 0.883
Epoch: 13


100%|██████████| 118/118 [00:04<00:00, 26.24it/s]


Train Loss : 0.879
Epoch: 14


100%|██████████| 118/118 [00:05<00:00, 21.72it/s]


Train Loss : 0.886
Epoch: 15


100%|██████████| 118/118 [00:04<00:00, 26.15it/s]


Train Loss : 0.884
Average epoch duration: 4.88 seconds.

Test Accuracy : 0.862

Classification Report : 
              precision    recall  f1-score   support

       World       0.87      0.86      0.87      1900
      Sports       0.93      0.94      0.94      1900
    Business       0.82      0.82      0.82      1900
    Sci/Tech       0.83      0.83      0.83      1900

    accuracy                           0.86      7600
   macro avg       0.86      0.86      0.86      7600
weighted avg       0.86      0.86      0.86      7600


Confusion Matrix : 
[[1631   69  112   88]
 [  59 1787   24   30]
 [  99   30 1555  216]
 [  81   28  214 1577]]
--------------------------------------------------------------------------------

Experimental configuration: {'experiment_name': '1LSTM', 'rnn_type': 'lstm', 'bidirectional': False, 'rnn_layers': 1}

Model:
Model(
  (embedding_layer): Embedding(20437, 100, padding_idx=0)
  (rnn): LSTM(100, 64, batch_first=True)
  (linear): Linear(in_features=

100%|██████████| 118/118 [00:04<00:00, 27.26it/s]


Train Loss : 1.022
Epoch: 2


100%|██████████| 118/118 [00:05<00:00, 21.99it/s]


Train Loss : 0.874
Epoch: 3


100%|██████████| 118/118 [00:04<00:00, 27.06it/s]


Train Loss : 0.865
Epoch: 4


100%|██████████| 118/118 [00:05<00:00, 22.03it/s]


Train Loss : 0.860
Epoch: 5


100%|██████████| 118/118 [00:04<00:00, 25.63it/s]


Train Loss : 0.855
Epoch: 6


100%|██████████| 118/118 [00:04<00:00, 26.05it/s]


Train Loss : 0.851
Epoch: 7


100%|██████████| 118/118 [00:05<00:00, 21.45it/s]


Train Loss : 0.849
Epoch: 8


100%|██████████| 118/118 [00:04<00:00, 26.71it/s]


Train Loss : 0.846
Epoch: 9


100%|██████████| 118/118 [00:05<00:00, 22.83it/s]


Train Loss : 0.844
Epoch: 10


100%|██████████| 118/118 [00:04<00:00, 25.34it/s]


Train Loss : 0.843
Epoch: 11


100%|██████████| 118/118 [00:04<00:00, 26.91it/s]


Train Loss : 0.840
Epoch: 12


100%|██████████| 118/118 [00:05<00:00, 22.46it/s]


Train Loss : 0.839
Epoch: 13


100%|██████████| 118/118 [00:04<00:00, 27.89it/s]


Train Loss : 0.838
Epoch: 14


100%|██████████| 118/118 [00:04<00:00, 24.72it/s]


Train Loss : 0.837
Epoch: 15


100%|██████████| 118/118 [00:05<00:00, 22.95it/s]


Train Loss : 0.834
Average epoch duration: 4.81 seconds.

Test Accuracy : 0.899

Classification Report : 
              precision    recall  f1-score   support

       World       0.94      0.87      0.90      1900
      Sports       0.94      0.98      0.96      1900
    Business       0.86      0.86      0.86      1900
    Sci/Tech       0.86      0.89      0.88      1900

    accuracy                           0.90      7600
   macro avg       0.90      0.90      0.90      7600
weighted avg       0.90      0.90      0.90      7600


Confusion Matrix : 
[[1656   77   94   73]
 [  10 1853   24   13]
 [  60   22 1631  187]
 [  43   18  144 1695]]
--------------------------------------------------------------------------------

Experimental configuration: {'experiment_name': '1Bi-LSTM', 'rnn_type': 'lstm', 'bidirectional': True, 'rnn_layers': 1}

Model:
Model(
  (embedding_layer): Embedding(20437, 100, padding_idx=0)
  (rnn): LSTM(100, 64, batch_first=True, bidirectional=True)
  (linear

100%|██████████| 118/118 [00:04<00:00, 25.60it/s]


Train Loss : 1.030
Epoch: 2


100%|██████████| 118/118 [00:05<00:00, 21.03it/s]


Train Loss : 0.873
Epoch: 3


100%|██████████| 118/118 [00:04<00:00, 25.29it/s]


Train Loss : 0.863
Epoch: 4


100%|██████████| 118/118 [00:05<00:00, 22.88it/s]


Train Loss : 0.858
Epoch: 5


100%|██████████| 118/118 [00:05<00:00, 23.52it/s]


Train Loss : 0.853
Epoch: 6


100%|██████████| 118/118 [00:04<00:00, 24.95it/s]


Train Loss : 0.851
Epoch: 7


100%|██████████| 118/118 [00:05<00:00, 21.14it/s]


Train Loss : 0.848
Epoch: 8


100%|██████████| 118/118 [00:04<00:00, 24.75it/s]


Train Loss : 0.845
Epoch: 9


100%|██████████| 118/118 [00:05<00:00, 20.69it/s]


Train Loss : 0.843
Epoch: 10


100%|██████████| 118/118 [00:04<00:00, 25.20it/s]


Train Loss : 0.841
Epoch: 11


100%|██████████| 118/118 [00:04<00:00, 24.45it/s]


Train Loss : 0.839
Epoch: 12


100%|██████████| 118/118 [00:05<00:00, 21.10it/s]


Train Loss : 0.837
Epoch: 13


100%|██████████| 118/118 [00:04<00:00, 24.76it/s]


Train Loss : 0.835
Epoch: 14


100%|██████████| 118/118 [00:05<00:00, 20.49it/s]


Train Loss : 0.834
Epoch: 15


100%|██████████| 118/118 [00:04<00:00, 24.92it/s]


Train Loss : 0.831
Average epoch duration: 5.09 seconds.

Test Accuracy : 0.896

Classification Report : 
              precision    recall  f1-score   support

       World       0.93      0.88      0.90      1900
      Sports       0.95      0.97      0.96      1900
    Business       0.88      0.83      0.85      1900
    Sci/Tech       0.83      0.91      0.87      1900

    accuracy                           0.90      7600
   macro avg       0.90      0.90      0.90      7600
weighted avg       0.90      0.90      0.90      7600


Confusion Matrix : 
[[1674   63   74   89]
 [  15 1836   24   25]
 [  69   15 1570  246]
 [  44   12  114 1730]]
--------------------------------------------------------------------------------

Experimental configuration: {'experiment_name': '2Bi-LSTM', 'rnn_type': 'lstm', 'bidirectional': True, 'rnn_layers': 2}

Model:
Model(
  (embedding_layer): Embedding(20437, 100, padding_idx=0)
  (rnn): LSTM(100, 64, num_layers=2, batch_first=True, bidirectional=T

100%|██████████| 118/118 [00:05<00:00, 20.92it/s]


Train Loss : 0.984
Epoch: 2


100%|██████████| 118/118 [00:05<00:00, 23.01it/s]


Train Loss : 0.870
Epoch: 3


100%|██████████| 118/118 [00:05<00:00, 23.06it/s]


Train Loss : 0.864
Epoch: 4


100%|██████████| 118/118 [00:06<00:00, 19.64it/s]


Train Loss : 0.856
Epoch: 5


100%|██████████| 118/118 [00:05<00:00, 23.20it/s]


Train Loss : 0.853
Epoch: 6


100%|██████████| 118/118 [00:06<00:00, 18.95it/s]


Train Loss : 0.851
Epoch: 7


100%|██████████| 118/118 [00:05<00:00, 22.64it/s]


Train Loss : 0.848
Epoch: 8


100%|██████████| 118/118 [00:06<00:00, 19.55it/s]


Train Loss : 0.846
Epoch: 9


100%|██████████| 118/118 [00:04<00:00, 23.76it/s]


Train Loss : 0.844
Epoch: 10


100%|██████████| 118/118 [00:05<00:00, 20.08it/s]


Train Loss : 0.840
Epoch: 11


100%|██████████| 118/118 [00:05<00:00, 22.63it/s]


Train Loss : 0.839
Epoch: 12


100%|██████████| 118/118 [00:05<00:00, 23.21it/s]


Train Loss : 0.838
Epoch: 13


100%|██████████| 118/118 [00:05<00:00, 20.74it/s]


Train Loss : 0.836
Epoch: 14


100%|██████████| 118/118 [00:05<00:00, 22.78it/s]


Train Loss : 0.834
Epoch: 15


100%|██████████| 118/118 [00:05<00:00, 19.80it/s]


Train Loss : 0.832
Average epoch duration: 5.50 seconds.

Test Accuracy : 0.900

Classification Report : 
              precision    recall  f1-score   support

       World       0.93      0.89      0.91      1900
      Sports       0.94      0.98      0.96      1900
    Business       0.87      0.84      0.86      1900
    Sci/Tech       0.86      0.89      0.87      1900

    accuracy                           0.90      7600
   macro avg       0.90      0.90      0.90      7600
weighted avg       0.90      0.90      0.90      7600


Confusion Matrix : 
[[1687   64   87   62]
 [  15 1853   18   14]
 [  62   27 1603  208]
 [  54   17  133 1696]]
--------------------------------------------------------------------------------

+------------+--------------+--------------+--------------+--------------+--------------+---------------+
|            |         1RNN |      1Bi-RNN |      2Bi-RNN |        1LSTM |     1Bi-LSTM |      2Bi-LSTM |
| Accuracy   |     0.854079 |     0.867763 |     0.

With freezed pre-trained embeddings, the models perform slightly worse than with non freezed pre-trained embeddings. Despite this, we notice that the complexity of the models is much smaller, since the parameters of the models are now only the RNN/LSTM parameters and the linear layer parameters. In the previous models, the majority of the parameters originated from the embedding layer, which is now freezed during training.

### 6.

To complete the last task of this assignment, we modify the preprocessing process to respect the new dataset format, which contains a "review" column for the text to classify, and a "sentiment" column for the label ("positive" or "negative"). 

In [None]:
import pandas as pd
data = pd.read_csv('IMDB Dataset.csv')

In [None]:
import math
import random

def preprocess_imdb_data(max_words):

    # All texts are truncated and padded to max_words tokens
    def collate_batch(batch):
        Y, X = list(zip(*batch))
        Y = [0 if label == "negative" else 1 for label in Y]
        Y = torch.tensor(Y, dtype=torch.float)
        X = [vocab(tokenizer(text)) for text in X]
        # Bringing all samples to max_words length. Shorter texts are padded with <PAD> sequences, longer texts are truncated.
        X = [tokens+([vocab['<PAD>']]* (max_words-len(tokens))) if len(tokens)<max_words else tokens[:max_words] for tokens in X]
        return torch.tensor(X, dtype=torch.int32).to(device), Y.to(device) 

    dataset = [(row["sentiment"], row["review"]) for i, row in data.iterrows()]

    random.seed(1)
    random.shuffle(dataset)

    train_dataset = dataset[:math.floor(0.8*len(dataset))]
    test_dataset = dataset[math.floor(0.8*len(dataset)):]

    train_loader = DataLoader(train_dataset, batch_size=BATCH_SIZE,
                                  shuffle=True, collate_fn=collate_batch)
    test_loader = DataLoader(test_dataset, batch_size=BATCH_SIZE,
                                  shuffle=False, collate_fn=collate_batch)
    
    def build_vocabulary(datasets):
        for dataset in datasets:
            for _, text in dataset:
                yield tokenizer(text)

    # Vocabulary includes all tokens with at least 10 occurrences in the texts
    # Special tokens <PAD> and <UNK> are used for padding sequences and unknown words respectively
    vocab = build_vocab_from_iterator(build_vocabulary([train_dataset, test_dataset]), min_freq=10, specials=["<PAD>","<UNK>"])
    vocab.set_default_index(vocab["<UNK>"])

    return train_dataset, test_dataset, train_loader, test_loader, vocab

train_dataset, test_dataset, train_loader, test_loader, vocab = preprocess_imdb_data(MAX_WORDS)

target_classes = ["negative", "positive"]


Furthermore, we modify the model into a binary classification model, which has an output dimension of 1 and uses a sigmoid activation function instead of softmax after the linear layer. The model will predict a number between 0 and 1 and the predictions closer to 0 will be interpreted as "negative" and closer to 1 as "positive". 

In [None]:
# Creating a Binary Classification model that always has output dimension of 1 and uses sigmoid instead of softmax.

class BinaryClassificationModel(nn.Module):
    def __init__(self, input_dim, embedding_dim, hidden_dim, rnn_type, rnn_layers, bidirectional):
        super(BinaryClassificationModel, self).__init__()

        self.embedding_layer = nn.Embedding(num_embeddings=input_dim, embedding_dim=embedding_dim)
        
        if rnn_type == "rnn":
          self.rnn = nn.RNN(input_size=embedding_dim, hidden_size=hidden_dim, batch_first=True, num_layers=rnn_layers, bidirectional=bidirectional)
        elif rnn_type == "lstm":
          self.rnn = nn.LSTM(input_size=embedding_dim, hidden_size=hidden_dim, batch_first=True, num_layers=rnn_layers, bidirectional=bidirectional)
        else:
          raise ValueError(f"Unsupported rnn type: {rnn_type}")

        if bidirectional:
            # In the case of bidirectional RNN, its output dimension will be 2 times the hidden size, 
            # since the hidden states of the forward and the backward RNN are concatenated. 
            # Thus, we change the input dimension of the linear layer to be 2 times the hidden dimension of the RNN, to match the output of the RNN.
            self.linear = nn.Linear(2*hidden_dim, 1)
        else:
            self.linear = nn.Linear(hidden_dim, 1)

    def forward(self, X_batch):
        embeddings = self.embedding_layer(X_batch)
        output, hidden = self.rnn(embeddings)
        logits = self.linear(output[:,-1])  # The last output of RNN is used for sequence classification
        probs = F.sigmoid(logits)
        return probs.squeeze(-1)

We also change the loss to Binary Cross Entropy loss (BCELoss), to comply with the new model output.

In [None]:
def run_imdb_experiments(experimental_configs, pretrained_embeddings=None, freeze_embeddings=False):
    results = []
    for config in experimental_configs:
        print(f"Experimental configuration: {config}")
        classifier = BinaryClassificationModel(
            input_dim=len(vocab), 
            embedding_dim=EMBEDDING_DIM, 
            hidden_dim=HIDDEN_DIM, 
            rnn_type=config["rnn_type"],
            rnn_layers=config["rnn_layers"],
            bidirectional=config["bidirectional"]
        ).to(device)
        # Using Binary CrossEntropy loss, since the Binary Classification model will return a sinlge probability in the range (0, 1).
        loss_fn = nn.BCELoss()
        optimizer = torch.optim.Adam([param for param in classifier.parameters() if param.requires_grad == True],lr=LEARNING_RATE)
        num_of_parameters = count_parameters(classifier)

        print('\nModel:')
        print(classifier)
        print('Total parameters: ', num_of_parameters)
        print('\n\n')

        average_epoch_time = TrainIMDBModel(classifier, loss_fn, optimizer, train_loader, EPOCHS)
        _, Y_actual, Y_preds = EvaluateIMDBModel(classifier, loss_fn, test_loader)
        accuracy = accuracy_score(Y_actual, Y_preds)


        print(f"Average epoch duration: {average_epoch_time:.2f} seconds.")
        print("\nTest Accuracy : {:.3f}".format(accuracy))
        print("\nClassification Report : ")
        print(classification_report(Y_actual, Y_preds, target_names=target_classes))
        print("\nConfusion Matrix : ")
        print(confusion_matrix(Y_actual, Y_preds))
        print("-" * 80 + "\n")

        results.append(
            {
                "Experiment name": config["experiment_name"],
                "Accuracy": accuracy,
                "Parameters": num_of_parameters,
                "Time cost": average_epoch_time,
                "Y_preds": Y_preds,
                "Y_actual": Y_actual
            }
        )
    return results

In [None]:
def EvaluateIMDBModel(model, loss_fn, val_loader):
    model.eval()
    with torch.no_grad():
        Y_actual, Y_preds, losses = [],[],[]
        for X, Y in val_loader:
            preds = model(X)
            loss = loss_fn(preds, Y)
            losses.append(loss.item())

            Y_actual.append(Y)
            Y_preds.append(torch.tensor([0 if p <= 0.5 else 1 for p in preds]))
        Y_actual = torch.cat(Y_actual)
        Y_preds = torch.cat(Y_preds)
    
    # Returns mean loss, actual labels, predicted labels 
    return torch.tensor(losses).mean(), Y_actual.detach().cpu().numpy(), Y_preds.detach().cpu().numpy()


def TrainIMDBModel(model, loss_fn, optimizer, train_loader, epochs):
    epoch_times_list = []
    for i in range(1, epochs+1):
        start = time.time()
        model.train()
        print('Epoch:',i)
        losses = []
        for X, Y in tqdm(train_loader):
            Y_preds = model(X)

            loss = loss_fn(Y_preds, Y)
            losses.append(loss.item())

            optimizer.zero_grad()
            loss.backward()
            optimizer.step()

        epoch_time = (time.time() - start)
        epoch_times_list.append(epoch_time)
        print("Train Loss : {:.3f}".format(torch.tensor(losses).mean()))
    
    # return the average epoch duration in seconds.
    return sum(epoch_times_list) / len(epoch_times_list)

In [None]:
MAX_WORDS = 25
train_dataset, test_dataset, train_loader, test_loader, vocab = preprocess_imdb_data(MAX_WORDS)
results = run_imdb_experiments(experimental_configs)
print_eval_table(results)

Experimental configuration: {'experiment_name': '1RNN', 'rnn_type': 'rnn', 'bidirectional': False, 'rnn_layers': 1}

Model:
BinaryClassificationModel(
  (embedding_layer): Embedding(29065, 100)
  (rnn): RNN(100, 64, batch_first=True)
  (linear): Linear(in_features=64, out_features=1, bias=True)
)
Total parameters:  2917189



Epoch: 1


100%|██████████| 40/40 [00:06<00:00,  6.06it/s]


Train Loss : 0.696
Epoch: 2


100%|██████████| 40/40 [00:07<00:00,  5.65it/s]


Train Loss : 0.689
Epoch: 3


100%|██████████| 40/40 [00:06<00:00,  6.59it/s]


Train Loss : 0.681
Epoch: 4


100%|██████████| 40/40 [00:06<00:00,  5.74it/s]


Train Loss : 0.655
Epoch: 5


100%|██████████| 40/40 [00:05<00:00,  6.72it/s]


Train Loss : 0.614
Epoch: 6


100%|██████████| 40/40 [00:06<00:00,  5.75it/s]


Train Loss : 0.575
Epoch: 7


100%|██████████| 40/40 [00:05<00:00,  6.67it/s]


Train Loss : 0.534
Epoch: 8


100%|██████████| 40/40 [00:06<00:00,  5.81it/s]


Train Loss : 0.504
Epoch: 9


100%|██████████| 40/40 [00:06<00:00,  6.65it/s]


Train Loss : 0.473
Epoch: 10


100%|██████████| 40/40 [00:06<00:00,  5.78it/s]


Train Loss : 0.446
Epoch: 11


100%|██████████| 40/40 [00:06<00:00,  6.38it/s]


Train Loss : 0.415
Epoch: 12


100%|██████████| 40/40 [00:07<00:00,  5.59it/s]


Train Loss : 0.385
Epoch: 13


100%|██████████| 40/40 [00:06<00:00,  6.31it/s]


Train Loss : 0.363
Epoch: 14


100%|██████████| 40/40 [00:06<00:00,  5.87it/s]


Train Loss : 0.344
Epoch: 15


100%|██████████| 40/40 [00:06<00:00,  5.93it/s]


Train Loss : 0.322
Average epoch duration: 6.59 seconds.

Test Accuracy : 0.714

Classification Report : 
              precision    recall  f1-score   support

    negative       0.74      0.67      0.71      5104
    positive       0.69      0.76      0.72      4896

    accuracy                           0.71     10000
   macro avg       0.72      0.71      0.71     10000
weighted avg       0.72      0.71      0.71     10000


Confusion Matrix : 
[[3435 1669]
 [1194 3702]]
--------------------------------------------------------------------------------

Experimental configuration: {'experiment_name': '1Bi-RNN', 'rnn_type': 'rnn', 'bidirectional': True, 'rnn_layers': 1}

Model:
BinaryClassificationModel(
  (embedding_layer): Embedding(29065, 100)
  (rnn): RNN(100, 64, batch_first=True, bidirectional=True)
  (linear): Linear(in_features=128, out_features=1, bias=True)
)
Total parameters:  2927877



Epoch: 1


100%|██████████| 40/40 [00:06<00:00,  6.38it/s]


Train Loss : 0.696
Epoch: 2


100%|██████████| 40/40 [00:07<00:00,  5.70it/s]


Train Loss : 0.688
Epoch: 3


100%|██████████| 40/40 [00:06<00:00,  6.60it/s]


Train Loss : 0.667
Epoch: 4


100%|██████████| 40/40 [00:07<00:00,  5.67it/s]


Train Loss : 0.638
Epoch: 5


100%|██████████| 40/40 [00:06<00:00,  6.48it/s]


Train Loss : 0.598
Epoch: 6


100%|██████████| 40/40 [00:07<00:00,  5.44it/s]


Train Loss : 0.571
Epoch: 7


100%|██████████| 40/40 [00:06<00:00,  6.10it/s]


Train Loss : 0.540
Epoch: 8


100%|██████████| 40/40 [00:06<00:00,  5.87it/s]


Train Loss : 0.508
Epoch: 9


100%|██████████| 40/40 [00:07<00:00,  5.60it/s]


Train Loss : 0.477
Epoch: 10


100%|██████████| 40/40 [00:06<00:00,  6.60it/s]


Train Loss : 0.449
Epoch: 11


100%|██████████| 40/40 [00:07<00:00,  5.62it/s]


Train Loss : 0.423
Epoch: 12


100%|██████████| 40/40 [00:06<00:00,  6.63it/s]


Train Loss : 0.397
Epoch: 13


100%|██████████| 40/40 [00:06<00:00,  5.74it/s]


Train Loss : 0.373
Epoch: 14


100%|██████████| 40/40 [00:06<00:00,  6.53it/s]


Train Loss : 0.351
Epoch: 15


100%|██████████| 40/40 [00:07<00:00,  5.61it/s]


Train Loss : 0.332
Average epoch duration: 6.67 seconds.

Test Accuracy : 0.716

Classification Report : 
              precision    recall  f1-score   support

    negative       0.73      0.69      0.71      5104
    positive       0.70      0.74      0.72      4896

    accuracy                           0.72     10000
   macro avg       0.72      0.72      0.72     10000
weighted avg       0.72      0.72      0.72     10000


Confusion Matrix : 
[[3540 1564]
 [1281 3615]]
--------------------------------------------------------------------------------

Experimental configuration: {'experiment_name': '2Bi-RNN', 'rnn_type': 'rnn', 'bidirectional': True, 'rnn_layers': 2}

Model:
BinaryClassificationModel(
  (embedding_layer): Embedding(29065, 100)
  (rnn): RNN(100, 64, num_layers=2, batch_first=True, bidirectional=True)
  (linear): Linear(in_features=128, out_features=1, bias=True)
)
Total parameters:  2952709



Epoch: 1


100%|██████████| 40/40 [00:06<00:00,  5.97it/s]


Train Loss : 0.694
Epoch: 2


100%|██████████| 40/40 [00:06<00:00,  5.83it/s]


Train Loss : 0.679
Epoch: 3


100%|██████████| 40/40 [00:07<00:00,  5.45it/s]


Train Loss : 0.645
Epoch: 4


100%|██████████| 40/40 [00:06<00:00,  6.45it/s]


Train Loss : 0.616
Epoch: 5


100%|██████████| 40/40 [00:07<00:00,  5.55it/s]


Train Loss : 0.578
Epoch: 6


100%|██████████| 40/40 [00:06<00:00,  6.51it/s]


Train Loss : 0.543
Epoch: 7


100%|██████████| 40/40 [00:07<00:00,  5.67it/s]


Train Loss : 0.507
Epoch: 8


100%|██████████| 40/40 [00:06<00:00,  6.44it/s]


Train Loss : 0.471
Epoch: 9


100%|██████████| 40/40 [00:07<00:00,  5.57it/s]


Train Loss : 0.442
Epoch: 10


100%|██████████| 40/40 [00:06<00:00,  6.42it/s]


Train Loss : 0.413
Epoch: 11


100%|██████████| 40/40 [00:07<00:00,  5.55it/s]


Train Loss : 0.384
Epoch: 12


100%|██████████| 40/40 [00:06<00:00,  5.78it/s]


Train Loss : 0.357
Epoch: 13


100%|██████████| 40/40 [00:06<00:00,  6.15it/s]


Train Loss : 0.321
Epoch: 14


100%|██████████| 40/40 [00:07<00:00,  5.43it/s]


Train Loss : 0.290
Epoch: 15


100%|██████████| 40/40 [00:06<00:00,  6.42it/s]


Train Loss : 0.270
Average epoch duration: 6.77 seconds.

Test Accuracy : 0.710

Classification Report : 
              precision    recall  f1-score   support

    negative       0.76      0.63      0.69      5104
    positive       0.67      0.80      0.73      4896

    accuracy                           0.71     10000
   macro avg       0.72      0.71      0.71     10000
weighted avg       0.72      0.71      0.71     10000


Confusion Matrix : 
[[3205 1899]
 [ 999 3897]]
--------------------------------------------------------------------------------

Experimental configuration: {'experiment_name': '1LSTM', 'rnn_type': 'lstm', 'bidirectional': False, 'rnn_layers': 1}

Model:
BinaryClassificationModel(
  (embedding_layer): Embedding(29065, 100)
  (rnn): LSTM(100, 64, batch_first=True)
  (linear): Linear(in_features=64, out_features=1, bias=True)
)
Total parameters:  2949061



Epoch: 1


100%|██████████| 40/40 [00:07<00:00,  5.56it/s]


Train Loss : 0.691
Epoch: 2


100%|██████████| 40/40 [00:06<00:00,  6.42it/s]


Train Loss : 0.672
Epoch: 3


100%|██████████| 40/40 [00:07<00:00,  5.63it/s]


Train Loss : 0.621
Epoch: 4


100%|██████████| 40/40 [00:06<00:00,  5.91it/s]


Train Loss : 0.564
Epoch: 5


100%|██████████| 40/40 [00:06<00:00,  6.27it/s]


Train Loss : 0.517
Epoch: 6


100%|██████████| 40/40 [00:07<00:00,  5.65it/s]


Train Loss : 0.477
Epoch: 7


100%|██████████| 40/40 [00:06<00:00,  6.56it/s]


Train Loss : 0.448
Epoch: 8


100%|██████████| 40/40 [00:07<00:00,  5.67it/s]


Train Loss : 0.416
Epoch: 9


100%|██████████| 40/40 [00:06<00:00,  6.44it/s]


Train Loss : 0.385
Epoch: 10


100%|██████████| 40/40 [00:07<00:00,  5.71it/s]


Train Loss : 0.356
Epoch: 11


100%|██████████| 40/40 [00:06<00:00,  6.45it/s]


Train Loss : 0.332
Epoch: 12


100%|██████████| 40/40 [00:06<00:00,  5.72it/s]


Train Loss : 0.300
Epoch: 13


100%|██████████| 40/40 [00:06<00:00,  6.55it/s]


Train Loss : 0.270
Epoch: 14


100%|██████████| 40/40 [00:07<00:00,  5.70it/s]


Train Loss : 0.244
Epoch: 15


100%|██████████| 40/40 [00:06<00:00,  6.63it/s]


Train Loss : 0.223
Average epoch duration: 6.64 seconds.

Test Accuracy : 0.713

Classification Report : 
              precision    recall  f1-score   support

    negative       0.69      0.79      0.74      5104
    positive       0.74      0.63      0.68      4896

    accuracy                           0.71     10000
   macro avg       0.72      0.71      0.71     10000
weighted avg       0.72      0.71      0.71     10000


Confusion Matrix : 
[[4034 1070]
 [1799 3097]]
--------------------------------------------------------------------------------

Experimental configuration: {'experiment_name': '1Bi-LSTM', 'rnn_type': 'lstm', 'bidirectional': True, 'rnn_layers': 1}

Model:
BinaryClassificationModel(
  (embedding_layer): Embedding(29065, 100)
  (rnn): LSTM(100, 64, batch_first=True, bidirectional=True)
  (linear): Linear(in_features=128, out_features=1, bias=True)
)
Total parameters:  2991621



Epoch: 1


100%|██████████| 40/40 [00:06<00:00,  6.29it/s]


Train Loss : 0.691
Epoch: 2


100%|██████████| 40/40 [00:07<00:00,  5.45it/s]


Train Loss : 0.672
Epoch: 3


100%|██████████| 40/40 [00:06<00:00,  6.03it/s]


Train Loss : 0.619
Epoch: 4


100%|██████████| 40/40 [00:07<00:00,  5.45it/s]


Train Loss : 0.562
Epoch: 5


100%|██████████| 40/40 [00:06<00:00,  6.13it/s]


Train Loss : 0.515
Epoch: 6


100%|██████████| 40/40 [00:07<00:00,  5.49it/s]


Train Loss : 0.475
Epoch: 7


100%|██████████| 40/40 [00:07<00:00,  5.69it/s]


Train Loss : 0.442
Epoch: 8


100%|██████████| 40/40 [00:06<00:00,  6.12it/s]


Train Loss : 0.411
Epoch: 9


100%|██████████| 40/40 [00:07<00:00,  5.40it/s]


Train Loss : 0.378
Epoch: 10


100%|██████████| 40/40 [00:06<00:00,  6.13it/s]


Train Loss : 0.351
Epoch: 11


100%|██████████| 40/40 [00:07<00:00,  5.44it/s]


Train Loss : 0.328
Epoch: 12


100%|██████████| 40/40 [00:06<00:00,  6.14it/s]


Train Loss : 0.296
Epoch: 13


100%|██████████| 40/40 [00:07<00:00,  5.49it/s]


Train Loss : 0.271
Epoch: 14


100%|██████████| 40/40 [00:06<00:00,  5.87it/s]


Train Loss : 0.250
Epoch: 15


100%|██████████| 40/40 [00:06<00:00,  5.85it/s]


Train Loss : 0.218
Average epoch duration: 6.93 seconds.

Test Accuracy : 0.717

Classification Report : 
              precision    recall  f1-score   support

    negative       0.74      0.69      0.71      5104
    positive       0.70      0.75      0.72      4896

    accuracy                           0.72     10000
   macro avg       0.72      0.72      0.72     10000
weighted avg       0.72      0.72      0.72     10000


Confusion Matrix : 
[[3501 1603]
 [1227 3669]]
--------------------------------------------------------------------------------

Experimental configuration: {'experiment_name': '2Bi-LSTM', 'rnn_type': 'lstm', 'bidirectional': True, 'rnn_layers': 2}

Model:
BinaryClassificationModel(
  (embedding_layer): Embedding(29065, 100)
  (rnn): LSTM(100, 64, num_layers=2, batch_first=True, bidirectional=True)
  (linear): Linear(in_features=128, out_features=1, bias=True)
)
Total parameters:  3090949



Epoch: 1


100%|██████████| 40/40 [00:07<00:00,  5.16it/s]


Train Loss : 0.688
Epoch: 2


100%|██████████| 40/40 [00:06<00:00,  5.88it/s]


Train Loss : 0.638
Epoch: 3


100%|██████████| 40/40 [00:07<00:00,  5.23it/s]


Train Loss : 0.582
Epoch: 4


100%|██████████| 40/40 [00:07<00:00,  5.08it/s]


Train Loss : 0.527
Epoch: 5


100%|██████████| 40/40 [00:06<00:00,  5.87it/s]


Train Loss : 0.487
Epoch: 6


100%|██████████| 40/40 [00:07<00:00,  5.14it/s]


Train Loss : 0.454
Epoch: 7


100%|██████████| 40/40 [00:06<00:00,  5.74it/s]


Train Loss : 0.413
Epoch: 8


100%|██████████| 40/40 [00:07<00:00,  5.19it/s]


Train Loss : 0.385
Epoch: 9


100%|██████████| 40/40 [00:07<00:00,  5.09it/s]


Train Loss : 0.352
Epoch: 10


100%|██████████| 40/40 [00:06<00:00,  5.84it/s]


Train Loss : 0.315
Epoch: 11


100%|██████████| 40/40 [00:07<00:00,  5.07it/s]


Train Loss : 0.277
Epoch: 12


100%|██████████| 40/40 [00:06<00:00,  5.88it/s]


Train Loss : 0.244
Epoch: 13


100%|██████████| 40/40 [00:07<00:00,  5.22it/s]


Train Loss : 0.226
Epoch: 14


100%|██████████| 40/40 [00:07<00:00,  5.21it/s]


Train Loss : 0.185
Epoch: 15


100%|██████████| 40/40 [00:06<00:00,  5.83it/s]


Train Loss : 0.155
Average epoch duration: 7.41 seconds.

Test Accuracy : 0.721

Classification Report : 
              precision    recall  f1-score   support

    negative       0.72      0.74      0.73      5104
    positive       0.72      0.70      0.71      4896

    accuracy                           0.72     10000
   macro avg       0.72      0.72      0.72     10000
weighted avg       0.72      0.72      0.72     10000


Confusion Matrix : 
[[3768 1336]
 [1457 3439]]
--------------------------------------------------------------------------------

+------------+-------------+-------------+-------------+-------------+-------------+-------------+
|            |        1RNN |     1Bi-RNN |     2Bi-RNN |       1LSTM |    1Bi-LSTM |    2Bi-LSTM |
| Accuracy   | 0.7137      | 0.7155      | 0.7102      | 0.7131      | 0.717       | 0.7207      |
+------------+-------------+-------------+-------------+-------------+-------------+-------------+
| Parameters | 2.91719e+06 | 2.92788e+06 

In this new dataset, again more complex models tend to perform slightly better. An exception is the 2Bi-RNN model which performs worse than less complex RNNs. The most complex model (2Bi-LSTM) achieves the best accuracy.

Regarding Time cost, the more complex a model, the slower is its average epoch runtime, although the differences in GPU execution is small. The slowest model is again the most complex model, 2Bi-LSTM.

Using 2 layers instead of 1, results in lower accuracy in the Bi-RNN models, but improves accuracy of the Bi-LSTM models and leads to the best model: 2Bi-LSTM.