## Part 2. Model Training & Evaluation - RNN

Now with the pretrained word embeddings acquired from Part 1 and the dataset acquired from Part
0, you need to train a deep learning model for topic classification using the training set, conforming
to these requirements:
- Use the pretrained word embeddings from Part 1 as inputs, together with your implementation
in mitigating the influence of OOV words; make them learnable parameters during training
(they are updated).
- Design a simple recurrent neural network (RNN), taking the input word embeddings, and
predicting a topic label for each sentence. To do that, you need to consider how to aggregate
the word representations to represent a sentence.
- Use the validation set to gauge the performance of the model for each epoch during training.
You are required to use accuracy as the performance metric during validation and evaluation.
- Use the mini-batch strategy during training. You may choose any preferred optimizer (e.g.,
SGD, Adagrad, Adam, RMSprop). Be careful when you choose your initial learning rate and
mini-batch size. (You should use the validation set to determine the optimal configuration.)
Train the model until the accuracy score on the validation set is not increasing for a few
epochs.
- Try different regularization techniques to mitigate overfitting.
- Evaluate your trained model on the test dataset, observing the accuracy score.

In [1]:
import json
import numpy as np
import random
import os
import lightning as L
from pathlib import Path
from torchtext import data, datasets
from utils.config import Config
from utils.train import train_rnn_model_with_parameters, train_rnn_model_with_parameters_regularzition
from utils.helper import SentenceDataset, collate_fn
from utils.analytics import load_tensorboard_logs
from models.RNN import RNN, RNNClassifier
from torch.utils.data import Dataset, DataLoader
from lightning.pytorch.callbacks import EarlyStopping, ModelCheckpoint

  from .autonotebook import tqdm as notebook_tqdm


In [2]:
TEXT = data.Field(tokenize = 'spacy', tokenizer_language='en_core_web_sm', include_lengths=True)
LABEL = data.LabelField()

train_data, test_data = datasets.TREC.splits(TEXT, LABEL, fine_grained=False)

In [3]:
train_data, valid_data = train_data.split(random_state=random.seed(Config.SEED), split_ratio=0.8)

In [4]:
TEXT.build_vocab(train_data, vectors="glove.6B.300d")
LABEL.build_vocab(train_data)

### Import the embedding matrix and vocab index mapping (train data)

In [5]:
embedding_path = Path("models/embedding_matrix.npy")
index_from_word_path = Path("models/index_from_word.json")

embedding_matrix = np.load(embedding_path)
with index_from_word_path.open() as f:
    index_from_word = json.load(f)

### Dataset

In [6]:
train_dataset = SentenceDataset(train_data.examples, index_from_word, LABEL.vocab)
valid_dataset = SentenceDataset(valid_data.examples, index_from_word, LABEL.vocab)
test_dataset = SentenceDataset(test_data.examples, index_from_word, LABEL.vocab)        

### Train RNN Model

In [7]:
SEARCH_SPACE = {
    "batch_size": [32, 64, 128],
    "optimizer_name": ["RMSprop", "Adam"],
    "learning_rate": [5e-4, 1e-3, 5e-3, 1e-2],  # More reasonable range
    "hidden_dim": [32, 64, 128, 256],
    "num_layers": [1, 2, 4],
    "sentence_representation_type": ["last", "average", "max"],
}

In [None]:
for hidden_dim in SEARCH_SPACE["hidden_dim"]:
    for num_layers in SEARCH_SPACE["num_layers"]:
        for optimizer_name in SEARCH_SPACE["optimizer_name"]:
            for batch_size in SEARCH_SPACE["batch_size"]:
                for learning_rate in SEARCH_SPACE["learning_rate"]:
                    for sentence_representation_type in SEARCH_SPACE["sentence_representation_type"]:
                        log_message = f"---------- batch_size_{batch_size}; lr_{learning_rate}; optimizer_{optimizer_name}; hidden_dim_{hidden_dim}; num_layers_{num_layers}; sentence_representation_{sentence_representation_type}  ----------"
                        print(log_message)
                        train_rnn_model_with_parameters(
                            embedding_matrix=embedding_matrix,
                            train_dataset=train_dataset,
                            val_dataset=valid_dataset,
                            batch_size=batch_size,
                            learning_rate=learning_rate,
                            optimizer_name=optimizer_name,
                            hidden_dim=hidden_dim,
                            num_layers=num_layers,
                            sentence_representation_type=sentence_representation_type,
                            show_progress=True,
                            freeze_embedding=False,
                        )

Seed set to 42


---------- batch_size_32; lr_0.0005; optimizer_RMSprop; hidden_dim_32; num_layers_1; sentence_representation_last  ----------


/home/linnsheng/Desktop/NTU/S3/Y1/NLP/SC4002/.venv/lib/python3.13/site-packages/lightning/pytorch/utilities/parsing.py:210: Attribute 'rnn_model' is an instance of `nn.Module` and is already saved during checkpointing. It is recommended to ignore them using `self.save_hyperparameters(ignore=['rnn_model'])`.
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 18: 100%|██████████| 137/137 [00:03<00:00, 45.22it/s, v_num=0, train_loss=0.0145, train_acc=1.000, val_loss=2.350, val_acc=0.499] 


Seed set to 42


---------- batch_size_32; lr_0.0005; optimizer_RMSprop; hidden_dim_32; num_layers_1; sentence_representation_average  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 9: 100%|██████████| 137/137 [00:02<00:00, 46.35it/s, v_num=0, train_loss=0.0252, train_acc=1.000, val_loss=0.703, val_acc=0.767]


Seed set to 42


---------- batch_size_32; lr_0.0005; optimizer_RMSprop; hidden_dim_32; num_layers_1; sentence_representation_max  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 12: 100%|██████████| 137/137 [00:02<00:00, 45.79it/s, v_num=0, train_loss=0.126, train_acc=0.900, val_loss=1.920, val_acc=0.628]  


Seed set to 42


---------- batch_size_32; lr_0.001; optimizer_RMSprop; hidden_dim_32; num_layers_1; sentence_representation_last  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 13: 100%|██████████| 137/137 [00:02<00:00, 64.81it/s, v_num=0, train_loss=0.0227, train_acc=1.000, val_loss=0.980, val_acc=0.735] 


Seed set to 42


---------- batch_size_32; lr_0.001; optimizer_RMSprop; hidden_dim_32; num_layers_1; sentence_representation_average  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 13: 100%|██████████| 137/137 [00:02<00:00, 61.44it/s, v_num=0, train_loss=0.00331, train_acc=1.000, val_loss=1.110, val_acc=0.763] 


Seed set to 42


---------- batch_size_32; lr_0.001; optimizer_RMSprop; hidden_dim_32; num_layers_1; sentence_representation_max  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 7: 100%|██████████| 137/137 [00:02<00:00, 65.45it/s, v_num=0, train_loss=0.00622, train_acc=1.000, val_loss=1.540, val_acc=0.655]


Seed set to 42


---------- batch_size_32; lr_0.005; optimizer_RMSprop; hidden_dim_32; num_layers_1; sentence_representation_last  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 6: 100%|██████████| 137/137 [00:03<00:00, 37.17it/s, v_num=0, train_loss=0.205, train_acc=0.556, val_loss=2.360, val_acc=0.472] 


Seed set to 42


---------- batch_size_32; lr_0.005; optimizer_RMSprop; hidden_dim_32; num_layers_1; sentence_representation_average  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 4: 100%|██████████| 137/137 [00:03<00:00, 40.46it/s, v_num=0, train_loss=0.000503, train_acc=1.000, val_loss=0.915, val_acc=0.718]


Seed set to 42


---------- batch_size_32; lr_0.005; optimizer_RMSprop; hidden_dim_32; num_layers_1; sentence_representation_max  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 7: 100%|██████████| 137/137 [00:02<00:00, 49.77it/s, v_num=0, train_loss=0.00125, train_acc=1.000, val_loss=1.010, val_acc=0.786] 


Seed set to 42


---------- batch_size_32; lr_0.01; optimizer_RMSprop; hidden_dim_32; num_layers_1; sentence_representation_last  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 4: 100%|██████████| 137/137 [00:02<00:00, 52.77it/s, v_num=0, train_loss=1.470, train_acc=0.500, val_loss=1.600, val_acc=0.214]


Seed set to 42


---------- batch_size_32; lr_0.01; optimizer_RMSprop; hidden_dim_32; num_layers_1; sentence_representation_average  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 5: 100%|██████████| 137/137 [00:02<00:00, 51.35it/s, v_num=0, train_loss=0.0215, train_acc=1.000, val_loss=1.450, val_acc=0.669]  


Seed set to 42


---------- batch_size_32; lr_0.01; optimizer_RMSprop; hidden_dim_32; num_layers_1; sentence_representation_max  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 7: 100%|██████████| 137/137 [00:02<00:00, 49.77it/s, v_num=0, train_loss=0.0082, train_acc=1.000, val_loss=1.060, val_acc=0.636]


Seed set to 42


---------- batch_size_64; lr_0.0005; optimizer_RMSprop; hidden_dim_32; num_layers_1; sentence_representation_last  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 18: 100%|██████████| 69/69 [00:01<00:00, 63.68it/s, v_num=0, train_loss=0.00895, train_acc=1.000, val_loss=1.780, val_acc=0.582]


Seed set to 42
GPU available: True (cuda), used: True


---------- batch_size_64; lr_0.0005; optimizer_RMSprop; hidden_dim_32; num_layers_1; sentence_representation_average  ----------


TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 9: 100%|██████████| 69/69 [00:01<00:00, 56.10it/s, v_num=0, train_loss=0.0748, train_acc=1.000, val_loss=0.664, val_acc=0.719]


Seed set to 42


---------- batch_size_64; lr_0.0005; optimizer_RMSprop; hidden_dim_32; num_layers_1; sentence_representation_max  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 12: 100%|██████████| 69/69 [00:01<00:00, 60.90it/s, v_num=0, train_loss=0.117, train_acc=0.900, val_loss=1.390, val_acc=0.633] 


Seed set to 42


---------- batch_size_64; lr_0.001; optimizer_RMSprop; hidden_dim_32; num_layers_1; sentence_representation_last  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 13: 100%|██████████| 69/69 [00:01<00:00, 63.12it/s, v_num=0, train_loss=0.00842, train_acc=1.000, val_loss=2.390, val_acc=0.516]


Seed set to 42


---------- batch_size_64; lr_0.001; optimizer_RMSprop; hidden_dim_32; num_layers_1; sentence_representation_average  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 12: 100%|██████████| 69/69 [00:01<00:00, 60.34it/s, v_num=0, train_loss=0.00618, train_acc=1.000, val_loss=0.943, val_acc=0.756]


Seed set to 42


---------- batch_size_64; lr_0.001; optimizer_RMSprop; hidden_dim_32; num_layers_1; sentence_representation_max  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 10: 100%|██████████| 69/69 [00:01<00:00, 56.83it/s, v_num=0, train_loss=0.00307, train_acc=1.000, val_loss=1.470, val_acc=0.646]


Seed set to 42


---------- batch_size_64; lr_0.005; optimizer_RMSprop; hidden_dim_32; num_layers_1; sentence_representation_last  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 8: 100%|██████████| 69/69 [00:01<00:00, 61.88it/s, v_num=0, train_loss=0.00532, train_acc=1.000, val_loss=2.070, val_acc=0.503]


Seed set to 42


---------- batch_size_64; lr_0.005; optimizer_RMSprop; hidden_dim_32; num_layers_1; sentence_representation_average  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 11: 100%|██████████| 69/69 [00:01<00:00, 56.60it/s, v_num=0, train_loss=0.00979, train_acc=1.000, val_loss=1.140, val_acc=0.732] 


Seed set to 42


---------- batch_size_64; lr_0.005; optimizer_RMSprop; hidden_dim_32; num_layers_1; sentence_representation_max  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 4: 100%|██████████| 69/69 [00:01<00:00, 56.95it/s, v_num=0, train_loss=0.0047, train_acc=1.000, val_loss=1.230, val_acc=0.678] 


Seed set to 42


---------- batch_size_64; lr_0.01; optimizer_RMSprop; hidden_dim_32; num_layers_1; sentence_representation_last  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 12: 100%|██████████| 69/69 [00:01<00:00, 62.97it/s, v_num=0, train_loss=0.379, train_acc=0.900, val_loss=2.420, val_acc=0.400]


Seed set to 42


---------- batch_size_64; lr_0.01; optimizer_RMSprop; hidden_dim_32; num_layers_1; sentence_representation_average  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 5: 100%|██████████| 69/69 [00:01<00:00, 57.87it/s, v_num=0, train_loss=0.00154, train_acc=1.000, val_loss=0.989, val_acc=0.709] 


Seed set to 42


---------- batch_size_64; lr_0.01; optimizer_RMSprop; hidden_dim_32; num_layers_1; sentence_representation_max  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 7: 100%|██████████| 69/69 [00:01<00:00, 56.89it/s, v_num=0, train_loss=0.0939, train_acc=1.000, val_loss=0.921, val_acc=0.717] 


Seed set to 42


---------- batch_size_128; lr_0.0005; optimizer_RMSprop; hidden_dim_32; num_layers_1; sentence_representation_last  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 17: 100%|██████████| 35/35 [00:00<00:00, 59.58it/s, v_num=0, train_loss=0.0387, train_acc=1.000, val_loss=1.270, val_acc=0.547]


Seed set to 42


---------- batch_size_128; lr_0.0005; optimizer_RMSprop; hidden_dim_32; num_layers_1; sentence_representation_average  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 12: 100%|██████████| 35/35 [00:00<00:00, 56.68it/s, v_num=0, train_loss=0.175, train_acc=0.900, val_loss=0.625, val_acc=0.700] 


Seed set to 42


---------- batch_size_128; lr_0.0005; optimizer_RMSprop; hidden_dim_32; num_layers_1; sentence_representation_max  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 10: 100%|██████████| 35/35 [00:00<00:00, 58.44it/s, v_num=0, train_loss=0.0617, train_acc=1.000, val_loss=1.210, val_acc=0.556]


Seed set to 42


---------- batch_size_128; lr_0.001; optimizer_RMSprop; hidden_dim_32; num_layers_1; sentence_representation_last  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 8: 100%|██████████| 35/35 [00:00<00:00, 53.87it/s, v_num=0, train_loss=0.0822, train_acc=1.000, val_loss=1.580, val_acc=0.457]


Seed set to 42


---------- batch_size_128; lr_0.001; optimizer_RMSprop; hidden_dim_32; num_layers_1; sentence_representation_average  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 12: 100%|██████████| 35/35 [00:00<00:00, 47.08it/s, v_num=0, train_loss=0.0111, train_acc=1.000, val_loss=0.669, val_acc=0.727] 


Seed set to 42


---------- batch_size_128; lr_0.001; optimizer_RMSprop; hidden_dim_32; num_layers_1; sentence_representation_max  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 13: 100%|██████████| 35/35 [00:00<00:00, 55.15it/s, v_num=0, train_loss=0.0188, train_acc=1.000, val_loss=1.470, val_acc=0.629] 


Seed set to 42


---------- batch_size_128; lr_0.005; optimizer_RMSprop; hidden_dim_32; num_layers_1; sentence_representation_last  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 8: 100%|██████████| 35/35 [00:00<00:00, 59.88it/s, v_num=0, train_loss=0.0243, train_acc=1.000, val_loss=1.980, val_acc=0.466]


Seed set to 42


---------- batch_size_128; lr_0.005; optimizer_RMSprop; hidden_dim_32; num_layers_1; sentence_representation_average  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 7: 100%|██████████| 35/35 [00:00<00:00, 57.26it/s, v_num=0, train_loss=0.00447, train_acc=1.000, val_loss=0.977, val_acc=0.705]


Seed set to 42


---------- batch_size_128; lr_0.005; optimizer_RMSprop; hidden_dim_32; num_layers_1; sentence_representation_max  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 4: 100%|██████████| 35/35 [00:00<00:00, 61.20it/s, v_num=0, train_loss=0.00655, train_acc=1.000, val_loss=0.950, val_acc=0.644]


Seed set to 42


---------- batch_size_128; lr_0.01; optimizer_RMSprop; hidden_dim_32; num_layers_1; sentence_representation_last  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 19: 100%|██████████| 35/35 [00:00<00:00, 64.75it/s, v_num=0, train_loss=0.152, train_acc=1.000, val_loss=2.730, val_acc=0.409] 


Seed set to 42


---------- batch_size_128; lr_0.01; optimizer_RMSprop; hidden_dim_32; num_layers_1; sentence_representation_average  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 12: 100%|██████████| 35/35 [00:00<00:00, 52.32it/s, v_num=0, train_loss=0.000977, train_acc=1.000, val_loss=1.320, val_acc=0.676]


Seed set to 42


---------- batch_size_128; lr_0.01; optimizer_RMSprop; hidden_dim_32; num_layers_1; sentence_representation_max  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 8: 100%|██████████| 35/35 [00:00<00:00, 56.82it/s, v_num=0, train_loss=0.0158, train_acc=1.000, val_loss=1.270, val_acc=0.608] 


Seed set to 42


---------- batch_size_32; lr_0.0005; optimizer_Adam; hidden_dim_32; num_layers_1; sentence_representation_last  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 11: 100%|██████████| 137/137 [00:02<00:00, 66.69it/s, v_num=0, train_loss=0.030, train_acc=1.000, val_loss=1.880, val_acc=0.525]  


Seed set to 42


---------- batch_size_32; lr_0.0005; optimizer_Adam; hidden_dim_32; num_layers_1; sentence_representation_average  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 15: 100%|██████████| 137/137 [00:02<00:00, 58.57it/s, v_num=0, train_loss=0.00342, train_acc=1.000, val_loss=0.818, val_acc=0.785]


Seed set to 42


---------- batch_size_32; lr_0.0005; optimizer_Adam; hidden_dim_32; num_layers_1; sentence_representation_max  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 7: 100%|██████████| 137/137 [00:02<00:00, 60.92it/s, v_num=0, train_loss=0.0506, train_acc=1.000, val_loss=1.390, val_acc=0.629]


Seed set to 42


---------- batch_size_32; lr_0.001; optimizer_Adam; hidden_dim_32; num_layers_1; sentence_representation_last  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 11: 100%|██████████| 137/137 [00:02<00:00, 61.24it/s, v_num=0, train_loss=0.00202, train_acc=1.000, val_loss=2.340, val_acc=0.548]


Seed set to 42


---------- batch_size_32; lr_0.001; optimizer_Adam; hidden_dim_32; num_layers_1; sentence_representation_average  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 6: 100%|██████████| 137/137 [00:02<00:00, 60.64it/s, v_num=0, train_loss=0.00772, train_acc=1.000, val_loss=0.643, val_acc=0.780]


Seed set to 42


---------- batch_size_32; lr_0.001; optimizer_Adam; hidden_dim_32; num_layers_1; sentence_representation_max  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 9: 100%|██████████| 137/137 [00:02<00:00, 51.17it/s, v_num=0, train_loss=0.00318, train_acc=1.000, val_loss=1.810, val_acc=0.645]


Seed set to 42


---------- batch_size_32; lr_0.005; optimizer_Adam; hidden_dim_32; num_layers_1; sentence_representation_last  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 7: 100%|██████████| 137/137 [00:02<00:00, 63.34it/s, v_num=0, train_loss=0.282, train_acc=0.733, val_loss=1.910, val_acc=0.492] 


Seed set to 42


---------- batch_size_32; lr_0.005; optimizer_Adam; hidden_dim_32; num_layers_1; sentence_representation_average  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 8: 100%|██████████| 137/137 [00:02<00:00, 58.05it/s, v_num=0, train_loss=0.0175, train_acc=1.000, val_loss=0.962, val_acc=0.778]  


Seed set to 42


---------- batch_size_32; lr_0.005; optimizer_Adam; hidden_dim_32; num_layers_1; sentence_representation_max  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 9: 100%|██████████| 137/137 [00:02<00:00, 59.91it/s, v_num=0, train_loss=0.000257, train_acc=1.000, val_loss=1.080, val_acc=0.757]


Seed set to 42


---------- batch_size_32; lr_0.01; optimizer_Adam; hidden_dim_32; num_layers_1; sentence_representation_last  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 6: 100%|██████████| 137/137 [00:02<00:00, 62.58it/s, v_num=0, train_loss=0.537, train_acc=0.917, val_loss=1.940, val_acc=0.320]


Seed set to 42


---------- batch_size_32; lr_0.01; optimizer_Adam; hidden_dim_32; num_layers_1; sentence_representation_average  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 6: 100%|██████████| 137/137 [00:02<00:00, 57.31it/s, v_num=0, train_loss=0.000329, train_acc=1.000, val_loss=1.140, val_acc=0.738]


Seed set to 42


---------- batch_size_32; lr_0.01; optimizer_Adam; hidden_dim_32; num_layers_1; sentence_representation_max  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 5: 100%|██████████| 137/137 [00:02<00:00, 58.90it/s, v_num=0, train_loss=0.149, train_acc=0.900, val_loss=1.120, val_acc=0.682]  


Seed set to 42


---------- batch_size_64; lr_0.0005; optimizer_Adam; hidden_dim_32; num_layers_1; sentence_representation_last  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 16: 100%|██████████| 69/69 [00:01<00:00, 61.66it/s, v_num=0, train_loss=0.0147, train_acc=1.000, val_loss=1.700, val_acc=0.563] 


Seed set to 42


---------- batch_size_64; lr_0.0005; optimizer_Adam; hidden_dim_32; num_layers_1; sentence_representation_average  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 11: 100%|██████████| 69/69 [00:01<00:00, 58.16it/s, v_num=0, train_loss=0.0242, train_acc=1.000, val_loss=0.647, val_acc=0.727]


Seed set to 42


---------- batch_size_64; lr_0.0005; optimizer_Adam; hidden_dim_32; num_layers_1; sentence_representation_max  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 29: 100%|██████████| 69/69 [00:01<00:00, 59.82it/s, v_num=0, train_loss=0.009, train_acc=1.000, val_loss=1.920, val_acc=0.634]  


Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs


---------- batch_size_64; lr_0.001; optimizer_Adam; hidden_dim_32; num_layers_1; sentence_representation_last  ----------


LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 12: 100%|██████████| 69/69 [00:01<00:00, 58.84it/s, v_num=0, train_loss=0.00365, train_acc=1.000, val_loss=2.070, val_acc=0.522]


Seed set to 42


---------- batch_size_64; lr_0.001; optimizer_Adam; hidden_dim_32; num_layers_1; sentence_representation_average  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 11: 100%|██████████| 69/69 [00:01<00:00, 56.95it/s, v_num=0, train_loss=0.00917, train_acc=1.000, val_loss=0.688, val_acc=0.779]


Seed set to 42


---------- batch_size_64; lr_0.001; optimizer_Adam; hidden_dim_32; num_layers_1; sentence_representation_max  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 13: 100%|██████████| 69/69 [00:01<00:00, 63.71it/s, v_num=0, train_loss=0.0105, train_acc=1.000, val_loss=1.600, val_acc=0.649] 


Seed set to 42


---------- batch_size_64; lr_0.005; optimizer_Adam; hidden_dim_32; num_layers_1; sentence_representation_last  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 10: 100%|██████████| 69/69 [00:01<00:00, 63.66it/s, v_num=0, train_loss=0.00403, train_acc=1.000, val_loss=2.600, val_acc=0.492]


Seed set to 42


---------- batch_size_64; lr_0.005; optimizer_Adam; hidden_dim_32; num_layers_1; sentence_representation_average  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 10: 100%|██████████| 69/69 [00:01<00:00, 60.88it/s, v_num=0, train_loss=0.00037, train_acc=1.000, val_loss=0.885, val_acc=0.796] 


Seed set to 42


---------- batch_size_64; lr_0.005; optimizer_Adam; hidden_dim_32; num_layers_1; sentence_representation_max  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 9: 100%|██████████| 69/69 [00:01<00:00, 57.94it/s, v_num=0, train_loss=0.000982, train_acc=1.000, val_loss=1.490, val_acc=0.693]


Seed set to 42


---------- batch_size_64; lr_0.01; optimizer_Adam; hidden_dim_32; num_layers_1; sentence_representation_last  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 7: 100%|██████████| 69/69 [00:01<00:00, 63.24it/s, v_num=0, train_loss=0.129, train_acc=0.933, val_loss=1.840, val_acc=0.505] 


Seed set to 42


---------- batch_size_64; lr_0.01; optimizer_Adam; hidden_dim_32; num_layers_1; sentence_representation_average  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 8: 100%|██████████| 69/69 [00:01<00:00, 59.23it/s, v_num=0, train_loss=8.9e-5, train_acc=1.000, val_loss=0.906, val_acc=0.775]  


Seed set to 42


---------- batch_size_64; lr_0.01; optimizer_Adam; hidden_dim_32; num_layers_1; sentence_representation_max  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 9: 100%|██████████| 69/69 [00:01<00:00, 55.46it/s, v_num=0, train_loss=0.000335, train_acc=1.000, val_loss=1.210, val_acc=0.735]


Seed set to 42


---------- batch_size_128; lr_0.0005; optimizer_Adam; hidden_dim_32; num_layers_1; sentence_representation_last  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 15: 100%|██████████| 35/35 [00:00<00:00, 52.33it/s, v_num=0, train_loss=0.0262, train_acc=1.000, val_loss=1.590, val_acc=0.460]


Seed set to 42


---------- batch_size_128; lr_0.0005; optimizer_Adam; hidden_dim_32; num_layers_1; sentence_representation_average  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 23: 100%|██████████| 35/35 [00:00<00:00, 54.85it/s, v_num=0, train_loss=0.0248, train_acc=1.000, val_loss=0.667, val_acc=0.771]


Seed set to 42


---------- batch_size_128; lr_0.0005; optimizer_Adam; hidden_dim_32; num_layers_1; sentence_representation_max  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 30: 100%|██████████| 35/35 [00:00<00:00, 56.63it/s, v_num=0, train_loss=0.00302, train_acc=1.000, val_loss=1.510, val_acc=0.649]


Seed set to 42


---------- batch_size_128; lr_0.001; optimizer_Adam; hidden_dim_32; num_layers_1; sentence_representation_last  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 18: 100%|██████████| 35/35 [00:00<00:00, 59.41it/s, v_num=0, train_loss=0.00378, train_acc=1.000, val_loss=2.370, val_acc=0.488]


Seed set to 42


---------- batch_size_128; lr_0.001; optimizer_Adam; hidden_dim_32; num_layers_1; sentence_representation_average  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 22: 100%|██████████| 35/35 [00:00<00:00, 57.25it/s, v_num=0, train_loss=0.0112, train_acc=1.000, val_loss=0.868, val_acc=0.763] 


Seed set to 42


---------- batch_size_128; lr_0.001; optimizer_Adam; hidden_dim_32; num_layers_1; sentence_representation_max  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 20: 100%|██████████| 35/35 [00:00<00:00, 50.23it/s, v_num=0, train_loss=0.00209, train_acc=1.000, val_loss=1.770, val_acc=0.634]


Seed set to 42


---------- batch_size_128; lr_0.005; optimizer_Adam; hidden_dim_32; num_layers_1; sentence_representation_last  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 8: 100%|██████████| 35/35 [00:00<00:00, 60.47it/s, v_num=0, train_loss=0.00809, train_acc=1.000, val_loss=2.340, val_acc=0.494]


Seed set to 42


---------- batch_size_128; lr_0.005; optimizer_Adam; hidden_dim_32; num_layers_1; sentence_representation_average  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 8: 100%|██████████| 35/35 [00:00<00:00, 56.21it/s, v_num=0, train_loss=0.000933, train_acc=1.000, val_loss=0.840, val_acc=0.755]


Seed set to 42


---------- batch_size_128; lr_0.005; optimizer_Adam; hidden_dim_32; num_layers_1; sentence_representation_max  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 9: 100%|██████████| 35/35 [00:00<00:00, 52.98it/s, v_num=0, train_loss=0.00129, train_acc=1.000, val_loss=1.260, val_acc=0.694]


Seed set to 42


---------- batch_size_128; lr_0.01; optimizer_Adam; hidden_dim_32; num_layers_1; sentence_representation_last  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 7: 100%|██████████| 35/35 [00:00<00:00, 59.38it/s, v_num=0, train_loss=0.347, train_acc=0.900, val_loss=2.290, val_acc=0.456] 


Seed set to 42


---------- batch_size_128; lr_0.01; optimizer_Adam; hidden_dim_32; num_layers_1; sentence_representation_average  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 10: 100%|██████████| 35/35 [00:00<00:00, 52.73it/s, v_num=0, train_loss=0.000293, train_acc=1.000, val_loss=1.140, val_acc=0.762]


Seed set to 42


---------- batch_size_128; lr_0.01; optimizer_Adam; hidden_dim_32; num_layers_1; sentence_representation_max  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.757     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 10: 100%|██████████| 35/35 [00:00<00:00, 50.92it/s, v_num=0, train_loss=0.000711, train_acc=1.000, val_loss=1.620, val_acc=0.641]


Seed set to 42


---------- batch_size_32; lr_0.0005; optimizer_RMSprop; hidden_dim_32; num_layers_2; sentence_representation_last  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.765     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 17: 100%|██████████| 137/137 [00:02<00:00, 62.29it/s, v_num=0, train_loss=0.0459, train_acc=1.000, val_loss=1.340, val_acc=0.745] 


Seed set to 42


---------- batch_size_32; lr_0.0005; optimizer_RMSprop; hidden_dim_32; num_layers_2; sentence_representation_average  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.765     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 13: 100%|██████████| 137/137 [00:03<00:00, 40.62it/s, v_num=0, train_loss=0.00792, train_acc=1.000, val_loss=0.746, val_acc=0.827]


Seed set to 42


---------- batch_size_32; lr_0.0005; optimizer_RMSprop; hidden_dim_32; num_layers_2; sentence_representation_max  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.765     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 8: 100%|██████████| 137/137 [00:02<00:00, 60.71it/s, v_num=0, train_loss=0.0147, train_acc=1.000, val_loss=0.693, val_acc=0.780]


Seed set to 42


---------- batch_size_32; lr_0.001; optimizer_RMSprop; hidden_dim_32; num_layers_2; sentence_representation_last  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.765     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 11: 100%|██████████| 137/137 [00:03<00:00, 43.85it/s, v_num=0, train_loss=0.0114, train_acc=1.000, val_loss=1.070, val_acc=0.703] 


Seed set to 42


---------- batch_size_32; lr_0.001; optimizer_RMSprop; hidden_dim_32; num_layers_2; sentence_representation_average  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.765     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 6: 100%|██████████| 137/137 [00:03<00:00, 40.31it/s, v_num=0, train_loss=0.00202, train_acc=1.000, val_loss=0.669, val_acc=0.784]


Seed set to 42


---------- batch_size_32; lr_0.001; optimizer_RMSprop; hidden_dim_32; num_layers_2; sentence_representation_max  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.765     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 5: 100%|██████████| 137/137 [00:03<00:00, 43.44it/s, v_num=0, train_loss=0.113, train_acc=0.917, val_loss=0.884, val_acc=0.765] 


Seed set to 42


---------- batch_size_32; lr_0.005; optimizer_RMSprop; hidden_dim_32; num_layers_2; sentence_representation_last  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.765     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 11: 100%|██████████| 137/137 [00:03<00:00, 44.91it/s, v_num=0, train_loss=0.051, train_acc=1.000, val_loss=2.720, val_acc=0.449] 


Seed set to 42


---------- batch_size_32; lr_0.005; optimizer_RMSprop; hidden_dim_32; num_layers_2; sentence_representation_average  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.765     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 5: 100%|██████████| 137/137 [00:03<00:00, 44.44it/s, v_num=0, train_loss=0.0423, train_acc=1.000, val_loss=1.660, val_acc=0.655] 


Seed set to 42


---------- batch_size_32; lr_0.005; optimizer_RMSprop; hidden_dim_32; num_layers_2; sentence_representation_max  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.765     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 7: 100%|██████████| 137/137 [00:03<00:00, 44.70it/s, v_num=0, train_loss=0.0352, train_acc=1.000, val_loss=0.835, val_acc=0.777] 


Seed set to 42


---------- batch_size_32; lr_0.01; optimizer_RMSprop; hidden_dim_32; num_layers_2; sentence_representation_last  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.765     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 9: 100%|██████████| 137/137 [00:03<00:00, 44.85it/s, v_num=0, train_loss=1.260, train_acc=0.500, val_loss=1.720, val_acc=0.238]


Seed set to 42


---------- batch_size_32; lr_0.01; optimizer_RMSprop; hidden_dim_32; num_layers_2; sentence_representation_average  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.765     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 5: 100%|██████████| 137/137 [00:02<00:00, 56.53it/s, v_num=0, train_loss=0.115, train_acc=0.733, val_loss=1.330, val_acc=0.654]  


Seed set to 42


---------- batch_size_32; lr_0.01; optimizer_RMSprop; hidden_dim_32; num_layers_2; sentence_representation_max  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.765     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 14: 100%|██████████| 137/137 [00:02<00:00, 54.78it/s, v_num=0, train_loss=0.0616, train_acc=1.000, val_loss=0.845, val_acc=0.731] 


Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs


---------- batch_size_64; lr_0.0005; optimizer_RMSprop; hidden_dim_32; num_layers_2; sentence_representation_last  ----------


LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.765     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 11: 100%|██████████| 69/69 [00:01<00:00, 55.50it/s, v_num=0, train_loss=0.0354, train_acc=1.000, val_loss=0.918, val_acc=0.696]


Seed set to 42


---------- batch_size_64; lr_0.0005; optimizer_RMSprop; hidden_dim_32; num_layers_2; sentence_representation_average  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.765     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 11: 100%|██████████| 69/69 [00:01<00:00, 51.72it/s, v_num=0, train_loss=0.0686, train_acc=1.000, val_loss=0.659, val_acc=0.773]


Seed set to 42


---------- batch_size_64; lr_0.0005; optimizer_RMSprop; hidden_dim_32; num_layers_2; sentence_representation_max  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.765     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 15: 100%|██████████| 69/69 [00:01<00:00, 41.39it/s, v_num=0, train_loss=0.0174, train_acc=1.000, val_loss=0.720, val_acc=0.785] 


Seed set to 42


---------- batch_size_64; lr_0.001; optimizer_RMSprop; hidden_dim_32; num_layers_2; sentence_representation_last  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.765     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 11: 100%|██████████| 69/69 [00:01<00:00, 57.92it/s, v_num=0, train_loss=0.0697, train_acc=1.000, val_loss=1.480, val_acc=0.676] 


Seed set to 42


---------- batch_size_64; lr_0.001; optimizer_RMSprop; hidden_dim_32; num_layers_2; sentence_representation_average  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.765     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 14: 100%|██████████| 69/69 [00:01<00:00, 60.77it/s, v_num=0, train_loss=0.00102, train_acc=1.000, val_loss=0.781, val_acc=0.807]


Seed set to 42


---------- batch_size_64; lr_0.001; optimizer_RMSprop; hidden_dim_32; num_layers_2; sentence_representation_max  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.765     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 11: 100%|██████████| 69/69 [00:01<00:00, 44.96it/s, v_num=0, train_loss=0.00188, train_acc=1.000, val_loss=1.010, val_acc=0.778]


Seed set to 42


---------- batch_size_64; lr_0.005; optimizer_RMSprop; hidden_dim_32; num_layers_2; sentence_representation_last  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.765     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 5: 100%|██████████| 69/69 [00:01<00:00, 42.96it/s, v_num=0, train_loss=0.355, train_acc=0.875, val_loss=2.100, val_acc=0.421]


Seed set to 42


---------- batch_size_64; lr_0.005; optimizer_RMSprop; hidden_dim_32; num_layers_2; sentence_representation_average  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.765     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 7: 100%|██████████| 69/69 [00:01<00:00, 68.60it/s, v_num=0, train_loss=0.0767, train_acc=1.000, val_loss=1.450, val_acc=0.614] 


Seed set to 42


---------- batch_size_64; lr_0.005; optimizer_RMSprop; hidden_dim_32; num_layers_2; sentence_representation_max  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.765     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 13: 100%|██████████| 69/69 [00:00<00:00, 69.18it/s, v_num=0, train_loss=0.395, train_acc=0.667, val_loss=1.240, val_acc=0.749]   


Seed set to 42


---------- batch_size_64; lr_0.01; optimizer_RMSprop; hidden_dim_32; num_layers_2; sentence_representation_last  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.765     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 3: 100%|██████████| 69/69 [00:00<00:00, 69.31it/s, v_num=0, train_loss=1.620, train_acc=0.200, val_loss=1.660, val_acc=0.178]


Seed set to 42


---------- batch_size_64; lr_0.01; optimizer_RMSprop; hidden_dim_32; num_layers_2; sentence_representation_average  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.765     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 6: 100%|██████████| 69/69 [00:00<00:00, 71.86it/s, v_num=0, train_loss=0.00279, train_acc=1.000, val_loss=1.250, val_acc=0.643]


Seed set to 42


---------- batch_size_64; lr_0.01; optimizer_RMSprop; hidden_dim_32; num_layers_2; sentence_representation_max  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.765     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 12: 100%|██████████| 69/69 [00:01<00:00, 64.13it/s, v_num=0, train_loss=0.132, train_acc=0.900, val_loss=1.030, val_acc=0.710] 


Seed set to 42


---------- batch_size_128; lr_0.0005; optimizer_RMSprop; hidden_dim_32; num_layers_2; sentence_representation_last  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.765     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 15: 100%|██████████| 35/35 [00:00<00:00, 66.40it/s, v_num=0, train_loss=0.304, train_acc=0.833, val_loss=1.050, val_acc=0.621] 


Seed set to 42


---------- batch_size_128; lr_0.0005; optimizer_RMSprop; hidden_dim_32; num_layers_2; sentence_representation_average  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.765     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 17: 100%|██████████| 35/35 [00:00<00:00, 54.54it/s, v_num=0, train_loss=0.0279, train_acc=1.000, val_loss=0.625, val_acc=0.784]


Seed set to 42


---------- batch_size_128; lr_0.0005; optimizer_RMSprop; hidden_dim_32; num_layers_2; sentence_representation_max  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.765     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 11: 100%|██████████| 35/35 [00:00<00:00, 53.66it/s, v_num=0, train_loss=0.0262, train_acc=1.000, val_loss=0.644, val_acc=0.718]


Seed set to 42


---------- batch_size_128; lr_0.001; optimizer_RMSprop; hidden_dim_32; num_layers_2; sentence_representation_last  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.765     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 13: 100%|██████████| 35/35 [00:00<00:00, 58.94it/s, v_num=0, train_loss=0.0244, train_acc=1.000, val_loss=1.150, val_acc=0.681]


Seed set to 42


---------- batch_size_128; lr_0.001; optimizer_RMSprop; hidden_dim_32; num_layers_2; sentence_representation_average  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.765     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 15: 100%|██████████| 35/35 [00:00<00:00, 67.32it/s, v_num=0, train_loss=0.00511, train_acc=1.000, val_loss=0.756, val_acc=0.759]


Seed set to 42


---------- batch_size_128; lr_0.001; optimizer_RMSprop; hidden_dim_32; num_layers_2; sentence_representation_max  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.765     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 9: 100%|██████████| 35/35 [00:00<00:00, 59.54it/s, v_num=0, train_loss=0.00857, train_acc=1.000, val_loss=0.798, val_acc=0.732]


Seed set to 42


---------- batch_size_128; lr_0.005; optimizer_RMSprop; hidden_dim_32; num_layers_2; sentence_representation_last  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.765     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 5: 100%|██████████| 35/35 [00:00<00:00, 68.52it/s, v_num=0, train_loss=0.714, train_acc=0.854, val_loss=2.670, val_acc=0.365]


Seed set to 42


---------- batch_size_128; lr_0.005; optimizer_RMSprop; hidden_dim_32; num_layers_2; sentence_representation_average  ----------


GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.4 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.4 M     Trainable params
0         Non-trainable params
2.4 M     Total params
9.765     Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


### Model Configuration Comparison

In [None]:
train_results_df = load_tensorboard_logs(log_dir="tb_logs/rnn")

In [None]:
rmsprop_sorted_df = train_results_df[train_results_df['optimizer_name'] == "RMSprop"].sort_values(by=["val_acc"], ascending=False).reset_index(drop=True)
rmsprop_sorted_df.head(n=50)

In [None]:
train_results_df = train_results_df.sort_values(
    by=["val_acc"], ascending=False
).reset_index(drop=True)
train_results_df.head(20)

Unnamed: 0,val_acc,batch_size,hidden_dim,learning_rate,optimizer_name,train_loss,train_acc,num_layers,sentence_representation_type,freeze,epoch,val_loss,filename
0,0.741048,32,256,0.01,RMSprop,0.689391,0.753968,1,average,False,42.0,0.694172,events.out.tfevents.1762253482.ArchThinkpadT14...
1,0.739651,32,128,0.01,Adam,0.444329,0.88254,1,max,False,32.0,0.695786,events.out.tfevents.1762239394.ArchThinkpadT14...
2,0.739651,32,128,0.01,RMSprop,0.444329,0.88254,1,max,False,32.0,0.695786,events.out.tfevents.1762236381.ArchThinkpadT14...
3,0.721671,32,32,0.01,RMSprop,0.449705,0.885,2,max,False,42.0,0.766004,events.out.tfevents.1762208799.ArchThinkpadT14...
4,0.721671,32,32,0.01,Adam,0.449705,0.885,2,max,False,42.0,0.766004,events.out.tfevents.1762211207.ArchThinkpadT14...
5,0.718828,32,128,0.01,RMSprop,1.013844,0.502778,1,average,False,47.0,0.746652,events.out.tfevents.1762236229.ArchThinkpadT14...
6,0.718828,32,128,0.01,Adam,1.013844,0.502778,1,average,False,47.0,0.746652,events.out.tfevents.1762239250.ArchThinkpadT14...
7,0.706503,64,256,0.01,RMSprop,0.465816,0.732843,1,average,False,63.0,0.702203,events.out.tfevents.1762254632.ArchThinkpadT14...
8,0.701665,32,256,0.01,RMSprop,0.929195,0.769444,1,last,False,36.0,0.83259,events.out.tfevents.1762253362.ArchThinkpadT14...
9,0.698184,32,128,0.01,RMSprop,0.661537,0.711111,2,max,False,26.0,0.80168,events.out.tfevents.1762242385.ArchThinkpadT14...


### a) Report the final configuration of your best model, namely the number of training epochs, learning rate, optimizer, batch size and hidden dimension.

In [None]:
train_results_df.head(1)

Unnamed: 0,val_acc,batch_size,hidden_dim,learning_rate,optimizer_name,train_loss,train_acc,num_layers,sentence_representation_type,freeze,epoch,val_loss,filename
0,0.741048,32,256,0.01,RMSprop,0.689391,0.753968,1,average,False,42.0,0.694172,events.out.tfevents.1762253482.ArchThinkpadT14...


Answer: Batch Size 32, Hidden Dimension 256, Learning Rate 0.01, RMSProp Optmizer and the number of training epochs is 42

### b) Report all the regularization strategies you have tried. Compare the accuracy on the test set among all strategies and the one without any regularization

In [None]:
num_workers = os.cpu_count() // 2

train_dataloader = DataLoader(
    train_dataset,
    batch_size=32,
    shuffle=True,
    num_workers=num_workers,
    collate_fn=collate_fn,
    multiprocessing_context='spawn',
    persistent_workers=True,
)

valid_dataloader = DataLoader(
    valid_dataset,  
    batch_size=32,
    shuffle=False,
    num_workers=num_workers,
    collate_fn=collate_fn,
    multiprocessing_context='spawn',
    persistent_workers=True,
)

test_dataloader = DataLoader(  
    test_dataset,  
    batch_size=32,
    shuffle=False,
    num_workers=num_workers,
    collate_fn=collate_fn,
    multiprocessing_context='spawn',
    persistent_workers=True,
)

L.seed_everything(Config.SEED)

callbacks = [
    # EarlyStopping(
    #     monitor="val_loss",
    #     mode="min",
    #     patience=early_stopping_patience,
    #     min_delta=1e-4,
    # ),
    EarlyStopping(
        monitor="val_acc",
        mode="max",
        patience=3,
        min_delta=1e-4,
    ),
    ModelCheckpoint(
        monitor="val_loss",
        save_top_k=1,
        mode="min",
    ),
]

Seed set to 42


Without Regularization

In [None]:
rnn_model = RNN(
    embedding_matrix=embedding_matrix,
    hidden_dim=256,
    num_layers=1,
    sentence_representation_type="average",
    output_dim=6,
    freeze_embedding=False,
)
model = RNNClassifier(rnn_model, lr=0.01, optimizer_name="RMSprop", show_progress=True)

/home/linnsheng/Desktop/NTU/S3/Y1/NLP/SC4002/.venv/lib/python3.13/site-packages/lightning/pytorch/utilities/parsing.py:210: Attribute 'rnn_model' is an instance of `nn.Module` and is already saved during checkpointing. It is recommended to ignore them using `self.save_hyperparameters(ignore=['rnn_model'])`.


In [None]:
trainer = L.Trainer(
    max_epochs=50,
    accelerator="auto",
    gradient_clip_val=1.0,
)

trainer.fit(model, train_dataloaders=train_dataloader, val_dataloaders=valid_dataloader)

💡 Tip: For seamless cloud uploads and versioning, try installing [litmodels](https://pypi.org/project/litmodels/) to enable LitModelCheckpoint, which syncs automatically with the Lightning model registry.
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 49: 100%|██████████| 110/110 [00:02<00:00, 44.52it/s, v_num=13, train_loss=0.490, train_acc=0.333, val_loss=0.735, val_acc=0.715] 

`Trainer.fit` stopped: `max_epochs=50` reached.


Epoch 49: 100%|██████████| 110/110 [00:02<00:00, 43.91it/s, v_num=13, train_loss=0.490, train_acc=0.333, val_loss=0.735, val_acc=0.715]


In [None]:
trainer.test(model=model, dataloaders=test_dataloader)

LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Testing DataLoader 0: 100%|██████████| 16/16 [00:00<00:00, 104.40it/s]


[{'test_loss': 0.6915724873542786, 'test_acc': 0.7517745494842529}]

#### Early Stopping

In [None]:
rnn_model = RNN(
    embedding_matrix=embedding_matrix,
    hidden_dim=256,
    num_layers=1,
    sentence_representation_type="average",
    output_dim=6,
    freeze_embedding=False,
)
model = RNNClassifier(rnn_model, lr=0.01, optimizer_name="RMSprop")

In [None]:
trainer = L.Trainer(
    max_epochs=50,
    callbacks=callbacks,
    accelerator="auto",
    devices="auto",
    gradient_clip_val=1.0,
)

trainer.fit(model, train_dataloaders=train_dataloader, val_dataloaders=valid_dataloader)

GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]



  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 49: 100%|██████████| 137/137 [00:02<00:00, 51.45it/s, v_num=8]        

`Trainer.fit` stopped: `max_epochs=50` reached.


Epoch 49: 100%|██████████| 137/137 [00:02<00:00, 51.35it/s, v_num=8]


In [None]:
trainer.test(model=model, dataloaders=test_dataloader)

LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Testing DataLoader 0: 100%|██████████| 16/16 [00:00<00:00, 148.04it/s]


[{'test_loss': 0.6336861252784729, 'test_acc': 0.7604537606239319}]

### L2 + Early Stopping

In [None]:
rnn_model = RNN(
    embedding_matrix=embedding_matrix,
    hidden_dim=256,
    num_layers=1,
    sentence_representation_type="average",
    output_dim=6,
    freeze_embedding=False,
)
model = RNNClassifier(rnn_model, lr=0.01, optimizer_name="RMSprop", weight_decay=1e-5, show_progress=True)

In [None]:
trainer = L.Trainer(
    max_epochs=50,
    callbacks=callbacks,
    accelerator="auto",
    devices="auto",
)

trainer.fit(model, train_dataloaders=train_dataloader, val_dataloaders=valid_dataloader)

GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 21: 100%|██████████| 110/110 [00:02<00:00, 41.82it/s, v_num=14, train_loss=2.540, train_acc=0.000, val_loss=2.260, val_acc=0.270]


In [None]:
trainer.test(model=model, dataloaders=test_dataloader)

LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Testing DataLoader 0: 100%|██████████| 16/16 [00:00<00:00, 128.46it/s]


[{'test_loss': 2.2802817821502686, 'test_acc': 0.2334180325269699}]

### DropOut + Early Stopping

In [None]:
rnn_model = RNN(
    embedding_matrix=embedding_matrix,
    hidden_dim=256,
    num_layers=1,
    sentence_representation_type="average",
    output_dim=6,
    freeze_embedding=False,
    dropout=0.5,
    embedding_dropout=0.2,
)
model = RNNClassifier(rnn_model, lr=0.01, optimizer_name="RMSprop")

In [None]:
trainer = L.Trainer(
    max_epochs=50,
    callbacks=callbacks,
    accelerator="auto",
    devices="auto",
)

trainer.fit(model, train_dataloaders=train_dataloader, val_dataloaders=valid_dataloader)

Trainer already configured with model summary callbacks: [<class 'lightning.pytorch.callbacks.model_summary.ModelSummary'>]. Skipping setting a default `ModelSummary` callback.
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
/home/linnsheng/Desktop/NTU/S3/Y1/NLP/SC4002/.venv/lib/python3.13/site-packages/lightning/pytorch/callbacks/model_checkpoint.py:751: Checkpoint directory /home/linnsheng/Desktop/NTU/S3/Y1/NLP/SC4002/lightning_logs/version_14/checkpoints exists and is not empty.
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]



  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 0: 100%|██████████| 110/110 [00:09<00:00, 11.58it/s, v_num=15]       


In [None]:
trainer.test(model=model, dataloaders=test_dataloader)

LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Testing DataLoader 0: 100%|██████████| 16/16 [00:00<00:00, 85.10it/s]


[{'test_loss': 1.7479004859924316, 'test_acc': 0.18720002472400665}]

### L2 + DropOut + Early Stopping

In [None]:
rnn_model = RNN(
    embedding_matrix=embedding_matrix,
    hidden_dim=256,
    num_layers=1,
    sentence_representation_type="average",
    output_dim=6,
    freeze_embedding=False,
    dropout=0.5,
    embedding_dropout=0.2,
)
model = RNNClassifier(rnn_model, lr=0.01, optimizer_name="RMSprop", weight_decay=1e-4)

In [None]:
trainer = L.Trainer(
    max_epochs=50,
    callbacks=callbacks,
    accelerator="auto",
    devices="auto",
)
trainer.fit(model, train_dataloaders=train_dataloader, val_dataloaders=valid_dataloader)

Trainer already configured with model summary callbacks: [<class 'lightning.pytorch.callbacks.model_summary.ModelSummary'>]. Skipping setting a default `ModelSummary` callback.
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
9         Modules in train mode
0         Modules in eval mode


Epoch 0: 100%|██████████| 110/110 [00:02<00:00, 53.77it/s, v_num=16]        


In [None]:
trainer.test(model=model, dataloaders=test_dataloader)

LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Testing DataLoader 0: 100%|██████████| 16/16 [00:00<00:00, 93.94it/s] 


[{'test_loss': 1.7192174196243286, 'test_acc': 0.34235653281211853}]