## Part 2. Model Training & Evaluation - RNN

Now with the pretrained word embeddings acquired from Part 1 and the dataset acquired from Part
0, you need to train a deep learning model for topic classification using the training set, conforming
to these requirements:
- Use the pretrained word embeddings from Part 1 as inputs, together with your implementation
in mitigating the influence of OOV words; make them learnable parameters during training
(they are updated).
- Design a simple recurrent neural network (RNN), taking the input word embeddings, and
predicting a topic label for each sentence. To do that, you need to consider how to aggregate
the word representations to represent a sentence.
- Use the validation set to gauge the performance of the model for each epoch during training.
You are required to use accuracy as the performance metric during validation and evaluation.
- Use the mini-batch strategy during training. You may choose any preferred optimizer (e.g.,
SGD, Adagrad, Adam, RMSprop). Be careful when you choose your initial learning rate and
mini-batch size. (You should use the validation set to determine the optimal configuration.)
Train the model until the accuracy score on the validation set is not increasing for a few
epochs.
- Try different regularization techniques to mitigate overfitting.
- Evaluate your trained model on the test dataset, observing the accuracy score.

In [1]:
import json
import numpy as np
import random
import itertools
from pathlib import Path
from torchtext import data, datasets
from torch.utils.data import TensorDataset, DataLoader
from utils.config import Config
from utils.train import train_rnn_model_with_parameters
from utils.helper import SentenceDataset, collate_fn

  from .autonotebook import tqdm as notebook_tqdm


In [2]:
TEXT = data.Field(tokenize = 'spacy', tokenizer_language='en_core_web_sm', include_lengths=True)
LABEL = data.LabelField()

train_data, test_data = datasets.TREC.splits(TEXT, LABEL, fine_grained=False)

In [3]:
train_data, valid_data = train_data.split(random_state=random.seed(Config.SEED), split_ratio=0.8)

In [4]:
TEXT.build_vocab(train_data, vectors="glove.6B.300d")
LABEL.build_vocab(train_data)

### Import the embedding matrix and vocab index mapping (train data)

In [5]:
embedding_path = Path("models/embedding_matrix.npy")
index_from_word_path = Path("models/index_from_word.json")

embedding_matrix = np.load(embedding_path)
with index_from_word_path.open() as f:
    index_from_word = json.load(f)

In [6]:
train_dataset = SentenceDataset(train_data.examples, index_from_word, LABEL.vocab)
valid_dataset = SentenceDataset(valid_data.examples, index_from_word, LABEL.vocab)
test_dataset = SentenceDataset(test_data.examples, index_from_word, LABEL.vocab)        

### Dataset

In [7]:
SEARCH_SPACE = {
    "batch_size": [32, 64, 128, 256, 512, 1024, 2048],
    "learning_rate": [1e-5, 1e-4, 1e-3, 1e-2, 1e-1],
    "optimizer_name": ["SGD", "Adagrad", "RMSprop", "Adam"],
    "hidden_dim": [256, 128, 64, 32],
    "num_layers": [1, 2, 4],
    "sentence_representation_type": ["last", "average", "max"],
}
all_combinations = list(itertools.product(
    SEARCH_SPACE["batch_size"],
    SEARCH_SPACE["learning_rate"],
    SEARCH_SPACE["optimizer_name"],
    SEARCH_SPACE["hidden_dim"],
    SEARCH_SPACE["num_layers"],
    SEARCH_SPACE["sentence_representation_type"]
))

In [8]:
# train_rnn_model_with_parameters(
# embedding_matrix=embedding_matrix,
#     train_dataset=train_dataset,
#     val_dataset=valid_dataset,
#     batch_size=128,
#     learning_rate=5e-4,
#     optimizer_name='Adam',
#     hidden_dim=128,
#     num_layers=1,
#     sentence_representation_type='average',
#     freeze_embedding=False,
#     show_progress=True,
# )

In [9]:
# for batch_size, lr, optimizer_name, hidden_dim, num_layers, sr_type in all_combinations:
#     print(f"Training with configuration: batch_size={batch_size}, lr={lr}, optimizer={optimizer_name}, "
#           f"hidden_dim={hidden_dim}, num_layers={num_layers}, sentence_repr={sr_type}")

#     train_rnn_model_with_parameters(
#         embedding_matrix=embedding_matrix,
#         train_dataset=train_dataset,
#         val_dataset=valid_dataset,
#         batch_size=batch_size,
#         learning_rate=lr,
#         optimizer_name=optimizer_name,
#         hidden_dim=hidden_dim,
#         num_layers=num_layers,
#         sentence_representation_type=sr_type,
#         freeze_embedding=False,
#         show_progress=True,
#     )

In [None]:
for hidden_dim in SEARCH_SPACE["hidden_dim"]:
    for num_layers in SEARCH_SPACE["num_layers"]:
        for optimizer_name in SEARCH_SPACE["optimizer_name"]:
            for batch_size in SEARCH_SPACE["batch_size"]:
                for learning_rate in SEARCH_SPACE["learning_rate"]:
                    for sentence_representation_type in SEARCH_SPACE["sentence_representation_type"]:
                        log_message = f"---------- batch_size_{batch_size}; lr_{learning_rate}; optimizer_{optimizer_name}; hidden_dim_{hidden_dim}; num_layers_{num_layers}; sentence_representation_{sentence_representation_type}  ----------"
                        print(log_message)
                        train_rnn_model_with_parameters(
                            embedding_matrix=embedding_matrix,
                            train_dataset=train_dataset,
                            val_dataset=valid_dataset,
                            batch_size=batch_size,
                            learning_rate=learning_rate,
                            optimizer_name=optimizer_name,
                            hidden_dim=hidden_dim,
                            num_layers=num_layers,
                            sentence_representation_type=sentence_representation_type,
                            show_progress=True,
                            freeze_embedding=False,
                        )

Seed set to 42
/home/linnsheng/Desktop/NTU/S3/Y1/NLP/SC4002/.venv/lib/python3.13/site-packages/lightning/pytorch/utilities/parsing.py:210: Attribute 'rnn_model' is an instance of `nn.Module` and is already saved during checkpointing. It is recommended to ignore them using `self.save_hyperparameters(ignore=['rnn_model'])`.
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs


---------- batch_size_32; lr_1e-05; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_last  ----------


LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


Epoch 0:   0%|          | 0/137 [00:00<?, ?it/s]                           Logits range: [-0.140, 0.144]
Loss: 1.793
Predictions: tensor([4, 4, 4, 4, 3], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 0], device='cuda:0')
Epoch 1:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.820, train_acc=0.0667, val_loss=1.810, val_acc=0.151]          Logits range: [-0.117, 0.130]
Loss: 1.786
Predictions: tensor([4, 3, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 2], device='cuda:0')
Epoch 2:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.770, train_acc=0.250, val_loss=1.810, val_acc=0.150]            Logits range: [-0.148, 0.162]
Loss: 1.803
Predictions: tensor([4, 3, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 5, 0, 3, 4], device='cuda:0')
Epoch 3:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.730, train_acc=0.375, val_loss=1.810, val_acc=0.151]            Logits range: [-0.117, 0.127]
Loss: 1.785
Predictions: tensor([3, 4, 4,

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_32; lr_1e-05; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/137 [00:00<?, ?it/s]                            Logits range: [-0.080, 0.122]
Loss: 1.806
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 0], device='cuda:0')
Epoch 1:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.820, train_acc=0.000, val_loss=1.810, val_acc=0.185]          Logits range: [-0.081, 0.120]
Loss: 1.794
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 2], device='cuda:0')
Epoch 2:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.770, train_acc=0.200, val_loss=1.810, val_acc=0.185]           Logits range: [-0.089, 0.134]
Loss: 1.804
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 5, 0, 3, 4], device='cuda:0')
Epoch 3:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.750, train_

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_32; lr_1e-05; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/137 [00:00<?, ?it/s]                            Logits range: [-0.215, 0.176]
Loss: 1.808
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 0], device='cuda:0')
Epoch 1:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.830, train_acc=0.250, val_loss=1.830, val_acc=0.130]          Logits range: [-0.190, 0.185]
Loss: 1.798
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 2], device='cuda:0')
Epoch 2:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.760, train_acc=0.200, val_loss=1.820, val_acc=0.131]           Logits range: [-0.184, 0.157]
Loss: 1.814
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 5, 0, 3, 4], device='cuda:0')
Epoch 3:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.760, train_acc=

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_32; lr_0.0001; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/137 [00:00<?, ?it/s]                            Logits range: [-0.140, 0.144]
Loss: 1.793
Predictions: tensor([4, 4, 4, 4, 3], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 0], device='cuda:0')
Epoch 1:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.810, train_acc=0.0667, val_loss=1.800, val_acc=0.154]          Logits range: [-0.115, 0.128]
Loss: 1.784
Predictions: tensor([4, 3, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 2], device='cuda:0')
Epoch 2:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.760, train_acc=0.250, val_loss=1.800, val_acc=0.165]            Logits range: [-0.144, 0.157]
Loss: 1.800
Predictions: tensor([4, 3, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 5, 0, 3, 4], device='cuda:0')
Epoch 3:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.730, train_

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_32; lr_0.0001; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/137 [00:00<?, ?it/s]                            Logits range: [-0.080, 0.122]
Loss: 1.806
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 0], device='cuda:0')
Epoch 1:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.820, train_acc=0.000, val_loss=1.810, val_acc=0.184]          Logits range: [-0.080, 0.119]
Loss: 1.793
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 2], device='cuda:0')
Epoch 2:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.770, train_acc=0.200, val_loss=1.810, val_acc=0.185]          Logits range: [-0.087, 0.132]
Loss: 1.803
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 5, 0, 3, 4], device='cuda:0')
Epoch 3:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.750, train_

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_32; lr_0.0001; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/137 [00:00<?, ?it/s]                            Logits range: [-0.215, 0.176]
Loss: 1.808
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 0], device='cuda:0')
Epoch 1:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.830, train_acc=0.250, val_loss=1.820, val_acc=0.139]          Logits range: [-0.186, 0.182]
Loss: 1.795
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 2], device='cuda:0')
Epoch 2:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.760, train_acc=0.200, val_loss=1.820, val_acc=0.156]           Logits range: [-0.175, 0.151]
Loss: 1.807
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 5, 0, 3, 4], device='cuda:0')
Epoch 3:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.750, train_acc

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_32; lr_0.001; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/137 [00:00<?, ?it/s]                            Logits range: [-0.140, 0.144]
Loss: 1.793
Predictions: tensor([4, 4, 4, 4, 3], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 0], device='cuda:0')
Epoch 1:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.790, train_acc=0.0667, val_loss=1.790, val_acc=0.161]          Logits range: [-0.156, 0.111]
Loss: 1.769
Predictions: tensor([4, 3, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 2], device='cuda:0')
Epoch 2:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.750, train_acc=0.150, val_loss=1.770, val_acc=0.199]           Logits range: [-0.278, 0.146]
Loss: 1.770
Predictions: tensor([1, 3, 1, 1, 1], device='cuda:0')
True labels: tensor([0, 5, 0, 3, 4], device='cuda:0')
Epoch 3:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.730, train_ac

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_32; lr_0.001; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/137 [00:00<?, ?it/s]                            Logits range: [-0.080, 0.122]
Loss: 1.806
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 0], device='cuda:0')
Epoch 1:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.810, train_acc=0.000, val_loss=1.800, val_acc=0.185]          Logits range: [-0.071, 0.111]
Loss: 1.787
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 2], device='cuda:0')
Epoch 2:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.760, train_acc=0.200, val_loss=1.790, val_acc=0.186]          Logits range: [-0.070, 0.120]
Loss: 1.791
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 5, 0, 3, 4], device='cuda:0')
Epoch 3:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.740, train_a

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_32; lr_0.001; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/137 [00:00<?, ?it/s]                            Logits range: [-0.215, 0.176]
Loss: 1.808
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 0], device='cuda:0')
Epoch 1:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.800, train_acc=0.200, val_loss=1.780, val_acc=0.195]          Logits range: [-0.139, 0.151]
Loss: 1.766
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 2], device='cuda:0')
Epoch 2:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.730, train_acc=0.300, val_loss=1.750, val_acc=0.184]           Logits range: [-0.305, 0.154]
Loss: 1.763
Predictions: tensor([1, 0, 1, 1, 0], device='cuda:0')
True labels: tensor([0, 5, 0, 3, 4], device='cuda:0')
Epoch 3:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.740, train_acc=

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_32; lr_0.01; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/137 [00:00<?, ?it/s]                            Logits range: [-0.140, 0.144]
Loss: 1.793
Predictions: tensor([4, 4, 4, 4, 3], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 0], device='cuda:0')
Epoch 1:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.670, train_acc=0.125, val_loss=1.680, val_acc=0.197]          Logits range: [-0.977, 0.328]
Loss: 1.699
Predictions: tensor([0, 1, 1, 0, 0], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 2], device='cuda:0')
Epoch 2:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.720, train_acc=0.200, val_loss=1.650, val_acc=0.211]          Logits range: [-1.706, 0.554]
Loss: 1.683
Predictions: tensor([0, 0, 0, 0, 0], device='cuda:0')
True labels: tensor([0, 5, 0, 3, 4], device='cuda:0')
Epoch 3:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.690, train_acc=0

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_32; lr_0.01; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/137 [00:00<?, ?it/s]                            Logits range: [-0.080, 0.122]
Loss: 1.806
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 0], device='cuda:0')
Epoch 1:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.750, train_acc=0.375, val_loss=1.750, val_acc=0.217]          Logits range: [-0.337, 0.151]
Loss: 1.745
Predictions: tensor([0, 1, 1, 1, 0], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 2], device='cuda:0')
Epoch 2:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.730, train_acc=0.300, val_loss=1.710, val_acc=0.236]          Logits range: [-0.734, 0.308]
Loss: 1.725
Predictions: tensor([1, 1, 1, 1, 1], device='cuda:0')
True labels: tensor([0, 5, 0, 3, 4], device='cuda:0')
Epoch 3:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.740, train_ac

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_32; lr_0.01; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/137 [00:00<?, ?it/s]                            Logits range: [-0.215, 0.176]
Loss: 1.808
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 0], device='cuda:0')
Epoch 1:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.660, train_acc=0.125, val_loss=1.660, val_acc=0.201]          Logits range: [-1.242, 0.507]
Loss: 1.681
Predictions: tensor([0, 1, 1, 0, 0], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 2], device='cuda:0')
Epoch 2:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.680, train_acc=0.300, val_loss=1.630, val_acc=0.208]          Logits range: [-1.968, 0.661]
Loss: 1.684
Predictions: tensor([0, 0, 0, 0, 0], device='cuda:0')
True labels: tensor([0, 5, 0, 3, 4], device='cuda:0')
Epoch 3:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.680, train_acc=0.

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_32; lr_0.1; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/137 [00:00<?, ?it/s]                            Logits range: [-0.140, 0.144]
Loss: 1.793
Predictions: tensor([4, 4, 4, 4, 3], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 0], device='cuda:0')
Epoch 1:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.440, train_acc=0.250, val_loss=1.510, val_acc=0.335]          Logits range: [-2.737, 2.547]
Loss: 1.535
Predictions: tensor([0, 0, 2, 2, 0], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 2], device='cuda:0')
Epoch 2:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.520, train_acc=0.400, val_loss=1.400, val_acc=0.427]          Logits range: [-3.094, 2.716]
Loss: 1.501
Predictions: tensor([0, 3, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 5, 0, 3, 4], device='cuda:0')
Epoch 3:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.450, train_acc=0.

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_32; lr_0.1; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/137 [00:00<?, ?it/s]                            Logits range: [-0.080, 0.122]
Loss: 1.806
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 0], device='cuda:0')
Epoch 1:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.580, train_acc=0.250, val_loss=1.610, val_acc=0.291]          Logits range: [-2.940, 1.438]
Loss: 1.668
Predictions: tensor([0, 0, 0, 2, 0], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 2], device='cuda:0')
Epoch 2:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.430, train_acc=0.400, val_loss=1.460, val_acc=0.368]          Logits range: [-4.340, 3.089]
Loss: 1.558
Predictions: tensor([1, 3, 0, 3, 3], device='cuda:0')
True labels: tensor([0, 5, 0, 3, 4], device='cuda:0')
Epoch 3:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.360, train_acc

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_32; lr_0.1; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/137 [00:00<?, ?it/s]                            Logits range: [-0.215, 0.176]
Loss: 1.808
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 0], device='cuda:0')
Epoch 1:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.250, train_acc=0.625, val_loss=1.380, val_acc=0.369]          Logits range: [-3.029, 3.092]
Loss: 1.405
Predictions: tensor([0, 1, 2, 2, 0], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 2], device='cuda:0')
Epoch 2:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.180, train_acc=0.600, val_loss=1.100, val_acc=0.585]          Logits range: [-3.895, 3.558]
Loss: 1.227
Predictions: tensor([0, 2, 0, 3, 4], device='cuda:0')
True labels: tensor([0, 5, 0, 3, 4], device='cuda:0')
Epoch 3:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=0.531, train_acc=0.7

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_64; lr_1e-05; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/69 [00:00<?, ?it/s]                            Logits range: [-0.173, 0.149]
Loss: 1.794
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.820, train_acc=0.0667, val_loss=1.810, val_acc=0.150]         Logits range: [-0.135, 0.144]
Loss: 1.798
Predictions: tensor([4, 4, 3, 4, 3], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 4], device='cuda:0')
Epoch 2:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.770, train_acc=0.250, val_loss=1.810, val_acc=0.150]         Logits range: [-0.149, 0.165]
Loss: 1.808
Predictions: tensor([3, 4, 4, 4, 3], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 5], device='cuda:0')
Epoch 3:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.730, train_acc=0.375

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_64; lr_1e-05; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/69 [00:00<?, ?it/s]                             Logits range: [-0.088, 0.123]
Loss: 1.801
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.820, train_acc=0.000, val_loss=1.810, val_acc=0.179]         Logits range: [-0.081, 0.128]
Loss: 1.802
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 4], device='cuda:0')
Epoch 2:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.770, train_acc=0.200, val_loss=1.810, val_acc=0.179]         Logits range: [-0.083, 0.130]
Loss: 1.813
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 5], device='cuda:0')
Epoch 3:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.750, train_acc=0.

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_64; lr_1e-05; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/69 [00:00<?, ?it/s]                             Logits range: [-0.215, 0.182]
Loss: 1.806
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.830, train_acc=0.250, val_loss=1.830, val_acc=0.135]         Logits range: [-0.191, 0.185]
Loss: 1.814
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 4], device='cuda:0')
Epoch 2:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.760, train_acc=0.200, val_loss=1.830, val_acc=0.135]         Logits range: [-0.192, 0.181]
Loss: 1.829
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 5], device='cuda:0')
Epoch 3:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.760, train_acc=0.250,

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_64; lr_0.0001; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/69 [00:00<?, ?it/s]                            Logits range: [-0.173, 0.149]
Loss: 1.794
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.810, train_acc=0.0667, val_loss=1.810, val_acc=0.149]         Logits range: [-0.134, 0.142]
Loss: 1.797
Predictions: tensor([4, 4, 3, 4, 3], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 4], device='cuda:0')
Epoch 2:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.770, train_acc=0.250, val_loss=1.800, val_acc=0.153]         Logits range: [-0.146, 0.162]
Loss: 1.806
Predictions: tensor([3, 4, 4, 4, 3], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 5], device='cuda:0')
Epoch 3:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.730, train_acc=0.37

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_64; lr_0.0001; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/69 [00:00<?, ?it/s]                             Logits range: [-0.088, 0.123]
Loss: 1.801
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.820, train_acc=0.000, val_loss=1.810, val_acc=0.179]         Logits range: [-0.080, 0.128]
Loss: 1.802
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 4], device='cuda:0')
Epoch 2:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.770, train_acc=0.200, val_loss=1.810, val_acc=0.178]         Logits range: [-0.082, 0.129]
Loss: 1.812
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 5], device='cuda:0')
Epoch 3:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.750, train_acc=0

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_64; lr_0.0001; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/69 [00:00<?, ?it/s]                             Logits range: [-0.215, 0.182]
Loss: 1.806
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.830, train_acc=0.250, val_loss=1.820, val_acc=0.142]         Logits range: [-0.188, 0.184]
Loss: 1.812
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 4], device='cuda:0')
Epoch 2:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.760, train_acc=0.200, val_loss=1.820, val_acc=0.148]         Logits range: [-0.187, 0.178]
Loss: 1.824
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 5], device='cuda:0')
Epoch 3:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.760, train_acc=0.250

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_64; lr_0.001; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/69 [00:00<?, ?it/s]                             Logits range: [-0.173, 0.149]
Loss: 1.794
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.800, train_acc=0.0667, val_loss=1.800, val_acc=0.157]         Logits range: [-0.123, 0.131]
Loss: 1.787
Predictions: tensor([4, 4, 3, 4, 3], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 4], device='cuda:0')
Epoch 2:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.750, train_acc=0.200, val_loss=1.790, val_acc=0.154]          Logits range: [-0.189, 0.145]
Loss: 1.786
Predictions: tensor([3, 4, 4, 4, 3], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 5], device='cuda:0')
Epoch 3:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.730, train_acc=0.4

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_64; lr_0.001; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/69 [00:00<?, ?it/s]                             Logits range: [-0.088, 0.123]
Loss: 1.801
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.820, train_acc=0.000, val_loss=1.800, val_acc=0.176]         Logits range: [-0.075, 0.123]
Loss: 1.798
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 4], device='cuda:0')
Epoch 2:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.770, train_acc=0.200, val_loss=1.800, val_acc=0.175]         Logits range: [-0.073, 0.122]
Loss: 1.805
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 5], device='cuda:0')
Epoch 3:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.750, train_acc=0.

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_64; lr_0.001; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/69 [00:00<?, ?it/s]                             Logits range: [-0.215, 0.182]
Loss: 1.806
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.810, train_acc=0.200, val_loss=1.800, val_acc=0.176]         Logits range: [-0.163, 0.167]
Loss: 1.793
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 4], device='cuda:0')
Epoch 2:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.750, train_acc=0.200, val_loss=1.780, val_acc=0.185]         Logits range: [-0.159, 0.148]
Loss: 1.786
Predictions: tensor([3, 3, 3, 3, 0], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 5], device='cuda:0')
Epoch 3:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.740, train_acc=0.125,

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_64; lr_0.01; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/69 [00:00<?, ?it/s]                             Logits range: [-0.173, 0.149]
Loss: 1.794
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.730, train_acc=0.000, val_loss=1.730, val_acc=0.179]         Logits range: [-0.539, 0.213]
Loss: 1.723
Predictions: tensor([1, 0, 1, 0, 1], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 4], device='cuda:0')
Epoch 2:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.720, train_acc=0.000, val_loss=1.680, val_acc=0.176]         Logits range: [-1.032, 0.338]
Loss: 1.679
Predictions: tensor([1, 0, 1, 0, 1], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 5], device='cuda:0')
Epoch 3:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.740, train_acc=0.500,

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_64; lr_0.01; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/69 [00:00<?, ?it/s]                             Logits range: [-0.088, 0.123]
Loss: 1.801
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.780, train_acc=0.000, val_loss=1.770, val_acc=0.197]         Logits range: [-0.150, 0.092]
Loss: 1.769
Predictions: tensor([0, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 4], device='cuda:0')
Epoch 2:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.740, train_acc=0.300, val_loss=1.750, val_acc=0.207]         Logits range: [-0.354, 0.157]
Loss: 1.750
Predictions: tensor([1, 1, 1, 0, 1], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 5], device='cuda:0')
Epoch 3:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.740, train_acc=0.1

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_64; lr_0.01; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/69 [00:00<?, ?it/s]                             Logits range: [-0.215, 0.182]
Loss: 1.806
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.720, train_acc=0.000, val_loss=1.700, val_acc=0.185]         Logits range: [-0.706, 0.319]
Loss: 1.699
Predictions: tensor([0, 0, 1, 1, 1], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 4], device='cuda:0')
Epoch 2:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.700, train_acc=0.400, val_loss=1.660, val_acc=0.189]         Logits range: [-1.283, 0.498]
Loss: 1.663
Predictions: tensor([0, 1, 1, 1, 1], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 5], device='cuda:0')
Epoch 3:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.740, train_acc=0.375, 

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_64; lr_0.1; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/69 [00:00<?, ?it/s]                             Logits range: [-0.173, 0.149]
Loss: 1.794
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.640, train_acc=0.312, val_loss=1.620, val_acc=0.239]         Logits range: [-1.950, 1.034]
Loss: 1.632
Predictions: tensor([0, 2, 2, 0, 2], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 4], device='cuda:0')
Epoch 2:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.630, train_acc=0.400, val_loss=1.450, val_acc=0.346]         Logits range: [-2.708, 1.591]
Loss: 1.447
Predictions: tensor([0, 2, 0, 0, 2], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 5], device='cuda:0')
Epoch 3:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.150, train_acc=0.420, 

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_64; lr_0.1; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/69 [00:00<?, ?it/s]                             Logits range: [-0.088, 0.123]
Loss: 1.801
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.650, train_acc=0.250, val_loss=1.660, val_acc=0.178]         Logits range: [-1.801, 0.803]
Loss: 1.667
Predictions: tensor([0, 0, 0, 0, 0], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 4], device='cuda:0')
Epoch 2:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.680, train_acc=0.400, val_loss=1.620, val_acc=0.281]         Logits range: [-3.125, 1.113]
Loss: 1.632
Predictions: tensor([0, 1, 0, 0, 1], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 5], device='cuda:0')
Epoch 3:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.480, train_acc=0.20

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_64; lr_0.1; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/69 [00:00<?, ?it/s]                             Logits range: [-0.215, 0.182]
Loss: 1.806
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.580, train_acc=0.312, val_loss=1.570, val_acc=0.291]         Logits range: [-2.422, 1.150]
Loss: 1.563
Predictions: tensor([0, 2, 2, 0, 0], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 4], device='cuda:0')
Epoch 2:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.400, train_acc=0.600, val_loss=1.350, val_acc=0.421]         Logits range: [-3.368, 2.640]
Loss: 1.366
Predictions: tensor([0, 1, 1, 0, 2], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 5], device='cuda:0')
Epoch 3:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=0.874, train_acc=0.750, v

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_128; lr_1e-05; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/35 [00:00<?, ?it/s]                             Logits range: [-0.173, 0.182]
Loss: 1.798
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([1, 0, 3, 0, 0], device='cuda:0')
Epoch 1:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.820, train_acc=0.0667, val_loss=1.810, val_acc=0.153]         Logits range: [-0.135, 0.145]
Loss: 1.798
Predictions: tensor([4, 4, 4, 4, 3], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 2:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.770, train_acc=0.250, val_loss=1.810, val_acc=0.153]         Logits range: [-0.166, 0.165]
Loss: 1.800
Predictions: tensor([3, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 0], device='cuda:0')
Epoch 3:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.730, train_acc=0.3

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_128; lr_1e-05; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/35 [00:00<?, ?it/s]                             Logits range: [-0.087, 0.141]
Loss: 1.805
Predictions: tensor([3, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([1, 0, 3, 0, 0], device='cuda:0')
Epoch 1:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.820, train_acc=0.000, val_loss=1.810, val_acc=0.171]         Logits range: [-0.083, 0.129]
Loss: 1.802
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 2:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.770, train_acc=0.200, val_loss=1.810, val_acc=0.171]         Logits range: [-0.083, 0.134]
Loss: 1.805
Predictions: tensor([4, 4, 4, 3, 4], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 0], device='cuda:0')
Epoch 3:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.750, train_acc=0

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_128; lr_1e-05; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/35 [00:00<?, ?it/s]                             Logits range: [-0.215, 0.192]
Loss: 1.812
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([1, 0, 3, 0, 0], device='cuda:0')
Epoch 1:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.830, train_acc=0.250, val_loss=1.830, val_acc=0.139]         Logits range: [-0.196, 0.207]
Loss: 1.816
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 2:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.760, train_acc=0.200, val_loss=1.830, val_acc=0.140]         Logits range: [-0.214, 0.192]
Loss: 1.814
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 0], device='cuda:0')
Epoch 3:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.760, train_acc=0.250

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_128; lr_0.0001; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/35 [00:00<?, ?it/s]                             Logits range: [-0.173, 0.182]
Loss: 1.798
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([1, 0, 3, 0, 0], device='cuda:0')
Epoch 1:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.820, train_acc=0.0667, val_loss=1.810, val_acc=0.153]         Logits range: [-0.134, 0.145]
Loss: 1.798
Predictions: tensor([4, 4, 4, 4, 3], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 2:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.770, train_acc=0.250, val_loss=1.810, val_acc=0.152]         Logits range: [-0.165, 0.164]
Loss: 1.799
Predictions: tensor([3, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 0], device='cuda:0')
Epoch 3:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.730, train_acc=0.

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_128; lr_0.0001; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/35 [00:00<?, ?it/s]                            Logits range: [-0.087, 0.141]
Loss: 1.805
Predictions: tensor([3, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([1, 0, 3, 0, 0], device='cuda:0')
Epoch 1:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.820, train_acc=0.000, val_loss=1.810, val_acc=0.171]         Logits range: [-0.082, 0.128]
Loss: 1.802
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 2:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.770, train_acc=0.200, val_loss=1.810, val_acc=0.171]         Logits range: [-0.083, 0.134]
Loss: 1.805
Predictions: tensor([4, 4, 4, 3, 4], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 0], device='cuda:0')
Epoch 3:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.750, train_acc=0

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_128; lr_0.0001; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/35 [00:00<?, ?it/s]                             Logits range: [-0.215, 0.192]
Loss: 1.812
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([1, 0, 3, 0, 0], device='cuda:0')
Epoch 1:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.830, train_acc=0.250, val_loss=1.820, val_acc=0.143]         Logits range: [-0.195, 0.206]
Loss: 1.815
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 2:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.760, train_acc=0.200, val_loss=1.820, val_acc=0.146]         Logits range: [-0.211, 0.190]
Loss: 1.812
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 0], device='cuda:0')
Epoch 3:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.760, train_acc=0.25

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_128; lr_0.001; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/35 [00:00<?, ?it/s]                            Logits range: [-0.173, 0.182]
Loss: 1.798
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([1, 0, 3, 0, 0], device='cuda:0')
Epoch 1:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.810, train_acc=0.0667, val_loss=1.800, val_acc=0.158]         Logits range: [-0.129, 0.142]
Loss: 1.793
Predictions: tensor([4, 4, 4, 3, 3], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 2:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.760, train_acc=0.300, val_loss=1.800, val_acc=0.152]          Logits range: [-0.154, 0.150]
Loss: 1.790
Predictions: tensor([3, 4, 4, 4, 3], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 0], device='cuda:0')
Epoch 3:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.730, train_acc=0.2

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_128; lr_0.001; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/35 [00:00<?, ?it/s]                             Logits range: [-0.087, 0.141]
Loss: 1.805
Predictions: tensor([3, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([1, 0, 3, 0, 0], device='cuda:0')
Epoch 1:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.820, train_acc=0.000, val_loss=1.810, val_acc=0.170]         Logits range: [-0.080, 0.126]
Loss: 1.800
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 2:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.770, train_acc=0.200, val_loss=1.800, val_acc=0.170]         Logits range: [-0.078, 0.129]
Loss: 1.801
Predictions: tensor([4, 4, 4, 3, 4], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 0], device='cuda:0')
Epoch 3:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.750, train_acc=0

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_128; lr_0.001; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/35 [00:00<?, ?it/s]                             Logits range: [-0.215, 0.192]
Loss: 1.812
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([1, 0, 3, 0, 0], device='cuda:0')
Epoch 1:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.820, train_acc=0.250, val_loss=1.810, val_acc=0.170]         Logits range: [-0.183, 0.196]
Loss: 1.805
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 2:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.750, train_acc=0.200, val_loss=1.800, val_acc=0.172]         Logits range: [-0.185, 0.171]
Loss: 1.793
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 0], device='cuda:0')
Epoch 3:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.750, train_acc=0.175

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_128; lr_0.01; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/35 [00:00<?, ?it/s]                             Logits range: [-0.173, 0.182]
Loss: 1.798
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([1, 0, 3, 0, 0], device='cuda:0')
Epoch 1:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.760, train_acc=0.0833, val_loss=1.760, val_acc=0.174]         Logits range: [-0.317, 0.134]
Loss: 1.756
Predictions: tensor([1, 3, 1, 1, 1], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 2:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.730, train_acc=0.200, val_loss=1.730, val_acc=0.177]         Logits range: [-0.572, 0.205]
Loss: 1.727
Predictions: tensor([1, 1, 1, 1, 0], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 0], device='cuda:0')
Epoch 3:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.730, train_acc=0.25

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_128; lr_0.01; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/35 [00:00<?, ?it/s]                            Logits range: [-0.087, 0.141]
Loss: 1.805
Predictions: tensor([3, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([1, 0, 3, 0, 0], device='cuda:0')
Epoch 1:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.800, train_acc=0.000, val_loss=1.790, val_acc=0.182]         Logits range: [-0.068, 0.108]
Loss: 1.785
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 2:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.750, train_acc=0.200, val_loss=1.770, val_acc=0.199]         Logits range: [-0.168, 0.100]
Loss: 1.772
Predictions: tensor([4, 3, 4, 3, 4], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 0], device='cuda:0')
Epoch 3:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.740, train_acc=0.3

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_128; lr_0.01; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/35 [00:00<?, ?it/s]                            Logits range: [-0.215, 0.192]
Loss: 1.812
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([1, 0, 3, 0, 0], device='cuda:0')
Epoch 1:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.760, train_acc=0.000, val_loss=1.740, val_acc=0.175]         Logits range: [-0.387, 0.197]
Loss: 1.742
Predictions: tensor([0, 0, 0, 1, 1], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 2:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.710, train_acc=0.200, val_loss=1.700, val_acc=0.177]         Logits range: [-0.778, 0.325]
Loss: 1.706
Predictions: tensor([0, 1, 1, 0, 0], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 0], device='cuda:0')
Epoch 3:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.740, train_acc=0.375, 

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_128; lr_0.1; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/35 [00:00<?, ?it/s]                            Logits range: [-0.173, 0.182]
Loss: 1.798
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([1, 0, 3, 0, 0], device='cuda:0')
Epoch 1:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.640, train_acc=0.250, val_loss=1.660, val_acc=0.196]         Logits range: [-1.808, 0.706]
Loss: 1.669
Predictions: tensor([2, 2, 0, 2, 2], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 2:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.710, train_acc=0.400, val_loss=1.610, val_acc=0.282]         Logits range: [-2.221, 0.941]
Loss: 1.637
Predictions: tensor([1, 0, 1, 1, 0], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 0], device='cuda:0')
Epoch 3:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.420, train_acc=0.500, 

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_128; lr_0.1; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/35 [00:00<?, ?it/s]                             Logits range: [-0.087, 0.141]
Loss: 1.805
Predictions: tensor([3, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([1, 0, 3, 0, 0], device='cuda:0')
Epoch 1:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.690, train_acc=0.125, val_loss=1.700, val_acc=0.178]         Logits range: [-0.841, 0.349]
Loss: 1.705
Predictions: tensor([0, 1, 0, 0, 1], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 2:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.720, train_acc=0.200, val_loss=1.660, val_acc=0.213]         Logits range: [-1.804, 0.677]
Loss: 1.673
Predictions: tensor([0, 1, 0, 0, 0], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 0], device='cuda:0')
Epoch 3:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.710, train_acc=0.2

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_128; lr_0.1; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/35 [00:00<?, ?it/s]                             Logits range: [-0.215, 0.192]
Loss: 1.812
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([1, 0, 3, 0, 0], device='cuda:0')
Epoch 1:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.620, train_acc=0.250, val_loss=1.640, val_acc=0.205]         Logits range: [-2.170, 0.824]
Loss: 1.643
Predictions: tensor([0, 2, 2, 0, 2], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 2:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.610, train_acc=0.400, val_loss=1.560, val_acc=0.290]         Logits range: [-2.817, 1.117]
Loss: 1.567
Predictions: tensor([0, 1, 1, 0, 0], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 0], device='cuda:0')
Epoch 3:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.410, train_acc=0.625, 

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_256; lr_1e-05; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/18 [00:00<?, ?it/s]                            Logits range: [-0.173, 0.182]
Loss: 1.789
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([1, 0, 3, 3, 0], device='cuda:0')
Epoch 1:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.820, train_acc=0.0667, val_loss=1.810, val_acc=0.154]         Logits range: [-0.151, 0.157]
Loss: 1.799
Predictions: tensor([4, 3, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 0, 0, 1], device='cuda:0')
Epoch 2:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.770, train_acc=0.250, val_loss=1.810, val_acc=0.154]         Logits range: [-0.166, 0.172]
Loss: 1.793
Predictions: tensor([4, 3, 4, 4, 4], device='cuda:0')
True labels: tensor([2, 1, 1, 1, 0], device='cuda:0')
Epoch 3:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.730, train_acc=0.37

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_256; lr_1e-05; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/18 [00:00<?, ?it/s]                             Logits range: [-0.087, 0.141]
Loss: 1.797
Predictions: tensor([3, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([1, 0, 3, 3, 0], device='cuda:0')
Epoch 1:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.820, train_acc=0.000, val_loss=1.810, val_acc=0.168]         Logits range: [-0.082, 0.122]
Loss: 1.804
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 0, 0, 1], device='cuda:0')
Epoch 2:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.770, train_acc=0.200, val_loss=1.810, val_acc=0.168]         Logits range: [-0.085, 0.134]
Loss: 1.801
Predictions: tensor([4, 4, 4, 4, 3], device='cuda:0')
True labels: tensor([2, 1, 1, 1, 0], device='cuda:0')
Epoch 3:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.750, train_acc=0

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_256; lr_1e-05; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/18 [00:00<?, ?it/s]                             Logits range: [-0.215, 0.209]
Loss: 1.799
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([1, 0, 3, 3, 0], device='cuda:0')
Epoch 1:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.830, train_acc=0.250, val_loss=1.830, val_acc=0.133]         Logits range: [-0.201, 0.207]
Loss: 1.818
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 0, 0, 0, 1], device='cuda:0')
Epoch 2:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.760, train_acc=0.200, val_loss=1.830, val_acc=0.135]         Logits range: [-0.214, 0.192]
Loss: 1.810
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([2, 1, 1, 1, 0], device='cuda:0')
Epoch 3:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.760, train_acc=0.200

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_256; lr_0.0001; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/18 [00:00<?, ?it/s]                             Logits range: [-0.173, 0.182]
Loss: 1.789
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([1, 0, 3, 3, 0], device='cuda:0')
Epoch 1:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.820, train_acc=0.0667, val_loss=1.810, val_acc=0.153]         Logits range: [-0.151, 0.156]
Loss: 1.799
Predictions: tensor([4, 3, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 0, 0, 1], device='cuda:0')
Epoch 2:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.770, train_acc=0.250, val_loss=1.810, val_acc=0.153]         Logits range: [-0.165, 0.171]
Loss: 1.793
Predictions: tensor([4, 3, 4, 4, 4], device='cuda:0')
True labels: tensor([2, 1, 1, 1, 0], device='cuda:0')
Epoch 3:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.730, train_acc=0.

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_256; lr_0.0001; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/18 [00:00<?, ?it/s]                             Logits range: [-0.087, 0.141]
Loss: 1.797
Predictions: tensor([3, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([1, 0, 3, 3, 0], device='cuda:0')
Epoch 1:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.820, train_acc=0.000, val_loss=1.810, val_acc=0.168]         Logits range: [-0.082, 0.122]
Loss: 1.804
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 0, 0, 1], device='cuda:0')
Epoch 2:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.770, train_acc=0.200, val_loss=1.810, val_acc=0.168]         Logits range: [-0.085, 0.133]
Loss: 1.801
Predictions: tensor([4, 4, 4, 4, 3], device='cuda:0')
True labels: tensor([2, 1, 1, 1, 0], device='cuda:0')
Epoch 3:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.750, train_acc=

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_256; lr_0.0001; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/18 [00:00<?, ?it/s]                             Logits range: [-0.215, 0.209]
Loss: 1.799
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([1, 0, 3, 3, 0], device='cuda:0')
Epoch 1:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.830, train_acc=0.250, val_loss=1.830, val_acc=0.136]         Logits range: [-0.200, 0.207]
Loss: 1.817
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 0, 0, 0, 1], device='cuda:0')
Epoch 2:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.760, train_acc=0.200, val_loss=1.820, val_acc=0.140]         Logits range: [-0.213, 0.191]
Loss: 1.809
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([2, 1, 1, 1, 0], device='cuda:0')
Epoch 3:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.760, train_acc=0.25

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_256; lr_0.001; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/18 [00:00<?, ?it/s]                             Logits range: [-0.173, 0.182]
Loss: 1.789
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([1, 0, 3, 3, 0], device='cuda:0')
Epoch 1:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.810, train_acc=0.0667, val_loss=1.800, val_acc=0.153]         Logits range: [-0.148, 0.153]
Loss: 1.797
Predictions: tensor([4, 3, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 0, 0, 1], device='cuda:0')
Epoch 2:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.760, train_acc=0.250, val_loss=1.800, val_acc=0.157]         Logits range: [-0.160, 0.165]
Loss: 1.788
Predictions: tensor([4, 3, 4, 4, 4], device='cuda:0')
True labels: tensor([2, 1, 1, 1, 0], device='cuda:0')
Epoch 3:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.730, train_acc=0.3

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_256; lr_0.001; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/18 [00:00<?, ?it/s]                             Logits range: [-0.087, 0.141]
Loss: 1.797
Predictions: tensor([3, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([1, 0, 3, 3, 0], device='cuda:0')
Epoch 1:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.820, train_acc=0.000, val_loss=1.810, val_acc=0.168]         Logits range: [-0.081, 0.120]
Loss: 1.803
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 0, 0, 1], device='cuda:0')
Epoch 2:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.770, train_acc=0.200, val_loss=1.810, val_acc=0.168]         Logits range: [-0.083, 0.131]
Loss: 1.799
Predictions: tensor([4, 4, 4, 4, 3], device='cuda:0')
True labels: tensor([2, 1, 1, 1, 0], device='cuda:0')
Epoch 3:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.750, train_acc=0

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_256; lr_0.001; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/18 [00:00<?, ?it/s]                             Logits range: [-0.215, 0.209]
Loss: 1.799
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([1, 0, 3, 3, 0], device='cuda:0')
Epoch 1:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.830, train_acc=0.250, val_loss=1.820, val_acc=0.159]         Logits range: [-0.192, 0.202]
Loss: 1.812
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 0, 0, 0, 1], device='cuda:0')
Epoch 2:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.760, train_acc=0.200, val_loss=1.810, val_acc=0.168]         Logits range: [-0.199, 0.182]
Loss: 1.799
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([2, 1, 1, 1, 0], device='cuda:0')
Epoch 3:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.750, train_acc=0.250

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_256; lr_0.01; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/18 [00:00<?, ?it/s]                             Logits range: [-0.173, 0.182]
Loss: 1.789
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([1, 0, 3, 3, 0], device='cuda:0')
Epoch 1:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.790, train_acc=0.0667, val_loss=1.780, val_acc=0.163]         Logits range: [-0.206, 0.135]
Loss: 1.774
Predictions: tensor([4, 3, 3, 3, 4], device='cuda:0')
True labels: tensor([0, 0, 0, 0, 1], device='cuda:0')
Epoch 2:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.740, train_acc=0.150, val_loss=1.760, val_acc=0.176]         Logits range: [-0.341, 0.160]
Loss: 1.748
Predictions: tensor([1, 3, 1, 1, 3], device='cuda:0')
True labels: tensor([2, 1, 1, 1, 0], device='cuda:0')
Epoch 3:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.730, train_acc=0.25

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_256; lr_0.01; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/18 [00:00<?, ?it/s]                             Logits range: [-0.087, 0.141]
Loss: 1.797
Predictions: tensor([3, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([1, 0, 3, 3, 0], device='cuda:0')
Epoch 1:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.810, train_acc=0.000, val_loss=1.800, val_acc=0.172]         Logits range: [-0.065, 0.110]
Loss: 1.795
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 0, 0, 1], device='cuda:0')
Epoch 2:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.760, train_acc=0.200, val_loss=1.790, val_acc=0.186]         Logits range: [-0.074, 0.111]
Loss: 1.783
Predictions: tensor([4, 4, 3, 4, 3], device='cuda:0')
True labels: tensor([2, 1, 1, 1, 0], device='cuda:0')
Epoch 3:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.740, train_acc=0.

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_256; lr_0.01; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/18 [00:00<?, ?it/s]                             Logits range: [-0.215, 0.209]
Loss: 1.799
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([1, 0, 3, 3, 0], device='cuda:0')
Epoch 1:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.790, train_acc=0.133, val_loss=1.770, val_acc=0.176]         Logits range: [-0.198, 0.165]
Loss: 1.770
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 0, 0, 0, 1], device='cuda:0')
Epoch 2:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.730, train_acc=0.400, val_loss=1.740, val_acc=0.172]         Logits range: [-0.412, 0.196]
Loss: 1.735
Predictions: tensor([1, 0, 1, 0, 0], device='cuda:0')
True labels: tensor([2, 1, 1, 1, 0], device='cuda:0')
Epoch 3:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.740, train_acc=0.375,

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_256; lr_0.1; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/18 [00:00<?, ?it/s]                             Logits range: [-0.173, 0.182]
Loss: 1.789
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([1, 0, 3, 3, 0], device='cuda:0')
Epoch 1:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.660, train_acc=0.250, val_loss=1.670, val_acc=0.172]         Logits range: [-1.264, 0.542]
Loss: 1.682
Predictions: tensor([2, 2, 2, 2, 2], device='cuda:0')
True labels: tensor([0, 0, 0, 0, 1], device='cuda:0')
Epoch 2:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.720, train_acc=0.200, val_loss=1.660, val_acc=0.199]         Logits range: [-1.933, 0.508]
Loss: 1.656
Predictions: tensor([1, 1, 0, 1, 1], device='cuda:0')
True labels: tensor([2, 1, 1, 1, 0], device='cuda:0')
Epoch 3:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.660, train_acc=0.400,

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_256; lr_0.1; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/18 [00:00<?, ?it/s]                             Logits range: [-0.087, 0.141]
Loss: 1.797
Predictions: tensor([3, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([1, 0, 3, 3, 0], device='cuda:0')
Epoch 1:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.740, train_acc=0.375, val_loss=1.740, val_acc=0.174]         Logits range: [-0.407, 0.160]
Loss: 1.739
Predictions: tensor([0, 0, 0, 1, 0], device='cuda:0')
True labels: tensor([0, 0, 0, 0, 1], device='cuda:0')
Epoch 2:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.730, train_acc=0.200, val_loss=1.700, val_acc=0.214]         Logits range: [-0.823, 0.336]
Loss: 1.701
Predictions: tensor([1, 1, 1, 1, 1], device='cuda:0')
True labels: tensor([2, 1, 1, 1, 0], device='cuda:0')
Epoch 3:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.750, train_acc=0.2

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_256; lr_0.1; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/18 [00:00<?, ?it/s]                             Logits range: [-0.215, 0.209]
Loss: 1.799
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([1, 0, 3, 3, 0], device='cuda:0')
Epoch 1:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.650, train_acc=0.125, val_loss=1.660, val_acc=0.190]         Logits range: [-1.637, 0.669]
Loss: 1.665
Predictions: tensor([2, 2, 0, 2, 2], device='cuda:0')
True labels: tensor([0, 0, 0, 0, 1], device='cuda:0')
Epoch 2:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.670, train_acc=0.500, val_loss=1.640, val_acc=0.187]         Logits range: [-2.346, 0.712]
Loss: 1.632
Predictions: tensor([1, 1, 1, 1, 1], device='cuda:0')
True labels: tensor([2, 1, 1, 1, 0], device='cuda:0')
Epoch 3:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.640, train_acc=0.400, 

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_512; lr_1e-05; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/9 [00:00<?, ?it/s]                              Logits range: [-0.173, 0.182]
Loss: 1.795
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 1, 0, 3, 3], device='cuda:0')
Epoch 1:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.800, train_acc=0.169, val_loss=1.810, val_acc=0.154]        Logits range: [-0.151, 0.175]
Loss: 1.799
Predictions: tensor([4, 3, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 0, 2, 0], device='cuda:0')
Epoch 2:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.800, train_acc=0.174, val_loss=1.810, val_acc=0.154]        Logits range: [-0.173, 0.172]
Loss: 1.795
Predictions: tensor([4, 3, 4, 4, 4], device='cuda:0')
True labels: tensor([2, 1, 1, 1, 0], device='cuda:0')
Epoch 3:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.790, train_acc=0.165, va

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_512; lr_1e-05; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/9 [00:00<?, ?it/s]                              Logits range: [-0.084, 0.131]
Loss: 1.802
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 1, 0, 3, 3], device='cuda:0')
Epoch 1:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.810, train_acc=0.167, val_loss=1.810, val_acc=0.169]        Logits range: [-0.082, 0.132]
Loss: 1.803
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 0, 2, 0], device='cuda:0')
Epoch 2:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.810, train_acc=0.167, val_loss=1.810, val_acc=0.169]        Logits range: [-0.088, 0.141]
Loss: 1.803
Predictions: tensor([4, 4, 4, 4, 3], device='cuda:0')
True labels: tensor([2, 1, 1, 1, 0], device='cuda:0')
Epoch 3:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.800, train_acc=0.167,

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_512; lr_1e-05; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/9 [00:00<?, ?it/s]                              Logits range: [-0.227, 0.214]
Loss: 1.810
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 1, 0, 3, 3], device='cuda:0')
Epoch 1:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.830, train_acc=0.160, val_loss=1.830, val_acc=0.141]        Logits range: [-0.205, 0.207]
Loss: 1.815
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 0, 0, 2, 0], device='cuda:0')
Epoch 2:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.820, train_acc=0.163, val_loss=1.830, val_acc=0.141]        Logits range: [-0.235, 0.205]
Loss: 1.812
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([2, 1, 1, 1, 0], device='cuda:0')
Epoch 3:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.810, train_acc=0.176, val

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_512; lr_0.0001; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/9 [00:00<?, ?it/s]                              Logits range: [-0.173, 0.182]
Loss: 1.795
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 1, 0, 3, 3], device='cuda:0')
Epoch 1:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.800, train_acc=0.169, val_loss=1.810, val_acc=0.154]        Logits range: [-0.151, 0.175]
Loss: 1.798
Predictions: tensor([4, 3, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 0, 2, 0], device='cuda:0')
Epoch 2:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.800, train_acc=0.174, val_loss=1.810, val_acc=0.154]        Logits range: [-0.173, 0.171]
Loss: 1.795
Predictions: tensor([4, 3, 4, 4, 4], device='cuda:0')
True labels: tensor([2, 1, 1, 1, 0], device='cuda:0')
Epoch 3:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.790, train_acc=0.165, v

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_512; lr_0.0001; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/9 [00:00<?, ?it/s]                              Logits range: [-0.084, 0.131]
Loss: 1.802
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 1, 0, 3, 3], device='cuda:0')
Epoch 1:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.810, train_acc=0.167, val_loss=1.810, val_acc=0.169]        Logits range: [-0.082, 0.132]
Loss: 1.803
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 0, 2, 0], device='cuda:0')
Epoch 2:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.810, train_acc=0.167, val_loss=1.810, val_acc=0.169]        Logits range: [-0.088, 0.141]
Loss: 1.803
Predictions: tensor([4, 4, 4, 4, 3], device='cuda:0')
True labels: tensor([2, 1, 1, 1, 0], device='cuda:0')
Epoch 3:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.800, train_acc=0.167

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_512; lr_0.0001; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/9 [00:00<?, ?it/s]                              Logits range: [-0.227, 0.214]
Loss: 1.810
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 1, 0, 3, 3], device='cuda:0')
Epoch 1:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.830, train_acc=0.160, val_loss=1.830, val_acc=0.144]        Logits range: [-0.205, 0.207]
Loss: 1.814
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 0, 0, 2, 0], device='cuda:0')
Epoch 2:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.820, train_acc=0.163, val_loss=1.830, val_acc=0.144]        Logits range: [-0.234, 0.205]
Loss: 1.811
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([2, 1, 1, 1, 0], device='cuda:0')
Epoch 3:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.810, train_acc=0.176, va

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_512; lr_0.001; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/9 [00:00<?, ?it/s]                              Logits range: [-0.173, 0.182]
Loss: 1.795
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 1, 0, 3, 3], device='cuda:0')
Epoch 1:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.800, train_acc=0.174, val_loss=1.800, val_acc=0.153]        Logits range: [-0.150, 0.173]
Loss: 1.797
Predictions: tensor([4, 3, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 0, 2, 0], device='cuda:0')
Epoch 2:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.800, train_acc=0.174, val_loss=1.800, val_acc=0.153]        Logits range: [-0.171, 0.168]
Loss: 1.792
Predictions: tensor([4, 3, 4, 4, 4], device='cuda:0')
True labels: tensor([2, 1, 1, 1, 0], device='cuda:0')
Epoch 3:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.790, train_acc=0.162, va

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_512; lr_0.001; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/9 [00:00<?, ?it/s]                              Logits range: [-0.084, 0.131]
Loss: 1.802
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 1, 0, 3, 3], device='cuda:0')
Epoch 1:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.810, train_acc=0.167, val_loss=1.810, val_acc=0.169]        Logits range: [-0.082, 0.132]
Loss: 1.803
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 0, 2, 0], device='cuda:0')
Epoch 2:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.810, train_acc=0.167, val_loss=1.810, val_acc=0.168]        Logits range: [-0.087, 0.140]
Loss: 1.802
Predictions: tensor([4, 4, 4, 4, 3], device='cuda:0')
True labels: tensor([2, 1, 1, 1, 0], device='cuda:0')
Epoch 3:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.800, train_acc=0.167,

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_512; lr_0.001; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/9 [00:00<?, ?it/s]                             Logits range: [-0.227, 0.214]
Loss: 1.810
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 1, 0, 3, 3], device='cuda:0')
Epoch 1:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.820, train_acc=0.165, val_loss=1.820, val_acc=0.151]        Logits range: [-0.201, 0.204]
Loss: 1.812
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 0, 0, 2, 0], device='cuda:0')
Epoch 2:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.810, train_acc=0.163, val_loss=1.820, val_acc=0.164]        Logits range: [-0.228, 0.200]
Loss: 1.806
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([2, 1, 1, 1, 0], device='cuda:0')
Epoch 3:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.800, train_acc=0.176, val_

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_512; lr_0.01; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/9 [00:00<?, ?it/s]                              Logits range: [-0.173, 0.182]
Loss: 1.795
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 1, 0, 3, 3], device='cuda:0')
Epoch 1:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.790, train_acc=0.166, val_loss=1.790, val_acc=0.153]        Logits range: [-0.173, 0.155]
Loss: 1.785
Predictions: tensor([4, 3, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 0, 2, 0], device='cuda:0')
Epoch 2:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.780, train_acc=0.166, val_loss=1.780, val_acc=0.153]        Logits range: [-0.222, 0.144]
Loss: 1.769
Predictions: tensor([1, 3, 4, 1, 3], device='cuda:0')
True labels: tensor([2, 1, 1, 1, 0], device='cuda:0')
Epoch 3:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.760, train_acc=0.175, val

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_512; lr_0.01; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/9 [00:00<?, ?it/s]                              Logits range: [-0.084, 0.131]
Loss: 1.802
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 1, 0, 3, 3], device='cuda:0')
Epoch 1:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.800, train_acc=0.172, val_loss=1.800, val_acc=0.168]        Logits range: [-0.076, 0.127]
Loss: 1.799
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 0, 2, 0], device='cuda:0')
Epoch 2:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.800, train_acc=0.167, val_loss=1.800, val_acc=0.169]        Logits range: [-0.076, 0.130]
Loss: 1.793
Predictions: tensor([4, 4, 4, 4, 3], device='cuda:0')
True labels: tensor([2, 1, 1, 1, 0], device='cuda:0')
Epoch 3:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.790, train_acc=0.170, 

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_512; lr_0.01; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/9 [00:00<?, ?it/s]                              Logits range: [-0.227, 0.214]
Loss: 1.810
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 1, 0, 3, 3], device='cuda:0')
Epoch 1:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.800, train_acc=0.164, val_loss=1.790, val_acc=0.173]        Logits range: [-0.172, 0.176]
Loss: 1.788
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 0, 0, 2, 0], device='cuda:0')
Epoch 2:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.770, train_acc=0.192, val_loss=1.770, val_acc=0.188]        Logits range: [-0.231, 0.156]
Loss: 1.764
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([2, 1, 1, 1, 0], device='cuda:0')
Epoch 3:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.750, train_acc=0.155, val_

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_512; lr_0.1; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/9 [00:00<?, ?it/s]                             Logits range: [-0.173, 0.182]
Loss: 1.795
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 1, 0, 3, 3], device='cuda:0')
Epoch 1:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.710, train_acc=0.172, val_loss=1.710, val_acc=0.169]        Logits range: [-0.712, 0.280]
Loss: 1.710
Predictions: tensor([1, 1, 0, 0, 1], device='cuda:0')
True labels: tensor([0, 0, 0, 2, 0], device='cuda:0')
Epoch 2:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.660, train_acc=0.179, val_loss=1.670, val_acc=0.182]        Logits range: [-1.288, 0.435]
Loss: 1.667
Predictions: tensor([1, 1, 0, 1, 1], device='cuda:0')
True labels: tensor([2, 1, 1, 1, 0], device='cuda:0')
Epoch 3:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.660, train_acc=0.214, val_l

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_512; lr_0.1; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/9 [00:00<?, ?it/s]                             Logits range: [-0.084, 0.131]
Loss: 1.802
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 1, 0, 3, 3], device='cuda:0')
Epoch 1:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.770, train_acc=0.175, val_loss=1.770, val_acc=0.193]        Logits range: [-0.202, 0.095]
Loss: 1.766
Predictions: tensor([1, 0, 0, 1, 0], device='cuda:0')
True labels: tensor([0, 0, 0, 2, 0], device='cuda:0')
Epoch 2:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.740, train_acc=0.168, val_loss=1.740, val_acc=0.198]        Logits range: [-0.405, 0.180]
Loss: 1.735
Predictions: tensor([1, 1, 1, 1, 0], device='cuda:0')
True labels: tensor([2, 1, 1, 1, 0], device='cuda:0')
Epoch 3:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.720, train_acc=0.193, va

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_512; lr_0.1; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/9 [00:00<?, ?it/s]                              Logits range: [-0.227, 0.214]
Loss: 1.810
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 1, 0, 3, 3], device='cuda:0')
Epoch 1:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.680, train_acc=0.199, val_loss=1.680, val_acc=0.177]        Logits range: [-0.987, 0.422]
Loss: 1.687
Predictions: tensor([0, 1, 0, 0, 0], device='cuda:0')
True labels: tensor([0, 0, 0, 2, 0], device='cuda:0')
Epoch 2:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.640, train_acc=0.216, val_loss=1.640, val_acc=0.181]        Logits range: [-1.675, 0.586]
Loss: 1.652
Predictions: tensor([1, 0, 1, 1, 0], device='cuda:0')
True labels: tensor([2, 1, 1, 1, 0], device='cuda:0')
Epoch 3:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.640, train_acc=0.248, val_l

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_1024; lr_1e-05; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/5 [00:00<?, ?it/s]                              Logits range: [-0.173, 0.229]
Loss: 1.798
Predictions: tensor([4, 3, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 2, 1, 0, 0], device='cuda:0')
Epoch 1:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.800, train_acc=0.169, val_loss=1.810, val_acc=0.156]        Logits range: [-0.165, 0.202]
Loss: 1.799
Predictions: tensor([4, 4, 4, 3, 4], device='cuda:0')
True labels: tensor([0, 2, 3, 0, 0], device='cuda:0')
Epoch 2:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.800, train_acc=0.174, val_loss=1.810, val_acc=0.156]        Logits range: [-0.173, 0.229]
Loss: 1.796
Predictions: tensor([4, 4, 4, 4, 3], device='cuda:0')
True labels: tensor([0, 2, 3, 3, 1], device='cuda:0')
Epoch 3:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.790, train_acc=0.165, v

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_1024; lr_1e-05; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/5 [00:00<?, ?it/s]                              Logits range: [-0.084, 0.131]
Loss: 1.804
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 2, 1, 0, 0], device='cuda:0')
Epoch 1:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.810, train_acc=0.167, val_loss=1.810, val_acc=0.169]        Logits range: [-0.083, 0.132]
Loss: 1.804
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 2, 3, 0, 0], device='cuda:0')
Epoch 2:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.810, train_acc=0.167, val_loss=1.810, val_acc=0.169]        Logits range: [-0.085, 0.132]
Loss: 1.803
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 2, 3, 3, 1], device='cuda:0')
Epoch 3:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.800, train_acc=0.167

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_1024; lr_1e-05; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/5 [00:00<?, ?it/s]                              Logits range: [-0.227, 0.224]
Loss: 1.815
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 2, 1, 0, 0], device='cuda:0')
Epoch 1:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.830, train_acc=0.160, val_loss=1.830, val_acc=0.143]        Logits range: [-0.215, 0.208]
Loss: 1.815
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 2, 3, 0, 0], device='cuda:0')
Epoch 2:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.820, train_acc=0.163, val_loss=1.830, val_acc=0.143]        Logits range: [-0.235, 0.209]
Loss: 1.812
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 2, 3, 3, 1], device='cuda:0')
Epoch 3:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.810, train_acc=0.176, va

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_1024; lr_0.0001; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/5 [00:00<?, ?it/s]                              Logits range: [-0.173, 0.229]
Loss: 1.798
Predictions: tensor([4, 3, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 2, 1, 0, 0], device='cuda:0')
Epoch 1:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.800, train_acc=0.169, val_loss=1.810, val_acc=0.156]        Logits range: [-0.165, 0.202]
Loss: 1.799
Predictions: tensor([4, 4, 4, 3, 4], device='cuda:0')
True labels: tensor([0, 2, 3, 0, 0], device='cuda:0')
Epoch 2:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.800, train_acc=0.174, val_loss=1.810, val_acc=0.156]        Logits range: [-0.173, 0.229]
Loss: 1.796
Predictions: tensor([4, 4, 4, 4, 3], device='cuda:0')
True labels: tensor([0, 2, 3, 3, 1], device='cuda:0')
Epoch 3:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.790, train_acc=0.165, 

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_1024; lr_0.0001; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/5 [00:00<?, ?it/s]                              Logits range: [-0.084, 0.131]
Loss: 1.804
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 2, 1, 0, 0], device='cuda:0')
Epoch 1:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.810, train_acc=0.167, val_loss=1.810, val_acc=0.169]        Logits range: [-0.083, 0.132]
Loss: 1.804
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 2, 3, 0, 0], device='cuda:0')
Epoch 2:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.810, train_acc=0.167, val_loss=1.810, val_acc=0.169]        Logits range: [-0.085, 0.132]
Loss: 1.803
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 2, 3, 3, 1], device='cuda:0')
Epoch 3:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.800, train_acc=0.16

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_1024; lr_0.0001; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/5 [00:00<?, ?it/s]                              Logits range: [-0.227, 0.224]
Loss: 1.815
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 2, 1, 0, 0], device='cuda:0')
Epoch 1:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.830, train_acc=0.160, val_loss=1.830, val_acc=0.146]        Logits range: [-0.215, 0.207]
Loss: 1.815
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 2, 3, 0, 0], device='cuda:0')
Epoch 2:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.820, train_acc=0.163, val_loss=1.830, val_acc=0.146]        Logits range: [-0.235, 0.209]
Loss: 1.812
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 2, 3, 3, 1], device='cuda:0')
Epoch 3:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.810, train_acc=0.176, v

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_1024; lr_0.001; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/5 [00:00<?, ?it/s]                              Logits range: [-0.173, 0.229]
Loss: 1.798
Predictions: tensor([4, 3, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 2, 1, 0, 0], device='cuda:0')
Epoch 1:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.800, train_acc=0.174, val_loss=1.810, val_acc=0.156]        Logits range: [-0.164, 0.202]
Loss: 1.798
Predictions: tensor([4, 4, 4, 3, 4], device='cuda:0')
True labels: tensor([0, 2, 3, 0, 0], device='cuda:0')
Epoch 2:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.800, train_acc=0.174, val_loss=1.800, val_acc=0.155]        Logits range: [-0.172, 0.228]
Loss: 1.794
Predictions: tensor([4, 4, 4, 4, 3], device='cuda:0')
True labels: tensor([0, 2, 3, 3, 1], device='cuda:0')
Epoch 3:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.790, train_acc=0.162, v

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_1024; lr_0.001; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/5 [00:00<?, ?it/s]                              Logits range: [-0.084, 0.131]
Loss: 1.804
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 2, 1, 0, 0], device='cuda:0')
Epoch 1:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.810, train_acc=0.167, val_loss=1.810, val_acc=0.169]        Logits range: [-0.082, 0.132]
Loss: 1.804
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 2, 3, 0, 0], device='cuda:0')
Epoch 2:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.810, train_acc=0.167, val_loss=1.810, val_acc=0.169]        Logits range: [-0.085, 0.132]
Loss: 1.803
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 2, 3, 3, 1], device='cuda:0')
Epoch 3:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.800, train_acc=0.167

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_1024; lr_0.001; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/5 [00:00<?, ?it/s]                              Logits range: [-0.227, 0.224]
Loss: 1.815
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 2, 1, 0, 0], device='cuda:0')
Epoch 1:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.820, train_acc=0.165, val_loss=1.820, val_acc=0.153]        Logits range: [-0.213, 0.206]
Loss: 1.813
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 2, 3, 0, 0], device='cuda:0')
Epoch 2:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.820, train_acc=0.163, val_loss=1.820, val_acc=0.152]        Logits range: [-0.231, 0.206]
Loss: 1.809
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 2, 3, 3, 1], device='cuda:0')
Epoch 3:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.800, train_acc=0.176, va

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_1024; lr_0.01; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/5 [00:00<?, ?it/s]                             Logits range: [-0.173, 0.229]
Loss: 1.798
Predictions: tensor([4, 3, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 2, 1, 0, 0], device='cuda:0')
Epoch 1:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.800, train_acc=0.172, val_loss=1.800, val_acc=0.164]        Logits range: [-0.155, 0.200]
Loss: 1.791
Predictions: tensor([4, 4, 4, 3, 4], device='cuda:0')
True labels: tensor([0, 2, 3, 0, 0], device='cuda:0')
Epoch 2:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.790, train_acc=0.201, val_loss=1.790, val_acc=0.154]        Logits range: [-0.180, 0.217]
Loss: 1.780
Predictions: tensor([4, 4, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 2, 3, 3, 1], device='cuda:0')
Epoch 3:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.770, train_acc=0.166, val

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_1024; lr_0.01; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/5 [00:00<?, ?it/s]                              Logits range: [-0.084, 0.131]
Loss: 1.804
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 2, 1, 0, 0], device='cuda:0')
Epoch 1:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.810, train_acc=0.172, val_loss=1.810, val_acc=0.169]        Logits range: [-0.079, 0.129]
Loss: 1.801
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 2, 3, 0, 0], device='cuda:0')
Epoch 2:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.800, train_acc=0.167, val_loss=1.800, val_acc=0.169]        Logits range: [-0.078, 0.127]
Loss: 1.798
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 2, 3, 3, 1], device='cuda:0')
Epoch 3:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.790, train_acc=0.167,

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_1024; lr_0.01; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/5 [00:00<?, ?it/s]                             Logits range: [-0.227, 0.224]
Loss: 1.815
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 2, 1, 0, 0], device='cuda:0')
Epoch 1:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.810, train_acc=0.165, val_loss=1.810, val_acc=0.165]        Logits range: [-0.196, 0.190]
Loss: 1.798
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 2, 3, 0, 0], device='cuda:0')
Epoch 2:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.790, train_acc=0.167, val_loss=1.790, val_acc=0.174]        Logits range: [-0.198, 0.180]
Loss: 1.781
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 2, 3, 3, 1], device='cuda:0')
Epoch 3:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.770, train_acc=0.187, val_

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_1024; lr_0.1; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/5 [00:00<?, ?it/s]                              Logits range: [-0.173, 0.229]
Loss: 1.798
Predictions: tensor([4, 3, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 2, 1, 0, 0], device='cuda:0')
Epoch 1:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.750, train_acc=0.182, val_loss=1.740, val_acc=0.169]        Logits range: [-0.449, 0.188]
Loss: 1.736
Predictions: tensor([1, 0, 4, 1, 0], device='cuda:0')
True labels: tensor([0, 2, 3, 0, 0], device='cuda:0')
Epoch 2:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.700, train_acc=0.196, val_loss=1.700, val_acc=0.173]        Logits range: [-0.785, 0.300]
Loss: 1.690
Predictions: tensor([1, 1, 1, 0, 1], device='cuda:0')
True labels: tensor([0, 2, 3, 3, 1], device='cuda:0')
Epoch 3:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.690, train_acc=0.166, val

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_1024; lr_0.1; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/5 [00:00<?, ?it/s]                              Logits range: [-0.084, 0.131]
Loss: 1.804
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 2, 1, 0, 0], device='cuda:0')
Epoch 1:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.790, train_acc=0.165, val_loss=1.780, val_acc=0.169]        Logits range: [-0.104, 0.106]
Loss: 1.779
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 2, 3, 0, 0], device='cuda:0')
Epoch 2:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.770, train_acc=0.160, val_loss=1.760, val_acc=0.200]        Logits range: [-0.212, 0.100]
Loss: 1.759
Predictions: tensor([1, 1, 1, 1, 4], device='cuda:0')
True labels: tensor([0, 2, 3, 3, 1], device='cuda:0')
Epoch 3:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.750, train_acc=0.171, 

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_1024; lr_0.1; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/5 [00:00<?, ?it/s]                             Logits range: [-0.227, 0.224]
Loss: 1.815
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 2, 1, 0, 0], device='cuda:0')
Epoch 1:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.720, train_acc=0.183, val_loss=1.710, val_acc=0.174]        Logits range: [-0.597, 0.296]
Loss: 1.714
Predictions: tensor([0, 1, 1, 1, 0], device='cuda:0')
True labels: tensor([0, 2, 3, 0, 0], device='cuda:0')
Epoch 2:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.680, train_acc=0.227, val_loss=1.670, val_acc=0.177]        Logits range: [-1.082, 0.444]
Loss: 1.669
Predictions: tensor([1, 1, 1, 1, 1], device='cuda:0')
True labels: tensor([0, 2, 3, 3, 1], device='cuda:0')
Epoch 3:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.670, train_acc=0.216, val_l

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_2048; lr_1e-05; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
                                                                           

/home/linnsheng/Desktop/NTU/S3/Y1/NLP/SC4002/.venv/lib/python3.13/site-packages/lightning/pytorch/loops/fit_loop.py:310: The number of training batches (3) is smaller than the logging interval Trainer(log_every_n_steps=5). Set a lower value for log_every_n_steps if you want to see logs for the training epoch.


Epoch 0:   0%|          | 0/3 [00:00<?, ?it/s]Logits range: [-0.173, 0.229]
Loss: 1.797
Predictions: tensor([4, 4, 3, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 2, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.800, train_acc=0.169, val_loss=1.810, val_acc=0.157]        Logits range: [-0.171, 0.202]
Loss: 1.799
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 0, 2, 3], device='cuda:0')
Epoch 2:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.800, train_acc=0.174, val_loss=1.810, val_acc=0.157]        Logits range: [-0.173, 0.229]
Loss: 1.799
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 3, 2, 3], device='cuda:0')
Epoch 3:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.790, train_acc=0.165, val_loss=1.810, val_acc=0.157]        Logits range: [-0.171, 0.191]
Loss: 1.796
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_2048; lr_1e-05; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/3 [00:00<?, ?it/s]                             Logits range: [-0.084, 0.134]
Loss: 1.803
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 2, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.810, train_acc=0.167, val_loss=1.810, val_acc=0.167]        Logits range: [-0.082, 0.139]
Loss: 1.804
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 0, 2, 3], device='cuda:0')
Epoch 2:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.810, train_acc=0.167, val_loss=1.810, val_acc=0.167]        Logits range: [-0.083, 0.131]
Loss: 1.805
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 3, 2, 3], device='cuda:0')
Epoch 3:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.800, train_acc=0.167,

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_2048; lr_1e-05; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/3 [00:00<?, ?it/s]                             Logits range: [-0.227, 0.224]
Loss: 1.814
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 0, 2, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.830, train_acc=0.160, val_loss=1.830, val_acc=0.145]        Logits range: [-0.227, 0.209]
Loss: 1.814
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 0, 0, 2, 3], device='cuda:0')
Epoch 2:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.820, train_acc=0.163, val_loss=1.830, val_acc=0.145]        Logits range: [-0.235, 0.224]
Loss: 1.815
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 0, 3, 2, 3], device='cuda:0')
Epoch 3:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.810, train_acc=0.176, val

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_2048; lr_0.0001; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/3 [00:00<?, ?it/s]                             Logits range: [-0.173, 0.229]
Loss: 1.797
Predictions: tensor([4, 4, 3, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 2, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.800, train_acc=0.169, val_loss=1.810, val_acc=0.157]        Logits range: [-0.171, 0.202]
Loss: 1.799
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 0, 2, 3], device='cuda:0')
Epoch 2:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.800, train_acc=0.174, val_loss=1.810, val_acc=0.157]        Logits range: [-0.173, 0.229]
Loss: 1.798
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 3, 2, 3], device='cuda:0')
Epoch 3:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.790, train_acc=0.165, v

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_2048; lr_0.0001; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/3 [00:00<?, ?it/s]                             Logits range: [-0.084, 0.134]
Loss: 1.803
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 2, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.810, train_acc=0.167, val_loss=1.810, val_acc=0.167]        Logits range: [-0.082, 0.139]
Loss: 1.804
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 0, 2, 3], device='cuda:0')
Epoch 2:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.810, train_acc=0.167, val_loss=1.810, val_acc=0.167]        Logits range: [-0.083, 0.131]
Loss: 1.805
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 3, 2, 3], device='cuda:0')
Epoch 3:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.800, train_acc=0.167

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_2048; lr_0.0001; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/3 [00:00<?, ?it/s]                             Logits range: [-0.227, 0.224]
Loss: 1.814
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 0, 2, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.830, train_acc=0.160, val_loss=1.830, val_acc=0.146]        Logits range: [-0.227, 0.209]
Loss: 1.814
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 0, 0, 2, 3], device='cuda:0')
Epoch 2:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.820, train_acc=0.163, val_loss=1.830, val_acc=0.148]        Logits range: [-0.235, 0.224]
Loss: 1.815
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 0, 3, 2, 3], device='cuda:0')
Epoch 3:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.810, train_acc=0.176, va

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_2048; lr_0.001; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/3 [00:00<?, ?it/s]                             Logits range: [-0.173, 0.229]
Loss: 1.797
Predictions: tensor([4, 4, 3, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 2, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.800, train_acc=0.174, val_loss=1.810, val_acc=0.157]        Logits range: [-0.171, 0.202]
Loss: 1.798
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 0, 2, 3], device='cuda:0')
Epoch 2:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.800, train_acc=0.174, val_loss=1.810, val_acc=0.157]        Logits range: [-0.172, 0.229]
Loss: 1.797
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 3, 2, 3], device='cuda:0')
Epoch 3:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.790, train_acc=0.162, va

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_2048; lr_0.001; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/3 [00:00<?, ?it/s]                             Logits range: [-0.084, 0.134]
Loss: 1.803
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 2, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.810, train_acc=0.167, val_loss=1.810, val_acc=0.167]        Logits range: [-0.082, 0.139]
Loss: 1.804
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 0, 2, 3], device='cuda:0')
Epoch 2:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.810, train_acc=0.167, val_loss=1.810, val_acc=0.167]        Logits range: [-0.083, 0.131]
Loss: 1.804
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 3, 2, 3], device='cuda:0')
Epoch 3:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.800, train_acc=0.167,

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_2048; lr_0.001; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/3 [00:00<?, ?it/s]                             Logits range: [-0.227, 0.224]
Loss: 1.814
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 0, 2, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.830, train_acc=0.160, val_loss=1.820, val_acc=0.149]        Logits range: [-0.226, 0.208]
Loss: 1.813
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 0, 0, 2, 3], device='cuda:0')
Epoch 2:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.820, train_acc=0.163, val_loss=1.820, val_acc=0.157]        Logits range: [-0.232, 0.222]
Loss: 1.813
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 0, 3, 2, 3], device='cuda:0')
Epoch 3:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.810, train_acc=0.176, val

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_2048; lr_0.01; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/3 [00:00<?, ?it/s]                             Logits range: [-0.173, 0.229]
Loss: 1.797
Predictions: tensor([4, 4, 3, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 2, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.800, train_acc=0.174, val_loss=1.800, val_acc=0.156]        Logits range: [-0.167, 0.201]
Loss: 1.794
Predictions: tensor([3, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 0, 2, 3], device='cuda:0')
Epoch 2:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.790, train_acc=0.185, val_loss=1.800, val_acc=0.152]        Logits range: [-0.162, 0.222]
Loss: 1.788
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 3, 2, 3], device='cuda:0')
Epoch 3:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.780, train_acc=0.159, val

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_2048; lr_0.01; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/3 [00:00<?, ?it/s]                             Logits range: [-0.084, 0.134]
Loss: 1.803
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 2, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.810, train_acc=0.167, val_loss=1.810, val_acc=0.167]        Logits range: [-0.080, 0.137]
Loss: 1.803
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 0, 2, 3], device='cuda:0')
Epoch 2:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.810, train_acc=0.167, val_loss=1.800, val_acc=0.167]        Logits range: [-0.079, 0.127]
Loss: 1.801
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 3, 2, 3], device='cuda:0')
Epoch 3:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.790, train_acc=0.167, 

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_2048; lr_0.01; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/3 [00:00<?, ?it/s]                             Logits range: [-0.227, 0.224]
Loss: 1.814
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 0, 2, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.820, train_acc=0.165, val_loss=1.810, val_acc=0.162]        Logits range: [-0.216, 0.198]
Loss: 1.804
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 0, 0, 2, 3], device='cuda:0')
Epoch 2:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.800, train_acc=0.167, val_loss=1.800, val_acc=0.173]        Logits range: [-0.210, 0.204]
Loss: 1.794
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 0, 3, 2, 3], device='cuda:0')
Epoch 3:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.790, train_acc=0.173, val_

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_2048; lr_0.1; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/3 [00:00<?, ?it/s]                             Logits range: [-0.173, 0.229]
Loss: 1.797
Predictions: tensor([4, 4, 3, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 2, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.770, train_acc=0.142, val_loss=1.760, val_acc=0.170]        Logits range: [-0.317, 0.189]
Loss: 1.756
Predictions: tensor([1, 1, 0, 0, 4], device='cuda:0')
True labels: tensor([0, 0, 0, 2, 3], device='cuda:0')
Epoch 2:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.740, train_acc=0.196, val_loss=1.730, val_acc=0.166]        Logits range: [-0.521, 0.214]
Loss: 1.723
Predictions: tensor([1, 1, 0, 1, 1], device='cuda:0')
True labels: tensor([0, 0, 3, 2, 3], device='cuda:0')
Epoch 3:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.710, train_acc=0.163, val_

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_2048; lr_0.1; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/3 [00:00<?, ?it/s]                             Logits range: [-0.084, 0.134]
Loss: 1.803
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 2, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.800, train_acc=0.177, val_loss=1.790, val_acc=0.167]        Logits range: [-0.062, 0.124]
Loss: 1.788
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 0, 2, 3], device='cuda:0')
Epoch 2:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.780, train_acc=0.167, val_loss=1.780, val_acc=0.165]        Logits range: [-0.129, 0.098]
Loss: 1.775
Predictions: tensor([4, 4, 3, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 3, 2, 3], device='cuda:0')
Epoch 3:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.760, train_acc=0.178, v

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_2048; lr_0.1; optimizer_SGD; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/3 [00:00<?, ?it/s]                             Logits range: [-0.227, 0.224]
Loss: 1.814
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 0, 2, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.760, train_acc=0.187, val_loss=1.740, val_acc=0.170]        Logits range: [-0.405, 0.221]
Loss: 1.738
Predictions: tensor([0, 0, 0, 0, 1], device='cuda:0')
True labels: tensor([0, 0, 0, 2, 3], device='cuda:0')
Epoch 2:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.720, train_acc=0.215, val_loss=1.700, val_acc=0.172]        Logits range: [-0.709, 0.334]
Loss: 1.699
Predictions: tensor([1, 1, 1, 1, 1], device='cuda:0')
True labels: tensor([0, 0, 3, 2, 3], device='cuda:0')
Epoch 3:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.700, train_acc=0.181, val_l

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_32; lr_1e-05; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/137 [00:00<?, ?it/s]                            Logits range: [-0.140, 0.144]
Loss: 1.793
Predictions: tensor([4, 4, 4, 4, 3], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 0], device='cuda:0')
Epoch 1:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.810, train_acc=0.0667, val_loss=1.800, val_acc=0.166]          Logits range: [-0.113, 0.119]
Loss: 1.778
Predictions: tensor([4, 3, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 2], device='cuda:0')
Epoch 2:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.760, train_acc=0.200, val_loss=1.790, val_acc=0.166]           Logits range: [-0.143, 0.148]
Loss: 1.791
Predictions: tensor([4, 3, 4, 3, 4], device='cuda:0')
True labels: tensor([0, 5, 0, 3, 4], device='cuda:0')
Epoch 3:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.730, trai

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_32; lr_1e-05; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/137 [00:00<?, ?it/s]                            Logits range: [-0.080, 0.122]
Loss: 1.806
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 0], device='cuda:0')
Epoch 1:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.820, train_acc=0.000, val_loss=1.800, val_acc=0.185]          Logits range: [-0.079, 0.116]
Loss: 1.790
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 2], device='cuda:0')
Epoch 2:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.770, train_acc=0.200, val_loss=1.800, val_acc=0.186]          Logits range: [-0.087, 0.132]
Loss: 1.799
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 5, 0, 3, 4], device='cuda:0')
Epoch 3:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.740, tra

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_32; lr_1e-05; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/137 [00:00<?, ?it/s]                            Logits range: [-0.215, 0.176]
Loss: 1.808
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 0], device='cuda:0')
Epoch 1:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.820, train_acc=0.200, val_loss=1.810, val_acc=0.164]          Logits range: [-0.182, 0.175]
Loss: 1.788
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 2], device='cuda:0')
Epoch 2:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.750, train_acc=0.200, val_loss=1.810, val_acc=0.181]          Logits range: [-0.171, 0.144]
Loss: 1.801
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 5, 0, 3, 4], device='cuda:0')
Epoch 3:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.750, train_a

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_32; lr_0.0001; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/137 [00:00<?, ?it/s]                            Logits range: [-0.140, 0.144]
Loss: 1.793
Predictions: tensor([4, 4, 4, 4, 3], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 0], device='cuda:0')
Epoch 1:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.730, train_acc=0.000, val_loss=1.730, val_acc=0.204]          Logits range: [-0.482, 0.163]
Loss: 1.727
Predictions: tensor([1, 1, 1, 0, 1], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 2], device='cuda:0')
Epoch 2:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.700, train_acc=0.200, val_loss=1.710, val_acc=0.203]          Logits range: [-0.717, 0.241]
Loss: 1.709
Predictions: tensor([0, 1, 1, 1, 1], device='cuda:0')
True labels: tensor([0, 5, 0, 3, 4], device='cuda:0')
Epoch 3:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.710, train

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_32; lr_0.0001; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/137 [00:00<?, ?it/s]                            Logits range: [-0.080, 0.122]
Loss: 1.806
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 0], device='cuda:0')
Epoch 1:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.770, train_acc=0.433, val_loss=1.770, val_acc=0.242]          Logits range: [-0.317, 0.127]
Loss: 1.762
Predictions: tensor([1, 1, 3, 3, 4], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 2], device='cuda:0')
Epoch 2:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.730, train_acc=0.450, val_loss=1.750, val_acc=0.243]          Logits range: [-0.593, 0.229]
Loss: 1.751
Predictions: tensor([1, 1, 1, 1, 1], device='cuda:0')
True labels: tensor([0, 5, 0, 3, 4], device='cuda:0')
Epoch 3:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.700, tr

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_32; lr_0.0001; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/137 [00:00<?, ?it/s]                            Logits range: [-0.215, 0.176]
Loss: 1.808
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 0], device='cuda:0')
Epoch 1:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.750, train_acc=0.125, val_loss=1.730, val_acc=0.201]          Logits range: [-0.408, 0.191]
Loss: 1.722
Predictions: tensor([0, 1, 1, 1, 1], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 2], device='cuda:0')
Epoch 2:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.700, train_acc=0.400, val_loss=1.700, val_acc=0.205]          Logits range: [-0.589, 0.246]
Loss: 1.722
Predictions: tensor([0, 1, 0, 1, 1], device='cuda:0')
True labels: tensor([0, 5, 0, 3, 4], device='cuda:0')
Epoch 3:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.710, train_

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_32; lr_0.001; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/137 [00:00<?, ?it/s]                            Logits range: [-0.140, 0.144]
Loss: 1.793
Predictions: tensor([4, 4, 4, 4, 3], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 0], device='cuda:0')
Epoch 1:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.260, train_acc=0.438, val_loss=1.410, val_acc=0.356]          Logits range: [-2.456, 1.965]
Loss: 1.457
Predictions: tensor([0, 0, 2, 2, 2], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 2], device='cuda:0')
Epoch 2:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.180, train_acc=0.700, val_loss=1.270, val_acc=0.493]          Logits range: [-2.878, 2.584]
Loss: 1.216
Predictions: tensor([0, 3, 0, 3, 4], device='cuda:0')
True labels: tensor([0, 5, 0, 3, 4], device='cuda:0')
Epoch 3:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=0.860, train_

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_32; lr_0.001; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/137 [00:00<?, ?it/s]                            Logits range: [-0.080, 0.122]
Loss: 1.806
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 0], device='cuda:0')
Epoch 1:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.220, train_acc=0.375, val_loss=1.450, val_acc=0.356]          Logits range: [-4.238, 3.065]
Loss: 1.528
Predictions: tensor([0, 0, 2, 2, 1], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 2], device='cuda:0')
Epoch 2:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.270, train_acc=0.400, val_loss=1.350, val_acc=0.455]          Logits range: [-5.233, 2.747]
Loss: 1.435
Predictions: tensor([0, 2, 0, 3, 1], device='cuda:0')
True labels: tensor([0, 5, 0, 3, 4], device='cuda:0')
Epoch 3:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.160, tra

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_32; lr_0.001; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/137 [00:00<?, ?it/s]                            Logits range: [-0.215, 0.176]
Loss: 1.808
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 0], device='cuda:0')
Epoch 1:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.310, train_acc=0.375, val_loss=1.390, val_acc=0.375]          Logits range: [-2.367, 1.989]
Loss: 1.389
Predictions: tensor([0, 1, 2, 2, 1], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 2], device='cuda:0')
Epoch 2:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.300, train_acc=0.600, val_loss=1.250, val_acc=0.531]          Logits range: [-2.926, 2.500]
Loss: 1.283
Predictions: tensor([0, 2, 0, 3, 4], device='cuda:0')
True labels: tensor([0, 5, 0, 3, 4], device='cuda:0')
Epoch 3:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.050, train_a

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_32; lr_0.01; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/137 [00:00<?, ?it/s]                            Logits range: [-0.140, 0.144]
Loss: 1.793
Predictions: tensor([4, 4, 4, 4, 3], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 0], device='cuda:0')
Epoch 1:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.740, train_acc=0.188, val_loss=1.550, val_acc=0.272]          Logits range: [-2.966, 1.101]
Loss: 1.594
Predictions: tensor([0, 0, 0, 0, 0], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 2], device='cuda:0')
Epoch 2:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.130, train_acc=0.700, val_loss=1.380, val_acc=0.506]          Logits range: [-4.606, 4.245]
Loss: 1.416
Predictions: tensor([0, 1, 4, 3, 3], device='cuda:0')
True labels: tensor([0, 5, 0, 3, 4], device='cuda:0')
Epoch 3:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=0.592, train_a

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_32; lr_0.01; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/137 [00:00<?, ?it/s]                            Logits range: [-0.080, 0.122]
Loss: 1.806
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 0], device='cuda:0')
Epoch 1:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=0.777, train_acc=0.688, val_loss=0.990, val_acc=0.616]          Logits range: [-7.907, 6.064]
Loss: 0.849
Predictions: tensor([0, 1, 1, 1, 1], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 2], device='cuda:0')
Epoch 2:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=0.804, train_acc=0.750, val_loss=0.735, val_acc=0.737]          Logits range: [-12.004, 11.686]
Loss: 0.998
Predictions: tensor([1, 2, 1, 3, 4], device='cuda:0')
True labels: tensor([0, 5, 0, 3, 4], device='cuda:0')
Epoch 3:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=0.446, tr

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_32; lr_0.01; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/137 [00:00<?, ?it/s]                            Logits range: [-0.215, 0.176]
Loss: 1.808
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 0], device='cuda:0')
Epoch 1:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=0.509, train_acc=0.812, val_loss=0.640, val_acc=0.743]          Logits range: [-4.664, 6.236]
Loss: 0.610
Predictions: tensor([0, 1, 2, 2, 2], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 2], device='cuda:0')
Epoch 2:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=0.702, train_acc=0.900, val_loss=0.472, val_acc=0.813]           Logits range: [-6.744, 7.478]
Loss: 0.331
Predictions: tensor([0, 2, 0, 3, 4], device='cuda:0')
True labels: tensor([0, 5, 0, 3, 4], device='cuda:0')
Epoch 3:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=0.0597, train_

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_32; lr_0.1; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/137 [00:00<?, ?it/s]                            Logits range: [-0.140, 0.144]
Loss: 1.793
Predictions: tensor([4, 4, 4, 4, 3], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 0], device='cuda:0')
Epoch 1:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.630, train_acc=0.250, val_loss=1.670, val_acc=0.187]          Logits range: [-1.046, 0.365]
Loss: 1.718
Predictions: tensor([0, 0, 0, 0, 0], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 2], device='cuda:0')
Epoch 2:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.680, train_acc=0.200, val_loss=1.690, val_acc=0.198]           Logits range: [-5.776, 0.953]
Loss: 1.777
Predictions: tensor([0, 0, 0, 0, 0], device='cuda:0')
True labels: tensor([0, 5, 0, 3, 4], device='cuda:0')
Epoch 3:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.630, train_a

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_32; lr_0.1; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/137 [00:00<?, ?it/s]                            Logits range: [-0.080, 0.122]
Loss: 1.806
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 0], device='cuda:0')
Epoch 1:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=0.900, train_acc=0.812, val_loss=0.770, val_acc=0.654]          Logits range: [-10.234, 7.563]
Loss: 0.606
Predictions: tensor([0, 1, 1, 2, 1], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 2], device='cuda:0')
Epoch 2:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=0.784, train_acc=0.750, val_loss=0.748, val_acc=0.730]          Logits range: [-17.392, 16.372]
Loss: 0.220
Predictions: tensor([0, 2, 0, 3, 4], device='cuda:0')
True labels: tensor([0, 5, 0, 3, 4], device='cuda:0')
Epoch 3:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=0.0152, t

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_32; lr_0.1; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/137 [00:00<?, ?it/s]                            Logits range: [-0.215, 0.176]
Loss: 1.808
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 0], device='cuda:0')
Epoch 1:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.580, train_acc=0.333, val_loss=1.380, val_acc=0.274]          Logits range: [-2.676, 8.554]
Loss: 1.368
Predictions: tensor([3, 2, 1, 2, 2], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 2], device='cuda:0')
Epoch 2:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.450, train_acc=0.350, val_loss=1.480, val_acc=0.418]          Logits range: [-13.544, 7.955]
Loss: 1.870
Predictions: tensor([0, 3, 2, 3, 2], device='cuda:0')
True labels: tensor([0, 5, 0, 3, 4], device='cuda:0')
Epoch 3:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.280, train_ac

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_64; lr_1e-05; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/69 [00:00<?, ?it/s]                             Logits range: [-0.173, 0.149]
Loss: 1.794
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.810, train_acc=0.0667, val_loss=1.800, val_acc=0.161]         Logits range: [-0.129, 0.135]
Loss: 1.790
Predictions: tensor([4, 4, 3, 4, 3], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 4], device='cuda:0')
Epoch 2:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.760, train_acc=0.300, val_loss=1.800, val_acc=0.159]          Logits range: [-0.143, 0.149]
Loss: 1.796
Predictions: tensor([3, 4, 4, 4, 3], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 5], device='cuda:0')
Epoch 3:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.730, train_acc

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_64; lr_1e-05; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/69 [00:00<?, ?it/s]                             Logits range: [-0.088, 0.123]
Loss: 1.801
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.820, train_acc=0.000, val_loss=1.810, val_acc=0.175]         Logits range: [-0.079, 0.124]
Loss: 1.799
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 4], device='cuda:0')
Epoch 2:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.770, train_acc=0.200, val_loss=1.800, val_acc=0.177]         Logits range: [-0.081, 0.126]
Loss: 1.809
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 5], device='cuda:0')
Epoch 3:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.740, train_ac

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_64; lr_1e-05; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/69 [00:00<?, ?it/s]                             Logits range: [-0.215, 0.182]
Loss: 1.806
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.830, train_acc=0.200, val_loss=1.820, val_acc=0.163]         Logits range: [-0.184, 0.176]
Loss: 1.805
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 4], device='cuda:0')
Epoch 2:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.760, train_acc=0.200, val_loss=1.810, val_acc=0.172]         Logits range: [-0.181, 0.168]
Loss: 1.814
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 5], device='cuda:0')
Epoch 3:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.750, train_acc=0.

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_64; lr_0.0001; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/69 [00:00<?, ?it/s]                             Logits range: [-0.173, 0.149]
Loss: 1.794
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.750, train_acc=0.0833, val_loss=1.750, val_acc=0.199]         Logits range: [-0.403, 0.160]
Loss: 1.739
Predictions: tensor([1, 3, 1, 1, 1], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 4], device='cuda:0')
Epoch 2:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.710, train_acc=0.100, val_loss=1.720, val_acc=0.189]         Logits range: [-0.591, 0.204]
Loss: 1.718
Predictions: tensor([1, 0, 1, 0, 1], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 5], device='cuda:0')
Epoch 3:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.710, train_acc

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_64; lr_0.0001; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/69 [00:00<?, ?it/s]                            Logits range: [-0.088, 0.123]
Loss: 1.801
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.780, train_acc=0.133, val_loss=1.780, val_acc=0.223]         Logits range: [-0.216, 0.111]
Loss: 1.775
Predictions: tensor([1, 1, 3, 4, 3], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 4], device='cuda:0')
Epoch 2:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.740, train_acc=0.250, val_loss=1.770, val_acc=0.247]          Logits range: [-0.406, 0.174]
Loss: 1.769
Predictions: tensor([1, 1, 1, 1, 1], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 5], device='cuda:0')
Epoch 3:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.710, train_a

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_64; lr_0.0001; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/69 [00:00<?, ?it/s]                             Logits range: [-0.215, 0.182]
Loss: 1.806
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.760, train_acc=0.208, val_loss=1.740, val_acc=0.193]         Logits range: [-0.328, 0.149]
Loss: 1.738
Predictions: tensor([0, 3, 1, 1, 1], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 4], device='cuda:0')
Epoch 2:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.710, train_acc=0.400, val_loss=1.720, val_acc=0.197]         Logits range: [-0.516, 0.209]
Loss: 1.723
Predictions: tensor([0, 1, 1, 0, 1], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 5], device='cuda:0')
Epoch 3:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.720, train_acc=0

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_64; lr_0.001; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/69 [00:00<?, ?it/s]                             Logits range: [-0.173, 0.149]
Loss: 1.794
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.400, train_acc=0.375, val_loss=1.430, val_acc=0.336]         Logits range: [-2.216, 1.854]
Loss: 1.479
Predictions: tensor([0, 0, 0, 0, 0], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 4], device='cuda:0')
Epoch 2:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.380, train_acc=0.500, val_loss=1.320, val_acc=0.445]         Logits range: [-2.637, 2.337]
Loss: 1.353
Predictions: tensor([4, 1, 4, 0, 2], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 5], device='cuda:0')
Epoch 3:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.100, train_acc=0

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_64; lr_0.001; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/69 [00:00<?, ?it/s]                             Logits range: [-0.088, 0.123]
Loss: 1.801
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.280, train_acc=0.375, val_loss=1.510, val_acc=0.329]         Logits range: [-4.038, 2.960]
Loss: 1.540
Predictions: tensor([0, 0, 0, 0, 0], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 4], device='cuda:0')
Epoch 2:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.310, train_acc=0.400, val_loss=1.400, val_acc=0.412]         Logits range: [-5.012, 3.538]
Loss: 1.453
Predictions: tensor([0, 0, 0, 0, 2], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 5], device='cuda:0')
Epoch 3:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.310, train_ac

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_64; lr_0.001; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/69 [00:00<?, ?it/s]                             Logits range: [-0.215, 0.182]
Loss: 1.806
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.350, train_acc=0.375, val_loss=1.410, val_acc=0.353]         Logits range: [-2.369, 1.914]
Loss: 1.429
Predictions: tensor([0, 0, 1, 0, 0], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 4], device='cuda:0')
Epoch 2:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.350, train_acc=0.500, val_loss=1.260, val_acc=0.499]         Logits range: [-2.822, 2.508]
Loss: 1.289
Predictions: tensor([0, 0, 0, 0, 2], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 5], device='cuda:0')
Epoch 3:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.030, train_acc=0.

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_64; lr_0.01; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/69 [00:00<?, ?it/s]                             Logits range: [-0.173, 0.149]
Loss: 1.794
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.540, train_acc=0.312, val_loss=1.480, val_acc=0.330]         Logits range: [-3.942, 2.109]
Loss: 1.393
Predictions: tensor([0, 0, 0, 0, 0], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 4], device='cuda:0')
Epoch 2:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.430, train_acc=0.550, val_loss=1.060, val_acc=0.530]         Logits range: [-4.447, 3.408]
Loss: 1.100
Predictions: tensor([1, 0, 0, 0, 1], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 5], device='cuda:0')
Epoch 3:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=0.554, train_acc=0.

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_64; lr_0.01; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/69 [00:00<?, ?it/s]                             Logits range: [-0.088, 0.123]
Loss: 1.801
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=0.594, train_acc=0.688, val_loss=1.130, val_acc=0.491]         Logits range: [-10.202, 7.742]
Loss: 1.135
Predictions: tensor([0, 0, 0, 0, 3], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 4], device='cuda:0')
Epoch 2:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=0.912, train_acc=0.650, val_loss=0.632, val_acc=0.728]         Logits range: [-9.964, 10.208]
Loss: 0.604
Predictions: tensor([1, 1, 1, 0, 2], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 5], device='cuda:0')
Epoch 3:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=0.122, train_a

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_64; lr_0.01; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/69 [00:00<?, ?it/s]                             Logits range: [-0.215, 0.182]
Loss: 1.806
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=0.587, train_acc=0.812, val_loss=0.674, val_acc=0.699]         Logits range: [-4.213, 4.670]
Loss: 0.545
Predictions: tensor([0, 0, 1, 0, 4], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 4], device='cuda:0')
Epoch 2:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=0.707, train_acc=0.750, val_loss=0.481, val_acc=0.778]         Logits range: [-6.526, 5.708]
Loss: 0.368
Predictions: tensor([1, 1, 1, 0, 2], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 5], device='cuda:0')
Epoch 3:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=0.0513, train_acc=1.

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_64; lr_0.1; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/69 [00:00<?, ?it/s]                             Logits range: [-0.173, 0.149]
Loss: 1.794
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.650, train_acc=0.250, val_loss=1.700, val_acc=0.177]         Logits range: [-0.967, 0.479]
Loss: 1.683
Predictions: tensor([0, 0, 0, 0, 0], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 4], device='cuda:0')
Epoch 2:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.720, train_acc=0.200, val_loss=1.690, val_acc=0.177]          Logits range: [-1.296, 0.527]
Loss: 1.668
Predictions: tensor([0, 0, 0, 0, 0], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 5], device='cuda:0')
Epoch 3:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.750, train_acc=0.

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_64; lr_0.1; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/69 [00:00<?, ?it/s]                             Logits range: [-0.088, 0.123]
Loss: 1.801
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=0.754, train_acc=0.812, val_loss=1.130, val_acc=0.473]         Logits range: [-9.068, 11.713]
Loss: 1.073
Predictions: tensor([0, 2, 0, 0, 2], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 4], device='cuda:0')
Epoch 2:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.070, train_acc=0.650, val_loss=1.070, val_acc=0.608]         Logits range: [-14.368, 19.831]
Loss: 0.822
Predictions: tensor([3, 1, 3, 0, 2], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 5], device='cuda:0')
Epoch 3:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=0.415, train_a

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_64; lr_0.1; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/69 [00:00<?, ?it/s]                             Logits range: [-0.215, 0.182]
Loss: 1.806
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.360, train_acc=0.500, val_loss=1.350, val_acc=0.410]         Logits range: [-3.824, 4.143]
Loss: 1.360
Predictions: tensor([1, 1, 1, 0, 1], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 4], device='cuda:0')
Epoch 2:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.620, train_acc=0.300, val_loss=1.300, val_acc=0.411]         Logits range: [-8.770, 6.445]
Loss: 1.297
Predictions: tensor([1, 1, 1, 0, 1], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 5], device='cuda:0')
Epoch 3:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.170, train_acc=0.55

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_128; lr_1e-05; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/35 [00:00<?, ?it/s]                             Logits range: [-0.173, 0.182]
Loss: 1.798
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([1, 0, 3, 0, 0], device='cuda:0')
Epoch 1:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.810, train_acc=0.0667, val_loss=1.800, val_acc=0.159]         Logits range: [-0.130, 0.142]
Loss: 1.793
Predictions: tensor([4, 4, 4, 3, 3], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 2:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.760, train_acc=0.300, val_loss=1.800, val_acc=0.160]         Logits range: [-0.159, 0.152]
Loss: 1.792
Predictions: tensor([3, 4, 4, 4, 3], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 0], device='cuda:0')
Epoch 3:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.730, train_acc

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_128; lr_1e-05; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/35 [00:00<?, ?it/s]                             Logits range: [-0.087, 0.141]
Loss: 1.805
Predictions: tensor([3, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([1, 0, 3, 0, 0], device='cuda:0')
Epoch 1:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.820, train_acc=0.000, val_loss=1.810, val_acc=0.170]         Logits range: [-0.082, 0.125]
Loss: 1.800
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 2:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.770, train_acc=0.200, val_loss=1.810, val_acc=0.171]         Logits range: [-0.082, 0.128]
Loss: 1.802
Predictions: tensor([4, 4, 4, 3, 4], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 0], device='cuda:0')
Epoch 3:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.750, train_a

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_128; lr_1e-05; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/35 [00:00<?, ?it/s]                             Logits range: [-0.215, 0.192]
Loss: 1.812
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([1, 0, 3, 0, 0], device='cuda:0')
Epoch 1:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.830, train_acc=0.200, val_loss=1.820, val_acc=0.166]         Logits range: [-0.192, 0.197]
Loss: 1.808
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 2:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.760, train_acc=0.200, val_loss=1.810, val_acc=0.165]         Logits range: [-0.210, 0.177]
Loss: 1.804
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 0], device='cuda:0')
Epoch 3:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.750, train_acc=0

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_128; lr_0.0001; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/35 [00:00<?, ?it/s]                             Logits range: [-0.173, 0.182]
Loss: 1.798
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([1, 0, 3, 0, 0], device='cuda:0')
Epoch 1:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.760, train_acc=0.0833, val_loss=1.760, val_acc=0.193]         Logits range: [-0.342, 0.136]
Loss: 1.751
Predictions: tensor([1, 3, 0, 3, 1], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 2:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.720, train_acc=0.150, val_loss=1.740, val_acc=0.194]         Logits range: [-0.488, 0.186]
Loss: 1.735
Predictions: tensor([1, 0, 1, 1, 3], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 0], device='cuda:0')
Epoch 3:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.710, train_ac

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_128; lr_0.0001; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/35 [00:00<?, ?it/s]                             Logits range: [-0.087, 0.141]
Loss: 1.805
Predictions: tensor([3, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([1, 0, 3, 0, 0], device='cuda:0')
Epoch 1:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.790, train_acc=0.0667, val_loss=1.790, val_acc=0.197]         Logits range: [-0.154, 0.112]
Loss: 1.782
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 2:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.750, train_acc=0.250, val_loss=1.780, val_acc=0.224]         Logits range: [-0.271, 0.115]
Loss: 1.774
Predictions: tensor([4, 1, 3, 0, 4], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 0], device='cuda:0')
Epoch 3:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.720, train

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_128; lr_0.0001; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/35 [00:00<?, ?it/s]                             Logits range: [-0.215, 0.192]
Loss: 1.812
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([1, 0, 3, 0, 0], device='cuda:0')
Epoch 1:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.780, train_acc=0.167, val_loss=1.760, val_acc=0.207]         Logits range: [-0.251, 0.138]
Loss: 1.755
Predictions: tensor([0, 3, 0, 1, 3], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 2:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.720, train_acc=0.300, val_loss=1.740, val_acc=0.185]         Logits range: [-0.401, 0.168]
Loss: 1.734
Predictions: tensor([0, 1, 1, 0, 0], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 0], device='cuda:0')
Epoch 3:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.720, train_acc=

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_128; lr_0.001; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/35 [00:00<?, ?it/s]                             Logits range: [-0.173, 0.182]
Loss: 1.798
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([1, 0, 3, 0, 0], device='cuda:0')
Epoch 1:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.400, train_acc=0.375, val_loss=1.470, val_acc=0.311]         Logits range: [-2.073, 1.705]
Loss: 1.491
Predictions: tensor([0, 0, 0, 0, 2], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 2:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.390, train_acc=0.400, val_loss=1.360, val_acc=0.415]         Logits range: [-2.474, 2.164]
Loss: 1.317
Predictions: tensor([0, 1, 0, 0, 0], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 0], device='cuda:0')
Epoch 3:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.130, train_acc=

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_128; lr_0.001; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/35 [00:00<?, ?it/s]                             Logits range: [-0.087, 0.141]
Loss: 1.805
Predictions: tensor([3, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([1, 0, 3, 0, 0], device='cuda:0')
Epoch 1:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.360, train_acc=0.375, val_loss=1.570, val_acc=0.305]         Logits range: [-3.811, 2.383]
Loss: 1.581
Predictions: tensor([0, 0, 0, 0, 2], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 2:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.390, train_acc=0.300, val_loss=1.460, val_acc=0.354]         Logits range: [-4.625, 3.303]
Loss: 1.435
Predictions: tensor([0, 0, 0, 0, 0], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 0], device='cuda:0')
Epoch 3:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.450, train_a

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_128; lr_0.001; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/35 [00:00<?, ?it/s]                             Logits range: [-0.215, 0.192]
Loss: 1.812
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([1, 0, 3, 0, 0], device='cuda:0')
Epoch 1:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.390, train_acc=0.375, val_loss=1.450, val_acc=0.320]         Logits range: [-2.123, 1.613]
Loss: 1.451
Predictions: tensor([0, 0, 0, 0, 2], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 2:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.400, train_acc=0.600, val_loss=1.330, val_acc=0.445]         Logits range: [-2.731, 2.172]
Loss: 1.308
Predictions: tensor([0, 0, 0, 0, 0], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 0], device='cuda:0')
Epoch 3:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.160, train_acc=0

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_128; lr_0.01; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/35 [00:00<?, ?it/s]                             Logits range: [-0.173, 0.182]
Loss: 1.798
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([1, 0, 3, 0, 0], device='cuda:0')
Epoch 1:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.660, train_acc=0.250, val_loss=1.620, val_acc=0.250]         Logits range: [-3.616, 1.857]
Loss: 1.497
Predictions: tensor([1, 2, 2, 1, 2], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 2:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.030, train_acc=0.750, val_loss=1.500, val_acc=0.390]         Logits range: [-4.645, 2.791]
Loss: 1.393
Predictions: tensor([3, 3, 3, 0, 1], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 0], device='cuda:0')
Epoch 3:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=0.879, train_acc=0

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_128; lr_0.01; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/35 [00:00<?, ?it/s]                             Logits range: [-0.087, 0.141]
Loss: 1.805
Predictions: tensor([3, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([1, 0, 3, 0, 0], device='cuda:0')
Epoch 1:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.090, train_acc=0.562, val_loss=1.410, val_acc=0.365]         Logits range: [-5.061, 6.336]
Loss: 1.403
Predictions: tensor([2, 2, 2, 2, 2], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 2:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.740, train_acc=0.600, val_loss=0.984, val_acc=0.616]         Logits range: [-11.821, 7.875]
Loss: 0.894
Predictions: tensor([1, 4, 1, 0, 0], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 0], device='cuda:0')
Epoch 3:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=0.451, train_a

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_128; lr_0.01; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/35 [00:00<?, ?it/s]                             Logits range: [-0.215, 0.192]
Loss: 1.812
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([1, 0, 3, 0, 0], device='cuda:0')
Epoch 1:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=0.878, train_acc=0.812, val_loss=1.030, val_acc=0.530]         Logits range: [-4.573, 3.878]
Loss: 0.946
Predictions: tensor([0, 0, 1, 0, 2], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 2:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=0.842, train_acc=0.650, val_loss=0.606, val_acc=0.714]         Logits range: [-5.619, 5.447]
Loss: 0.497
Predictions: tensor([1, 1, 1, 0, 0], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 0], device='cuda:0')
Epoch 3:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=0.135, train_acc=0.

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_128; lr_0.1; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/35 [00:00<?, ?it/s]                             Logits range: [-0.173, 0.182]
Loss: 1.798
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([1, 0, 3, 0, 0], device='cuda:0')
Epoch 1:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=3.050, train_acc=0.250, val_loss=1.740, val_acc=0.173]         Logits range: [-0.713, 0.526]
Loss: 1.702
Predictions: tensor([1, 1, 1, 1, 1], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 2:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.710, train_acc=0.200, val_loss=1.730, val_acc=0.173]         Logits range: [-0.962, 0.832]
Loss: 1.691
Predictions: tensor([1, 1, 1, 1, 1], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 0], device='cuda:0')
Epoch 3:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.730, train_acc=0.

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_128; lr_0.1; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/35 [00:00<?, ?it/s]                             Logits range: [-0.087, 0.141]
Loss: 1.805
Predictions: tensor([3, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([1, 0, 3, 0, 0], device='cuda:0')
Epoch 1:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.240, train_acc=0.750, val_loss=1.370, val_acc=0.370]         Logits range: [-8.937, 9.148]
Loss: 1.342
Predictions: tensor([0, 2, 2, 2, 2], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 2:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.080, train_acc=0.550, val_loss=0.935, val_acc=0.579]         Logits range: [-14.924, 13.272]
Loss: 0.683
Predictions: tensor([1, 1, 1, 0, 0], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 0], device='cuda:0')
Epoch 3:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=0.058, train_a

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_128; lr_0.1; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/35 [00:00<?, ?it/s]                             Logits range: [-0.215, 0.192]
Loss: 1.812
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([1, 0, 3, 0, 0], device='cuda:0')
Epoch 1:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.310, train_acc=0.500, val_loss=1.580, val_acc=0.286]         Logits range: [-2.875, 6.396]
Loss: 1.511
Predictions: tensor([2, 2, 2, 0, 0], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 2:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.730, train_acc=0.300, val_loss=1.410, val_acc=0.377]         Logits range: [-2.960, 5.232]
Loss: 1.328
Predictions: tensor([1, 1, 1, 1, 1], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 0], device='cuda:0')
Epoch 3:   0%|          | 0/35 [00:00<?, ?it/s, v_num=0, train_loss=1.350, train_acc=0.6

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_256; lr_1e-05; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/18 [00:00<?, ?it/s]                             Logits range: [-0.173, 0.182]
Loss: 1.789
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([1, 0, 3, 3, 0], device='cuda:0')
Epoch 1:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.810, train_acc=0.0667, val_loss=1.800, val_acc=0.158]         Logits range: [-0.149, 0.152]
Loss: 1.795
Predictions: tensor([4, 3, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 0, 0, 1], device='cuda:0')
Epoch 2:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.760, train_acc=0.250, val_loss=1.800, val_acc=0.157]         Logits range: [-0.160, 0.163]
Loss: 1.787
Predictions: tensor([4, 3, 4, 4, 4], device='cuda:0')
True labels: tensor([2, 1, 1, 1, 0], device='cuda:0')
Epoch 3:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.730, train_acc

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_256; lr_1e-05; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/18 [00:00<?, ?it/s]                             Logits range: [-0.087, 0.141]
Loss: 1.797
Predictions: tensor([3, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([1, 0, 3, 3, 0], device='cuda:0')
Epoch 1:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.820, train_acc=0.000, val_loss=1.810, val_acc=0.168]         Logits range: [-0.080, 0.120]
Loss: 1.803
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 0, 0, 1], device='cuda:0')
Epoch 2:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.770, train_acc=0.200, val_loss=1.810, val_acc=0.168]         Logits range: [-0.084, 0.129]
Loss: 1.799
Predictions: tensor([4, 4, 4, 4, 3], device='cuda:0')
True labels: tensor([2, 1, 1, 1, 0], device='cuda:0')
Epoch 3:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.750, train_a

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_256; lr_1e-05; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/18 [00:00<?, ?it/s]                             Logits range: [-0.215, 0.209]
Loss: 1.799
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([1, 0, 3, 3, 0], device='cuda:0')
Epoch 1:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.830, train_acc=0.200, val_loss=1.820, val_acc=0.146]         Logits range: [-0.196, 0.200]
Loss: 1.812
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 0, 0, 0, 1], device='cuda:0')
Epoch 2:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.760, train_acc=0.200, val_loss=1.820, val_acc=0.163]         Logits range: [-0.211, 0.181]
Loss: 1.802
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([2, 1, 1, 1, 0], device='cuda:0')
Epoch 3:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.750, train_acc=0

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_256; lr_0.0001; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/18 [00:00<?, ?it/s]                             Logits range: [-0.173, 0.182]
Loss: 1.789
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([1, 0, 3, 3, 0], device='cuda:0')
Epoch 1:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.770, train_acc=0.0667, val_loss=1.770, val_acc=0.188]         Logits range: [-0.277, 0.132]
Loss: 1.761
Predictions: tensor([1, 3, 3, 3, 0], device='cuda:0')
True labels: tensor([0, 0, 0, 0, 1], device='cuda:0')
Epoch 2:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.730, train_acc=0.150, val_loss=1.750, val_acc=0.190]         Logits range: [-0.390, 0.179]
Loss: 1.740
Predictions: tensor([1, 3, 0, 1, 3], device='cuda:0')
True labels: tensor([2, 1, 1, 1, 0], device='cuda:0')
Epoch 3:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.720, train_ac

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_256; lr_0.0001; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/18 [00:00<?, ?it/s]                            Logits range: [-0.087, 0.141]
Loss: 1.797
Predictions: tensor([3, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([1, 0, 3, 3, 0], device='cuda:0')
Epoch 1:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.800, train_acc=0.000, val_loss=1.790, val_acc=0.183]         Logits range: [-0.096, 0.108]
Loss: 1.791
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 0, 0, 1], device='cuda:0')
Epoch 2:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.760, train_acc=0.200, val_loss=1.790, val_acc=0.192]         Logits range: [-0.161, 0.107]
Loss: 1.779
Predictions: tensor([4, 4, 3, 4, 3], device='cuda:0')
True labels: tensor([2, 1, 1, 1, 0], device='cuda:0')
Epoch 3:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.730, train_a

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_256; lr_0.0001; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/18 [00:00<?, ?it/s]                            Logits range: [-0.215, 0.209]
Loss: 1.799
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([1, 0, 3, 3, 0], device='cuda:0')
Epoch 1:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.790, train_acc=0.200, val_loss=1.770, val_acc=0.201]         Logits range: [-0.190, 0.146]
Loss: 1.769
Predictions: tensor([3, 3, 0, 3, 3], device='cuda:0')
True labels: tensor([0, 0, 0, 0, 1], device='cuda:0')
Epoch 2:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.730, train_acc=0.400, val_loss=1.750, val_acc=0.199]         Logits range: [-0.295, 0.129]
Loss: 1.747
Predictions: tensor([3, 3, 1, 0, 0], device='cuda:0')
True labels: tensor([2, 1, 1, 1, 0], device='cuda:0')
Epoch 3:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.730, train_acc=0

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_256; lr_0.001; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/18 [00:00<?, ?it/s]                             Logits range: [-0.173, 0.182]
Loss: 1.789
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([1, 0, 3, 3, 0], device='cuda:0')
Epoch 1:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.450, train_acc=0.312, val_loss=1.540, val_acc=0.282]         Logits range: [-1.803, 1.404]
Loss: 1.537
Predictions: tensor([0, 0, 0, 2, 0], device='cuda:0')
True labels: tensor([0, 0, 0, 0, 1], device='cuda:0')
Epoch 2:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.500, train_acc=0.400, val_loss=1.440, val_acc=0.357]         Logits range: [-2.168, 1.749]
Loss: 1.432
Predictions: tensor([0, 0, 0, 0, 0], device='cuda:0')
True labels: tensor([2, 1, 1, 1, 0], device='cuda:0')
Epoch 3:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.180, train_acc=

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_256; lr_0.001; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/18 [00:00<?, ?it/s]                            Logits range: [-0.087, 0.141]
Loss: 1.797
Predictions: tensor([3, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([1, 0, 3, 3, 0], device='cuda:0')
Epoch 1:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.620, train_acc=0.250, val_loss=1.660, val_acc=0.230]         Logits range: [-3.395, 1.132]
Loss: 1.667
Predictions: tensor([0, 0, 0, 0, 0], device='cuda:0')
True labels: tensor([0, 0, 0, 0, 1], device='cuda:0')
Epoch 2:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.600, train_acc=0.300, val_loss=1.640, val_acc=0.241]         Logits range: [-4.108, 2.233]
Loss: 1.633
Predictions: tensor([0, 2, 2, 2, 0], device='cuda:0')
True labels: tensor([2, 1, 1, 1, 0], device='cuda:0')
Epoch 3:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.560, train_ac

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_256; lr_0.001; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/18 [00:00<?, ?it/s]                            Logits range: [-0.215, 0.209]
Loss: 1.799
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([1, 0, 3, 3, 0], device='cuda:0')
Epoch 1:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.470, train_acc=0.375, val_loss=1.530, val_acc=0.292]         Logits range: [-1.734, 1.152]
Loss: 1.527
Predictions: tensor([0, 0, 0, 2, 0], device='cuda:0')
True labels: tensor([0, 0, 0, 0, 1], device='cuda:0')
Epoch 2:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.480, train_acc=0.450, val_loss=1.420, val_acc=0.371]         Logits range: [-2.304, 1.725]
Loss: 1.400
Predictions: tensor([0, 0, 0, 0, 0], device='cuda:0')
True labels: tensor([2, 1, 1, 1, 0], device='cuda:0')
Epoch 3:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.300, train_acc=0.

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_256; lr_0.01; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/18 [00:00<?, ?it/s]                             Logits range: [-0.173, 0.182]
Loss: 1.789
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([1, 0, 3, 3, 0], device='cuda:0')
Epoch 1:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.910, train_acc=0.188, val_loss=1.650, val_acc=0.259]         Logits range: [-2.352, 1.597]
Loss: 1.622
Predictions: tensor([2, 2, 2, 2, 2], device='cuda:0')
True labels: tensor([0, 0, 0, 0, 1], device='cuda:0')
Epoch 2:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.520, train_acc=0.300, val_loss=1.530, val_acc=0.310]         Logits range: [-3.496, 3.124]
Loss: 1.506
Predictions: tensor([3, 1, 1, 1, 1], device='cuda:0')
True labels: tensor([2, 1, 1, 1, 0], device='cuda:0')
Epoch 3:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.370, train_acc=0

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_256; lr_0.01; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/18 [00:00<?, ?it/s]                             Logits range: [-0.087, 0.141]
Loss: 1.797
Predictions: tensor([3, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([1, 0, 3, 3, 0], device='cuda:0')
Epoch 1:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.530, train_acc=0.375, val_loss=1.430, val_acc=0.402]         Logits range: [-3.837, 2.555]
Loss: 1.426
Predictions: tensor([0, 0, 0, 0, 0], device='cuda:0')
True labels: tensor([0, 0, 0, 0, 1], device='cuda:0')
Epoch 2:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.050, train_acc=0.550, val_loss=1.190, val_acc=0.462]         Logits range: [-8.427, 7.361]
Loss: 1.123
Predictions: tensor([0, 1, 0, 1, 0], device='cuda:0')
True labels: tensor([2, 1, 1, 1, 0], device='cuda:0')
Epoch 3:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=0.488, train_ac

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_256; lr_0.01; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/18 [00:00<?, ?it/s]                             Logits range: [-0.215, 0.209]
Loss: 1.799
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([1, 0, 3, 3, 0], device='cuda:0')
Epoch 1:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.410, train_acc=0.500, val_loss=1.350, val_acc=0.384]         Logits range: [-3.544, 2.137]
Loss: 1.328
Predictions: tensor([0, 0, 0, 0, 0], device='cuda:0')
True labels: tensor([0, 0, 0, 0, 1], device='cuda:0')
Epoch 2:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=0.909, train_acc=0.700, val_loss=1.000, val_acc=0.592]         Logits range: [-4.226, 4.278]
Loss: 0.854
Predictions: tensor([1, 1, 1, 3, 0], device='cuda:0')
True labels: tensor([2, 1, 1, 1, 0], device='cuda:0')
Epoch 3:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=0.337, train_acc=0.

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_256; lr_0.1; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/18 [00:00<?, ?it/s]                             Logits range: [-0.173, 0.182]
Loss: 1.789
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([1, 0, 3, 3, 0], device='cuda:0')
Epoch 1:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.680, train_acc=0.188, val_loss=2.880, val_acc=0.169]         Logits range: [-2.285, 7.531]
Loss: 2.987
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 0, 0, 0, 1], device='cuda:0')
Epoch 2:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.750, train_acc=0.200, val_loss=1.690, val_acc=0.171]         Logits range: [-0.720, 0.447]
Loss: 1.697
Predictions: tensor([0, 0, 0, 0, 0], device='cuda:0')
True labels: tensor([2, 1, 1, 1, 0], device='cuda:0')
Epoch 3:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.760, train_acc=0.

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_256; lr_0.1; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/18 [00:00<?, ?it/s]                             Logits range: [-0.087, 0.141]
Loss: 1.797
Predictions: tensor([3, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([1, 0, 3, 3, 0], device='cuda:0')
Epoch 1:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.080, train_acc=0.450, val_loss=1.660, val_acc=0.307]         Logits range: [-5.880, 4.791]
Loss: 1.538
Predictions: tensor([2, 2, 2, 2, 2], device='cuda:0')
True labels: tensor([0, 0, 0, 0, 1], device='cuda:0')
Epoch 2:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.240, train_acc=0.600, val_loss=1.170, val_acc=0.463]         Logits range: [-11.786, 9.783]
Loss: 1.008
Predictions: tensor([3, 3, 1, 3, 1], device='cuda:0')
True labels: tensor([2, 1, 1, 1, 0], device='cuda:0')
Epoch 3:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=0.426, train_ac

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_256; lr_0.1; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/18 [00:00<?, ?it/s]                            Logits range: [-0.215, 0.209]
Loss: 1.799
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([1, 0, 3, 3, 0], device='cuda:0')
Epoch 1:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.660, train_acc=0.250, val_loss=1.840, val_acc=0.169]         Logits range: [-6.856, 6.751]
Loss: 1.766
Predictions: tensor([2, 2, 2, 2, 2], device='cuda:0')
True labels: tensor([0, 0, 0, 0, 1], device='cuda:0')
Epoch 2:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=2.150, train_acc=0.200, val_loss=1.480, val_acc=0.284]         Logits range: [-2.826, 5.354]
Loss: 1.458
Predictions: tensor([1, 1, 1, 1, 1], device='cuda:0')
True labels: tensor([2, 1, 1, 1, 0], device='cuda:0')
Epoch 3:   0%|          | 0/18 [00:00<?, ?it/s, v_num=0, train_loss=1.460, train_acc=0.37

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_512; lr_1e-05; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/9 [00:00<?, ?it/s]                             Logits range: [-0.173, 0.182]
Loss: 1.795
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 1, 0, 3, 3], device='cuda:0')
Epoch 1:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.800, train_acc=0.180, val_loss=1.800, val_acc=0.152]        Logits range: [-0.151, 0.170]
Loss: 1.795
Predictions: tensor([4, 3, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 0, 2, 0], device='cuda:0')
Epoch 2:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.800, train_acc=0.185, val_loss=1.800, val_acc=0.158]        Logits range: [-0.170, 0.165]
Loss: 1.790
Predictions: tensor([4, 3, 4, 4, 4], device='cuda:0')
True labels: tensor([2, 1, 1, 1, 0], device='cuda:0')
Epoch 3:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.790, train_acc=0.152,

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_512; lr_1e-05; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/9 [00:00<?, ?it/s]                             Logits range: [-0.084, 0.131]
Loss: 1.802
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 1, 0, 3, 3], device='cuda:0')
Epoch 1:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.810, train_acc=0.167, val_loss=1.810, val_acc=0.168]        Logits range: [-0.081, 0.130]
Loss: 1.803
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 0, 2, 0], device='cuda:0')
Epoch 2:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.810, train_acc=0.167, val_loss=1.810, val_acc=0.168]        Logits range: [-0.089, 0.137]
Loss: 1.801
Predictions: tensor([4, 4, 4, 4, 3], device='cuda:0')
True labels: tensor([2, 1, 1, 1, 0], device='cuda:0')
Epoch 3:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.800, train_acc=0.1

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_512; lr_1e-05; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/9 [00:00<?, ?it/s]                              Logits range: [-0.227, 0.214]
Loss: 1.810
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 1, 0, 3, 3], device='cuda:0')
Epoch 1:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.820, train_acc=0.165, val_loss=1.820, val_acc=0.152]        Logits range: [-0.202, 0.201]
Loss: 1.811
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 0, 0, 2, 0], device='cuda:0')
Epoch 2:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.810, train_acc=0.163, val_loss=1.820, val_acc=0.155]        Logits range: [-0.233, 0.198]
Loss: 1.806
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([2, 1, 1, 1, 0], device='cuda:0')
Epoch 3:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.800, train_acc=0.176,

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_512; lr_0.0001; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/9 [00:00<?, ?it/s]                              Logits range: [-0.173, 0.182]
Loss: 1.795
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 1, 0, 3, 3], device='cuda:0')
Epoch 1:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.770, train_acc=0.157, val_loss=1.780, val_acc=0.173]        Logits range: [-0.246, 0.142]
Loss: 1.770
Predictions: tensor([4, 3, 4, 4, 3], device='cuda:0')
True labels: tensor([0, 0, 0, 2, 0], device='cuda:0')
Epoch 2:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.760, train_acc=0.198, val_loss=1.760, val_acc=0.192]        Logits range: [-0.312, 0.161]
Loss: 1.752
Predictions: tensor([1, 3, 0, 1, 3], device='cuda:0')
True labels: tensor([2, 1, 1, 1, 0], device='cuda:0')
Epoch 3:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.740, train_acc=0.19

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_512; lr_0.0001; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/9 [00:00<?, ?it/s]                              Logits range: [-0.084, 0.131]
Loss: 1.802
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 1, 0, 3, 3], device='cuda:0')
Epoch 1:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.800, train_acc=0.172, val_loss=1.800, val_acc=0.170]        Logits range: [-0.076, 0.112]
Loss: 1.795
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 0, 2, 0], device='cuda:0')
Epoch 2:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.800, train_acc=0.175, val_loss=1.790, val_acc=0.176]        Logits range: [-0.108, 0.117]
Loss: 1.786
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([2, 1, 1, 1, 0], device='cuda:0')
Epoch 3:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.780, train_acc=0

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_512; lr_0.0001; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/9 [00:00<?, ?it/s]                              Logits range: [-0.227, 0.214]
Loss: 1.810
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 1, 0, 3, 3], device='cuda:0')
Epoch 1:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.790, train_acc=0.183, val_loss=1.790, val_acc=0.181]        Logits range: [-0.181, 0.153]
Loss: 1.779
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 0, 0, 2, 0], device='cuda:0')
Epoch 2:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.770, train_acc=0.256, val_loss=1.770, val_acc=0.202]        Logits range: [-0.224, 0.138]
Loss: 1.761
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([2, 1, 1, 1, 0], device='cuda:0')
Epoch 3:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.750, train_acc=0.251

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_512; lr_0.001; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/9 [00:00<?, ?it/s]                              Logits range: [-0.173, 0.182]
Loss: 1.795
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 1, 0, 3, 3], device='cuda:0')
Epoch 1:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.610, train_acc=0.199, val_loss=1.620, val_acc=0.250]        Logits range: [-1.638, 0.821]
Loss: 1.622
Predictions: tensor([0, 0, 0, 0, 0], device='cuda:0')
True labels: tensor([0, 0, 0, 2, 0], device='cuda:0')
Epoch 2:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.460, train_acc=0.301, val_loss=1.510, val_acc=0.292]        Logits range: [-1.925, 1.448]
Loss: 1.489
Predictions: tensor([0, 0, 0, 0, 0], device='cuda:0')
True labels: tensor([2, 1, 1, 1, 0], device='cuda:0')
Epoch 3:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.460, train_acc=0.325

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_512; lr_0.001; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/9 [00:00<?, ?it/s]                             Logits range: [-0.084, 0.131]
Loss: 1.802
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 1, 0, 3, 3], device='cuda:0')
Epoch 1:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.660, train_acc=0.237, val_loss=1.690, val_acc=0.169]        Logits range: [-2.467, 1.164]
Loss: 1.687
Predictions: tensor([0, 0, 0, 0, 0], device='cuda:0')
True labels: tensor([0, 0, 0, 2, 0], device='cuda:0')
Epoch 2:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.660, train_acc=0.186, val_loss=1.650, val_acc=0.234]        Logits range: [-3.925, 1.593]
Loss: 1.622
Predictions: tensor([0, 0, 0, 0, 0], device='cuda:0')
True labels: tensor([2, 1, 1, 1, 0], device='cuda:0')
Epoch 3:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.610, train_acc=0.3

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_512; lr_0.001; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/9 [00:00<?, ?it/s]                              Logits range: [-0.227, 0.214]
Loss: 1.810
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 1, 0, 3, 3], device='cuda:0')
Epoch 1:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.590, train_acc=0.275, val_loss=1.600, val_acc=0.265]        Logits range: [-1.396, 0.717]
Loss: 1.600
Predictions: tensor([0, 0, 0, 0, 0], device='cuda:0')
True labels: tensor([0, 0, 0, 2, 0], device='cuda:0')
Epoch 2:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.430, train_acc=0.338, val_loss=1.470, val_acc=0.320]        Logits range: [-1.975, 1.448]
Loss: 1.453
Predictions: tensor([0, 0, 0, 0, 0], device='cuda:0')
True labels: tensor([2, 1, 1, 1, 0], device='cuda:0')
Epoch 3:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.420, train_acc=0.353,

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_512; lr_0.01; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/9 [00:00<?, ?it/s]                              Logits range: [-0.173, 0.182]
Loss: 1.795
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 1, 0, 3, 3], device='cuda:0')
Epoch 1:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.590, train_acc=0.241, val_loss=1.670, val_acc=0.184]        Logits range: [-3.148, 1.778]
Loss: 1.671
Predictions: tensor([1, 1, 1, 1, 1], device='cuda:0')
True labels: tensor([0, 0, 0, 2, 0], device='cuda:0')
Epoch 2:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.450, train_acc=0.316, val_loss=1.550, val_acc=0.256]        Logits range: [-3.558, 2.019]
Loss: 1.524
Predictions: tensor([0, 1, 2, 1, 0], device='cuda:0')
True labels: tensor([2, 1, 1, 1, 0], device='cuda:0')
Epoch 3:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.410, train_acc=0.316,

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_512; lr_0.01; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/9 [00:00<?, ?it/s]                              Logits range: [-0.084, 0.131]
Loss: 1.802
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 1, 0, 3, 3], device='cuda:0')
Epoch 1:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.640, train_acc=0.244, val_loss=1.800, val_acc=0.169]        Logits range: [-5.389, 3.944]
Loss: 1.754
Predictions: tensor([0, 0, 0, 0, 0], device='cuda:0')
True labels: tensor([0, 0, 0, 2, 0], device='cuda:0')
Epoch 2:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.390, train_acc=0.386, val_loss=1.430, val_acc=0.337]        Logits range: [-6.226, 3.413]
Loss: 1.447
Predictions: tensor([1, 1, 1, 1, 1], device='cuda:0')
True labels: tensor([2, 1, 1, 1, 0], device='cuda:0')
Epoch 3:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.050, train_acc=0.5

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_512; lr_0.01; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/9 [00:00<?, ?it/s]                              Logits range: [-0.227, 0.214]
Loss: 1.810
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 1, 0, 3, 3], device='cuda:0')
Epoch 1:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.720, train_acc=0.207, val_loss=1.560, val_acc=0.309]        Logits range: [-3.367, 1.191]
Loss: 1.534
Predictions: tensor([0, 0, 0, 0, 0], device='cuda:0')
True labels: tensor([0, 0, 0, 2, 0], device='cuda:0')
Epoch 2:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.210, train_acc=0.455, val_loss=1.320, val_acc=0.383]        Logits range: [-4.607, 2.084]
Loss: 1.266
Predictions: tensor([0, 1, 1, 1, 0], device='cuda:0')
True labels: tensor([2, 1, 1, 1, 0], device='cuda:0')
Epoch 3:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.050, train_acc=0.529, 

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_512; lr_0.1; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/9 [00:00<?, ?it/s]                              Logits range: [-0.173, 0.182]
Loss: 1.795
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 1, 0, 3, 3], device='cuda:0')
Epoch 1:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.770, train_acc=0.175, val_loss=1.760, val_acc=0.169]        Logits range: [-0.135, 0.295]
Loss: 1.761
Predictions: tensor([0, 0, 0, 0, 0], device='cuda:0')
True labels: tensor([0, 0, 0, 2, 0], device='cuda:0')
Epoch 2:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.750, train_acc=0.167, val_loss=2.490, val_acc=0.169]        Logits range: [-1.856, 3.051]
Loss: 2.238
Predictions: tensor([1, 1, 1, 1, 1], device='cuda:0')
True labels: tensor([2, 1, 1, 1, 0], device='cuda:0')
Epoch 3:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.710, train_acc=0.167, 

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_512; lr_0.1; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/9 [00:00<?, ?it/s]                              Logits range: [-0.084, 0.131]
Loss: 1.802
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 1, 0, 3, 3], device='cuda:0')
Epoch 1:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.400, train_acc=0.406, val_loss=1.400, val_acc=0.284]        Logits range: [-8.156, 5.908]
Loss: 1.385
Predictions: tensor([0, 0, 0, 0, 0], device='cuda:0')
True labels: tensor([0, 0, 0, 2, 0], device='cuda:0')
Epoch 2:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.010, train_acc=0.486, val_loss=0.999, val_acc=0.497]        Logits range: [-15.423, 14.456]
Loss: 0.859
Predictions: tensor([2, 1, 1, 1, 0], device='cuda:0')
True labels: tensor([2, 1, 1, 1, 0], device='cuda:0')
Epoch 3:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=0.650, train_acc=0.

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_512; lr_0.1; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/9 [00:00<?, ?it/s]                              Logits range: [-0.227, 0.214]
Loss: 1.810
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 1, 0, 3, 3], device='cuda:0')
Epoch 1:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.910, train_acc=0.167, val_loss=1.710, val_acc=0.167]        Logits range: [-0.865, 1.487]
Loss: 1.685
Predictions: tensor([0, 0, 0, 0, 0], device='cuda:0')
True labels: tensor([0, 0, 0, 2, 0], device='cuda:0')
Epoch 2:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.650, train_acc=0.164, val_loss=1.690, val_acc=0.169]        Logits range: [-1.794, 1.678]
Loss: 1.673
Predictions: tensor([0, 2, 2, 2, 2], device='cuda:0')
True labels: tensor([2, 1, 1, 1, 0], device='cuda:0')
Epoch 3:   0%|          | 0/9 [00:00<?, ?it/s, v_num=0, train_loss=1.650, train_acc=0.167, v

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_1024; lr_1e-05; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/5 [00:00<?, ?it/s]                              Logits range: [-0.173, 0.229]
Loss: 1.798
Predictions: tensor([4, 3, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 2, 1, 0, 0], device='cuda:0')
Epoch 1:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.800, train_acc=0.174, val_loss=1.800, val_acc=0.155]        Logits range: [-0.163, 0.202]
Loss: 1.796
Predictions: tensor([4, 4, 4, 3, 4], device='cuda:0')
True labels: tensor([0, 2, 3, 0, 0], device='cuda:0')
Epoch 2:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.800, train_acc=0.185, val_loss=1.800, val_acc=0.155]        Logits range: [-0.171, 0.227]
Loss: 1.792
Predictions: tensor([4, 4, 4, 4, 3], device='cuda:0')
True labels: tensor([0, 2, 3, 3, 1], device='cuda:0')
Epoch 3:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.790, train_acc=0.16

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_1024; lr_1e-05; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/5 [00:00<?, ?it/s]                             Logits range: [-0.084, 0.131]
Loss: 1.804
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 2, 1, 0, 0], device='cuda:0')
Epoch 1:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.810, train_acc=0.167, val_loss=1.810, val_acc=0.169]        Logits range: [-0.082, 0.131]
Loss: 1.803
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 2, 3, 0, 0], device='cuda:0')
Epoch 2:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.810, train_acc=0.167, val_loss=1.810, val_acc=0.169]        Logits range: [-0.086, 0.130]
Loss: 1.802
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 2, 3, 3, 1], device='cuda:0')
Epoch 3:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.800, train_acc=0.

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_1024; lr_1e-05; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/5 [00:00<?, ?it/s]                              Logits range: [-0.227, 0.224]
Loss: 1.815
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 2, 1, 0, 0], device='cuda:0')
Epoch 1:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.820, train_acc=0.165, val_loss=1.820, val_acc=0.155]        Logits range: [-0.215, 0.204]
Loss: 1.812
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 2, 3, 0, 0], device='cuda:0')
Epoch 2:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.820, train_acc=0.163, val_loss=1.820, val_acc=0.153]        Logits range: [-0.233, 0.203]
Loss: 1.808
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 2, 3, 3, 1], device='cuda:0')
Epoch 3:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.800, train_acc=0.176

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_1024; lr_0.0001; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/5 [00:00<?, ?it/s]                              Logits range: [-0.173, 0.229]
Loss: 1.798
Predictions: tensor([4, 3, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 2, 1, 0, 0], device='cuda:0')
Epoch 1:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.780, train_acc=0.132, val_loss=1.790, val_acc=0.159]        Logits range: [-0.208, 0.198]
Loss: 1.777
Predictions: tensor([4, 4, 4, 3, 4], device='cuda:0')
True labels: tensor([0, 2, 3, 0, 0], device='cuda:0')
Epoch 2:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.770, train_acc=0.170, val_loss=1.780, val_acc=0.182]        Logits range: [-0.260, 0.205]
Loss: 1.762
Predictions: tensor([4, 1, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 2, 3, 3, 1], device='cuda:0')
Epoch 3:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.760, train_acc=0.1

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_1024; lr_0.0001; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/5 [00:00<?, ?it/s]                              Logits range: [-0.084, 0.131]
Loss: 1.804
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 2, 1, 0, 0], device='cuda:0')
Epoch 1:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.800, train_acc=0.172, val_loss=1.800, val_acc=0.168]        Logits range: [-0.080, 0.120]
Loss: 1.797
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 2, 3, 0, 0], device='cuda:0')
Epoch 2:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.800, train_acc=0.167, val_loss=1.800, val_acc=0.169]        Logits range: [-0.081, 0.115]
Loss: 1.793
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 2, 3, 3, 1], device='cuda:0')
Epoch 3:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.790, train_acc=

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_1024; lr_0.0001; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/5 [00:00<?, ?it/s]                             Logits range: [-0.227, 0.224]
Loss: 1.815
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 2, 1, 0, 0], device='cuda:0')
Epoch 1:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.800, train_acc=0.168, val_loss=1.800, val_acc=0.179]        Logits range: [-0.202, 0.176]
Loss: 1.788
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 2, 3, 0, 0], device='cuda:0')
Epoch 2:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.780, train_acc=0.194, val_loss=1.780, val_acc=0.187]        Logits range: [-0.212, 0.154]
Loss: 1.772
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 2, 3, 3, 1], device='cuda:0')
Epoch 3:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.770, train_acc=0.217

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_1024; lr_0.001; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/5 [00:00<?, ?it/s]                              Logits range: [-0.173, 0.229]
Loss: 1.798
Predictions: tensor([4, 3, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 2, 1, 0, 0], device='cuda:0')
Epoch 1:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.660, train_acc=0.220, val_loss=1.660, val_acc=0.204]        Logits range: [-1.210, 0.505]
Loss: 1.653
Predictions: tensor([0, 0, 0, 0, 0], device='cuda:0')
True labels: tensor([0, 2, 3, 0, 0], device='cuda:0')
Epoch 2:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.630, train_acc=0.179, val_loss=1.600, val_acc=0.212]        Logits range: [-1.667, 1.071]
Loss: 1.587
Predictions: tensor([2, 1, 1, 1, 1], device='cuda:0')
True labels: tensor([0, 2, 3, 3, 1], device='cuda:0')
Epoch 3:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.590, train_acc=0.26

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_1024; lr_0.001; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/5 [00:00<?, ?it/s]                              Logits range: [-0.084, 0.131]
Loss: 1.804
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 2, 1, 0, 0], device='cuda:0')
Epoch 1:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.740, train_acc=0.227, val_loss=1.730, val_acc=0.230]        Logits range: [-1.132, 0.477]
Loss: 1.726
Predictions: tensor([1, 1, 1, 0, 0], device='cuda:0')
True labels: tensor([0, 2, 3, 0, 0], device='cuda:0')
Epoch 2:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.690, train_acc=0.175, val_loss=1.680, val_acc=0.213]        Logits range: [-2.799, 1.075]
Loss: 1.660
Predictions: tensor([1, 1, 1, 1, 1], device='cuda:0')
True labels: tensor([0, 2, 3, 3, 1], device='cuda:0')
Epoch 3:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.670, train_acc=0

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_1024; lr_0.001; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/5 [00:00<?, ?it/s]                              Logits range: [-0.227, 0.224]
Loss: 1.815
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 2, 1, 0, 0], device='cuda:0')
Epoch 1:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.650, train_acc=0.243, val_loss=1.650, val_acc=0.207]        Logits range: [-1.042, 0.474]
Loss: 1.644
Predictions: tensor([0, 1, 1, 0, 0], device='cuda:0')
True labels: tensor([0, 2, 3, 0, 0], device='cuda:0')
Epoch 2:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.580, train_acc=0.283, val_loss=1.590, val_acc=0.259]        Logits range: [-1.478, 0.939]
Loss: 1.569
Predictions: tensor([2, 0, 2, 1, 0], device='cuda:0')
True labels: tensor([0, 2, 3, 3, 1], device='cuda:0')
Epoch 3:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.520, train_acc=0.289

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_1024; lr_0.01; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/5 [00:00<?, ?it/s]                              Logits range: [-0.173, 0.229]
Loss: 1.798
Predictions: tensor([4, 3, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 2, 1, 0, 0], device='cuda:0')
Epoch 1:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.660, train_acc=0.193, val_loss=1.730, val_acc=0.188]        Logits range: [-3.446, 1.401]
Loss: 1.696
Predictions: tensor([0, 0, 0, 0, 0], device='cuda:0')
True labels: tensor([0, 2, 3, 0, 0], device='cuda:0')
Epoch 2:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.970, train_acc=0.167, val_loss=1.660, val_acc=0.169]        Logits range: [-1.796, 0.687]
Loss: 1.650
Predictions: tensor([1, 1, 1, 1, 1], device='cuda:0')
True labels: tensor([0, 2, 3, 3, 1], device='cuda:0')
Epoch 3:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.830, train_acc=0.170

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_1024; lr_0.01; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/5 [00:00<?, ?it/s]                              Logits range: [-0.084, 0.131]
Loss: 1.804
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 2, 1, 0, 0], device='cuda:0')
Epoch 1:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.640, train_acc=0.177, val_loss=1.700, val_acc=0.170]        Logits range: [-4.678, 3.708]
Loss: 1.682
Predictions: tensor([0, 0, 0, 0, 0], device='cuda:0')
True labels: tensor([0, 2, 3, 0, 0], device='cuda:0')
Epoch 2:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.680, train_acc=0.185, val_loss=1.650, val_acc=0.262]        Logits range: [-4.643, 3.735]
Loss: 1.583
Predictions: tensor([2, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 2, 3, 3, 1], device='cuda:0')
Epoch 3:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.570, train_acc=0.

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_1024; lr_0.01; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/5 [00:00<?, ?it/s]                              Logits range: [-0.227, 0.224]
Loss: 1.815
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 2, 1, 0, 0], device='cuda:0')
Epoch 1:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=2.230, train_acc=0.167, val_loss=1.910, val_acc=0.169]        Logits range: [-3.044, 1.952]
Loss: 1.831
Predictions: tensor([1, 1, 1, 1, 1], device='cuda:0')
True labels: tensor([0, 2, 3, 0, 0], device='cuda:0')
Epoch 2:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.390, train_acc=0.424, val_loss=2.260, val_acc=0.187]        Logits range: [-4.775, 2.996]
Loss: 2.312
Predictions: tensor([2, 2, 2, 2, 2], device='cuda:0')
True labels: tensor([0, 2, 3, 3, 1], device='cuda:0')
Epoch 3:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.400, train_acc=0.307,

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_1024; lr_0.1; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/5 [00:00<?, ?it/s]                              Logits range: [-0.173, 0.229]
Loss: 1.798
Predictions: tensor([4, 3, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 2, 1, 0, 0], device='cuda:0')
Epoch 1:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=8.390, train_acc=0.145, val_loss=5.870, val_acc=0.153]        Logits range: [-6.987, 16.398]
Loss: 6.000
Predictions: tensor([1, 1, 0, 1, 1], device='cuda:0')
True labels: tensor([0, 2, 3, 0, 0], device='cuda:0')
Epoch 2:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=2.200, train_acc=0.167, val_loss=1.830, val_acc=0.183]        Logits range: [-3.517, 3.904]
Loss: 1.765
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 2, 3, 3, 1], device='cuda:0')
Epoch 3:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.740, train_acc=0.168

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_1024; lr_0.1; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/5 [00:00<?, ?it/s]                              Logits range: [-0.084, 0.131]
Loss: 1.804
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 2, 1, 0, 0], device='cuda:0')
Epoch 1:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=2.150, train_acc=0.223, val_loss=1.760, val_acc=0.186]        Logits range: [-4.293, 3.977]
Loss: 1.779
Predictions: tensor([2, 1, 3, 2, 2], device='cuda:0')
True labels: tensor([0, 2, 3, 0, 0], device='cuda:0')
Epoch 2:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.330, train_acc=0.387, val_loss=1.340, val_acc=0.398]        Logits range: [-8.124, 9.050]
Loss: 1.275
Predictions: tensor([1, 1, 1, 3, 1], device='cuda:0')
True labels: tensor([0, 2, 3, 3, 1], device='cuda:0')
Epoch 3:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.150, train_acc=0.3

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_1024; lr_0.1; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/5 [00:00<?, ?it/s]                              Logits range: [-0.227, 0.224]
Loss: 1.815
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 2, 1, 0, 0], device='cuda:0')
Epoch 1:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=19.90, train_acc=0.167, val_loss=4.520, val_acc=0.168]        Logits range: [-8.556, 9.430]
Loss: 4.688
Predictions: tensor([0, 2, 2, 2, 2], device='cuda:0')
True labels: tensor([0, 2, 3, 0, 0], device='cuda:0')
Epoch 2:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=1.760, train_acc=0.167, val_loss=1.760, val_acc=0.167]        Logits range: [-5.410, 11.157]
Loss: 1.723
Predictions: tensor([0, 0, 0, 0, 0], device='cuda:0')
True labels: tensor([0, 2, 3, 3, 1], device='cuda:0')
Epoch 3:   0%|          | 0/5 [00:00<?, ?it/s, v_num=0, train_loss=2.390, train_acc=0.167,

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_2048; lr_1e-05; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/3 [00:00<?, ?it/s]                             Logits range: [-0.173, 0.229]
Loss: 1.797
Predictions: tensor([4, 4, 3, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 2, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.800, train_acc=0.174, val_loss=1.800, val_acc=0.157]        Logits range: [-0.170, 0.202]
Loss: 1.797
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 0, 2, 3], device='cuda:0')
Epoch 2:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.800, train_acc=0.178, val_loss=1.800, val_acc=0.156]        Logits range: [-0.171, 0.227]
Loss: 1.796
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 3, 2, 3], device='cuda:0')
Epoch 3:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.790, train_acc=0.159

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_2048; lr_1e-05; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/3 [00:00<?, ?it/s]                             Logits range: [-0.084, 0.134]
Loss: 1.803
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 2, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.810, train_acc=0.167, val_loss=1.810, val_acc=0.167]        Logits range: [-0.082, 0.138]
Loss: 1.804
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 0, 2, 3], device='cuda:0')
Epoch 2:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.810, train_acc=0.167, val_loss=1.810, val_acc=0.167]        Logits range: [-0.083, 0.129]
Loss: 1.804
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 3, 2, 3], device='cuda:0')
Epoch 3:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.800, train_acc=0.

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_2048; lr_1e-05; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/3 [00:00<?, ?it/s]                             Logits range: [-0.227, 0.224]
Loss: 1.814
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 0, 2, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.820, train_acc=0.165, val_loss=1.820, val_acc=0.153]        Logits range: [-0.226, 0.207]
Loss: 1.812
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 0, 0, 2, 3], device='cuda:0')
Epoch 2:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.820, train_acc=0.163, val_loss=1.820, val_acc=0.157]        Logits range: [-0.233, 0.221]
Loss: 1.812
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 0, 3, 2, 3], device='cuda:0')
Epoch 3:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.810, train_acc=0.176,

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_2048; lr_0.0001; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/3 [00:00<?, ?it/s]                             Logits range: [-0.173, 0.229]
Loss: 1.797
Predictions: tensor([4, 4, 3, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 2, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.790, train_acc=0.158, val_loss=1.790, val_acc=0.153]        Logits range: [-0.181, 0.199]
Loss: 1.782
Predictions: tensor([3, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 0, 2, 3], device='cuda:0')
Epoch 2:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.780, train_acc=0.188, val_loss=1.780, val_acc=0.169]        Logits range: [-0.217, 0.209]
Loss: 1.773
Predictions: tensor([1, 4, 4, 4, 3], device='cuda:0')
True labels: tensor([0, 0, 3, 2, 3], device='cuda:0')
Epoch 3:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.770, train_acc=0.17

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_2048; lr_0.0001; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/3 [00:00<?, ?it/s]                             Logits range: [-0.084, 0.134]
Loss: 1.803
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 2, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.800, train_acc=0.172, val_loss=1.800, val_acc=0.167]        Logits range: [-0.081, 0.131]
Loss: 1.800
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 0, 2, 3], device='cuda:0')
Epoch 2:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.800, train_acc=0.167, val_loss=1.800, val_acc=0.167]        Logits range: [-0.081, 0.116]
Loss: 1.798
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 3, 2, 3], device='cuda:0')
Epoch 3:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.790, train_acc=0

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_2048; lr_0.0001; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/3 [00:00<?, ?it/s]                             Logits range: [-0.227, 0.224]
Loss: 1.814
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 0, 2, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.810, train_acc=0.168, val_loss=1.810, val_acc=0.182]        Logits range: [-0.216, 0.185]
Loss: 1.795
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 0, 0, 2, 3], device='cuda:0')
Epoch 2:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.790, train_acc=0.183, val_loss=1.790, val_acc=0.181]        Logits range: [-0.218, 0.194]
Loss: 1.784
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 0, 3, 2, 3], device='cuda:0')
Epoch 3:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.780, train_acc=0.189

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_2048; lr_0.001; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/3 [00:00<?, ?it/s]                             Logits range: [-0.173, 0.229]
Loss: 1.797
Predictions: tensor([4, 4, 3, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 2, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.710, train_acc=0.172, val_loss=1.690, val_acc=0.179]        Logits range: [-0.794, 0.338]
Loss: 1.685
Predictions: tensor([1, 0, 1, 0, 0], device='cuda:0')
True labels: tensor([0, 0, 0, 2, 3], device='cuda:0')
Epoch 2:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.640, train_acc=0.239, val_loss=1.650, val_acc=0.173]        Logits range: [-1.375, 0.755]
Loss: 1.644
Predictions: tensor([2, 2, 2, 2, 2], device='cuda:0')
True labels: tensor([0, 0, 3, 2, 3], device='cuda:0')
Epoch 3:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.620, train_acc=0.294

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_2048; lr_0.001; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/3 [00:00<?, ?it/s]                             Logits range: [-0.084, 0.134]
Loss: 1.803
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 2, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.770, train_acc=0.236, val_loss=1.770, val_acc=0.232]        Logits range: [-0.469, 0.218]
Loss: 1.762
Predictions: tensor([1, 1, 1, 1, 1], device='cuda:0')
True labels: tensor([0, 0, 0, 2, 3], device='cuda:0')
Epoch 2:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.740, train_acc=0.224, val_loss=1.730, val_acc=0.211]        Logits range: [-1.560, 0.649]
Loss: 1.713
Predictions: tensor([1, 1, 1, 1, 1], device='cuda:0')
True labels: tensor([0, 0, 3, 2, 3], device='cuda:0')
Epoch 3:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.690, train_acc=0.

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_2048; lr_0.001; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/3 [00:00<?, ?it/s]                             Logits range: [-0.227, 0.224]
Loss: 1.814
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 0, 2, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.710, train_acc=0.248, val_loss=1.690, val_acc=0.198]        Logits range: [-0.785, 0.344]
Loss: 1.681
Predictions: tensor([1, 0, 0, 1, 1], device='cuda:0')
True labels: tensor([0, 0, 0, 2, 3], device='cuda:0')
Epoch 2:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.640, train_acc=0.252, val_loss=1.640, val_acc=0.249]        Logits range: [-1.202, 0.573]
Loss: 1.627
Predictions: tensor([1, 1, 1, 1, 1], device='cuda:0')
True labels: tensor([0, 0, 3, 2, 3], device='cuda:0')
Epoch 3:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.610, train_acc=0.325,

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_2048; lr_0.01; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/3 [00:00<?, ?it/s]                             Logits range: [-0.173, 0.229]
Loss: 1.797
Predictions: tensor([4, 4, 3, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 2, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=2.070, train_acc=0.167, val_loss=1.830, val_acc=0.170]        Logits range: [-2.439, 1.175]
Loss: 1.936
Predictions: tensor([1, 1, 1, 1, 5], device='cuda:0')
True labels: tensor([0, 0, 0, 2, 3], device='cuda:0')
Epoch 2:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.730, train_acc=0.214, val_loss=1.670, val_acc=0.169]        Logits range: [-2.119, 1.509]
Loss: 1.701
Predictions: tensor([2, 2, 2, 2, 2], device='cuda:0')
True labels: tensor([0, 0, 3, 2, 3], device='cuda:0')
Epoch 3:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.640, train_acc=0.233,

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_2048; lr_0.01; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/3 [00:00<?, ?it/s]                             Logits range: [-0.084, 0.134]
Loss: 1.803
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 2, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.710, train_acc=0.202, val_loss=1.750, val_acc=0.167]        Logits range: [-3.866, 2.985]
Loss: 1.764
Predictions: tensor([2, 2, 2, 2, 2], device='cuda:0')
True labels: tensor([0, 0, 0, 2, 3], device='cuda:0')
Epoch 2:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.630, train_acc=0.202, val_loss=1.620, val_acc=0.233]        Logits range: [-7.058, 4.407]
Loss: 1.577
Predictions: tensor([0, 0, 2, 0, 0], device='cuda:0')
True labels: tensor([0, 0, 3, 2, 3], device='cuda:0')
Epoch 3:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.700, train_acc=0.1

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_2048; lr_0.01; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/3 [00:00<?, ?it/s]                             Logits range: [-0.227, 0.224]
Loss: 1.814
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 0, 2, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.740, train_acc=0.213, val_loss=1.510, val_acc=0.255]        Logits range: [-2.755, 1.443]
Loss: 1.509
Predictions: tensor([1, 1, 0, 1, 1], device='cuda:0')
True labels: tensor([0, 0, 0, 2, 3], device='cuda:0')
Epoch 2:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.520, train_acc=0.214, val_loss=1.610, val_acc=0.263]        Logits range: [-3.605, 1.609]
Loss: 1.590
Predictions: tensor([0, 0, 0, 0, 0], device='cuda:0')
True labels: tensor([0, 0, 3, 2, 3], device='cuda:0')
Epoch 3:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.520, train_acc=0.296, 

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_2048; lr_0.1; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/3 [00:00<?, ?it/s]                             Logits range: [-0.173, 0.229]
Loss: 1.797
Predictions: tensor([4, 4, 3, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 2, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=16.80, train_acc=0.135, val_loss=9.890, val_acc=0.163]        Logits range: [-62.734, 127.061]
Loss: 10.412
Predictions: tensor([1, 0, 1, 2, 2], device='cuda:0')
True labels: tensor([0, 0, 0, 2, 3], device='cuda:0')
Epoch 2:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=2.460, train_acc=0.189, val_loss=1.870, val_acc=0.156]        Logits range: [-30.310, 134.136]
Loss: 1.809
Predictions: tensor([1, 3, 2, 0, 3], device='cuda:0')
True labels: tensor([0, 0, 3, 2, 3], device='cuda:0')
Epoch 3:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.850, train_acc=

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_2048; lr_0.1; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/3 [00:00<?, ?it/s]                             Logits range: [-0.084, 0.134]
Loss: 1.803
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 2, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=2.950, train_acc=0.167, val_loss=1.920, val_acc=0.150]        Logits range: [-3.345, 2.654]
Loss: 1.988
Predictions: tensor([1, 1, 1, 1, 1], device='cuda:0')
True labels: tensor([0, 0, 0, 2, 3], device='cuda:0')
Epoch 2:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.640, train_acc=0.251, val_loss=1.620, val_acc=0.244]        Logits range: [-5.483, 2.966]
Loss: 1.619
Predictions: tensor([0, 0, 0, 0, 0], device='cuda:0')
True labels: tensor([0, 0, 3, 2, 3], device='cuda:0')
Epoch 3:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.480, train_acc=0.31

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_2048; lr_0.1; optimizer_Adagrad; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/3 [00:00<?, ?it/s]                             Logits range: [-0.227, 0.224]
Loss: 1.814
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 0, 2, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=61.60, train_acc=0.167, val_loss=27.60, val_acc=0.167]        Logits range: [-10.474, 53.203]
Loss: 26.378
Predictions: tensor([0, 0, 0, 0, 0], device='cuda:0')
True labels: tensor([0, 0, 0, 2, 3], device='cuda:0')
Epoch 2:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=2.730, train_acc=0.167, val_loss=2.460, val_acc=0.165]        Logits range: [-10.998, 10.935]
Loss: 2.353
Predictions: tensor([0, 0, 0, 0, 0], device='cuda:0')
True labels: tensor([0, 0, 3, 2, 3], device='cuda:0')
Epoch 3:   0%|          | 0/3 [00:00<?, ?it/s, v_num=0, train_loss=1.860, train_acc=0.1

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_32; lr_1e-05; optimizer_RMSprop; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/137 [00:00<?, ?it/s]                            Logits range: [-0.140, 0.144]
Loss: 1.793
Predictions: tensor([4, 4, 4, 4, 3], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 0], device='cuda:0')
Epoch 1:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.720, train_acc=0.000, val_loss=1.730, val_acc=0.197]          Logits range: [-0.532, 0.180]
Loss: 1.722
Predictions: tensor([1, 1, 1, 0, 1], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 2], device='cuda:0')
Epoch 2:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.700, train_acc=0.200, val_loss=1.690, val_acc=0.201]          Logits range: [-0.875, 0.289]
Loss: 1.695
Predictions: tensor([0, 1, 1, 1, 1], device='cuda:0')
True labels: tensor([0, 5, 0, 3, 4], device='cuda:0')
Epoch 3:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.700, train_

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_32; lr_1e-05; optimizer_RMSprop; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/137 [00:00<?, ?it/s]                            Logits range: [-0.080, 0.122]
Loss: 1.806
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 0], device='cuda:0')
Epoch 1:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.760, train_acc=0.367, val_loss=1.760, val_acc=0.243]          Logits range: [-0.375, 0.148]
Loss: 1.758
Predictions: tensor([1, 1, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 2], device='cuda:0')
Epoch 2:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.710, train_acc=0.400, val_loss=1.730, val_acc=0.239]          Logits range: [-0.868, 0.328]
Loss: 1.740
Predictions: tensor([1, 1, 1, 1, 1], device='cuda:0')
True labels: tensor([0, 5, 0, 3, 4], device='cuda:0')
Epoch 3:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.690, tra

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_32; lr_1e-05; optimizer_RMSprop; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/137 [00:00<?, ?it/s]                            Logits range: [-0.215, 0.176]
Loss: 1.808
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 0], device='cuda:0')
Epoch 1:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.740, train_acc=0.125, val_loss=1.720, val_acc=0.203]          Logits range: [-0.458, 0.214]
Loss: 1.716
Predictions: tensor([0, 1, 1, 1, 1], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 2], device='cuda:0')
Epoch 2:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.690, train_acc=0.400, val_loss=1.680, val_acc=0.209]          Logits range: [-0.719, 0.289]
Loss: 1.706
Predictions: tensor([0, 1, 0, 1, 1], device='cuda:0')
True labels: tensor([0, 5, 0, 3, 4], device='cuda:0')
Epoch 3:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.710, train_a

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_32; lr_0.0001; optimizer_RMSprop; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/137 [00:00<?, ?it/s]                            Logits range: [-0.140, 0.144]
Loss: 1.793
Predictions: tensor([4, 4, 4, 4, 3], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 0], device='cuda:0')
Epoch 1:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.290, train_acc=0.375, val_loss=1.370, val_acc=0.375]          Logits range: [-2.527, 2.145]
Loss: 1.406
Predictions: tensor([0, 0, 2, 2, 1], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 2], device='cuda:0')
Epoch 2:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.130, train_acc=0.700, val_loss=1.230, val_acc=0.523]          Logits range: [-3.084, 2.426]
Loss: 1.238
Predictions: tensor([0, 3, 0, 3, 4], device='cuda:0')
True labels: tensor([0, 5, 0, 3, 4], device='cuda:0')
Epoch 3:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=0.467, train

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_32; lr_0.0001; optimizer_RMSprop; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/137 [00:00<?, ?it/s]                            Logits range: [-0.080, 0.122]
Loss: 1.806
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 0], device='cuda:0')
Epoch 1:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.200, train_acc=0.375, val_loss=1.430, val_acc=0.358]          Logits range: [-4.393, 3.297]
Loss: 1.516
Predictions: tensor([0, 0, 2, 2, 1], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 2], device='cuda:0')
Epoch 2:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.160, train_acc=0.600, val_loss=1.310, val_acc=0.458]          Logits range: [-5.943, 3.167]
Loss: 1.422
Predictions: tensor([0, 3, 0, 3, 1], device='cuda:0')
True labels: tensor([0, 5, 0, 3, 4], device='cuda:0')
Epoch 3:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.030, tr

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_32; lr_0.0001; optimizer_RMSprop; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/137 [00:00<?, ?it/s]                            Logits range: [-0.215, 0.176]
Loss: 1.808
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 0], device='cuda:0')
Epoch 1:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.280, train_acc=0.625, val_loss=1.360, val_acc=0.388]          Logits range: [-2.499, 2.120]
Loss: 1.364
Predictions: tensor([0, 1, 2, 2, 1], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 2], device='cuda:0')
Epoch 2:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.200, train_acc=0.700, val_loss=1.130, val_acc=0.612]          Logits range: [-3.293, 2.719]
Loss: 1.209
Predictions: tensor([0, 2, 0, 3, 4], device='cuda:0')
True labels: tensor([0, 5, 0, 3, 4], device='cuda:0')
Epoch 3:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=0.737, train_

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_32; lr_0.001; optimizer_RMSprop; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/137 [00:00<?, ?it/s]                            Logits range: [-0.140, 0.144]
Loss: 1.793
Predictions: tensor([4, 4, 4, 4, 3], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 0], device='cuda:0')
Epoch 1:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.740, train_acc=0.150, val_loss=1.470, val_acc=0.334]          Logits range: [-3.382, 1.467]
Loss: 1.497
Predictions: tensor([0, 0, 0, 0, 0], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 2], device='cuda:0')
Epoch 2:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.310, train_acc=0.450, val_loss=1.180, val_acc=0.533]          Logits range: [-5.871, 3.783]
Loss: 1.321
Predictions: tensor([0, 1, 4, 3, 4], device='cuda:0')
True labels: tensor([0, 5, 0, 3, 4], device='cuda:0')
Epoch 3:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=0.566, train_

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_32; lr_0.001; optimizer_RMSprop; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/137 [00:00<?, ?it/s]                            Logits range: [-0.080, 0.122]
Loss: 1.806
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 0], device='cuda:0')
Epoch 1:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=2.030, train_acc=0.521, val_loss=0.944, val_acc=0.643]          Logits range: [-9.933, 5.628]
Loss: 0.805
Predictions: tensor([0, 0, 1, 1, 1], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 2], device='cuda:0')
Epoch 2:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=0.791, train_acc=0.800, val_loss=0.835, val_acc=0.706]          Logits range: [-12.151, 11.853]
Loss: 0.824
Predictions: tensor([0, 2, 1, 3, 4], device='cuda:0')
True labels: tensor([0, 5, 0, 3, 4], device='cuda:0')
Epoch 3:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=0.216, t

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_32; lr_0.001; optimizer_RMSprop; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/137 [00:00<?, ?it/s]                            Logits range: [-0.215, 0.176]
Loss: 1.808
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 0], device='cuda:0')
Epoch 1:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=0.583, train_acc=0.875, val_loss=0.666, val_acc=0.719]          Logits range: [-4.566, 6.041]
Loss: 0.591
Predictions: tensor([0, 1, 1, 2, 2], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 2], device='cuda:0')
Epoch 2:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=0.658, train_acc=0.900, val_loss=0.552, val_acc=0.794]          Logits range: [-7.783, 7.647]
Loss: 0.356
Predictions: tensor([0, 2, 0, 3, 4], device='cuda:0')
True labels: tensor([0, 5, 0, 3, 4], device='cuda:0')
Epoch 3:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=0.165, train_a

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_32; lr_0.01; optimizer_RMSprop; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/137 [00:00<?, ?it/s]                            Logits range: [-0.140, 0.144]
Loss: 1.793
Predictions: tensor([4, 4, 4, 4, 3], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 0], device='cuda:0')
Epoch 1:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.630, train_acc=0.250, val_loss=1.660, val_acc=0.186]          Logits range: [-1.440, 0.507]
Loss: 1.704
Predictions: tensor([0, 0, 0, 0, 0], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 2], device='cuda:0')
Epoch 2:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.710, train_acc=0.200, val_loss=1.660, val_acc=0.186]          Logits range: [-1.903, 0.518]
Loss: 1.693
Predictions: tensor([0, 0, 0, 0, 0], device='cuda:0')
True labels: tensor([0, 5, 0, 3, 4], device='cuda:0')
Epoch 3:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.730, train_a

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_32; lr_0.01; optimizer_RMSprop; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/137 [00:00<?, ?it/s]                            Logits range: [-0.080, 0.122]
Loss: 1.806
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 0], device='cuda:0')
Epoch 1:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=0.968, train_acc=0.812, val_loss=0.815, val_acc=0.638]          Logits range: [-8.976, 7.002]
Loss: 0.582
Predictions: tensor([0, 1, 1, 2, 2], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 2], device='cuda:0')
Epoch 2:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=0.758, train_acc=0.750, val_loss=0.714, val_acc=0.725]          Logits range: [-15.394, 14.224]
Loss: 0.452
Predictions: tensor([0, 2, 1, 3, 4], device='cuda:0')
True labels: tensor([0, 5, 0, 3, 4], device='cuda:0')
Epoch 3:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=0.0573, t

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_32; lr_0.01; optimizer_RMSprop; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/137 [00:00<?, ?it/s]                            Logits range: [-0.215, 0.176]
Loss: 1.808
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 0], device='cuda:0')
Epoch 1:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.070, train_acc=0.500, val_loss=1.270, val_acc=0.473]          Logits range: [-6.851, 8.797]
Loss: 1.185
Predictions: tensor([1, 1, 1, 1, 1], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 2], device='cuda:0')
Epoch 2:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.560, train_acc=0.300, val_loss=1.180, val_acc=0.495]          Logits range: [-10.975, 9.905]
Loss: 1.189
Predictions: tensor([0, 1, 1, 3, 1], device='cuda:0')
True labels: tensor([0, 5, 0, 3, 4], device='cuda:0')
Epoch 3:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.120, train_a

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_32; lr_0.1; optimizer_RMSprop; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/137 [00:00<?, ?it/s]                            Logits range: [-0.140, 0.144]
Loss: 1.793
Predictions: tensor([4, 4, 4, 4, 3], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 0], device='cuda:0')
Epoch 1:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.670, train_acc=0.250, val_loss=12.10, val_acc=0.188]          Logits range: [-1.710, 1.204]
Loss: 1.793
Predictions: tensor([2, 2, 2, 2, 2], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 2], device='cuda:0')
Epoch 2:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.720, train_acc=0.200, val_loss=8.550, val_acc=0.191]          Logits range: [-2.370, 0.873]
Loss: 1.695
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 5, 0, 3, 4], device='cuda:0')
Epoch 3:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.660, train_ac

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_32; lr_0.1; optimizer_RMSprop; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/137 [00:00<?, ?it/s]                            Logits range: [-0.080, 0.122]
Loss: 1.806
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 0], device='cuda:0')
Epoch 1:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=15.40, train_acc=0.333, val_loss=2.150, val_acc=0.238]          Logits range: [-29.897, 108.245]
Loss: 2.316
Predictions: tensor([0, 0, 0, 0, 0], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 2], device='cuda:0')
Epoch 2:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.930, train_acc=0.200, val_loss=1.690, val_acc=0.229]          Logits range: [-26.919, 39.439]
Loss: 1.757
Predictions: tensor([2, 2, 2, 1, 2], device='cuda:0')
True labels: tensor([0, 5, 0, 3, 4], device='cuda:0')
Epoch 3:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=2.100, 

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_32; lr_0.1; optimizer_RMSprop; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/137 [00:00<?, ?it/s]                            Logits range: [-0.215, 0.176]
Loss: 1.808
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 0], device='cuda:0')
Epoch 1:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.660, train_acc=0.250, val_loss=1.670, val_acc=0.186]          Logits range: [-1.920, 0.919]
Loss: 1.751
Predictions: tensor([2, 2, 2, 2, 2], device='cuda:0')
True labels: tensor([0, 1, 1, 1, 2], device='cuda:0')
Epoch 2:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.780, train_acc=0.200, val_loss=1.660, val_acc=0.187]          Logits range: [-2.323, 0.740]
Loss: 1.685
Predictions: tensor([2, 2, 2, 2, 2], device='cuda:0')
True labels: tensor([0, 5, 0, 3, 4], device='cuda:0')
Epoch 3:   0%|          | 0/137 [00:00<?, ?it/s, v_num=0, train_loss=1.660, train_acc

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_64; lr_1e-05; optimizer_RMSprop; hidden_dim_256; num_layers_1; sentence_representation_last  ----------
Epoch 0:   0%|          | 0/69 [00:00<?, ?it/s]                             Logits range: [-0.173, 0.149]
Loss: 1.794
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.740, train_acc=0.0833, val_loss=1.740, val_acc=0.197]         Logits range: [-0.423, 0.167]
Loss: 1.737
Predictions: tensor([1, 3, 1, 1, 1], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 4], device='cuda:0')
Epoch 2:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.710, train_acc=0.200, val_loss=1.710, val_acc=0.188]         Logits range: [-0.656, 0.225]
Loss: 1.710
Predictions: tensor([1, 0, 1, 0, 1], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 5], device='cuda:0')
Epoch 3:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.710, train_acc=

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_64; lr_1e-05; optimizer_RMSprop; hidden_dim_256; num_layers_1; sentence_representation_average  ----------
Epoch 0:   0%|          | 0/69 [00:00<?, ?it/s]                             Logits range: [-0.088, 0.123]
Loss: 1.801
Predictions: tensor([4, 4, 4, 4, 4], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.780, train_acc=0.133, val_loss=1.780, val_acc=0.225]         Logits range: [-0.234, 0.110]
Loss: 1.773
Predictions: tensor([1, 1, 3, 4, 3], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 4], device='cuda:0')
Epoch 2:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.740, train_acc=0.550, val_loss=1.760, val_acc=0.243]         Logits range: [-0.482, 0.204]
Loss: 1.763
Predictions: tensor([1, 1, 1, 1, 1], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 5], device='cuda:0')
Epoch 3:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.710, train_ac

Seed set to 42
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type               | Params | Mode 
------------------------------------------------------
0 | model  | RNN                | 2.6 M  | train
1 | metric | MulticlassAccuracy | 0      | train
------------------------------------------------------
2.6 M     Trainable params
0         Non-trainable params
2.6 M     Total params
10.418    Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


---------- batch_size_64; lr_1e-05; optimizer_RMSprop; hidden_dim_256; num_layers_1; sentence_representation_max  ----------
Epoch 0:   0%|          | 0/69 [00:00<?, ?it/s]                             Logits range: [-0.215, 0.182]
Loss: 1.806
Predictions: tensor([3, 3, 3, 3, 3], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 1], device='cuda:0')
Epoch 1:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.760, train_acc=0.208, val_loss=1.740, val_acc=0.193]         Logits range: [-0.347, 0.159]
Loss: 1.735
Predictions: tensor([0, 0, 1, 1, 1], device='cuda:0')
True labels: tensor([0, 0, 1, 0, 4], device='cuda:0')
Epoch 2:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.700, train_acc=0.400, val_loss=1.710, val_acc=0.197]         Logits range: [-0.574, 0.227]
Loss: 1.715
Predictions: tensor([0, 1, 1, 0, 1], device='cuda:0')
True labels: tensor([1, 1, 1, 0, 5], device='cuda:0')
Epoch 3:   0%|          | 0/69 [00:00<?, ?it/s, v_num=0, train_loss=1.710, train_acc=0.