# Earnings Call Project: Data Cleaning
<br>
CIS 831 Deep Learning – Term Project<br>
Kansas State University
<br><br>
James Chapman<br>
John Woods<br>
Nathan Diehl<br>
<br>

This notebook uses the transformer architecture from the [HTML paper](https://www.researchgate.net/publication/340385140_HTML_Hierarchical_Transformer-based_Multi-task_Learning_for_Volatility_Prediction). 

From the previous notebooks, we have features for both audio and text, of each sentence, of each meeting.<br>
- audio 
    - PRAAT (27 features)
- text
    - GLOVE (300 features)
    - ROBERTA (1024 features)
    - ROBERTA with averaging (1024 features)
    - FinLang investopedia (768 features) from huggingface sentencetransformers
    - BGE financial (1024 features) from huggingface sentencetransformers

This notebook performs 3 nested loops. Each audio/text pair, and for each N_DAYS, we train models with 17 different alpha values. The alpha value the lowest MSE, based on the validation set, is been used to train a model with Both training and validation sets. This model is finally tested on the test set.

These notebooks also give us a playground to work with the data, and test the models. Now we can insert sentiment detection, segmentation of the text, etc.

In [90]:
import pandas as pd
import numpy as np
from tqdm import tqdm
from sklearn.preprocessing import StandardScaler

import torch
import torch.nn.functional as F
from torch import nn
from torch.optim.lr_scheduler import LambdaLR
from torch.utils.tensorboard import SummaryWriter

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

np.random.seed(777)
torch.manual_seed(777)
torch.cuda.manual_seed_all(777)
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False

In [91]:
import shutil
import os

#fresh start

directory_path = r"C:\Users\James\OneDrive\Documents\GitHub\Earnings_call_project\runs"
if os.path.exists(directory_path):
    shutil.rmtree(directory_path)
    
path = "C:/Users/James/AppData/Local/Temp/.tensorboard-info/"
for filename in os.listdir(path):
    filepath = os.path.join(path, filename)
    print(filepath)
    os.remove(filepath)

In [92]:
directories = ['Roberta/', 'Roberta2/', 'investopedia/', 'bge/', 'glove/']#, 'bge_base/'
all_n_days = ['3', '7', '15', '30']
alphas = [0.05, 0.1, 0.15, 0.2, 0.25, 0.3, 0.35, 0.4, 0.45, 0.5, 0.55, 0.6, 0.65, 0.7, 0.75, 0.8, 0.85]

# static hyper-parameters from paper
max_norm = 1.0
num_epochs = 10
warmup_steps = 1000
batch_size=4
dropout=0.5
heads=2
depth=2

lr=0.000002 # 1/10 learning rate of paper
is_using_scaler = True # whether or not to use standardscaler

# This is the hierarchical transformer from the HTML paper.

## This is taken directly from the GitHub account for HTML paper.
- a few adaptions are highlighted with ##############################################


- [GitHub HTML - util.py](https://github.com/YangLinyi/HTML-Hierarchical-Transformer-based-Multi-task-Learning-for-Volatility-Prediction/blob/master/Model/Sentence-Level-Transformer/transformers/util/util.py)
- [GitHub HTML - transformers_gpu.py](https://github.com/YangLinyi/HTML-Hierarchical-Transformer-based-Multi-task-Learning-for-Volatility-Prediction/blob/master/Model/Sentence-Level-Transformer/transformers/transformers_gpu.py)

In [93]:

# https://github.com/YangLinyi/HTML-Hierarchical-Transformer-based-Multi-task-Learning-for-Volatility-Prediction/blob/master/Model/Sentence-Level-Transformer/transformers/transformers_gpu.py
def mask_(matrices, maskval=0.0, mask_diagonal=True):
    """
    Masks out all values in the given batch of matrices where i <= j holds,
    i < j if mask_diagonal is false

    In place operation

    :param tns:
    :return:
    """

    b, h, w = matrices.size()

    indices = torch.triu_indices(h, w, offset=0 if mask_diagonal else 1)
    matrices[:, indices, indices[1]] = maskval

def contains_nan(tensor):
    return bool((tensor != tensor).sum() > 0)

# https://github.com/YangLinyi/HTML-Hierarchical-Transformer-based-Multi-task-Learning-for-Volatility-Prediction/blob/master/Model/Sentence-Level-Transformer/transformers/transformers_gpu.py
class SelfAttention(nn.Module):
    def __init__(self, emb, heads=8, mask=False):
        """
        :param emb:
        :param heads:
        :param mask:
        """

        super().__init__()

        self.emb = emb
        self.heads = heads
        self.mask = mask

        self.tokeys = nn.Linear(emb, emb * heads, bias=False)
        self.toqueries = nn.Linear(emb, emb * heads, bias=False)
        self.tovalues = nn.Linear(emb, emb * heads, bias=False)

        self.unifyheads = nn.Linear(heads * emb, emb)

    def forward(self, x):

        b, t, e = x.size()
        h = self.heads
        assert e == self.emb

        keys    = self.tokeys(x)   .view(b, t, h, e)
        queries = self.toqueries(x).view(b, t, h, e)
        values  = self.tovalues(x) .view(b, t, h, e)

        # compute scaled dot-product self-attention

        # - fold heads into the batch dimension
        keys = keys.transpose(1, 2).contiguous().view(b * h, t, e)
        queries = queries.transpose(1, 2).contiguous().view(b * h, t, e)
        values = values.transpose(1, 2).contiguous().view(b * h, t, e)

        queries = queries / (e ** (1/4))
        keys    = keys / (e ** (1/4))
        # - Instead of dividing the dot products by sqrt(e), we scale the keys and values.
        #   This should be more memory efficient

        # - get dot product of queries and keys, and scale
        dot = torch.bmm(queries, keys.transpose(1, 2))

        assert dot.size() == (b*h, t, t)

        if self.mask: # mask out the lower half of the dot matrix,including the diagonal
            mask_(dot, maskval=float('-inf'), mask_diagonal=False)

        dot = F.softmax(dot, dim=2) # dot now has row-wise self-attention probabilities

        ##############################################
        ##############################################
        # assert not util.contains_nan(dot[:, 1:, :]) # only the forst row may contain nan
        assert not contains_nan(dot[:, 1:, :]) # only the forst row may contain nan 
        ##############################################
        ##############################################
        
        if self.mask == 'first':
            dot = dot.clone()
            dot[:, :1, :] = 0.0
            # - The first row of the first attention matrix is entirely masked out, so the softmax operation results
            #   in a division by zero. We set this row to zero by hand to get rid of the NaNs

        # apply the self attention to the values
        out = torch.bmm(dot, values).view(b, h, t, e)

        # swap h, t back, unify heads
        out = out.transpose(1, 2).contiguous().view(b, t, h * e)

        return self.unifyheads(out)

class TransformerBlock(nn.Module):
    def __init__(self, emb, heads, mask, seq_length, ff_hidden_mult=4, dropout=0.0):
        super().__init__()

        self.attention = SelfAttention(emb, heads=heads, mask=mask)
        self.mask = mask

        self.norm1 = nn.LayerNorm(emb)
        self.norm2 = nn.LayerNorm(emb)

        self.ff = nn.Sequential(
            nn.Linear(emb, ff_hidden_mult * emb),
            nn.ReLU(),
            nn.Linear(ff_hidden_mult * emb, emb)
        )

        self.do = nn.Dropout(dropout)

    def forward(self, x):

        attended = self.attention(x)

        x = self.norm1(attended + x)

        x = self.do(x)

        fedforward = self.ff(x)

        x = self.norm2(fedforward + x)

        x = self.do(x)

        return x

class RTransformer(nn.Module):
    """
    Transformer for sequences Regression    
    
    """

    def __init__(self, emb, heads, depth, seq_length, num_tokens, num_classes, max_pool=True, dropout=0.0):
        """
        emb: Embedding dimension
        heads: nr. of attention heads
        depth: Number of transformer blocks
        seq_length: Expected maximum sequence length
        num_tokens: Number of tokens (usually words) in the vocabulary
        num_classes: Number of classes.
        max_pool: If true, use global max pooling in the last layer. If false, use global
                         average pooling.
        """
        super().__init__()

        self.num_tokens, self.max_pool = num_tokens, max_pool

        #self.token_embedding = nn.Embedding(embedding_dim=emb, num_embeddings=num_tokens)
        self.pos_embedding = nn.Embedding(embedding_dim=emb, num_embeddings=seq_length)

        tblocks = []
        for i in range(depth):
            tblocks.append(
                TransformerBlock(emb=emb, heads=heads, seq_length=seq_length, mask=False, dropout=dropout))

        self.tblocks = nn.Sequential(*tblocks)

        self.toprobs = nn.Linear(emb, num_classes)
        self.toprobs_b = nn.Linear(emb, num_classes)
        self.do = nn.Dropout(dropout)

    def forward(self, x):
        """
        :param x: A batch by sequence length integer tensor of token indices.
        :return: predicted log-probability vectors for each token based on the preceding tokens.
        """
        sentences_emb = x
        b, t, e = x.size()

        ##############################################
        ##############################################
        # swap d() for device
        # positions = self.pos_embedding(torch.arange(t, device=d()))[None, :, :].expand(b, t, e)
        positions = self.pos_embedding(torch.arange(t, device=device))[None, :, :].expand(b, t, e)
        ##############################################
        ##############################################
        #positions = self.pos_embedding(torch.arange(t))[None, :, :].expand(b, t, e)
        #positions = torch.tensor(positions, dtype=torch.float32)
        x = sentences_emb.cuda() + positions
        x = self.do(x)

        x = self.tblocks(x)

        x = x.max(dim=1)[0] if self.max_pool else x.mean(dim=1) # pool over the time dimension
        
        
        x_a = self.toprobs(x)
        x_b = self.toprobs_b(x)
        x_a = torch.squeeze(x_a)
        x_b = torch.squeeze(x_b)
        #print('x shape: ',x.shape)
        return x_a, x_b


# dataset builder and training loop

In [94]:
class Dataset(torch.utils.data.Dataset):
    def __init__(self, embeddings, labels, labels_b):
        self.embeddings = embeddings
        self.labels = labels
        self.labels_b = labels_b

    def __len__(self):
        return len(self.embeddings)

    def __getitem__(self, idx):
        emb = self.embeddings[idx]
        label = self.labels[idx]
        label_b = self.labels_b[idx]
        return emb, label, label_b 

def get_data(data_directory, n_days, no_validation_set, is_using_scaler, batch_size):
    # features
    train_features = np.load('data/{}/train_features.npy'.format(data_directory))
    val_features = np.load('data/{}/val_features.npy'.format(data_directory))
    test_features = np.load('data/{}/test_features.npy'.format(data_directory))
    # targets (n_days volatility)
    train_targets = np.load('data/{}/train_targets_{}.npy'.format(data_directory, n_days))
    val_targets = np.load('data/{}/val_targets_{}.npy'.format(data_directory, n_days))
    test_targets = np.load('data/{}/test_targets_{}.npy'.format(data_directory, n_days))
    # secondary targets (log percent change of day n)
    train_secondary_targets = np.load('data/{}/train_secondary_targets_{}.npy'.format(data_directory, n_days))
    val_secondary_targets = np.load('data/{}/val_secondary_targets_{}.npy'.format(data_directory, n_days))
    test_secondary_targets = np.load('data/{}/test_secondary_targets_{}.npy'.format(data_directory, n_days))

    # after hyperparameters are tuned on the validation set
    # add validation set to training set
    if no_validation_set:
        train_features = np.concatenate((train_features, val_features), axis=0)
        train_targets = np.concatenate((train_targets, val_targets), axis=0)
        train_secondary_targets = np.concatenate((train_secondary_targets, val_secondary_targets), axis=0)

    if is_using_scaler:
        # Scaling features
        feature_scaler = StandardScaler()
        train_features = feature_scaler.fit_transform(train_features.reshape(-1, train_features.shape[-1])).reshape(train_features.shape)
        val_features = feature_scaler.transform(val_features.reshape(-1, val_features.shape[-1])).reshape(val_features.shape)
        test_features = feature_scaler.transform(test_features.reshape(-1, test_features.shape[-1])).reshape(test_features.shape)
        # Scaling primary targets
        target_scaler = StandardScaler()
        train_targets = target_scaler.fit_transform(train_targets.reshape(-1, 1)).reshape(train_targets.shape)
        val_targets = target_scaler.transform(val_targets.reshape(-1, 1)).reshape(val_targets.shape)
        test_targets = target_scaler.transform(test_targets.reshape(-1, 1)).reshape(test_targets.shape)
        # Scaling secondary targets
        secondary_target_scaler = StandardScaler()
        train_secondary_targets = secondary_target_scaler.fit_transform(train_secondary_targets.reshape(-1, 1)).reshape(train_secondary_targets.shape)
        val_secondary_targets = secondary_target_scaler.transform(val_secondary_targets.reshape(-1, 1)).reshape(val_secondary_targets.shape)
        test_secondary_targets = secondary_target_scaler.transform(test_secondary_targets.reshape(-1, 1)).reshape(test_secondary_targets.shape)
    # Dataset & DataLoader
    training_set = Dataset(train_features, train_targets, train_secondary_targets) 
    val_set = Dataset(val_features, val_targets, val_secondary_targets)
    test_set = Dataset(test_features, test_targets, test_secondary_targets)
    trainloader = torch.utils.data.DataLoader(training_set, batch_size=batch_size, shuffle=False, num_workers=0)
    valloader = torch.utils.data.DataLoader(val_set, batch_size=len(val_set), shuffle=False, num_workers=0)
    testloader = torch.utils.data.DataLoader(test_set, batch_size=len(test_set), shuffle=False, num_workers=0)

    print(train_features.shape, train_targets.shape, train_secondary_targets.shape) 
    print(val_features.shape, val_targets.shape, val_secondary_targets.shape) 
    print(test_features.shape, test_targets.shape, test_secondary_targets.shape) 

    # train_features.shape[2] is the number of features, need to instantiate the model
    # need target scaler to inverse transform
    return trainloader, valloader, testloader, train_features.shape[2], target_scaler 


In [95]:
# this training loop is actually reused for training, validation, and testing
def  training_loop(trainloader, loader, emb_dimensions, heads, depth,dropout, warmup_steps, 
                   batch_size, num_epochs, alphas, max_norm, lr, run_name, target_scaler):
    alphas_results = []
    for alpha in alphas:
        model = RTransformer(emb=emb_dimensions, heads=heads, depth=depth, seq_length=523, 
                            num_tokens=4000, num_classes=1, max_pool=False, dropout=dropout).to(device)
        opt = torch.optim.Adam(model.parameters(), lr=lr)
        # Linear LR warmup for the first #warmup_steps training examples
        scheduler = LambdaLR(opt, lr_lambda=lambda step: min(1.0, step / (warmup_steps/batch_size)))

        seen = 0
        min_loss = float('inf')
        val_loss_a_history =  []
        writer = SummaryWriter(log_dir= f'runs/{run_name}_{alpha}')
        progress_bar = tqdm(range(num_epochs), desc="Training Progress", unit="epoch")
        for epoch in progress_bar:
            model.train()
            epoch_seen = 0 # 1st 1000 meetings trained, increase the learning rate
            train_loss_total = 0
            for i, (inputs, labels, labels_b) in enumerate(trainloader): #training data
                seen += inputs.size(0)
                epoch_seen += inputs.size(0)
                inputs = inputs.to(device, dtype=torch.float32)
                labels = labels.to(device, dtype=torch.float32)
                labels_b = labels_b.to(device, dtype=torch.float32)
                out_a, out_b = model(inputs)

                # Compute the combined loss (multitask)
                loss_a = F.mse_loss(out_a, labels)
                loss_b = F.mse_loss(out_b, labels_b)
                loss = alpha * loss_a + (1 - alpha) * loss_b # Alpha parameter
            
                opt.zero_grad()
                loss.backward()
                torch.nn.utils.clip_grad_norm_(model.parameters(), max_norm=max_norm)
                opt.step()
                if seen < warmup_steps:
                    scheduler.step()

                # we only care about the primary target
                # although we training with the scaled loss, reporting the volatility must be unscaled
                out_a_unscaled = target_scaler.inverse_transform(out_a.detach().cpu().numpy().reshape(-1, 1)).reshape(out_a.shape)
                labels_unscaled = target_scaler.inverse_transform(labels.detach().cpu().numpy().reshape(-1, 1)).reshape(labels.shape)
                loss_a_unscaled = F.mse_loss(torch.tensor(out_a_unscaled, device=out_a.device), torch.tensor(labels_unscaled, device=labels.device))
                train_loss_total += loss_a_unscaled.item() * inputs.size(0) # MSE of volatility training

            # Epoch average of unscaled training loss
            train_loss_avg = train_loss_total / epoch_seen
            
            # Epoch Evaluation
            model.eval()
            val_loss_a_total = 0.0
            val_loss_b_total = 0.0
            epoch_seen = 0
            with torch.no_grad():
                for i, (inputs, labels, labels_b) in enumerate(loader): #validation ortesting
                    epoch_seen += inputs.size(0)
                    inputs = inputs.to(device, dtype=torch.float32)
                    labels = labels.to(device, dtype=torch.float32)
                    labels_b = labels_b.to(device, dtype=torch.float32)

                    out_a, out_b = model(inputs)
                    #loss_a = F.mse_loss(out_a, labels)
                    loss_b = F.mse_loss(out_b, labels_b)
                    
                    out_a_unscaled = target_scaler.inverse_transform(out_a.detach().cpu().numpy().reshape(-1, 1)).reshape(out_a.shape)
                    labels_unscaled = target_scaler.inverse_transform(labels.detach().cpu().numpy().reshape(-1, 1)).reshape(labels.shape)
                    loss_a_unscaled = F.mse_loss(torch.tensor(out_a_unscaled, device=out_a.device), torch.tensor(labels_unscaled, device=labels.device))
                    val_loss_a_total += loss_a_unscaled.item() * inputs.size(0) # MSE of volatility

                    val_loss_b_total += loss_b.item() * inputs.size(0) 
                ##############################################
                val_loss_a_avg = val_loss_a_total / epoch_seen # MSE of volatility of Val/test!!
                val_loss_b_avg = val_loss_b_total / epoch_seen
            # Epoch finished
            writer.add_scalar('Train Loss/train', train_loss_avg, epoch)
            writer.add_scalar('Loss A/val', val_loss_a_avg, epoch)
            writer.add_scalar('Loss B/val', val_loss_b_avg, epoch)
            # Update progress bar description and postfix
            progress_bar.set_description(f"Epoch {epoch+1}/{num_epochs}")
            progress_bar.set_postfix({
                "Train Loss": f"{train_loss_avg:.4f}",
                "Test Loss A": f"{val_loss_a_avg:.4f}",
                "Test Loss B": f"{val_loss_b_avg:.4f}"
            })
            # for alpha optimization, collect each Epoch's average loss
            val_loss_a_history.append(val_loss_a_avg)
        # all Epochs finished for this alpha value
        # if testing, there will only be one Alpha
        # I made a sliding window of 2, so we pick the alpha with the lowest average 2 consecutive Epoeh losses
        # min_loss = min(
        #     (val_loss_a_history[i] + val_loss_a_history[i + 1]) / 2 # average 2 consecutive Epoeh losses
        #     for i in range(len(val_loss_a_history) - 1)
        # )
        # or just the Alpha with the lowest minimum loss
        min_loss = round(min(val_loss_a_history), 3)
        alphas_results.append((alpha, min_loss))
        print(alpha, min_loss)
        writer.close()
    best_alpha, lowest_val_loss = min(alphas_results, key=lambda x: x[1])
    return best_alpha, lowest_val_loss # if testING, this is the MSE  of the test set

In [96]:

def train_val_test(data_directory, n_days):
    run_name = data_directory + '_'  + n_days
    # using training set and validation set, determine optimum hyper-parameters (just alpha)
    no_validation_set = False
    trainloader,valloader,testloader,emb_dimensions, target_scaler = get_data(data_directory, n_days, no_validation_set, is_using_scaler, batch_size)
    best_alpha, _ = training_loop(trainloader, valloader, emb_dimensions, heads, depth,dropout, warmup_steps, 
                            batch_size, num_epochs, alphas, max_norm, lr, run_name, target_scaler)
    # using optimum alpha, retrain with all training/validation sets and test on testset
    test_run_name = run_name + '_test'
    no_validation_set = True
    trainloader,valloader,testloader,emb_dimensions, target_scaler = get_data(data_directory, n_days, no_validation_set, is_using_scaler, batch_size)
    _, MSE_testset = training_loop(trainloader, testloader, emb_dimensions, heads, depth, dropout, warmup_steps, 
                        batch_size, num_epochs, [best_alpha], max_norm, lr, test_run_name, target_scaler)

    print('----------------------------------------')
    print(f'run_name-----------{run_name}')
    print(f'best_alpha---------{best_alpha}')
    print(f'MSE_testset--------{MSE_testset}')
    print('----------------------------------------')
    return best_alpha, MSE_testset

In [97]:
final_results = []
for data_directory in directories: # different feature engineering features
    for n_days in all_n_days: 
        best_alpha, MSE_testset = train_val_test(data_directory, n_days)
        final_results.append([data_directory, n_days, best_alpha, MSE_testset])

(392, 523, 1051) (392,) (392,)
(56, 523, 1051) (56,) (56,)
(117, 523, 1051) (117,) (117,)


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.53s/epoch, Train Loss=0.7474, Test Loss A=0.5626, Test Loss B=5.1637]


0.05 0.556


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.51s/epoch, Train Loss=0.7351, Test Loss A=0.6318, Test Loss B=5.0527]


0.1 0.6


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.50s/epoch, Train Loss=0.7449, Test Loss A=0.6438, Test Loss B=5.0161]


0.15 0.59


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.51s/epoch, Train Loss=0.7456, Test Loss A=0.6742, Test Loss B=5.1316]


0.2 0.576


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.49s/epoch, Train Loss=0.7359, Test Loss A=0.7346, Test Loss B=5.0295]


0.25 0.619


Epoch 10/10: 100%|██████████| 10/10 [00:26<00:00,  2.66s/epoch, Train Loss=0.7332, Test Loss A=0.7000, Test Loss B=5.0452]


0.3 0.556


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.58s/epoch, Train Loss=0.7375, Test Loss A=0.8301, Test Loss B=4.9928]


0.35 0.635


Epoch 10/10: 100%|██████████| 10/10 [00:26<00:00,  2.69s/epoch, Train Loss=0.7376, Test Loss A=0.8129, Test Loss B=5.1011]


0.4 0.619


Epoch 10/10: 100%|██████████| 10/10 [00:26<00:00,  2.63s/epoch, Train Loss=0.7348, Test Loss A=0.7924, Test Loss B=4.9932]


0.45 0.602


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.50s/epoch, Train Loss=0.7256, Test Loss A=0.8033, Test Loss B=4.9804]


0.5 0.593


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.55s/epoch, Train Loss=0.7308, Test Loss A=0.8125, Test Loss B=4.9856]


0.55 0.621


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.50s/epoch, Train Loss=0.7272, Test Loss A=0.8812, Test Loss B=5.1153]


0.6 0.602


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.45s/epoch, Train Loss=0.7333, Test Loss A=0.8623, Test Loss B=4.9983]


0.65 0.6


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.50s/epoch, Train Loss=0.7317, Test Loss A=0.8978, Test Loss B=5.0169]


0.7 0.599


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.44s/epoch, Train Loss=0.7282, Test Loss A=0.8183, Test Loss B=4.9829]


0.75 0.625


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.44s/epoch, Train Loss=0.7385, Test Loss A=0.8381, Test Loss B=4.9130]


0.8 0.629


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.45s/epoch, Train Loss=0.7341, Test Loss A=0.8933, Test Loss B=4.8391]


0.85 0.617
(448, 523, 1051) (448,) (448,)
(56, 523, 1051) (56,) (56,)
(117, 523, 1051) (117,) (117,)


Epoch 10/10: 100%|██████████| 10/10 [00:29<00:00,  2.95s/epoch, Train Loss=0.7301, Test Loss A=0.7247, Test Loss B=0.2895]


0.05 0.725
----------------------------------------
run_name-----------
best_alpha---------0.05
MSE_testset--------0.725
----------------------------------------
(392, 523, 1051) (392,) (392,)
(56, 523, 1051) (56,) (56,)
(117, 523, 1051) (117,) (117,)


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.47s/epoch, Train Loss=0.4303, Test Loss A=0.2686, Test Loss B=2.3269]


0.05 0.268


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.44s/epoch, Train Loss=0.4312, Test Loss A=0.2685, Test Loss B=2.4284]


0.1 0.266


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.45s/epoch, Train Loss=0.4304, Test Loss A=0.2864, Test Loss B=2.3316]


0.15 0.271


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.46s/epoch, Train Loss=0.4333, Test Loss A=0.2835, Test Loss B=2.3538]


0.2 0.267


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.45s/epoch, Train Loss=0.4270, Test Loss A=0.2793, Test Loss B=2.3428]


0.25 0.261


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.45s/epoch, Train Loss=0.4270, Test Loss A=0.2970, Test Loss B=2.3734]


0.3 0.27


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.44s/epoch, Train Loss=0.4327, Test Loss A=0.2867, Test Loss B=2.3679]


0.35 0.265


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.45s/epoch, Train Loss=0.4305, Test Loss A=0.2958, Test Loss B=2.3795]


0.4 0.291


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.46s/epoch, Train Loss=0.4298, Test Loss A=0.2920, Test Loss B=2.3768]


0.45 0.261


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.49s/epoch, Train Loss=0.4311, Test Loss A=0.2894, Test Loss B=2.3341]


0.5 0.272


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.48s/epoch, Train Loss=0.4298, Test Loss A=0.3113, Test Loss B=2.3186]


0.55 0.263


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.42s/epoch, Train Loss=0.4308, Test Loss A=0.2915, Test Loss B=2.3738]


0.6 0.269


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.54s/epoch, Train Loss=0.4325, Test Loss A=0.3109, Test Loss B=2.2925]


0.65 0.28


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.46s/epoch, Train Loss=0.4350, Test Loss A=0.2911, Test Loss B=2.4522]


0.7 0.278


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.41s/epoch, Train Loss=0.4319, Test Loss A=0.2948, Test Loss B=2.3194]


0.75 0.265


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.40s/epoch, Train Loss=0.4315, Test Loss A=0.3037, Test Loss B=2.3720]


0.8 0.271


Epoch 10/10: 100%|██████████| 10/10 [00:23<00:00,  2.38s/epoch, Train Loss=0.4315, Test Loss A=0.2968, Test Loss B=2.3111]


0.85 0.27
(448, 523, 1051) (448,) (448,)
(56, 523, 1051) (56,) (56,)
(117, 523, 1051) (117,) (117,)


Epoch 10/10: 100%|██████████| 10/10 [00:28<00:00,  2.89s/epoch, Train Loss=0.4092, Test Loss A=0.3622, Test Loss B=1.9909]


0.25 0.362
----------------------------------------
run_name-----------_7
best_alpha---------0.25
MSE_testset--------0.362
----------------------------------------
(392, 523, 1051) (392,) (392,)
(56, 523, 1051) (56,) (56,)
(117, 523, 1051) (117,) (117,)


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.43s/epoch, Train Loss=0.2636, Test Loss A=0.1810, Test Loss B=1.2284]


0.05 0.181


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.41s/epoch, Train Loss=0.2590, Test Loss A=0.1896, Test Loss B=1.2347]


0.1 0.187


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.42s/epoch, Train Loss=0.2590, Test Loss A=0.1871, Test Loss B=1.2705]


0.15 0.183


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.50s/epoch, Train Loss=0.2587, Test Loss A=0.1958, Test Loss B=1.2610]


0.2 0.191


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.50s/epoch, Train Loss=0.2586, Test Loss A=0.1952, Test Loss B=1.3040]


0.25 0.182


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.50s/epoch, Train Loss=0.2621, Test Loss A=0.2022, Test Loss B=1.2948]


0.3 0.189


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.54s/epoch, Train Loss=0.2601, Test Loss A=0.2125, Test Loss B=1.2167]


0.35 0.184


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.52s/epoch, Train Loss=0.2596, Test Loss A=0.2115, Test Loss B=1.2515]


0.4 0.185


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.51s/epoch, Train Loss=0.2584, Test Loss A=0.2227, Test Loss B=1.4535]


0.45 0.185


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.48s/epoch, Train Loss=0.2580, Test Loss A=0.2270, Test Loss B=1.2616]


0.5 0.186


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.49s/epoch, Train Loss=0.2575, Test Loss A=0.2254, Test Loss B=1.2825]


0.55 0.183


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.49s/epoch, Train Loss=0.2557, Test Loss A=0.2159, Test Loss B=1.2773]


0.6 0.183


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.49s/epoch, Train Loss=0.2577, Test Loss A=0.2143, Test Loss B=1.1963]


0.65 0.182


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.48s/epoch, Train Loss=0.2556, Test Loss A=0.2132, Test Loss B=1.1492]


0.7 0.181


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.50s/epoch, Train Loss=0.2580, Test Loss A=0.2164, Test Loss B=1.1481]


0.75 0.188


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.51s/epoch, Train Loss=0.2549, Test Loss A=0.2379, Test Loss B=1.1334]


0.8 0.184


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.47s/epoch, Train Loss=0.2574, Test Loss A=0.2324, Test Loss B=1.1310]


0.85 0.186
(448, 523, 1051) (448,) (448,)
(56, 523, 1051) (56,) (56,)
(117, 523, 1051) (117,) (117,)


Epoch 10/10: 100%|██████████| 10/10 [00:29<00:00,  2.96s/epoch, Train Loss=0.2518, Test Loss A=0.2930, Test Loss B=0.6435]


0.05 0.28
----------------------------------------
run_name-----------_15
best_alpha---------0.05
MSE_testset--------0.28
----------------------------------------
(392, 523, 1051) (392,) (392,)
(56, 523, 1051) (56,) (56,)
(117, 523, 1051) (117,) (117,)


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.48s/epoch, Train Loss=0.1912, Test Loss A=0.1211, Test Loss B=0.2907]


0.05 0.117


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.47s/epoch, Train Loss=0.1929, Test Loss A=0.1216, Test Loss B=0.2957]


0.1 0.118


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.47s/epoch, Train Loss=0.1901, Test Loss A=0.1407, Test Loss B=0.3058]


0.15 0.118


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.47s/epoch, Train Loss=0.1907, Test Loss A=0.1246, Test Loss B=0.2976]


0.2 0.118


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.47s/epoch, Train Loss=0.1902, Test Loss A=0.1350, Test Loss B=0.2883]


0.25 0.12


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.45s/epoch, Train Loss=0.1889, Test Loss A=0.1201, Test Loss B=0.2833]


0.3 0.115


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.47s/epoch, Train Loss=0.1884, Test Loss A=0.1281, Test Loss B=0.2920]


0.35 0.119


Epoch 10/10: 100%|██████████| 10/10 [00:27<00:00,  2.72s/epoch, Train Loss=0.1890, Test Loss A=0.1287, Test Loss B=0.2846]


0.4 0.116


Epoch 10/10: 100%|██████████| 10/10 [00:26<00:00,  2.66s/epoch, Train Loss=0.1885, Test Loss A=0.1249, Test Loss B=0.2795]


0.45 0.113


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.57s/epoch, Train Loss=0.1900, Test Loss A=0.1277, Test Loss B=0.2919]


0.5 0.119


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.46s/epoch, Train Loss=0.1917, Test Loss A=0.1287, Test Loss B=0.2838]


0.55 0.116


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.54s/epoch, Train Loss=0.1892, Test Loss A=0.1335, Test Loss B=0.2806]


0.6 0.118


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.57s/epoch, Train Loss=0.1920, Test Loss A=0.1347, Test Loss B=0.2800]


0.65 0.118


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.52s/epoch, Train Loss=0.1920, Test Loss A=0.1261, Test Loss B=0.2772]


0.7 0.115


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.54s/epoch, Train Loss=0.1910, Test Loss A=0.1314, Test Loss B=0.2793]


0.75 0.118


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.50s/epoch, Train Loss=0.1910, Test Loss A=0.1350, Test Loss B=0.2896]


0.8 0.119


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.46s/epoch, Train Loss=0.1903, Test Loss A=0.1322, Test Loss B=0.2928]


0.85 0.116
(448, 523, 1051) (448,) (448,)
(56, 523, 1051) (56,) (56,)
(117, 523, 1051) (117,) (117,)


Epoch 10/10: 100%|██████████| 10/10 [00:29<00:00,  2.96s/epoch, Train Loss=0.1795, Test Loss A=0.1913, Test Loss B=0.4571]


0.45 0.19
----------------------------------------
run_name-----------_30
best_alpha---------0.45
MSE_testset--------0.19
----------------------------------------
(392, 523, 1051) (392,) (392,)
(56, 523, 1051) (56,) (56,)
(117, 523, 1051) (117,) (117,)


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.50s/epoch, Train Loss=0.7496, Test Loss A=0.5640, Test Loss B=4.9113]


0.05 0.56


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.49s/epoch, Train Loss=0.7371, Test Loss A=0.6106, Test Loss B=5.0016]


0.1 0.593


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.47s/epoch, Train Loss=0.7375, Test Loss A=0.5780, Test Loss B=4.9170]


0.15 0.564


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.48s/epoch, Train Loss=0.7260, Test Loss A=0.6118, Test Loss B=5.1356]


0.2 0.58


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.50s/epoch, Train Loss=0.7276, Test Loss A=0.6408, Test Loss B=5.0350]


0.25 0.596


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.48s/epoch, Train Loss=0.7193, Test Loss A=0.7292, Test Loss B=5.0662]


0.3 0.595


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.50s/epoch, Train Loss=0.7217, Test Loss A=0.6560, Test Loss B=5.1278]


0.35 0.581


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.44s/epoch, Train Loss=0.7164, Test Loss A=0.7113, Test Loss B=5.0302]


0.4 0.614


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.46s/epoch, Train Loss=0.7118, Test Loss A=0.7561, Test Loss B=5.0359]


0.45 0.605


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.43s/epoch, Train Loss=0.7222, Test Loss A=0.6475, Test Loss B=4.9642]


0.5 0.592


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.43s/epoch, Train Loss=0.7166, Test Loss A=0.6558, Test Loss B=5.0661]


0.55 0.573


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.42s/epoch, Train Loss=0.7058, Test Loss A=0.7232, Test Loss B=5.1137]


0.6 0.605


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.40s/epoch, Train Loss=0.7195, Test Loss A=0.6422, Test Loss B=5.0764]


0.65 0.567


Epoch 10/10: 100%|██████████| 10/10 [00:23<00:00,  2.39s/epoch, Train Loss=0.7242, Test Loss A=0.7140, Test Loss B=4.9009]


0.7 0.596


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.43s/epoch, Train Loss=0.7205, Test Loss A=0.6374, Test Loss B=4.8919]


0.75 0.591


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.41s/epoch, Train Loss=0.7104, Test Loss A=0.7521, Test Loss B=4.9063]


0.8 0.576


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.44s/epoch, Train Loss=0.7147, Test Loss A=0.7598, Test Loss B=5.0931]


0.85 0.589
(448, 523, 1051) (448,) (448,)
(56, 523, 1051) (56,) (56,)
(117, 523, 1051) (117,) (117,)


Epoch 10/10: 100%|██████████| 10/10 [00:29<00:00,  2.98s/epoch, Train Loss=0.7266, Test Loss A=0.7480, Test Loss B=0.2477]


0.05 0.734
----------------------------------------
run_name-----------Roberta2/
best_alpha---------0.05
MSE_testset--------0.734
----------------------------------------
(392, 523, 1051) (392,) (392,)
(56, 523, 1051) (56,) (56,)
(117, 523, 1051) (117,) (117,)


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.49s/epoch, Train Loss=0.4296, Test Loss A=0.2592, Test Loss B=2.4339]


0.05 0.259


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.51s/epoch, Train Loss=0.4304, Test Loss A=0.2561, Test Loss B=2.3692]


0.1 0.256


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.49s/epoch, Train Loss=0.4247, Test Loss A=0.2647, Test Loss B=2.4141]


0.15 0.265


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.49s/epoch, Train Loss=0.4236, Test Loss A=0.2579, Test Loss B=2.3633]


0.2 0.258


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.49s/epoch, Train Loss=0.4243, Test Loss A=0.2640, Test Loss B=2.3769]


0.25 0.264


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.47s/epoch, Train Loss=0.4239, Test Loss A=0.2618, Test Loss B=2.3796]


0.3 0.262


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.50s/epoch, Train Loss=0.4234, Test Loss A=0.2683, Test Loss B=2.3749]


0.35 0.268


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.50s/epoch, Train Loss=0.4216, Test Loss A=0.2530, Test Loss B=2.3207]


0.4 0.252


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.48s/epoch, Train Loss=0.4182, Test Loss A=0.2542, Test Loss B=2.3695]


0.45 0.252


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.49s/epoch, Train Loss=0.4178, Test Loss A=0.2889, Test Loss B=2.3438]


0.5 0.259


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.53s/epoch, Train Loss=0.4191, Test Loss A=0.2590, Test Loss B=2.3282]


0.55 0.257


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.53s/epoch, Train Loss=0.4171, Test Loss A=0.2538, Test Loss B=2.3180]


0.6 0.252


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.51s/epoch, Train Loss=0.4156, Test Loss A=0.2821, Test Loss B=2.3408]


0.65 0.265


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.56s/epoch, Train Loss=0.4168, Test Loss A=0.2601, Test Loss B=2.3093]


0.7 0.256


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.50s/epoch, Train Loss=0.4145, Test Loss A=0.2787, Test Loss B=2.2630]


0.75 0.265


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.49s/epoch, Train Loss=0.4168, Test Loss A=0.2597, Test Loss B=2.2878]


0.8 0.253


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.51s/epoch, Train Loss=0.4169, Test Loss A=0.2834, Test Loss B=2.3371]


0.85 0.252
(448, 523, 1051) (448,) (448,)
(56, 523, 1051) (56,) (56,)
(117, 523, 1051) (117,) (117,)


Epoch 10/10: 100%|██████████| 10/10 [00:29<00:00,  2.96s/epoch, Train Loss=0.4002, Test Loss A=0.3431, Test Loss B=2.0490]


0.4 0.343
----------------------------------------
run_name-----------Roberta2/_7
best_alpha---------0.4
MSE_testset--------0.343
----------------------------------------
(392, 523, 1051) (392,) (392,)
(56, 523, 1051) (56,) (56,)
(117, 523, 1051) (117,) (117,)


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.51s/epoch, Train Loss=0.2607, Test Loss A=0.1803, Test Loss B=1.1937]


0.05 0.18


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.48s/epoch, Train Loss=0.2615, Test Loss A=0.1785, Test Loss B=1.2105]


0.1 0.178


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.48s/epoch, Train Loss=0.2577, Test Loss A=0.1780, Test Loss B=1.2273]


0.15 0.178


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.53s/epoch, Train Loss=0.2583, Test Loss A=0.1826, Test Loss B=1.2301]


0.2 0.183


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.51s/epoch, Train Loss=0.2582, Test Loss A=0.1810, Test Loss B=1.1778]


0.25 0.18


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.55s/epoch, Train Loss=0.2568, Test Loss A=0.1787, Test Loss B=1.2163]


0.3 0.179


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.50s/epoch, Train Loss=0.2548, Test Loss A=0.1778, Test Loss B=1.2228]


0.35 0.178


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.56s/epoch, Train Loss=0.2544, Test Loss A=0.1776, Test Loss B=1.2428]


0.4 0.178


Epoch 10/10: 100%|██████████| 10/10 [00:26<00:00,  2.60s/epoch, Train Loss=0.2515, Test Loss A=0.1772, Test Loss B=1.2583]


0.45 0.177


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.54s/epoch, Train Loss=0.2512, Test Loss A=0.1751, Test Loss B=1.1554]


0.5 0.175


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.51s/epoch, Train Loss=0.2520, Test Loss A=0.1794, Test Loss B=1.2273]


0.55 0.179


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.51s/epoch, Train Loss=0.2492, Test Loss A=0.1889, Test Loss B=1.1657]


0.6 0.184


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.59s/epoch, Train Loss=0.2455, Test Loss A=0.1790, Test Loss B=1.0903]


0.65 0.177


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.58s/epoch, Train Loss=0.2462, Test Loss A=0.1762, Test Loss B=1.0877]


0.7 0.174


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.52s/epoch, Train Loss=0.2469, Test Loss A=0.1744, Test Loss B=1.1488]


0.75 0.172


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.49s/epoch, Train Loss=0.2451, Test Loss A=0.1847, Test Loss B=1.0149]


0.8 0.175


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.54s/epoch, Train Loss=0.2428, Test Loss A=0.1888, Test Loss B=0.9151]


0.85 0.18
(448, 523, 1051) (448,) (448,)
(56, 523, 1051) (56,) (56,)
(117, 523, 1051) (117,) (117,)


Epoch 10/10: 100%|██████████| 10/10 [00:31<00:00,  3.14s/epoch, Train Loss=0.2323, Test Loss A=0.3181, Test Loss B=0.4995]


0.75 0.276
----------------------------------------
run_name-----------Roberta2/_15
best_alpha---------0.75
MSE_testset--------0.276
----------------------------------------
(392, 523, 1051) (392,) (392,)
(56, 523, 1051) (56,) (56,)
(117, 523, 1051) (117,) (117,)


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.53s/epoch, Train Loss=0.1933, Test Loss A=0.1076, Test Loss B=0.3016]


0.05 0.108


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.57s/epoch, Train Loss=0.1927, Test Loss A=0.1090, Test Loss B=0.2996]


0.1 0.109


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.50s/epoch, Train Loss=0.1909, Test Loss A=0.1071, Test Loss B=0.2861]


0.15 0.107


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.53s/epoch, Train Loss=0.1899, Test Loss A=0.1069, Test Loss B=0.2816]


0.2 0.107


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.55s/epoch, Train Loss=0.1891, Test Loss A=0.1009, Test Loss B=0.2972]


0.25 0.101


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.54s/epoch, Train Loss=0.1886, Test Loss A=0.1003, Test Loss B=0.2942]


0.3 0.1


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.53s/epoch, Train Loss=0.1889, Test Loss A=0.1013, Test Loss B=0.2830]


0.35 0.101


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.51s/epoch, Train Loss=0.1865, Test Loss A=0.0967, Test Loss B=0.3112]


0.4 0.097


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.55s/epoch, Train Loss=0.1871, Test Loss A=0.1015, Test Loss B=0.2924]


0.45 0.102


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.59s/epoch, Train Loss=0.1869, Test Loss A=0.1055, Test Loss B=0.2740]


0.5 0.105


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.59s/epoch, Train Loss=0.1864, Test Loss A=0.1065, Test Loss B=0.3327]


0.55 0.106


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.52s/epoch, Train Loss=0.1854, Test Loss A=0.1005, Test Loss B=0.2708]


0.6 0.1


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.50s/epoch, Train Loss=0.1849, Test Loss A=0.1049, Test Loss B=0.2733]


0.65 0.105


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.51s/epoch, Train Loss=0.1842, Test Loss A=0.1116, Test Loss B=0.2657]


0.7 0.106


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.52s/epoch, Train Loss=0.1869, Test Loss A=0.1079, Test Loss B=0.2676]


0.75 0.108


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.55s/epoch, Train Loss=0.1839, Test Loss A=0.1095, Test Loss B=0.2718]


0.8 0.105


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.58s/epoch, Train Loss=0.1824, Test Loss A=0.1094, Test Loss B=0.2688]


0.85 0.107
(448, 523, 1051) (448,) (448,)
(56, 523, 1051) (56,) (56,)
(117, 523, 1051) (117,) (117,)


Epoch 10/10: 100%|██████████| 10/10 [00:30<00:00,  3.05s/epoch, Train Loss=0.1724, Test Loss A=0.1867, Test Loss B=0.4995]


0.4 0.182
----------------------------------------
run_name-----------Roberta2/_30
best_alpha---------0.4
MSE_testset--------0.182
----------------------------------------
(392, 523, 795) (392,) (392,)
(56, 523, 795) (56,) (56,)
(117, 523, 795) (117,) (117,)


Epoch 10/10: 100%|██████████| 10/10 [00:18<00:00,  1.88s/epoch, Train Loss=0.7467, Test Loss A=0.5589, Test Loss B=4.9114]


0.05 0.557


Epoch 10/10: 100%|██████████| 10/10 [00:18<00:00,  1.85s/epoch, Train Loss=0.7411, Test Loss A=0.5791, Test Loss B=4.8575]


0.1 0.578


Epoch 10/10: 100%|██████████| 10/10 [00:18<00:00,  1.82s/epoch, Train Loss=0.7342, Test Loss A=0.5915, Test Loss B=4.9856]


0.15 0.586


Epoch 10/10: 100%|██████████| 10/10 [00:18<00:00,  1.82s/epoch, Train Loss=0.7389, Test Loss A=0.5675, Test Loss B=4.9401]


0.2 0.557


Epoch 10/10: 100%|██████████| 10/10 [00:17<00:00,  1.78s/epoch, Train Loss=0.7305, Test Loss A=0.6137, Test Loss B=4.9873]


0.25 0.602


Epoch 10/10: 100%|██████████| 10/10 [00:17<00:00,  1.75s/epoch, Train Loss=0.7318, Test Loss A=0.6271, Test Loss B=4.9193]


0.3 0.592


Epoch 10/10: 100%|██████████| 10/10 [00:17<00:00,  1.75s/epoch, Train Loss=0.7304, Test Loss A=0.6366, Test Loss B=4.8413]


0.35 0.587


Epoch 10/10: 100%|██████████| 10/10 [00:18<00:00,  1.85s/epoch, Train Loss=0.7293, Test Loss A=0.6351, Test Loss B=4.8983]


0.4 0.617


Epoch 10/10: 100%|██████████| 10/10 [00:19<00:00,  1.94s/epoch, Train Loss=0.7242, Test Loss A=0.6431, Test Loss B=4.8356]


0.45 0.616


Epoch 10/10: 100%|██████████| 10/10 [00:18<00:00,  1.87s/epoch, Train Loss=0.7210, Test Loss A=0.6734, Test Loss B=4.9430]


0.5 0.604


Epoch 10/10: 100%|██████████| 10/10 [00:18<00:00,  1.86s/epoch, Train Loss=0.7224, Test Loss A=0.6813, Test Loss B=4.9527]


0.55 0.595


Epoch 10/10: 100%|██████████| 10/10 [00:18<00:00,  1.84s/epoch, Train Loss=0.7231, Test Loss A=0.6526, Test Loss B=4.8893]


0.6 0.579


Epoch 10/10: 100%|██████████| 10/10 [00:17<00:00,  1.79s/epoch, Train Loss=0.7230, Test Loss A=0.6575, Test Loss B=4.8786]


0.65 0.586


Epoch 10/10: 100%|██████████| 10/10 [00:18<00:00,  1.88s/epoch, Train Loss=0.7193, Test Loss A=0.6702, Test Loss B=4.9247]


0.7 0.601


Epoch 10/10: 100%|██████████| 10/10 [00:18<00:00,  1.85s/epoch, Train Loss=0.7251, Test Loss A=0.6584, Test Loss B=5.0489]


0.75 0.59


Epoch 10/10: 100%|██████████| 10/10 [00:18<00:00,  1.86s/epoch, Train Loss=0.7215, Test Loss A=0.7051, Test Loss B=4.9726]


0.8 0.605


Epoch 10/10: 100%|██████████| 10/10 [00:18<00:00,  1.85s/epoch, Train Loss=0.7232, Test Loss A=0.6444, Test Loss B=4.9871]


0.85 0.554
(448, 523, 795) (448,) (448,)
(56, 523, 795) (56,) (56,)
(117, 523, 795) (117,) (117,)


Epoch 10/10: 100%|██████████| 10/10 [00:22<00:00,  2.21s/epoch, Train Loss=0.7030, Test Loss A=0.7012, Test Loss B=0.2769]


0.85 0.698
----------------------------------------
run_name-----------investopedia/
best_alpha---------0.85
MSE_testset--------0.698
----------------------------------------
(392, 523, 795) (392,) (392,)
(56, 523, 795) (56,) (56,)
(117, 523, 795) (117,) (117,)


Epoch 10/10: 100%|██████████| 10/10 [00:18<00:00,  1.82s/epoch, Train Loss=0.4272, Test Loss A=0.2696, Test Loss B=2.3352]


0.05 0.27


Epoch 10/10: 100%|██████████| 10/10 [00:18<00:00,  1.82s/epoch, Train Loss=0.4268, Test Loss A=0.2666, Test Loss B=2.2682]


0.1 0.266


Epoch 10/10: 100%|██████████| 10/10 [00:18<00:00,  1.85s/epoch, Train Loss=0.4276, Test Loss A=0.2646, Test Loss B=2.2854]


0.15 0.265


Epoch 10/10: 100%|██████████| 10/10 [00:18<00:00,  1.83s/epoch, Train Loss=0.4276, Test Loss A=0.2850, Test Loss B=2.2884]


0.2 0.284


Epoch 10/10: 100%|██████████| 10/10 [00:17<00:00,  1.79s/epoch, Train Loss=0.4242, Test Loss A=0.2595, Test Loss B=2.3360]


0.25 0.259


Epoch 10/10: 100%|██████████| 10/10 [00:16<00:00,  1.66s/epoch, Train Loss=0.4209, Test Loss A=0.2657, Test Loss B=2.3276]


0.3 0.266


Epoch 10/10: 100%|██████████| 10/10 [00:16<00:00,  1.70s/epoch, Train Loss=0.4240, Test Loss A=0.2626, Test Loss B=2.3495]


0.35 0.263


Epoch 10/10: 100%|██████████| 10/10 [00:16<00:00,  1.66s/epoch, Train Loss=0.4238, Test Loss A=0.2690, Test Loss B=2.3060]


0.4 0.269


Epoch 10/10: 100%|██████████| 10/10 [00:16<00:00,  1.62s/epoch, Train Loss=0.4216, Test Loss A=0.2641, Test Loss B=2.3629]


0.45 0.264


Epoch 10/10: 100%|██████████| 10/10 [00:16<00:00,  1.66s/epoch, Train Loss=0.4175, Test Loss A=0.2596, Test Loss B=2.3285]


0.5 0.26


Epoch 10/10: 100%|██████████| 10/10 [00:16<00:00,  1.65s/epoch, Train Loss=0.4212, Test Loss A=0.2703, Test Loss B=2.3360]


0.55 0.27


Epoch 10/10: 100%|██████████| 10/10 [00:16<00:00,  1.65s/epoch, Train Loss=0.4219, Test Loss A=0.2627, Test Loss B=2.3334]


0.6 0.261


Epoch 10/10: 100%|██████████| 10/10 [00:16<00:00,  1.70s/epoch, Train Loss=0.4195, Test Loss A=0.2660, Test Loss B=2.2880]


0.65 0.266


Epoch 10/10: 100%|██████████| 10/10 [00:16<00:00,  1.62s/epoch, Train Loss=0.4183, Test Loss A=0.2581, Test Loss B=2.3040]


0.7 0.258


Epoch 10/10: 100%|██████████| 10/10 [00:16<00:00,  1.69s/epoch, Train Loss=0.4192, Test Loss A=0.2462, Test Loss B=2.3152]


0.75 0.246


Epoch 10/10: 100%|██████████| 10/10 [00:16<00:00,  1.68s/epoch, Train Loss=0.4192, Test Loss A=0.2535, Test Loss B=2.3129]


0.8 0.254


Epoch 10/10: 100%|██████████| 10/10 [00:17<00:00,  1.72s/epoch, Train Loss=0.4160, Test Loss A=0.2559, Test Loss B=2.2770]


0.85 0.256
(448, 523, 795) (448,) (448,)
(56, 523, 795) (56,) (56,)
(117, 523, 795) (117,) (117,)


Epoch 10/10: 100%|██████████| 10/10 [00:19<00:00,  1.97s/epoch, Train Loss=0.3970, Test Loss A=0.3235, Test Loss B=2.0326]


0.75 0.324
----------------------------------------
run_name-----------investopedia/_7
best_alpha---------0.75
MSE_testset--------0.324
----------------------------------------
(392, 523, 795) (392,) (392,)
(56, 523, 795) (56,) (56,)
(117, 523, 795) (117,) (117,)


Epoch 10/10: 100%|██████████| 10/10 [00:17<00:00,  1.72s/epoch, Train Loss=0.2628, Test Loss A=0.1838, Test Loss B=1.1653]


0.05 0.184


Epoch 10/10: 100%|██████████| 10/10 [00:16<00:00,  1.67s/epoch, Train Loss=0.2608, Test Loss A=0.1846, Test Loss B=1.1952]


0.1 0.185


Epoch 10/10: 100%|██████████| 10/10 [00:17<00:00,  1.71s/epoch, Train Loss=0.2605, Test Loss A=0.1824, Test Loss B=1.1249]


0.15 0.182


Epoch 10/10: 100%|██████████| 10/10 [00:16<00:00,  1.69s/epoch, Train Loss=0.2580, Test Loss A=0.1774, Test Loss B=1.1282]


0.2 0.177


Epoch 10/10: 100%|██████████| 10/10 [00:16<00:00,  1.69s/epoch, Train Loss=0.2589, Test Loss A=0.1750, Test Loss B=1.1292]


0.25 0.175


Epoch 10/10: 100%|██████████| 10/10 [00:16<00:00,  1.67s/epoch, Train Loss=0.2559, Test Loss A=0.1838, Test Loss B=1.1956]


0.3 0.184


Epoch 10/10: 100%|██████████| 10/10 [00:16<00:00,  1.67s/epoch, Train Loss=0.2560, Test Loss A=0.1755, Test Loss B=1.1254]


0.35 0.175


Epoch 10/10: 100%|██████████| 10/10 [00:17<00:00,  1.71s/epoch, Train Loss=0.2540, Test Loss A=0.1733, Test Loss B=1.1022]


0.4 0.173


Epoch 10/10: 100%|██████████| 10/10 [00:17<00:00,  1.71s/epoch, Train Loss=0.2532, Test Loss A=0.1764, Test Loss B=1.1810]


0.45 0.176


Epoch 10/10: 100%|██████████| 10/10 [00:17<00:00,  1.70s/epoch, Train Loss=0.2550, Test Loss A=0.1743, Test Loss B=1.1616]


0.5 0.174


Epoch 10/10: 100%|██████████| 10/10 [00:16<00:00,  1.69s/epoch, Train Loss=0.2512, Test Loss A=0.1757, Test Loss B=1.1563]


0.55 0.176


Epoch 10/10: 100%|██████████| 10/10 [00:16<00:00,  1.63s/epoch, Train Loss=0.2515, Test Loss A=0.1802, Test Loss B=1.1024]


0.6 0.18


Epoch 10/10: 100%|██████████| 10/10 [00:16<00:00,  1.66s/epoch, Train Loss=0.2497, Test Loss A=0.1729, Test Loss B=1.0516]


0.65 0.173


Epoch 10/10: 100%|██████████| 10/10 [00:16<00:00,  1.67s/epoch, Train Loss=0.2519, Test Loss A=0.1756, Test Loss B=1.0242]


0.7 0.176


Epoch 10/10: 100%|██████████| 10/10 [00:16<00:00,  1.65s/epoch, Train Loss=0.2485, Test Loss A=0.1695, Test Loss B=1.0026]


0.75 0.17


Epoch 10/10: 100%|██████████| 10/10 [00:16<00:00,  1.67s/epoch, Train Loss=0.2494, Test Loss A=0.1723, Test Loss B=1.0977]


0.8 0.172


Epoch 10/10: 100%|██████████| 10/10 [00:16<00:00,  1.65s/epoch, Train Loss=0.2508, Test Loss A=0.1771, Test Loss B=1.0087]


0.85 0.177
(448, 523, 795) (448,) (448,)
(56, 523, 795) (56,) (56,)
(117, 523, 795) (117,) (117,)


Epoch 10/10: 100%|██████████| 10/10 [00:20<00:00,  2.00s/epoch, Train Loss=0.2401, Test Loss A=0.2633, Test Loss B=0.5573]


0.75 0.263
----------------------------------------
run_name-----------investopedia/_15
best_alpha---------0.75
MSE_testset--------0.263
----------------------------------------
(392, 523, 795) (392,) (392,)
(56, 523, 795) (56,) (56,)
(117, 523, 795) (117,) (117,)


Epoch 10/10: 100%|██████████| 10/10 [00:16<00:00,  1.70s/epoch, Train Loss=0.1918, Test Loss A=0.1128, Test Loss B=0.2909]


0.05 0.113


Epoch 10/10: 100%|██████████| 10/10 [00:16<00:00,  1.70s/epoch, Train Loss=0.1917, Test Loss A=0.1129, Test Loss B=0.2766]


0.1 0.113


Epoch 10/10: 100%|██████████| 10/10 [00:16<00:00,  1.65s/epoch, Train Loss=0.1902, Test Loss A=0.1070, Test Loss B=0.2963]


0.15 0.107


Epoch 10/10: 100%|██████████| 10/10 [00:16<00:00,  1.67s/epoch, Train Loss=0.1915, Test Loss A=0.1108, Test Loss B=0.2794]


0.2 0.111


Epoch 10/10: 100%|██████████| 10/10 [00:17<00:00,  1.76s/epoch, Train Loss=0.1884, Test Loss A=0.1048, Test Loss B=0.2865]


0.25 0.105


Epoch 10/10: 100%|██████████| 10/10 [00:17<00:00,  1.74s/epoch, Train Loss=0.1912, Test Loss A=0.1060, Test Loss B=0.2847]


0.3 0.106


Epoch 10/10: 100%|██████████| 10/10 [00:16<00:00,  1.68s/epoch, Train Loss=0.1884, Test Loss A=0.1021, Test Loss B=0.3051]


0.35 0.102


Epoch 10/10: 100%|██████████| 10/10 [00:16<00:00,  1.66s/epoch, Train Loss=0.1880, Test Loss A=0.1040, Test Loss B=0.2851]


0.4 0.104


Epoch 10/10: 100%|██████████| 10/10 [00:16<00:00,  1.67s/epoch, Train Loss=0.1880, Test Loss A=0.1056, Test Loss B=0.2869]


0.45 0.106


Epoch 10/10: 100%|██████████| 10/10 [00:16<00:00,  1.65s/epoch, Train Loss=0.1892, Test Loss A=0.1002, Test Loss B=0.2798]


0.5 0.1


Epoch 10/10: 100%|██████████| 10/10 [00:16<00:00,  1.69s/epoch, Train Loss=0.1867, Test Loss A=0.1001, Test Loss B=0.2782]


0.55 0.1


Epoch 10/10: 100%|██████████| 10/10 [00:16<00:00,  1.66s/epoch, Train Loss=0.1869, Test Loss A=0.1032, Test Loss B=0.2815]


0.6 0.103


Epoch 10/10: 100%|██████████| 10/10 [00:16<00:00,  1.66s/epoch, Train Loss=0.1859, Test Loss A=0.1033, Test Loss B=0.2730]


0.65 0.103


Epoch 10/10: 100%|██████████| 10/10 [00:17<00:00,  1.71s/epoch, Train Loss=0.1863, Test Loss A=0.1024, Test Loss B=0.2860]


0.7 0.102


Epoch 10/10: 100%|██████████| 10/10 [00:16<00:00,  1.70s/epoch, Train Loss=0.1888, Test Loss A=0.1057, Test Loss B=0.2947]


0.75 0.106


Epoch 10/10: 100%|██████████| 10/10 [00:17<00:00,  1.70s/epoch, Train Loss=0.1880, Test Loss A=0.1085, Test Loss B=0.2816]


0.8 0.108


Epoch 10/10: 100%|██████████| 10/10 [00:16<00:00,  1.63s/epoch, Train Loss=0.1851, Test Loss A=0.1085, Test Loss B=0.2866]


0.85 0.108
(448, 523, 795) (448,) (448,)
(56, 523, 795) (56,) (56,)
(117, 523, 795) (117,) (117,)


Epoch 10/10: 100%|██████████| 10/10 [00:19<00:00,  2.00s/epoch, Train Loss=0.1768, Test Loss A=0.1883, Test Loss B=0.4765]


0.5 0.187
----------------------------------------
run_name-----------investopedia/_30
best_alpha---------0.5
MSE_testset--------0.187
----------------------------------------
(392, 523, 1051) (392,) (392,)
(56, 523, 1051) (56,) (56,)
(117, 523, 1051) (117,) (117,)


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.46s/epoch, Train Loss=0.7451, Test Loss A=0.5822, Test Loss B=4.8407]


0.05 0.578


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.47s/epoch, Train Loss=0.7441, Test Loss A=0.5770, Test Loss B=4.9290]


0.1 0.566


Epoch 10/10: 100%|██████████| 10/10 [00:23<00:00,  2.38s/epoch, Train Loss=0.7333, Test Loss A=0.5933, Test Loss B=4.9167]


0.15 0.571


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.43s/epoch, Train Loss=0.7353, Test Loss A=0.6200, Test Loss B=5.0370]


0.2 0.6


Epoch 10/10: 100%|██████████| 10/10 [00:23<00:00,  2.39s/epoch, Train Loss=0.7296, Test Loss A=0.6264, Test Loss B=4.8640]


0.25 0.597


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.44s/epoch, Train Loss=0.7305, Test Loss A=0.6174, Test Loss B=4.8668]


0.3 0.571


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.44s/epoch, Train Loss=0.7203, Test Loss A=0.6215, Test Loss B=4.9204]


0.35 0.588


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.45s/epoch, Train Loss=0.7212, Test Loss A=0.6583, Test Loss B=4.9506]


0.4 0.605


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.44s/epoch, Train Loss=0.7224, Test Loss A=0.6730, Test Loss B=4.8957]


0.45 0.602


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.42s/epoch, Train Loss=0.7174, Test Loss A=0.6848, Test Loss B=4.9041]


0.5 0.615


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.42s/epoch, Train Loss=0.7185, Test Loss A=0.7097, Test Loss B=4.8998]


0.55 0.637


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.44s/epoch, Train Loss=0.7199, Test Loss A=0.6664, Test Loss B=5.0604]


0.6 0.556


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.45s/epoch, Train Loss=0.7210, Test Loss A=0.7046, Test Loss B=4.9638]


0.65 0.578


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.44s/epoch, Train Loss=0.7196, Test Loss A=0.6928, Test Loss B=4.9840]


0.7 0.575


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.43s/epoch, Train Loss=0.7172, Test Loss A=0.7057, Test Loss B=4.8502]


0.75 0.597


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.42s/epoch, Train Loss=0.7226, Test Loss A=0.6428, Test Loss B=4.9434]


0.8 0.582


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.44s/epoch, Train Loss=0.7210, Test Loss A=0.6865, Test Loss B=4.9416]


0.85 0.587
(448, 523, 1051) (448,) (448,)
(56, 523, 1051) (56,) (56,)
(117, 523, 1051) (117,) (117,)


Epoch 10/10: 100%|██████████| 10/10 [00:29<00:00,  2.94s/epoch, Train Loss=0.6982, Test Loss A=0.6932, Test Loss B=0.2565]


0.6 0.693
----------------------------------------
run_name-----------bge/
best_alpha---------0.6
MSE_testset--------0.693
----------------------------------------
(392, 523, 1051) (392,) (392,)
(56, 523, 1051) (56,) (56,)
(117, 523, 1051) (117,) (117,)


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.52s/epoch, Train Loss=0.4301, Test Loss A=0.2633, Test Loss B=2.4041]


0.05 0.263


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.49s/epoch, Train Loss=0.4313, Test Loss A=0.2824, Test Loss B=2.3856]


0.1 0.282


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.43s/epoch, Train Loss=0.4277, Test Loss A=0.2777, Test Loss B=2.3667]


0.15 0.278


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.44s/epoch, Train Loss=0.4243, Test Loss A=0.2653, Test Loss B=2.2893]


0.2 0.265


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.45s/epoch, Train Loss=0.4239, Test Loss A=0.2815, Test Loss B=2.3589]


0.25 0.28


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.45s/epoch, Train Loss=0.4245, Test Loss A=0.2708, Test Loss B=2.3234]


0.3 0.269


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.43s/epoch, Train Loss=0.4199, Test Loss A=0.2787, Test Loss B=2.3289]


0.35 0.271


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.46s/epoch, Train Loss=0.4288, Test Loss A=0.2770, Test Loss B=2.3707]


0.4 0.267


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.48s/epoch, Train Loss=0.4239, Test Loss A=0.2701, Test Loss B=2.4045]


0.45 0.27


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.46s/epoch, Train Loss=0.4194, Test Loss A=0.2767, Test Loss B=2.3439]


0.5 0.27


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.44s/epoch, Train Loss=0.4179, Test Loss A=0.2688, Test Loss B=2.2902]


0.55 0.262


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.45s/epoch, Train Loss=0.4160, Test Loss A=0.2705, Test Loss B=2.3460]


0.6 0.267


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.46s/epoch, Train Loss=0.4198, Test Loss A=0.2620, Test Loss B=2.3139]


0.65 0.259


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.52s/epoch, Train Loss=0.4206, Test Loss A=0.2533, Test Loss B=2.3412]


0.7 0.253


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.45s/epoch, Train Loss=0.4182, Test Loss A=0.2734, Test Loss B=2.3362]


0.75 0.264


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.52s/epoch, Train Loss=0.4151, Test Loss A=0.2578, Test Loss B=2.3069]


0.8 0.258


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.58s/epoch, Train Loss=0.4139, Test Loss A=0.2724, Test Loss B=2.2968]


0.85 0.268
(448, 523, 1051) (448,) (448,)
(56, 523, 1051) (56,) (56,)
(117, 523, 1051) (117,) (117,)


Epoch 10/10: 100%|██████████| 10/10 [00:30<00:00,  3.02s/epoch, Train Loss=0.3943, Test Loss A=0.3242, Test Loss B=2.0494]


0.7 0.324
----------------------------------------
run_name-----------bge/_7
best_alpha---------0.7
MSE_testset--------0.324
----------------------------------------
(392, 523, 1051) (392,) (392,)
(56, 523, 1051) (56,) (56,)
(117, 523, 1051) (117,) (117,)


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.53s/epoch, Train Loss=0.2602, Test Loss A=0.1806, Test Loss B=1.2837]


0.05 0.181


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.52s/epoch, Train Loss=0.2621, Test Loss A=0.1822, Test Loss B=1.2283]


0.1 0.182


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.50s/epoch, Train Loss=0.2591, Test Loss A=0.1755, Test Loss B=1.2143]


0.15 0.176


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.53s/epoch, Train Loss=0.2578, Test Loss A=0.1800, Test Loss B=1.2010]


0.2 0.18


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.50s/epoch, Train Loss=0.2557, Test Loss A=0.1741, Test Loss B=1.2007]


0.25 0.174


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.53s/epoch, Train Loss=0.2585, Test Loss A=0.1817, Test Loss B=1.2760]


0.3 0.182


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.56s/epoch, Train Loss=0.2541, Test Loss A=0.1759, Test Loss B=1.2241]


0.35 0.176


Epoch 10/10: 100%|██████████| 10/10 [00:26<00:00,  2.65s/epoch, Train Loss=0.2539, Test Loss A=0.1779, Test Loss B=1.2536]


0.4 0.178


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.55s/epoch, Train Loss=0.2516, Test Loss A=0.1788, Test Loss B=1.1738]


0.45 0.179


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.51s/epoch, Train Loss=0.2536, Test Loss A=0.1759, Test Loss B=1.2146]


0.5 0.176


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.52s/epoch, Train Loss=0.2507, Test Loss A=0.1753, Test Loss B=1.1805]


0.55 0.175


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.56s/epoch, Train Loss=0.2521, Test Loss A=0.1760, Test Loss B=1.2313]


0.6 0.176


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.52s/epoch, Train Loss=0.2509, Test Loss A=0.1785, Test Loss B=1.1111]


0.65 0.179


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.58s/epoch, Train Loss=0.2514, Test Loss A=0.1817, Test Loss B=1.1504]


0.7 0.181


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.50s/epoch, Train Loss=0.2488, Test Loss A=0.1721, Test Loss B=1.1244]


0.75 0.172


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.48s/epoch, Train Loss=0.2503, Test Loss A=0.1910, Test Loss B=1.0346]


0.8 0.185


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.55s/epoch, Train Loss=0.2497, Test Loss A=0.1804, Test Loss B=0.9973]


0.85 0.178
(448, 523, 1051) (448,) (448,)
(56, 523, 1051) (56,) (56,)
(117, 523, 1051) (117,) (117,)


Epoch 10/10: 100%|██████████| 10/10 [00:30<00:00,  3.06s/epoch, Train Loss=0.2357, Test Loss A=0.2651, Test Loss B=0.5462]


0.75 0.265
----------------------------------------
run_name-----------bge/_15
best_alpha---------0.75
MSE_testset--------0.265
----------------------------------------
(392, 523, 1051) (392,) (392,)
(56, 523, 1051) (56,) (56,)
(117, 523, 1051) (117,) (117,)


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.54s/epoch, Train Loss=0.1942, Test Loss A=0.1127, Test Loss B=0.3076]


0.05 0.113


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.49s/epoch, Train Loss=0.1910, Test Loss A=0.1062, Test Loss B=0.3011]


0.1 0.106


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.52s/epoch, Train Loss=0.1913, Test Loss A=0.1059, Test Loss B=0.2906]


0.15 0.106


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.55s/epoch, Train Loss=0.1890, Test Loss A=0.1067, Test Loss B=0.3071]


0.2 0.107


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.55s/epoch, Train Loss=0.1900, Test Loss A=0.1024, Test Loss B=0.2923]


0.25 0.102


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.52s/epoch, Train Loss=0.1879, Test Loss A=0.1062, Test Loss B=0.2859]


0.3 0.106


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.48s/epoch, Train Loss=0.1888, Test Loss A=0.1072, Test Loss B=0.2800]


0.35 0.107


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.52s/epoch, Train Loss=0.1887, Test Loss A=0.1057, Test Loss B=0.2886]


0.4 0.106


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.57s/epoch, Train Loss=0.1870, Test Loss A=0.1056, Test Loss B=0.2809]


0.45 0.106


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.53s/epoch, Train Loss=0.1855, Test Loss A=0.1041, Test Loss B=0.2761]


0.5 0.104


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.54s/epoch, Train Loss=0.1850, Test Loss A=0.1078, Test Loss B=0.2649]


0.55 0.108


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.51s/epoch, Train Loss=0.1865, Test Loss A=0.1068, Test Loss B=0.2738]


0.6 0.107


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.48s/epoch, Train Loss=0.1865, Test Loss A=0.1089, Test Loss B=0.2760]


0.65 0.109


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.52s/epoch, Train Loss=0.1877, Test Loss A=0.1085, Test Loss B=0.2805]


0.7 0.108


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.50s/epoch, Train Loss=0.1859, Test Loss A=0.1099, Test Loss B=0.2788]


0.75 0.11


Epoch 10/10: 100%|██████████| 10/10 [00:24<00:00,  2.49s/epoch, Train Loss=0.1877, Test Loss A=0.1094, Test Loss B=0.2710]


0.8 0.109


Epoch 10/10: 100%|██████████| 10/10 [00:25<00:00,  2.54s/epoch, Train Loss=0.1878, Test Loss A=0.1079, Test Loss B=0.2715]


0.85 0.108
(448, 523, 1051) (448,) (448,)
(56, 523, 1051) (56,) (56,)
(117, 523, 1051) (117,) (117,)


Epoch 10/10: 100%|██████████| 10/10 [00:29<00:00,  2.98s/epoch, Train Loss=0.1784, Test Loss A=0.1787, Test Loss B=0.4899]


0.25 0.179
----------------------------------------
run_name-----------bge/_30
best_alpha---------0.25
MSE_testset--------0.179
----------------------------------------
(392, 523, 327) (392,) (392,)
(56, 523, 327) (56,) (56,)
(117, 523, 327) (117,) (117,)


Epoch 10/10: 100%|██████████| 10/10 [00:07<00:00,  1.42epoch/s, Train Loss=0.7512, Test Loss A=0.5535, Test Loss B=5.1305]


0.05 0.553


Epoch 10/10: 100%|██████████| 10/10 [00:07<00:00,  1.40epoch/s, Train Loss=0.7593, Test Loss A=0.5713, Test Loss B=5.0591]


0.1 0.571


Epoch 10/10: 100%|██████████| 10/10 [00:06<00:00,  1.45epoch/s, Train Loss=0.7508, Test Loss A=0.5763, Test Loss B=4.9367]


0.15 0.574


Epoch 10/10: 100%|██████████| 10/10 [00:07<00:00,  1.38epoch/s, Train Loss=0.7517, Test Loss A=0.5797, Test Loss B=4.9815]


0.2 0.58


Epoch 10/10: 100%|██████████| 10/10 [00:06<00:00,  1.51epoch/s, Train Loss=0.7558, Test Loss A=0.5705, Test Loss B=5.0130]


0.25 0.57


Epoch 10/10: 100%|██████████| 10/10 [00:06<00:00,  1.47epoch/s, Train Loss=0.7434, Test Loss A=0.5691, Test Loss B=5.0320]


0.3 0.567


Epoch 10/10: 100%|██████████| 10/10 [00:06<00:00,  1.51epoch/s, Train Loss=0.7559, Test Loss A=0.6273, Test Loss B=4.9976]


0.35 0.627


Epoch 10/10: 100%|██████████| 10/10 [00:06<00:00,  1.51epoch/s, Train Loss=0.7509, Test Loss A=0.5834, Test Loss B=5.0051]


0.4 0.583


Epoch 10/10: 100%|██████████| 10/10 [00:06<00:00,  1.44epoch/s, Train Loss=0.7440, Test Loss A=0.5899, Test Loss B=4.9374]


0.45 0.584


Epoch 10/10: 100%|██████████| 10/10 [00:06<00:00,  1.44epoch/s, Train Loss=0.7447, Test Loss A=0.5713, Test Loss B=4.9331]


0.5 0.566


Epoch 10/10: 100%|██████████| 10/10 [00:06<00:00,  1.48epoch/s, Train Loss=0.7478, Test Loss A=0.6112, Test Loss B=5.0277]


0.55 0.611


Epoch 10/10: 100%|██████████| 10/10 [00:07<00:00,  1.39epoch/s, Train Loss=0.7473, Test Loss A=0.5740, Test Loss B=5.0009]


0.6 0.574


Epoch 10/10: 100%|██████████| 10/10 [00:06<00:00,  1.46epoch/s, Train Loss=0.7466, Test Loss A=0.5981, Test Loss B=5.0970]


0.65 0.598


Epoch 10/10: 100%|██████████| 10/10 [00:06<00:00,  1.50epoch/s, Train Loss=0.7466, Test Loss A=0.5632, Test Loss B=5.0112]


0.7 0.558


Epoch 10/10: 100%|██████████| 10/10 [00:06<00:00,  1.51epoch/s, Train Loss=0.7418, Test Loss A=0.5561, Test Loss B=5.0844]


0.75 0.556


Epoch 10/10: 100%|██████████| 10/10 [00:06<00:00,  1.52epoch/s, Train Loss=0.7439, Test Loss A=0.5681, Test Loss B=5.0542]


0.8 0.568


Epoch 10/10: 100%|██████████| 10/10 [00:06<00:00,  1.44epoch/s, Train Loss=0.7473, Test Loss A=0.6212, Test Loss B=5.1214]


0.85 0.621
(448, 523, 327) (448,) (448,)
(56, 523, 327) (56,) (56,)
(117, 523, 327) (117,) (117,)


Epoch 10/10: 100%|██████████| 10/10 [00:07<00:00,  1.29epoch/s, Train Loss=0.7322, Test Loss A=0.7625, Test Loss B=0.2902]


0.05 0.743
----------------------------------------
run_name-----------glove/
best_alpha---------0.05
MSE_testset--------0.743
----------------------------------------
(392, 523, 327) (392,) (392,)
(56, 523, 327) (56,) (56,)
(117, 523, 327) (117,) (117,)


Epoch 10/10: 100%|██████████| 10/10 [00:07<00:00,  1.41epoch/s, Train Loss=0.4295, Test Loss A=0.2654, Test Loss B=2.3270]


0.05 0.263


Epoch 10/10: 100%|██████████| 10/10 [00:07<00:00,  1.41epoch/s, Train Loss=0.4343, Test Loss A=0.2988, Test Loss B=2.3691]


0.1 0.288


Epoch 10/10: 100%|██████████| 10/10 [00:07<00:00,  1.40epoch/s, Train Loss=0.4296, Test Loss A=0.2767, Test Loss B=2.4124]


0.15 0.263


Epoch 10/10: 100%|██████████| 10/10 [00:07<00:00,  1.34epoch/s, Train Loss=0.4325, Test Loss A=0.2832, Test Loss B=2.3403]


0.2 0.268


Epoch 10/10: 100%|██████████| 10/10 [00:06<00:00,  1.46epoch/s, Train Loss=0.4310, Test Loss A=0.2610, Test Loss B=2.3742]


0.25 0.259


Epoch 10/10: 100%|██████████| 10/10 [00:06<00:00,  1.51epoch/s, Train Loss=0.4307, Test Loss A=0.2825, Test Loss B=2.3706]


0.3 0.281


Epoch 10/10: 100%|██████████| 10/10 [00:06<00:00,  1.54epoch/s, Train Loss=0.4297, Test Loss A=0.2904, Test Loss B=2.3321]


0.35 0.29


Epoch 10/10: 100%|██████████| 10/10 [00:06<00:00,  1.47epoch/s, Train Loss=0.4324, Test Loss A=0.2747, Test Loss B=2.3912]


0.4 0.27


Epoch 10/10: 100%|██████████| 10/10 [00:06<00:00,  1.47epoch/s, Train Loss=0.4341, Test Loss A=0.2873, Test Loss B=2.3268]


0.45 0.277


Epoch 10/10: 100%|██████████| 10/10 [00:06<00:00,  1.48epoch/s, Train Loss=0.4315, Test Loss A=0.2898, Test Loss B=2.3626]


0.5 0.278


Epoch 10/10: 100%|██████████| 10/10 [00:06<00:00,  1.46epoch/s, Train Loss=0.4317, Test Loss A=0.2906, Test Loss B=2.3193]


0.55 0.278


Epoch 10/10: 100%|██████████| 10/10 [00:06<00:00,  1.53epoch/s, Train Loss=0.4308, Test Loss A=0.2884, Test Loss B=2.3233]


0.6 0.27


Epoch 10/10: 100%|██████████| 10/10 [00:06<00:00,  1.50epoch/s, Train Loss=0.4312, Test Loss A=0.2634, Test Loss B=2.3139]


0.65 0.263


Epoch 10/10: 100%|██████████| 10/10 [00:06<00:00,  1.51epoch/s, Train Loss=0.4320, Test Loss A=0.2612, Test Loss B=2.3238]


0.7 0.259


Epoch 10/10: 100%|██████████| 10/10 [00:06<00:00,  1.52epoch/s, Train Loss=0.4291, Test Loss A=0.2952, Test Loss B=2.3653]


0.75 0.295


Epoch 10/10: 100%|██████████| 10/10 [00:06<00:00,  1.54epoch/s, Train Loss=0.4326, Test Loss A=0.2784, Test Loss B=2.3326]


0.8 0.265


Epoch 10/10: 100%|██████████| 10/10 [00:07<00:00,  1.42epoch/s, Train Loss=0.4346, Test Loss A=0.2846, Test Loss B=2.3129]


0.85 0.277
(448, 523, 327) (448,) (448,)
(56, 523, 327) (56,) (56,)
(117, 523, 327) (117,) (117,)


Epoch 10/10: 100%|██████████| 10/10 [00:08<00:00,  1.18epoch/s, Train Loss=0.4093, Test Loss A=0.3729, Test Loss B=2.0087]


0.25 0.372
----------------------------------------
run_name-----------glove/_7
best_alpha---------0.25
MSE_testset--------0.372
----------------------------------------
(392, 523, 327) (392,) (392,)
(56, 523, 327) (56,) (56,)
(117, 523, 327) (117,) (117,)


Epoch 10/10: 100%|██████████| 10/10 [00:07<00:00,  1.41epoch/s, Train Loss=0.2626, Test Loss A=0.1947, Test Loss B=1.0505]


0.05 0.184


Epoch 10/10: 100%|██████████| 10/10 [00:07<00:00,  1.41epoch/s, Train Loss=0.2623, Test Loss A=0.1837, Test Loss B=1.0084]


0.1 0.184


Epoch 10/10: 100%|██████████| 10/10 [00:07<00:00,  1.37epoch/s, Train Loss=0.2612, Test Loss A=0.1836, Test Loss B=1.0175]


0.15 0.183


Epoch 10/10: 100%|██████████| 10/10 [00:07<00:00,  1.39epoch/s, Train Loss=0.2603, Test Loss A=0.1846, Test Loss B=0.9358]


0.2 0.185


Epoch 10/10: 100%|██████████| 10/10 [00:07<00:00,  1.30epoch/s, Train Loss=0.2614, Test Loss A=0.1827, Test Loss B=1.0586]


0.25 0.183


Epoch 10/10: 100%|██████████| 10/10 [00:07<00:00,  1.27epoch/s, Train Loss=0.2603, Test Loss A=0.1839, Test Loss B=0.9390]


0.3 0.183


Epoch 10/10: 100%|██████████| 10/10 [00:07<00:00,  1.32epoch/s, Train Loss=0.2599, Test Loss A=0.1823, Test Loss B=0.9719]


0.35 0.182


Epoch 10/10: 100%|██████████| 10/10 [00:07<00:00,  1.37epoch/s, Train Loss=0.2620, Test Loss A=0.1863, Test Loss B=1.0456]


0.4 0.186


Epoch 10/10: 100%|██████████| 10/10 [00:06<00:00,  1.44epoch/s, Train Loss=0.2618, Test Loss A=0.1841, Test Loss B=0.9654]


0.45 0.184


Epoch 10/10: 100%|██████████| 10/10 [00:07<00:00,  1.38epoch/s, Train Loss=0.2585, Test Loss A=0.1836, Test Loss B=0.9775]


0.5 0.183


Epoch 10/10: 100%|██████████| 10/10 [00:07<00:00,  1.30epoch/s, Train Loss=0.2609, Test Loss A=0.1886, Test Loss B=0.9657]


0.55 0.189


Epoch 10/10: 100%|██████████| 10/10 [00:06<00:00,  1.49epoch/s, Train Loss=0.2607, Test Loss A=0.1809, Test Loss B=0.9543]


0.6 0.181


Epoch 10/10: 100%|██████████| 10/10 [00:07<00:00,  1.40epoch/s, Train Loss=0.2611, Test Loss A=0.1841, Test Loss B=0.9725]


0.65 0.184


Epoch 10/10: 100%|██████████| 10/10 [00:06<00:00,  1.50epoch/s, Train Loss=0.2592, Test Loss A=0.1819, Test Loss B=0.9409]


0.7 0.181


Epoch 10/10: 100%|██████████| 10/10 [00:06<00:00,  1.48epoch/s, Train Loss=0.2617, Test Loss A=0.1991, Test Loss B=0.8994]


0.75 0.199


Epoch 10/10: 100%|██████████| 10/10 [00:07<00:00,  1.41epoch/s, Train Loss=0.2590, Test Loss A=0.1826, Test Loss B=0.9732]


0.8 0.183


Epoch 10/10: 100%|██████████| 10/10 [00:07<00:00,  1.40epoch/s, Train Loss=0.2594, Test Loss A=0.1857, Test Loss B=0.9235]


0.85 0.186
(448, 523, 327) (448,) (448,)
(56, 523, 327) (56,) (56,)
(117, 523, 327) (117,) (117,)


Epoch 10/10: 100%|██████████| 10/10 [00:07<00:00,  1.25epoch/s, Train Loss=0.2502, Test Loss A=0.2837, Test Loss B=0.5124]


0.6 0.284
----------------------------------------
run_name-----------glove/_15
best_alpha---------0.6
MSE_testset--------0.284
----------------------------------------
(392, 523, 327) (392,) (392,)
(56, 523, 327) (56,) (56,)
(117, 523, 327) (117,) (117,)


Epoch 10/10: 100%|██████████| 10/10 [00:07<00:00,  1.35epoch/s, Train Loss=0.1916, Test Loss A=0.1180, Test Loss B=0.2953]


0.05 0.118


Epoch 10/10: 100%|██████████| 10/10 [00:08<00:00,  1.16epoch/s, Train Loss=0.1916, Test Loss A=0.1183, Test Loss B=0.2733]


0.1 0.118


Epoch 10/10: 100%|██████████| 10/10 [00:08<00:00,  1.23epoch/s, Train Loss=0.1926, Test Loss A=0.1153, Test Loss B=0.3131]


0.15 0.114


Epoch 10/10: 100%|██████████| 10/10 [00:07<00:00,  1.35epoch/s, Train Loss=0.1940, Test Loss A=0.1199, Test Loss B=0.2767]


0.2 0.12


Epoch 10/10: 100%|██████████| 10/10 [00:07<00:00,  1.41epoch/s, Train Loss=0.1920, Test Loss A=0.1189, Test Loss B=0.2809]


0.25 0.119


Epoch 10/10: 100%|██████████| 10/10 [00:07<00:00,  1.41epoch/s, Train Loss=0.1943, Test Loss A=0.1234, Test Loss B=0.2786]


0.3 0.123


Epoch 10/10: 100%|██████████| 10/10 [00:07<00:00,  1.37epoch/s, Train Loss=0.1921, Test Loss A=0.1180, Test Loss B=0.2782]


0.35 0.118


Epoch 10/10: 100%|██████████| 10/10 [00:07<00:00,  1.36epoch/s, Train Loss=0.1924, Test Loss A=0.1166, Test Loss B=0.2768]


0.4 0.117


Epoch 10/10: 100%|██████████| 10/10 [00:07<00:00,  1.30epoch/s, Train Loss=0.1916, Test Loss A=0.1206, Test Loss B=0.2836]


0.45 0.121


Epoch 10/10: 100%|██████████| 10/10 [00:07<00:00,  1.33epoch/s, Train Loss=0.1930, Test Loss A=0.1148, Test Loss B=0.2825]


0.5 0.115


Epoch 10/10: 100%|██████████| 10/10 [00:07<00:00,  1.38epoch/s, Train Loss=0.1926, Test Loss A=0.1163, Test Loss B=0.2864]


0.55 0.116


Epoch 10/10: 100%|██████████| 10/10 [00:06<00:00,  1.47epoch/s, Train Loss=0.1927, Test Loss A=0.1183, Test Loss B=0.3108]


0.6 0.118


Epoch 10/10: 100%|██████████| 10/10 [00:06<00:00,  1.51epoch/s, Train Loss=0.1928, Test Loss A=0.1164, Test Loss B=0.2773]


0.65 0.116


Epoch 10/10: 100%|██████████| 10/10 [00:06<00:00,  1.52epoch/s, Train Loss=0.1937, Test Loss A=0.1181, Test Loss B=0.2710]


0.7 0.118


Epoch 10/10: 100%|██████████| 10/10 [00:06<00:00,  1.51epoch/s, Train Loss=0.1941, Test Loss A=0.1159, Test Loss B=0.2825]


0.75 0.116


Epoch 10/10: 100%|██████████| 10/10 [00:07<00:00,  1.41epoch/s, Train Loss=0.1933, Test Loss A=0.1170, Test Loss B=0.2790]


0.8 0.117


Epoch 10/10: 100%|██████████| 10/10 [00:07<00:00,  1.34epoch/s, Train Loss=0.1917, Test Loss A=0.1257, Test Loss B=0.2758]


0.85 0.126
(448, 523, 327) (448,) (448,)
(56, 523, 327) (56,) (56,)
(117, 523, 327) (117,) (117,)


Epoch 10/10: 100%|██████████| 10/10 [00:08<00:00,  1.16epoch/s, Train Loss=0.1843, Test Loss A=0.2264, Test Loss B=0.4581]

0.15 0.226
----------------------------------------
run_name-----------glove/_30
best_alpha---------0.15
MSE_testset--------0.226
----------------------------------------





In [98]:
final_results = pd.DataFrame(final_results, columns=['data_directory', 'n_days', 'best_alpha', 'MSE_testset'])
final_results.to_csv('data/final_results.csv', index=False)
final_results

Unnamed: 0,0,1,2
0,,0.05,0.725
1,_7,0.25,0.362
2,_15,0.05,0.28
3,_30,0.45,0.19
4,Roberta2/,0.05,0.734
5,Roberta2/_7,0.4,0.343
6,Roberta2/_15,0.75,0.276
7,Roberta2/_30,0.4,0.182
8,investopedia/,0.85,0.698
9,investopedia/_7,0.75,0.324


In [None]:
stop

In [101]:
#?
final_results_pivoted = final_results.pivot(index='data_directory', columns='n_days', values=['best_alpha', 'MSE_testset'])
final_results_pivoted.columns = [f"{n_days}_{metric}" for metric, n_days in final_results_pivoted.columns]
final_results_pivoted.reset_index(inplace=True)
final_results_pivoted


Unnamed: 0_level_0,alpha3,alpha_15,alpha_30,alpha_7,MSE3,MSE_15,MSE_30,MSE_7
model,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
,0.05,,,,0.725,,,
Roberta2,0.05,0.75,0.4,0.4,0.734,0.276,0.182,0.343
_15,0.05,,,,0.28,,,
_30,0.45,,,,0.19,,,
_7,0.25,,,,0.362,,,
bge,0.6,0.75,0.25,0.7,0.693,0.265,0.179,0.324
glove,0.05,0.6,0.15,0.25,0.743,0.284,0.226,0.372
investopedia,0.85,0.75,0.5,0.75,0.698,0.263,0.187,0.324


In [103]:
final_results

Unnamed: 0,run_name,alpha,MSE,model,n_days
0,,0.05,0.725,,3
1,_7,0.25,0.362,_7,3
2,_15,0.05,0.28,_15,3
3,_30,0.45,0.19,_30,3
4,Roberta2/,0.05,0.734,Roberta2,3
5,Roberta2/_7,0.4,0.343,Roberta2,_7
6,Roberta2/_15,0.75,0.276,Roberta2,_15
7,Roberta2/_30,0.4,0.182,Roberta2,_30
8,investopedia/,0.85,0.698,investopedia,3
9,investopedia/_7,0.75,0.324,investopedia,_7


In [99]:
%load_ext tensorboard
%tensorboard --logdir runs 

In [100]:
:)

SyntaxError: unmatched ')' (1896645534.py, line 1)