# Multilabel BERT Experiments

In this notebook we do some first experiments with BERT: we finetune a BERT model+classifier on each of our datasets separately and compute the accuracy of the resulting classifier on the test data.

For these experiments we use the `pytorch_transformers` package. It contains a variety of neural network architectures for transfer learning and pretrained models, including BERT and XLNET.

Two different BERT models are relevant for our experiments: 

- BERT-base-uncased: a relatively small BERT model that should already give reasonable results,
- BERT-large-uncased: a larger model for real state-of-the-art results.

In [1]:
PREFIXES = ["title9_but", "title9_because", "title9_so", "eatingmeat4_so", "eatingmeat4_because", "eatingmeat4_but"]  #"voting_so_automl"
BERT_MODEL = 'bert-base-uncased'
BATCH_SIZE = 16 if "base" in BERT_MODEL else 2
GRADIENT_ACCUMULATION_STEPS = 1 if "base" in BERT_MODEL else 8
MAX_SEQ_LENGTH = 100

## Data

We use the same data as for all our previous experiments. Here we load the training, development and test data for a particular prompt.

In [2]:
import sys
sys.path.append('../')

import ndjson
import glob
import numpy as np
from collections import Counter

from quillnlp.models.bert.preprocessing import preprocess, create_label_vocabulary

data = []
for prefix in PREFIXES:
    data_file = f"../data/interim/{prefix}_withprompt.ndjson"

    with open(data_file) as i:
        new_data = ndjson.load(i)
        for item in new_data:
            item["labels"] = [prefix + "_" + l for l in item["labels"] if len(l) > 1]
            item["label"] = item["labels"]

        data.extend(new_data)

label2idx = create_label_vocabulary(data)
print(label2idx)
idx2label = {v:k for k,v in label2idx.items()}
target_names = [idx2label[s] for s in range(len(idx2label))]
data_items = np.array(preprocess(data, BERT_MODEL, label2idx, MAX_SEQ_LENGTH))

labels = Counter()
for item in data:
    labels.update(item["label"])
    
lc = Counter()
for item in data_items:
    for j, v in enumerate(item.label_ids):
        if v == 1:
            lc.update([j])
        
print(labels)
print(lc)

I0406 12:43:14.593649 139997413660480 file_utils.py:41] PyTorch version 1.2.0+cu92 available.
I0406 12:43:15.604099 139997413660480 file_utils.py:57] TensorFlow version 2.1.0 available.


{'title9_but_Affected_More_Men': 0, 'title9_but_Cuts_to_mens_sports': 1, 'title9_but_Cuts_to_WSD': 2, 'title9_but_Cuts_to_WSD,': 3, 'title9_but_EMU_compliant': 4, 'title9_but_Miscellaneous': 5, 'title9_but_Limit_woms_opps': 6, 'title9_but_Court_Decision': 7, 'title9_but_Judge_reinstate': 8, 'title9_but_Teams_reinstated': 9, 'title9_but_Women_sued': 10, 'title9_but_Kept_other_ones': 11, 'title9_but_Did_not_cut_t_and_s': 12, 'title9_but_EMU_viol_TitleIX': 13, 'title9_but_EMU_cut_less_men': 14, 'title9_but_Fem_athletes_mad': 15, 'title9_because_Fin_Trouble': 16, 'title9_because_Lack_of_funds': 17, 'title9_because_To_Save_Money': 18, 'title9_because_Miscellaneous': 19, 'title9_because_Overspending': 20, 'title9_so_Ath_claim_TitleIX': 21, 'title9_so_Miscellaneous': 22, 'title9_so_Ath_fought': 23, 'title9_so_Teams_got_reinstated': 24, 'title9_so_Ath_sued': 25, 'title9_so_Court_EMU_viol': 26, 'title9_so_Judge_said_reinstate': 27, 'title9_so_Reduce_costs': 28, 'title9_so_Ath_sued_TitleIX': 29,

I0406 12:43:16.310533 139997413660480 tokenization_utils.py:501] loading file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt from cache at /home/yves/.cache/torch/transformers/26bc1ad6c0ac742e9b52263248f6d0f00068293b33709fae12320c0e35ccfbbb.542ce4285a40d23a559526243235df47c5f75c197f04f37d1a0c124c32c9a084


Counter({'eatingmeat4_but_Feedback_13': 358, 'eatingmeat4_so_Feedback_9': 309, 'eatingmeat4_because_Feedback_1': 282, 'title9_because_Overspending': 276, 'eatingmeat4_so_Feedback_10': 241, 'title9_because_To_Save_Money': 224, 'title9_so_Reduce_costs': 209, 'eatingmeat4_because_Feedback_5': 205, 'title9_so_Ath_sued_TitleIX': 179, 'title9_because_Fin_Trouble': 172, 'eatingmeat4_so_Feedback_2': 148, 'title9_but_Miscellaneous': 124, 'title9_but_Teams_reinstated': 116, 'title9_so_Miscellaneous': 113, 'title9_but_Women_sued': 107, 'title9_so_Ath_sued': 102, 'eatingmeat4_but_Feedback_9': 101, 'title9_but_Cuts_to_WSD,': 92, 'eatingmeat4_but_Feedback_7': 80, 'title9_because_Lack_of_funds': 57, 'title9_because_Miscellaneous': 57, 'title9_but_EMU_viol_TitleIX': 56, 'eatingmeat4_but_Feedback_3': 56, 'title9_but_Court_Decision': 49, 'title9_but_Cuts_to_mens_sports': 48, 'eatingmeat4_so_Feedback_8': 48, 'title9_but_Judge_reinstate': 45, 'eatingmeat4_because_Feedback_7': 43, 'title9_but_Affected_More

## Model

We load the pretrained model and put it on a GPU if one is available. We also put the model in "training" mode, so that we can correctly update its internal parameters on the basis of our data sets.

In [3]:
def evaluate_output(all_correct, all_predicted):
    correct = 0
    at_least_one = 0
    fp, fn, tp, tn = 0, 0, 0, 0
    for c, p in zip(all_correct, all_predicted):
        if sum(c == p) == len(c):
            correct +=1

        for ci, pi in zip(c, p):
            if pi == 1 and ci == 1:
                at_least_one += 1
                break

        for ci, pi in zip(c, p):
            if pi == 1 and ci == 1:
                tp += 1
                same = 1
            elif pi == 1 and ci == 0:
                fp += 1
            elif pi == 0 and ci == 1:
                fn += 1
            else:
                tn += 1
                same =1

    precision = tp/(tp+fp) if tp+fp > 0 else 0
    recall = tp/(tp+fn) if tp+fn > 0 else 0
    fscore = 2*precision*recall/(precision+recall) if (precision+recall) > 0 else 0
    print("Data size:", len(all_predicted))
    print("P:", tp, "/", tp+fp, "=", precision)
    print("R:", tp, "/", tp+fn, "=", recall)
    print("F:", fscore)
    print("A:", correct/len(all_correct))
    print("AL1:", at_least_one/len(all_correct))

In [4]:
import torch

from quillnlp.models.bert.train import train, evaluate
from quillnlp.models.bert.models import get_multilabel_bert_classifier

from quillnlp.models.bert.preprocessing import get_data_loader
from sklearn.model_selection import KFold

import random
random.shuffle(data_items)

kf = KFold(n_splits=5, shuffle=True, random_state=1)
all_correct, all_predicted = [], []
all_test_data = []
for train_idx, test_idx in kf.split(data_items):

    train_and_dev_data, test_data = data_items[train_idx], data_items[test_idx]    
    cutoff = int(len(train_and_dev_data)/6*5)
    train_data = train_and_dev_data[:cutoff]
    dev_data = train_and_dev_data[cutoff:]
    
    print("Train size:", len(train_data))        
    print("Dev size:", len(dev_data))        
            
    train_dataloader = get_data_loader(train_data, BATCH_SIZE)
    dev_dataloader = get_data_loader(dev_data, BATCH_SIZE, shuffle=False)
    test_dataloader = get_data_loader(test_data, BATCH_SIZE, shuffle=False)
    
    device = "cuda" if torch.cuda.is_available() else "cpu"
    model = get_multilabel_bert_classifier(BERT_MODEL, len(label2idx), device=device)
    output_model_file = train(model, train_dataloader, dev_dataloader, 
                              BATCH_SIZE, GRADIENT_ACCUMULATION_STEPS, 
                              device, num_train_epochs=100)
    
    print("Loading model from", output_model_file)
    device="cpu"

    model = get_multilabel_bert_classifier(BERT_MODEL, len(label2idx), model_file=output_model_file, device=device)
    model.eval()
    
    _, _, test_correct, test_predicted = evaluate(model, test_dataloader, device)
    evaluate_output(test_correct, test_predicted)
    all_correct.extend(test_correct)
    all_predicted.extend(test_predicted)
    all_test_data.extend(test_data)

Train size: 2956
Dev size: 592


I0406 12:43:18.506703 139997413660480 configuration_utils.py:256] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json from cache at /home/yves/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.8f56353af4a709bf5ff0fbc915d8f5b42bfff892cbb6ac98c3c45f481a03c685
I0406 12:43:18.508011 139997413660480 configuration_utils.py:292] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": null,
  "do_sample": false,
  "eos_token_ids": null,
  "finetuning_task": null,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1"
  },
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "is_decoder": false,
  "label2id": {
    "LABEL_0": 0,
    "LABEL_1": 1
  },
  "layer_norm_eps": 1e-12,
  "length_penalty": 1.0,
  "max_length": 20,
  "max_position_embedd

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: []
Dev loss: 0.49955145088402003


Epoch:   1%|          | 1/100 [00:40<1:07:38, 41.00s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49955145088402003]
Dev loss: 0.3425517323854807


Epoch:   2%|▏         | 2/100 [01:22<1:07:10, 41.12s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49955145088402003, 0.3425517323854807]
Dev loss: 0.22473320445498904


Epoch:   3%|▎         | 3/100 [02:03<1:06:41, 41.25s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904]
Dev loss: 0.15214085337278005


Epoch:   4%|▍         | 4/100 [02:45<1:06:10, 41.36s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005]
Dev loss: 0.11478856729494559


Epoch:   5%|▌         | 5/100 [03:27<1:05:38, 41.46s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559]
Dev loss: 0.09607695989512108


Epoch:   6%|▌         | 6/100 [04:09<1:05:05, 41.55s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559, 0.09607695989512108]
Dev loss: 0.08612215377994485


Epoch:   7%|▋         | 7/100 [04:50<1:04:30, 41.62s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559, 0.09607695989512108, 0.08612215377994485]
Dev loss: 0.08051841706037521


Epoch:   8%|▊         | 8/100 [05:32<1:03:52, 41.66s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559, 0.09607695989512108, 0.08612215377994485, 0.08051841706037521]
Dev loss: 0.07700466082708256


Epoch:   9%|▉         | 9/100 [06:14<1:03:14, 41.70s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559, 0.09607695989512108, 0.08612215377994485, 0.08051841706037521, 0.07700466082708256]
Dev loss: 0.07435202256247804


Epoch:  10%|█         | 10/100 [06:56<1:02:35, 41.72s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559, 0.09607695989512108, 0.08612215377994485, 0.08051841706037521, 0.07700466082708256, 0.07435202256247804]
Dev loss: 0.07212021685129888


Epoch:  11%|█         | 11/100 [07:37<1:01:54, 41.74s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559, 0.09607695989512108, 0.08612215377994485, 0.08051841706037521, 0.07700466082708256, 0.07435202256247804, 0.07212021685129888]
Dev loss: 0.06969393407170837


Epoch:  12%|█▏        | 12/100 [08:19<1:01:14, 41.75s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559, 0.09607695989512108, 0.08612215377994485, 0.08051841706037521, 0.07700466082708256, 0.07435202256247804, 0.07212021685129888, 0.06969393407170837]
Dev loss: 0.06702591448619559


Epoch:  13%|█▎        | 13/100 [09:01<1:00:33, 41.76s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559, 0.09607695989512108, 0.08612215377994485, 0.08051841706037521, 0.07700466082708256, 0.07435202256247804, 0.07212021685129888, 0.06969393407170837, 0.06702591448619559]
Dev loss: 0.06440388139437984


Epoch:  14%|█▍        | 14/100 [09:43<59:51, 41.77s/it]  

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559, 0.09607695989512108, 0.08612215377994485, 0.08051841706037521, 0.07700466082708256, 0.07435202256247804, 0.07212021685129888, 0.06969393407170837, 0.06702591448619559, 0.06440388139437984]
Dev loss: 0.06164820240558805


Epoch:  15%|█▌        | 15/100 [10:25<59:10, 41.77s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559, 0.09607695989512108, 0.08612215377994485, 0.08051841706037521, 0.07700466082708256, 0.07435202256247804, 0.07212021685129888, 0.06969393407170837, 0.06702591448619559, 0.06440388139437984, 0.06164820240558805]
Dev loss: 0.05886032855188524


Epoch:  16%|█▌        | 16/100 [11:06<58:29, 41.78s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559, 0.09607695989512108, 0.08612215377994485, 0.08051841706037521, 0.07700466082708256, 0.07435202256247804, 0.07212021685129888, 0.06969393407170837, 0.06702591448619559, 0.06440388139437984, 0.06164820240558805, 0.05886032855188524]
Dev loss: 0.056033102342405834


Epoch:  17%|█▋        | 17/100 [11:48<57:47, 41.78s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559, 0.09607695989512108, 0.08612215377994485, 0.08051841706037521, 0.07700466082708256, 0.07435202256247804, 0.07212021685129888, 0.06969393407170837, 0.06702591448619559, 0.06440388139437984, 0.06164820240558805, 0.05886032855188524, 0.056033102342405834]
Dev loss: 0.05334019399172551


Epoch:  18%|█▊        | 18/100 [12:30<57:06, 41.78s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559, 0.09607695989512108, 0.08612215377994485, 0.08051841706037521, 0.07700466082708256, 0.07435202256247804, 0.07212021685129888, 0.06969393407170837, 0.06702591448619559, 0.06440388139437984, 0.06164820240558805, 0.05886032855188524, 0.056033102342405834, 0.05334019399172551]
Dev loss: 0.050841758480748615


Epoch:  19%|█▉        | 19/100 [13:12<56:24, 41.79s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559, 0.09607695989512108, 0.08612215377994485, 0.08051841706037521, 0.07700466082708256, 0.07435202256247804, 0.07212021685129888, 0.06969393407170837, 0.06702591448619559, 0.06440388139437984, 0.06164820240558805, 0.05886032855188524, 0.056033102342405834, 0.05334019399172551, 0.050841758480748615]
Dev loss: 0.048417703525440114


Epoch:  20%|██        | 20/100 [13:53<55:42, 41.79s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559, 0.09607695989512108, 0.08612215377994485, 0.08051841706037521, 0.07700466082708256, 0.07435202256247804, 0.07212021685129888, 0.06969393407170837, 0.06702591448619559, 0.06440388139437984, 0.06164820240558805, 0.05886032855188524, 0.056033102342405834, 0.05334019399172551, 0.050841758480748615, 0.048417703525440114]
Dev loss: 0.04591028241289628


Epoch:  21%|██        | 21/100 [14:35<55:01, 41.79s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559, 0.09607695989512108, 0.08612215377994485, 0.08051841706037521, 0.07700466082708256, 0.07435202256247804, 0.07212021685129888, 0.06969393407170837, 0.06702591448619559, 0.06440388139437984, 0.06164820240558805, 0.05886032855188524, 0.056033102342405834, 0.05334019399172551, 0.050841758480748615, 0.048417703525440114, 0.04591028241289628]
Dev loss: 0.04385586445396011


Epoch:  22%|██▏       | 22/100 [15:17<54:19, 41.78s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559, 0.09607695989512108, 0.08612215377994485, 0.08051841706037521, 0.07700466082708256, 0.07435202256247804, 0.07212021685129888, 0.06969393407170837, 0.06702591448619559, 0.06440388139437984, 0.06164820240558805, 0.05886032855188524, 0.056033102342405834, 0.05334019399172551, 0.050841758480748615, 0.048417703525440114, 0.04591028241289628, 0.04385586445396011]
Dev loss: 0.04180658900657216


Epoch:  23%|██▎       | 23/100 [15:59<53:37, 41.78s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559, 0.09607695989512108, 0.08612215377994485, 0.08051841706037521, 0.07700466082708256, 0.07435202256247804, 0.07212021685129888, 0.06969393407170837, 0.06702591448619559, 0.06440388139437984, 0.06164820240558805, 0.05886032855188524, 0.056033102342405834, 0.05334019399172551, 0.050841758480748615, 0.048417703525440114, 0.04591028241289628, 0.04385586445396011, 0.04180658900657216]
Dev loss: 0.03987455725468494


Epoch:  24%|██▍       | 24/100 [16:41<52:55, 41.78s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559, 0.09607695989512108, 0.08612215377994485, 0.08051841706037521, 0.07700466082708256, 0.07435202256247804, 0.07212021685129888, 0.06969393407170837, 0.06702591448619559, 0.06440388139437984, 0.06164820240558805, 0.05886032855188524, 0.056033102342405834, 0.05334019399172551, 0.050841758480748615, 0.048417703525440114, 0.04591028241289628, 0.04385586445396011, 0.04180658900657216, 0.03987455725468494]
Dev loss: 0.03838468353087838


Epoch:  25%|██▌       | 25/100 [17:22<52:13, 41.79s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559, 0.09607695989512108, 0.08612215377994485, 0.08051841706037521, 0.07700466082708256, 0.07435202256247804, 0.07212021685129888, 0.06969393407170837, 0.06702591448619559, 0.06440388139437984, 0.06164820240558805, 0.05886032855188524, 0.056033102342405834, 0.05334019399172551, 0.050841758480748615, 0.048417703525440114, 0.04591028241289628, 0.04385586445396011, 0.04180658900657216, 0.03987455725468494, 0.03838468353087838]
Dev loss: 0.03683853496772212


Epoch:  26%|██▌       | 26/100 [18:04<51:32, 41.79s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559, 0.09607695989512108, 0.08612215377994485, 0.08051841706037521, 0.07700466082708256, 0.07435202256247804, 0.07212021685129888, 0.06969393407170837, 0.06702591448619559, 0.06440388139437984, 0.06164820240558805, 0.05886032855188524, 0.056033102342405834, 0.05334019399172551, 0.050841758480748615, 0.048417703525440114, 0.04591028241289628, 0.04385586445396011, 0.04180658900657216, 0.03987455725468494, 0.03838468353087838, 0.03683853496772212]
Dev loss: 0.03543759127323692


Epoch:  27%|██▋       | 27/100 [18:46<50:50, 41.79s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559, 0.09607695989512108, 0.08612215377994485, 0.08051841706037521, 0.07700466082708256, 0.07435202256247804, 0.07212021685129888, 0.06969393407170837, 0.06702591448619559, 0.06440388139437984, 0.06164820240558805, 0.05886032855188524, 0.056033102342405834, 0.05334019399172551, 0.050841758480748615, 0.048417703525440114, 0.04591028241289628, 0.04385586445396011, 0.04180658900657216, 0.03987455725468494, 0.03838468353087838, 0.03683853496772212, 0.03543759127323692]
Dev loss: 0.033991139432465706


Epoch:  28%|██▊       | 28/100 [19:28<50:08, 41.79s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559, 0.09607695989512108, 0.08612215377994485, 0.08051841706037521, 0.07700466082708256, 0.07435202256247804, 0.07212021685129888, 0.06969393407170837, 0.06702591448619559, 0.06440388139437984, 0.06164820240558805, 0.05886032855188524, 0.056033102342405834, 0.05334019399172551, 0.050841758480748615, 0.048417703525440114, 0.04591028241289628, 0.04385586445396011, 0.04180658900657216, 0.03987455725468494, 0.03838468353087838, 0.03683853496772212, 0.03543759127323692, 0.033991139432465706]
Dev loss: 0.03303859126124833


Epoch:  29%|██▉       | 29/100 [20:10<49:26, 41.79s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559, 0.09607695989512108, 0.08612215377994485, 0.08051841706037521, 0.07700466082708256, 0.07435202256247804, 0.07212021685129888, 0.06969393407170837, 0.06702591448619559, 0.06440388139437984, 0.06164820240558805, 0.05886032855188524, 0.056033102342405834, 0.05334019399172551, 0.050841758480748615, 0.048417703525440114, 0.04591028241289628, 0.04385586445396011, 0.04180658900657216, 0.03987455725468494, 0.03838468353087838, 0.03683853496772212, 0.03543759127323692, 0.033991139432465706, 0.03303859126124833]
Dev loss: 0.03167754693611248


Epoch:  30%|███       | 30/100 [20:51<48:45, 41.79s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559, 0.09607695989512108, 0.08612215377994485, 0.08051841706037521, 0.07700466082708256, 0.07435202256247804, 0.07212021685129888, 0.06969393407170837, 0.06702591448619559, 0.06440388139437984, 0.06164820240558805, 0.05886032855188524, 0.056033102342405834, 0.05334019399172551, 0.050841758480748615, 0.048417703525440114, 0.04591028241289628, 0.04385586445396011, 0.04180658900657216, 0.03987455725468494, 0.03838468353087838, 0.03683853496772212, 0.03543759127323692, 0.033991139432465706, 0.03303859126124833, 0.03167754693611248]
Dev loss: 0.030582019002050966


Epoch:  31%|███       | 31/100 [21:33<48:03, 41.78s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559, 0.09607695989512108, 0.08612215377994485, 0.08051841706037521, 0.07700466082708256, 0.07435202256247804, 0.07212021685129888, 0.06969393407170837, 0.06702591448619559, 0.06440388139437984, 0.06164820240558805, 0.05886032855188524, 0.056033102342405834, 0.05334019399172551, 0.050841758480748615, 0.048417703525440114, 0.04591028241289628, 0.04385586445396011, 0.04180658900657216, 0.03987455725468494, 0.03838468353087838, 0.03683853496772212, 0.03543759127323692, 0.033991139432465706, 0.03303859126124833, 0.03167754693611248, 0.030582019002050966]
Dev loss: 0.02985577375904934


Epoch:  32%|███▏      | 32/100 [22:15<47:21, 41.78s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559, 0.09607695989512108, 0.08612215377994485, 0.08051841706037521, 0.07700466082708256, 0.07435202256247804, 0.07212021685129888, 0.06969393407170837, 0.06702591448619559, 0.06440388139437984, 0.06164820240558805, 0.05886032855188524, 0.056033102342405834, 0.05334019399172551, 0.050841758480748615, 0.048417703525440114, 0.04591028241289628, 0.04385586445396011, 0.04180658900657216, 0.03987455725468494, 0.03838468353087838, 0.03683853496772212, 0.03543759127323692, 0.033991139432465706, 0.03303859126124833, 0.03167754693611248, 0.030582019002050966, 0.02985577375904934]
Dev loss: 0.029126298689358943


Epoch:  33%|███▎      | 33/100 [22:57<46:39, 41.78s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559, 0.09607695989512108, 0.08612215377994485, 0.08051841706037521, 0.07700466082708256, 0.07435202256247804, 0.07212021685129888, 0.06969393407170837, 0.06702591448619559, 0.06440388139437984, 0.06164820240558805, 0.05886032855188524, 0.056033102342405834, 0.05334019399172551, 0.050841758480748615, 0.048417703525440114, 0.04591028241289628, 0.04385586445396011, 0.04180658900657216, 0.03987455725468494, 0.03838468353087838, 0.03683853496772212, 0.03543759127323692, 0.033991139432465706, 0.03303859126124833, 0.03167754693611248, 0.030582019002050966, 0.02985577375904934, 0.029126298689358943]
Dev loss: 0.028383144253009075


Epoch:  34%|███▍      | 34/100 [23:38<45:57, 41.78s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559, 0.09607695989512108, 0.08612215377994485, 0.08051841706037521, 0.07700466082708256, 0.07435202256247804, 0.07212021685129888, 0.06969393407170837, 0.06702591448619559, 0.06440388139437984, 0.06164820240558805, 0.05886032855188524, 0.056033102342405834, 0.05334019399172551, 0.050841758480748615, 0.048417703525440114, 0.04591028241289628, 0.04385586445396011, 0.04180658900657216, 0.03987455725468494, 0.03838468353087838, 0.03683853496772212, 0.03543759127323692, 0.033991139432465706, 0.03303859126124833, 0.03167754693611248, 0.030582019002050966, 0.02985577375904934, 0.029126298689358943, 0.028383144253009075]
Dev loss: 0.027378388907055597


Epoch:  35%|███▌      | 35/100 [24:20<45:15, 41.78s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559, 0.09607695989512108, 0.08612215377994485, 0.08051841706037521, 0.07700466082708256, 0.07435202256247804, 0.07212021685129888, 0.06969393407170837, 0.06702591448619559, 0.06440388139437984, 0.06164820240558805, 0.05886032855188524, 0.056033102342405834, 0.05334019399172551, 0.050841758480748615, 0.048417703525440114, 0.04591028241289628, 0.04385586445396011, 0.04180658900657216, 0.03987455725468494, 0.03838468353087838, 0.03683853496772212, 0.03543759127323692, 0.033991139432465706, 0.03303859126124833, 0.03167754693611248, 0.030582019002050966, 0.02985577375904934, 0.029126298689358943, 0.028383144253009075, 0.027378388907055597]
Dev loss: 0.026652019722638903


Epoch:  36%|███▌      | 36/100 [25:02<44:34, 41.78s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559, 0.09607695989512108, 0.08612215377994485, 0.08051841706037521, 0.07700466082708256, 0.07435202256247804, 0.07212021685129888, 0.06969393407170837, 0.06702591448619559, 0.06440388139437984, 0.06164820240558805, 0.05886032855188524, 0.056033102342405834, 0.05334019399172551, 0.050841758480748615, 0.048417703525440114, 0.04591028241289628, 0.04385586445396011, 0.04180658900657216, 0.03987455725468494, 0.03838468353087838, 0.03683853496772212, 0.03543759127323692, 0.033991139432465706, 0.03303859126124833, 0.03167754693611248, 0.030582019002050966, 0.02985577375904934, 0.029126298689358943, 0.028383144253009075, 0.027378388907055597, 0.026652019722638903]
Dev loss: 0.026173862452442583


Epoch:  37%|███▋      | 37/100 [25:44<43:52, 41.78s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559, 0.09607695989512108, 0.08612215377994485, 0.08051841706037521, 0.07700466082708256, 0.07435202256247804, 0.07212021685129888, 0.06969393407170837, 0.06702591448619559, 0.06440388139437984, 0.06164820240558805, 0.05886032855188524, 0.056033102342405834, 0.05334019399172551, 0.050841758480748615, 0.048417703525440114, 0.04591028241289628, 0.04385586445396011, 0.04180658900657216, 0.03987455725468494, 0.03838468353087838, 0.03683853496772212, 0.03543759127323692, 0.033991139432465706, 0.03303859126124833, 0.03167754693611248, 0.030582019002050966, 0.02985577375904934, 0.029126298689358943, 0.028383144253009075, 0.027378388907055597, 0.026652019722638903, 0.026173862452442583]
Dev loss: 0.02563548906127343


Epoch:  38%|███▊      | 38/100 [26:26<43:10, 41.78s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559, 0.09607695989512108, 0.08612215377994485, 0.08051841706037521, 0.07700466082708256, 0.07435202256247804, 0.07212021685129888, 0.06969393407170837, 0.06702591448619559, 0.06440388139437984, 0.06164820240558805, 0.05886032855188524, 0.056033102342405834, 0.05334019399172551, 0.050841758480748615, 0.048417703525440114, 0.04591028241289628, 0.04385586445396011, 0.04180658900657216, 0.03987455725468494, 0.03838468353087838, 0.03683853496772212, 0.03543759127323692, 0.033991139432465706, 0.03303859126124833, 0.03167754693611248, 0.030582019002050966, 0.02985577375904934, 0.029126298689358943, 0.028383144253009075, 0.027378388907055597, 0.026652019722638903, 0.026173862452442583, 0.02563548906127343]
Dev loss: 0.025040321228270594


Epoch:  39%|███▉      | 39/100 [27:07<42:28, 41.78s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559, 0.09607695989512108, 0.08612215377994485, 0.08051841706037521, 0.07700466082708256, 0.07435202256247804, 0.07212021685129888, 0.06969393407170837, 0.06702591448619559, 0.06440388139437984, 0.06164820240558805, 0.05886032855188524, 0.056033102342405834, 0.05334019399172551, 0.050841758480748615, 0.048417703525440114, 0.04591028241289628, 0.04385586445396011, 0.04180658900657216, 0.03987455725468494, 0.03838468353087838, 0.03683853496772212, 0.03543759127323692, 0.033991139432465706, 0.03303859126124833, 0.03167754693611248, 0.030582019002050966, 0.02985577375904934, 0.029126298689358943, 0.028383144253009075, 0.027378388907055597, 0.026652019722638903, 0.026173862452442583, 0.02563548906127343, 0.025040321228270594]
Dev loss: 0.02458925751616826


Epoch:  40%|████      | 40/100 [27:49<41:46, 41.78s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559, 0.09607695989512108, 0.08612215377994485, 0.08051841706037521, 0.07700466082708256, 0.07435202256247804, 0.07212021685129888, 0.06969393407170837, 0.06702591448619559, 0.06440388139437984, 0.06164820240558805, 0.05886032855188524, 0.056033102342405834, 0.05334019399172551, 0.050841758480748615, 0.048417703525440114, 0.04591028241289628, 0.04385586445396011, 0.04180658900657216, 0.03987455725468494, 0.03838468353087838, 0.03683853496772212, 0.03543759127323692, 0.033991139432465706, 0.03303859126124833, 0.03167754693611248, 0.030582019002050966, 0.02985577375904934, 0.029126298689358943, 0.028383144253009075, 0.027378388907055597, 0.026652019722638903, 0.026173862452442583, 0.02563548906127343, 0.025040321228270594, 0.02458925751616826]
Dev loss: 0.02448661673209957


Epoch:  41%|████      | 41/100 [28:31<41:05, 41.78s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559, 0.09607695989512108, 0.08612215377994485, 0.08051841706037521, 0.07700466082708256, 0.07435202256247804, 0.07212021685129888, 0.06969393407170837, 0.06702591448619559, 0.06440388139437984, 0.06164820240558805, 0.05886032855188524, 0.056033102342405834, 0.05334019399172551, 0.050841758480748615, 0.048417703525440114, 0.04591028241289628, 0.04385586445396011, 0.04180658900657216, 0.03987455725468494, 0.03838468353087838, 0.03683853496772212, 0.03543759127323692, 0.033991139432465706, 0.03303859126124833, 0.03167754693611248, 0.030582019002050966, 0.02985577375904934, 0.029126298689358943, 0.028383144253009075, 0.027378388907055597, 0.026652019722638903, 0.026173862452442583, 0.02563548906127343, 0.025040321228270594, 0.02458925751616826, 0.02448661673209957]
Dev loss: 0.024136731476598495


Epoch:  42%|████▏     | 42/100 [29:13<40:23, 41.78s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559, 0.09607695989512108, 0.08612215377994485, 0.08051841706037521, 0.07700466082708256, 0.07435202256247804, 0.07212021685129888, 0.06969393407170837, 0.06702591448619559, 0.06440388139437984, 0.06164820240558805, 0.05886032855188524, 0.056033102342405834, 0.05334019399172551, 0.050841758480748615, 0.048417703525440114, 0.04591028241289628, 0.04385586445396011, 0.04180658900657216, 0.03987455725468494, 0.03838468353087838, 0.03683853496772212, 0.03543759127323692, 0.033991139432465706, 0.03303859126124833, 0.03167754693611248, 0.030582019002050966, 0.02985577375904934, 0.029126298689358943, 0.028383144253009075, 0.027378388907055597, 0.026652019722638903, 0.026173862452442583, 0.02563548906127343, 0.025040321228270594, 0.02458925751616826, 0.02448661673209957, 0.024136731476598495]
Dev loss: 0.023546071144173276


Epoch:  43%|████▎     | 43/100 [29:54<39:39, 41.75s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559, 0.09607695989512108, 0.08612215377994485, 0.08051841706037521, 0.07700466082708256, 0.07435202256247804, 0.07212021685129888, 0.06969393407170837, 0.06702591448619559, 0.06440388139437984, 0.06164820240558805, 0.05886032855188524, 0.056033102342405834, 0.05334019399172551, 0.050841758480748615, 0.048417703525440114, 0.04591028241289628, 0.04385586445396011, 0.04180658900657216, 0.03987455725468494, 0.03838468353087838, 0.03683853496772212, 0.03543759127323692, 0.033991139432465706, 0.03303859126124833, 0.03167754693611248, 0.030582019002050966, 0.02985577375904934, 0.029126298689358943, 0.028383144253009075, 0.027378388907055597, 0.026652019722638903, 0.026173862452442583, 0.02563548906127343, 0.025040321228270594, 0.02458925751616826, 0.02448661673209957, 0.024136731476598495, 0.023546071144173276]
Dev loss: 0.023421233469570004


Epoch:  44%|████▍     | 44/100 [30:36<38:58, 41.76s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559, 0.09607695989512108, 0.08612215377994485, 0.08051841706037521, 0.07700466082708256, 0.07435202256247804, 0.07212021685129888, 0.06969393407170837, 0.06702591448619559, 0.06440388139437984, 0.06164820240558805, 0.05886032855188524, 0.056033102342405834, 0.05334019399172551, 0.050841758480748615, 0.048417703525440114, 0.04591028241289628, 0.04385586445396011, 0.04180658900657216, 0.03987455725468494, 0.03838468353087838, 0.03683853496772212, 0.03543759127323692, 0.033991139432465706, 0.03303859126124833, 0.03167754693611248, 0.030582019002050966, 0.02985577375904934, 0.029126298689358943, 0.028383144253009075, 0.027378388907055597, 0.026652019722638903, 0.026173862452442583, 0.02563548906127343, 0.025040321228270594, 0.02458925751616826, 0.02448661673209957, 0.024136731476598495, 0.023546071144173276, 0.023421233469570004]
Dev loss: 0.022959853346283372


Epoch:  45%|████▌     | 45/100 [31:18<38:17, 41.77s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559, 0.09607695989512108, 0.08612215377994485, 0.08051841706037521, 0.07700466082708256, 0.07435202256247804, 0.07212021685129888, 0.06969393407170837, 0.06702591448619559, 0.06440388139437984, 0.06164820240558805, 0.05886032855188524, 0.056033102342405834, 0.05334019399172551, 0.050841758480748615, 0.048417703525440114, 0.04591028241289628, 0.04385586445396011, 0.04180658900657216, 0.03987455725468494, 0.03838468353087838, 0.03683853496772212, 0.03543759127323692, 0.033991139432465706, 0.03303859126124833, 0.03167754693611248, 0.030582019002050966, 0.02985577375904934, 0.029126298689358943, 0.028383144253009075, 0.027378388907055597, 0.026652019722638903, 0.026173862452442583, 0.02563548906127343, 0.025040321228270594, 0.02458925751616826, 0.02448661673209957, 0.024136731476598495, 0.023546071144173276, 0.023421233469570004, 0.022959853346283372]
Dev loss: 0.0226019730640

Epoch:  46%|████▌     | 46/100 [32:00<37:35, 41.77s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559, 0.09607695989512108, 0.08612215377994485, 0.08051841706037521, 0.07700466082708256, 0.07435202256247804, 0.07212021685129888, 0.06969393407170837, 0.06702591448619559, 0.06440388139437984, 0.06164820240558805, 0.05886032855188524, 0.056033102342405834, 0.05334019399172551, 0.050841758480748615, 0.048417703525440114, 0.04591028241289628, 0.04385586445396011, 0.04180658900657216, 0.03987455725468494, 0.03838468353087838, 0.03683853496772212, 0.03543759127323692, 0.033991139432465706, 0.03303859126124833, 0.03167754693611248, 0.030582019002050966, 0.02985577375904934, 0.029126298689358943, 0.028383144253009075, 0.027378388907055597, 0.026652019722638903, 0.026173862452442583, 0.02563548906127343, 0.025040321228270594, 0.02458925751616826, 0.02448661673209957, 0.024136731476598495, 0.023546071144173276, 0.023421233469570004, 0.022959853346283372, 0.022601973064042425]
Dev

Epoch:  47%|████▋     | 47/100 [32:42<36:54, 41.78s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559, 0.09607695989512108, 0.08612215377994485, 0.08051841706037521, 0.07700466082708256, 0.07435202256247804, 0.07212021685129888, 0.06969393407170837, 0.06702591448619559, 0.06440388139437984, 0.06164820240558805, 0.05886032855188524, 0.056033102342405834, 0.05334019399172551, 0.050841758480748615, 0.048417703525440114, 0.04591028241289628, 0.04385586445396011, 0.04180658900657216, 0.03987455725468494, 0.03838468353087838, 0.03683853496772212, 0.03543759127323692, 0.033991139432465706, 0.03303859126124833, 0.03167754693611248, 0.030582019002050966, 0.02985577375904934, 0.029126298689358943, 0.028383144253009075, 0.027378388907055597, 0.026652019722638903, 0.026173862452442583, 0.02563548906127343, 0.025040321228270594, 0.02458925751616826, 0.02448661673209957, 0.024136731476598495, 0.023546071144173276, 0.023421233469570004, 0.022959853346283372, 0.022601973064042425, 0.0

Epoch:  48%|████▊     | 48/100 [33:23<36:12, 41.78s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…

Epoch:  49%|████▉     | 49/100 [34:05<35:23, 41.64s/it]


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559, 0.09607695989512108, 0.08612215377994485, 0.08051841706037521, 0.07700466082708256, 0.07435202256247804, 0.07212021685129888, 0.06969393407170837, 0.06702591448619559, 0.06440388139437984, 0.06164820240558805, 0.05886032855188524, 0.056033102342405834, 0.05334019399172551, 0.050841758480748615, 0.048417703525440114, 0.04591028241289628, 0.04385586445396011, 0.04180658900657216, 0.03987455725468494, 0.03838468353087838, 0.03683853496772212, 0.03543759127323692, 0.033991139432465706, 0.03303859126124833, 0.03167754693611248, 0.030582019002050966, 0.02985577375904934, 0.029126298689358943, 0.028383144253009075, 0.027378388907055597, 0.026652019722638903, 0.026173862452442583, 0.02563548906127343, 0.025040321228270594, 0.02458925751616826, 0.02448661673209957, 0.024136731476598495, 0.023546071144173276, 0.023421233469570004, 0.022959853346283372, 0.022601973064042425, 0.0

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559, 0.09607695989512108, 0.08612215377994485, 0.08051841706037521, 0.07700466082708256, 0.07435202256247804, 0.07212021685129888, 0.06969393407170837, 0.06702591448619559, 0.06440388139437984, 0.06164820240558805, 0.05886032855188524, 0.056033102342405834, 0.05334019399172551, 0.050841758480748615, 0.048417703525440114, 0.04591028241289628, 0.04385586445396011, 0.04180658900657216, 0.03987455725468494, 0.03838468353087838, 0.03683853496772212, 0.03543759127323692, 0.033991139432465706, 0.03303859126124833, 0.03167754693611248, 0.030582019002050966, 0.02985577375904934, 0.029126298689358943, 0.028383144253009075, 0.027378388907055597, 0.026652019722638903, 0.026173862452442583, 0.02563548906127343, 0.025040321228270594, 0.02458925751616826, 0.02448661673209957, 0.024136731476598495, 0.023546071144173276, 0.023421233469570004, 0.022959853346283372, 0.022601973064042425, 0.0

Epoch:  50%|█████     | 50/100 [34:46<34:44, 41.68s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559, 0.09607695989512108, 0.08612215377994485, 0.08051841706037521, 0.07700466082708256, 0.07435202256247804, 0.07212021685129888, 0.06969393407170837, 0.06702591448619559, 0.06440388139437984, 0.06164820240558805, 0.05886032855188524, 0.056033102342405834, 0.05334019399172551, 0.050841758480748615, 0.048417703525440114, 0.04591028241289628, 0.04385586445396011, 0.04180658900657216, 0.03987455725468494, 0.03838468353087838, 0.03683853496772212, 0.03543759127323692, 0.033991139432465706, 0.03303859126124833, 0.03167754693611248, 0.030582019002050966, 0.02985577375904934, 0.029126298689358943, 0.028383144253009075, 0.027378388907055597, 0.026652019722638903, 0.026173862452442583, 0.02563548906127343, 0.025040321228270594, 0.02458925751616826, 0.02448661673209957, 0.024136731476598495, 0.023546071144173276, 0.023421233469570004, 0.022959853346283372, 0.022601973064042425, 0.0

Epoch:  51%|█████     | 51/100 [35:28<34:03, 41.71s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559, 0.09607695989512108, 0.08612215377994485, 0.08051841706037521, 0.07700466082708256, 0.07435202256247804, 0.07212021685129888, 0.06969393407170837, 0.06702591448619559, 0.06440388139437984, 0.06164820240558805, 0.05886032855188524, 0.056033102342405834, 0.05334019399172551, 0.050841758480748615, 0.048417703525440114, 0.04591028241289628, 0.04385586445396011, 0.04180658900657216, 0.03987455725468494, 0.03838468353087838, 0.03683853496772212, 0.03543759127323692, 0.033991139432465706, 0.03303859126124833, 0.03167754693611248, 0.030582019002050966, 0.02985577375904934, 0.029126298689358943, 0.028383144253009075, 0.027378388907055597, 0.026652019722638903, 0.026173862452442583, 0.02563548906127343, 0.025040321228270594, 0.02458925751616826, 0.02448661673209957, 0.024136731476598495, 0.023546071144173276, 0.023421233469570004, 0.022959853346283372, 0.022601973064042425, 0.0

Epoch:  52%|█████▏    | 52/100 [36:10<33:23, 41.73s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…

Epoch:  53%|█████▎    | 53/100 [36:51<32:35, 41.60s/it]


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559, 0.09607695989512108, 0.08612215377994485, 0.08051841706037521, 0.07700466082708256, 0.07435202256247804, 0.07212021685129888, 0.06969393407170837, 0.06702591448619559, 0.06440388139437984, 0.06164820240558805, 0.05886032855188524, 0.056033102342405834, 0.05334019399172551, 0.050841758480748615, 0.048417703525440114, 0.04591028241289628, 0.04385586445396011, 0.04180658900657216, 0.03987455725468494, 0.03838468353087838, 0.03683853496772212, 0.03543759127323692, 0.033991139432465706, 0.03303859126124833, 0.03167754693611248, 0.030582019002050966, 0.02985577375904934, 0.029126298689358943, 0.028383144253009075, 0.027378388907055597, 0.026652019722638903, 0.026173862452442583, 0.02563548906127343, 0.025040321228270594, 0.02458925751616826, 0.02448661673209957, 0.024136731476598495, 0.023546071144173276, 0.023421233469570004, 0.022959853346283372, 0.022601973064042425, 0.0

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559, 0.09607695989512108, 0.08612215377994485, 0.08051841706037521, 0.07700466082708256, 0.07435202256247804, 0.07212021685129888, 0.06969393407170837, 0.06702591448619559, 0.06440388139437984, 0.06164820240558805, 0.05886032855188524, 0.056033102342405834, 0.05334019399172551, 0.050841758480748615, 0.048417703525440114, 0.04591028241289628, 0.04385586445396011, 0.04180658900657216, 0.03987455725468494, 0.03838468353087838, 0.03683853496772212, 0.03543759127323692, 0.033991139432465706, 0.03303859126124833, 0.03167754693611248, 0.030582019002050966, 0.02985577375904934, 0.029126298689358943, 0.028383144253009075, 0.027378388907055597, 0.026652019722638903, 0.026173862452442583, 0.02563548906127343, 0.025040321228270594, 0.02458925751616826, 0.02448661673209957, 0.024136731476598495, 0.023546071144173276, 0.023421233469570004, 0.022959853346283372, 0.022601973064042425, 0.0

Epoch:  54%|█████▍    | 54/100 [37:33<31:56, 41.66s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…

Epoch:  55%|█████▌    | 55/100 [38:14<31:09, 41.55s/it]


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559, 0.09607695989512108, 0.08612215377994485, 0.08051841706037521, 0.07700466082708256, 0.07435202256247804, 0.07212021685129888, 0.06969393407170837, 0.06702591448619559, 0.06440388139437984, 0.06164820240558805, 0.05886032855188524, 0.056033102342405834, 0.05334019399172551, 0.050841758480748615, 0.048417703525440114, 0.04591028241289628, 0.04385586445396011, 0.04180658900657216, 0.03987455725468494, 0.03838468353087838, 0.03683853496772212, 0.03543759127323692, 0.033991139432465706, 0.03303859126124833, 0.03167754693611248, 0.030582019002050966, 0.02985577375904934, 0.029126298689358943, 0.028383144253009075, 0.027378388907055597, 0.026652019722638903, 0.026173862452442583, 0.02563548906127343, 0.025040321228270594, 0.02458925751616826, 0.02448661673209957, 0.024136731476598495, 0.023546071144173276, 0.023421233469570004, 0.022959853346283372, 0.022601973064042425, 0.0

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559, 0.09607695989512108, 0.08612215377994485, 0.08051841706037521, 0.07700466082708256, 0.07435202256247804, 0.07212021685129888, 0.06969393407170837, 0.06702591448619559, 0.06440388139437984, 0.06164820240558805, 0.05886032855188524, 0.056033102342405834, 0.05334019399172551, 0.050841758480748615, 0.048417703525440114, 0.04591028241289628, 0.04385586445396011, 0.04180658900657216, 0.03987455725468494, 0.03838468353087838, 0.03683853496772212, 0.03543759127323692, 0.033991139432465706, 0.03303859126124833, 0.03167754693611248, 0.030582019002050966, 0.02985577375904934, 0.029126298689358943, 0.028383144253009075, 0.027378388907055597, 0.026652019722638903, 0.026173862452442583, 0.02563548906127343, 0.025040321228270594, 0.02458925751616826, 0.02448661673209957, 0.024136731476598495, 0.023546071144173276, 0.023421233469570004, 0.022959853346283372, 0.022601973064042425, 0.0

Epoch:  56%|█████▌    | 56/100 [38:56<30:31, 41.62s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559, 0.09607695989512108, 0.08612215377994485, 0.08051841706037521, 0.07700466082708256, 0.07435202256247804, 0.07212021685129888, 0.06969393407170837, 0.06702591448619559, 0.06440388139437984, 0.06164820240558805, 0.05886032855188524, 0.056033102342405834, 0.05334019399172551, 0.050841758480748615, 0.048417703525440114, 0.04591028241289628, 0.04385586445396011, 0.04180658900657216, 0.03987455725468494, 0.03838468353087838, 0.03683853496772212, 0.03543759127323692, 0.033991139432465706, 0.03303859126124833, 0.03167754693611248, 0.030582019002050966, 0.02985577375904934, 0.029126298689358943, 0.028383144253009075, 0.027378388907055597, 0.026652019722638903, 0.026173862452442583, 0.02563548906127343, 0.025040321228270594, 0.02458925751616826, 0.02448661673209957, 0.024136731476598495, 0.023546071144173276, 0.023421233469570004, 0.022959853346283372, 0.022601973064042425, 0.0

Epoch:  57%|█████▋    | 57/100 [39:38<29:51, 41.67s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559, 0.09607695989512108, 0.08612215377994485, 0.08051841706037521, 0.07700466082708256, 0.07435202256247804, 0.07212021685129888, 0.06969393407170837, 0.06702591448619559, 0.06440388139437984, 0.06164820240558805, 0.05886032855188524, 0.056033102342405834, 0.05334019399172551, 0.050841758480748615, 0.048417703525440114, 0.04591028241289628, 0.04385586445396011, 0.04180658900657216, 0.03987455725468494, 0.03838468353087838, 0.03683853496772212, 0.03543759127323692, 0.033991139432465706, 0.03303859126124833, 0.03167754693611248, 0.030582019002050966, 0.02985577375904934, 0.029126298689358943, 0.028383144253009075, 0.027378388907055597, 0.026652019722638903, 0.026173862452442583, 0.02563548906127343, 0.025040321228270594, 0.02458925751616826, 0.02448661673209957, 0.024136731476598495, 0.023546071144173276, 0.023421233469570004, 0.022959853346283372, 0.022601973064042425, 0.0

Epoch:  58%|█████▊    | 58/100 [40:20<29:11, 41.70s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…

Epoch:  59%|█████▉    | 59/100 [41:01<28:24, 41.58s/it]


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559, 0.09607695989512108, 0.08612215377994485, 0.08051841706037521, 0.07700466082708256, 0.07435202256247804, 0.07212021685129888, 0.06969393407170837, 0.06702591448619559, 0.06440388139437984, 0.06164820240558805, 0.05886032855188524, 0.056033102342405834, 0.05334019399172551, 0.050841758480748615, 0.048417703525440114, 0.04591028241289628, 0.04385586445396011, 0.04180658900657216, 0.03987455725468494, 0.03838468353087838, 0.03683853496772212, 0.03543759127323692, 0.033991139432465706, 0.03303859126124833, 0.03167754693611248, 0.030582019002050966, 0.02985577375904934, 0.029126298689358943, 0.028383144253009075, 0.027378388907055597, 0.026652019722638903, 0.026173862452442583, 0.02563548906127343, 0.025040321228270594, 0.02458925751616826, 0.02448661673209957, 0.024136731476598495, 0.023546071144173276, 0.023421233469570004, 0.022959853346283372, 0.022601973064042425, 0.0

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559, 0.09607695989512108, 0.08612215377994485, 0.08051841706037521, 0.07700466082708256, 0.07435202256247804, 0.07212021685129888, 0.06969393407170837, 0.06702591448619559, 0.06440388139437984, 0.06164820240558805, 0.05886032855188524, 0.056033102342405834, 0.05334019399172551, 0.050841758480748615, 0.048417703525440114, 0.04591028241289628, 0.04385586445396011, 0.04180658900657216, 0.03987455725468494, 0.03838468353087838, 0.03683853496772212, 0.03543759127323692, 0.033991139432465706, 0.03303859126124833, 0.03167754693611248, 0.030582019002050966, 0.02985577375904934, 0.029126298689358943, 0.028383144253009075, 0.027378388907055597, 0.026652019722638903, 0.026173862452442583, 0.02563548906127343, 0.025040321228270594, 0.02458925751616826, 0.02448661673209957, 0.024136731476598495, 0.023546071144173276, 0.023421233469570004, 0.022959853346283372, 0.022601973064042425, 0.0

Epoch:  60%|██████    | 60/100 [41:43<27:45, 41.64s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559, 0.09607695989512108, 0.08612215377994485, 0.08051841706037521, 0.07700466082708256, 0.07435202256247804, 0.07212021685129888, 0.06969393407170837, 0.06702591448619559, 0.06440388139437984, 0.06164820240558805, 0.05886032855188524, 0.056033102342405834, 0.05334019399172551, 0.050841758480748615, 0.048417703525440114, 0.04591028241289628, 0.04385586445396011, 0.04180658900657216, 0.03987455725468494, 0.03838468353087838, 0.03683853496772212, 0.03543759127323692, 0.033991139432465706, 0.03303859126124833, 0.03167754693611248, 0.030582019002050966, 0.02985577375904934, 0.029126298689358943, 0.028383144253009075, 0.027378388907055597, 0.026652019722638903, 0.026173862452442583, 0.02563548906127343, 0.025040321228270594, 0.02458925751616826, 0.02448661673209957, 0.024136731476598495, 0.023546071144173276, 0.023421233469570004, 0.022959853346283372, 0.022601973064042425, 0.0

Epoch:  61%|██████    | 61/100 [42:25<27:05, 41.69s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559, 0.09607695989512108, 0.08612215377994485, 0.08051841706037521, 0.07700466082708256, 0.07435202256247804, 0.07212021685129888, 0.06969393407170837, 0.06702591448619559, 0.06440388139437984, 0.06164820240558805, 0.05886032855188524, 0.056033102342405834, 0.05334019399172551, 0.050841758480748615, 0.048417703525440114, 0.04591028241289628, 0.04385586445396011, 0.04180658900657216, 0.03987455725468494, 0.03838468353087838, 0.03683853496772212, 0.03543759127323692, 0.033991139432465706, 0.03303859126124833, 0.03167754693611248, 0.030582019002050966, 0.02985577375904934, 0.029126298689358943, 0.028383144253009075, 0.027378388907055597, 0.026652019722638903, 0.026173862452442583, 0.02563548906127343, 0.025040321228270594, 0.02458925751616826, 0.02448661673209957, 0.024136731476598495, 0.023546071144173276, 0.023421233469570004, 0.022959853346283372, 0.022601973064042425, 0.0

Epoch:  62%|██████▏   | 62/100 [43:06<26:25, 41.72s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…

Epoch:  63%|██████▎   | 63/100 [43:48<25:38, 41.59s/it]


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559, 0.09607695989512108, 0.08612215377994485, 0.08051841706037521, 0.07700466082708256, 0.07435202256247804, 0.07212021685129888, 0.06969393407170837, 0.06702591448619559, 0.06440388139437984, 0.06164820240558805, 0.05886032855188524, 0.056033102342405834, 0.05334019399172551, 0.050841758480748615, 0.048417703525440114, 0.04591028241289628, 0.04385586445396011, 0.04180658900657216, 0.03987455725468494, 0.03838468353087838, 0.03683853496772212, 0.03543759127323692, 0.033991139432465706, 0.03303859126124833, 0.03167754693611248, 0.030582019002050966, 0.02985577375904934, 0.029126298689358943, 0.028383144253009075, 0.027378388907055597, 0.026652019722638903, 0.026173862452442583, 0.02563548906127343, 0.025040321228270594, 0.02458925751616826, 0.02448661673209957, 0.024136731476598495, 0.023546071144173276, 0.023421233469570004, 0.022959853346283372, 0.022601973064042425, 0.0

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…

Epoch:  64%|██████▍   | 64/100 [44:29<24:54, 41.50s/it]


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559, 0.09607695989512108, 0.08612215377994485, 0.08051841706037521, 0.07700466082708256, 0.07435202256247804, 0.07212021685129888, 0.06969393407170837, 0.06702591448619559, 0.06440388139437984, 0.06164820240558805, 0.05886032855188524, 0.056033102342405834, 0.05334019399172551, 0.050841758480748615, 0.048417703525440114, 0.04591028241289628, 0.04385586445396011, 0.04180658900657216, 0.03987455725468494, 0.03838468353087838, 0.03683853496772212, 0.03543759127323692, 0.033991139432465706, 0.03303859126124833, 0.03167754693611248, 0.030582019002050966, 0.02985577375904934, 0.029126298689358943, 0.028383144253009075, 0.027378388907055597, 0.026652019722638903, 0.026173862452442583, 0.02563548906127343, 0.025040321228270594, 0.02458925751616826, 0.02448661673209957, 0.024136731476598495, 0.023546071144173276, 0.023421233469570004, 0.022959853346283372, 0.022601973064042425, 0.0

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…

Epoch:  65%|██████▌   | 65/100 [45:10<24:10, 41.44s/it]


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559, 0.09607695989512108, 0.08612215377994485, 0.08051841706037521, 0.07700466082708256, 0.07435202256247804, 0.07212021685129888, 0.06969393407170837, 0.06702591448619559, 0.06440388139437984, 0.06164820240558805, 0.05886032855188524, 0.056033102342405834, 0.05334019399172551, 0.050841758480748615, 0.048417703525440114, 0.04591028241289628, 0.04385586445396011, 0.04180658900657216, 0.03987455725468494, 0.03838468353087838, 0.03683853496772212, 0.03543759127323692, 0.033991139432465706, 0.03303859126124833, 0.03167754693611248, 0.030582019002050966, 0.02985577375904934, 0.029126298689358943, 0.028383144253009075, 0.027378388907055597, 0.026652019722638903, 0.026173862452442583, 0.02563548906127343, 0.025040321228270594, 0.02458925751616826, 0.02448661673209957, 0.024136731476598495, 0.023546071144173276, 0.023421233469570004, 0.022959853346283372, 0.022601973064042425, 0.0

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…

Epoch:  66%|██████▌   | 66/100 [45:52<23:27, 41.40s/it]


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559, 0.09607695989512108, 0.08612215377994485, 0.08051841706037521, 0.07700466082708256, 0.07435202256247804, 0.07212021685129888, 0.06969393407170837, 0.06702591448619559, 0.06440388139437984, 0.06164820240558805, 0.05886032855188524, 0.056033102342405834, 0.05334019399172551, 0.050841758480748615, 0.048417703525440114, 0.04591028241289628, 0.04385586445396011, 0.04180658900657216, 0.03987455725468494, 0.03838468353087838, 0.03683853496772212, 0.03543759127323692, 0.033991139432465706, 0.03303859126124833, 0.03167754693611248, 0.030582019002050966, 0.02985577375904934, 0.029126298689358943, 0.028383144253009075, 0.027378388907055597, 0.026652019722638903, 0.026173862452442583, 0.02563548906127343, 0.025040321228270594, 0.02458925751616826, 0.02448661673209957, 0.024136731476598495, 0.023546071144173276, 0.023421233469570004, 0.022959853346283372, 0.022601973064042425, 0.0

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49955145088402003, 0.3425517323854807, 0.22473320445498904, 0.15214085337278005, 0.11478856729494559, 0.09607695989512108, 0.08612215377994485, 0.08051841706037521, 0.07700466082708256, 0.07435202256247804, 0.07212021685129888, 0.06969393407170837, 0.06702591448619559, 0.06440388139437984, 0.06164820240558805, 0.05886032855188524, 0.056033102342405834, 0.05334019399172551, 0.050841758480748615, 0.048417703525440114, 0.04591028241289628, 0.04385586445396011, 0.04180658900657216, 0.03987455725468494, 0.03838468353087838, 0.03683853496772212, 0.03543759127323692, 0.033991139432465706, 0.03303859126124833, 0.03167754693611248, 0.030582019002050966, 0.02985577375904934, 0.029126298689358943, 0.028383144253009075, 0.027378388907055597, 0.026652019722638903, 0.026173862452442583, 0.02563548906127343, 0.025040321228270594, 0.02458925751616826, 0.02448661673209957, 0.024136731476598495, 0.023546071144173276, 0.023421233469570004, 0.022959853346283372, 0.022601973064042425, 0.0


I0406 13:29:58.580018 139997413660480 configuration_utils.py:256] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json from cache at /home/yves/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.8f56353af4a709bf5ff0fbc915d8f5b42bfff892cbb6ac98c3c45f481a03c685
I0406 13:29:58.581720 139997413660480 configuration_utils.py:292] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": null,
  "do_sample": false,
  "eos_token_ids": null,
  "finetuning_task": null,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1"
  },
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "is_decoder": false,
  "label2id": {
    "LABEL_0": 0,
    "LABEL_1": 1
  },
  "layer_norm_eps": 1e-12,
  "length_penalty": 1.0,
  "max_length": 20,
  "max_position_embed

HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=56, style=ProgressStyle(descriptio…


Data size: 887
P: 750 / 862 = 0.8700696055684455
R: 750 / 948 = 0.7911392405063291
F: 0.8287292817679558
A: 0.7677564825253664
AL1: 0.8060879368658399
Train size: 2956
Dev size: 592


I0406 13:30:23.468563 139997413660480 configuration_utils.py:256] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json from cache at /home/yves/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.8f56353af4a709bf5ff0fbc915d8f5b42bfff892cbb6ac98c3c45f481a03c685
I0406 13:30:23.469714 139997413660480 configuration_utils.py:292] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": null,
  "do_sample": false,
  "eos_token_ids": null,
  "finetuning_task": null,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1"
  },
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "is_decoder": false,
  "label2id": {
    "LABEL_0": 0,
    "LABEL_1": 1
  },
  "layer_norm_eps": 1e-12,
  "length_penalty": 1.0,
  "max_length": 20,
  "max_position_embedd

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: []
Dev loss: 0.534661236647013


Epoch:   1%|          | 1/100 [00:41<1:08:31, 41.53s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.534661236647013]
Dev loss: 0.34460563836870967


Epoch:   2%|▏         | 2/100 [01:23<1:07:55, 41.58s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.534661236647013, 0.34460563836870967]
Dev loss: 0.22278829039754094


Epoch:   3%|▎         | 3/100 [02:04<1:07:18, 41.63s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094]
Dev loss: 0.1519625009717168


Epoch:   4%|▍         | 4/100 [02:46<1:06:40, 41.67s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168]
Dev loss: 0.11520464194787515


Epoch:   5%|▌         | 5/100 [03:28<1:06:01, 41.70s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515]
Dev loss: 0.09650587552302592


Epoch:   6%|▌         | 6/100 [04:10<1:05:21, 41.71s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592]
Dev loss: 0.08659176508317122


Epoch:   7%|▋         | 7/100 [04:52<1:04:40, 41.73s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122]
Dev loss: 0.08103243543489559


Epoch:   8%|▊         | 8/100 [05:33<1:03:59, 41.73s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559]
Dev loss: 0.07771772810736217


Epoch:   9%|▉         | 9/100 [06:15<1:03:17, 41.73s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217]
Dev loss: 0.07553859437639648


Epoch:  10%|█         | 10/100 [06:57<1:02:35, 41.73s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648]
Dev loss: 0.07381320623932658


Epoch:  11%|█         | 11/100 [07:38<1:01:54, 41.74s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658]
Dev loss: 0.07206979615462793


Epoch:  12%|█▏        | 12/100 [08:20<1:01:13, 41.74s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793]
Dev loss: 0.07032747768067024


Epoch:  13%|█▎        | 13/100 [09:02<1:00:31, 41.74s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024]
Dev loss: 0.06833026246041865


Epoch:  14%|█▍        | 14/100 [09:44<59:50, 41.75s/it]  

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024, 0.06833026246041865]
Dev loss: 0.06610027680525908


Epoch:  15%|█▌        | 15/100 [10:25<59:08, 41.75s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024, 0.06833026246041865, 0.06610027680525908]
Dev loss: 0.06310851398754765


Epoch:  16%|█▌        | 16/100 [11:07<58:27, 41.75s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024, 0.06833026246041865, 0.06610027680525908, 0.06310851398754765]
Dev loss: 0.06034158341385223


Epoch:  17%|█▋        | 17/100 [11:49<57:45, 41.75s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024, 0.06833026246041865, 0.06610027680525908, 0.06310851398754765, 0.06034158341385223]
Dev loss: 0.057755845624047356


Epoch:  18%|█▊        | 18/100 [12:31<57:03, 41.75s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024, 0.06833026246041865, 0.06610027680525908, 0.06310851398754765, 0.06034158341385223, 0.057755845624047356]
Dev loss: 0.05513654216318517


Epoch:  19%|█▉        | 19/100 [13:12<56:21, 41.75s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024, 0.06833026246041865, 0.06610027680525908, 0.06310851398754765, 0.06034158341385223, 0.057755845624047356, 0.05513654216318517]
Dev loss: 0.05268114986451897


Epoch:  20%|██        | 20/100 [13:54<55:39, 41.75s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024, 0.06833026246041865, 0.06610027680525908, 0.06310851398754765, 0.06034158341385223, 0.057755845624047356, 0.05513654216318517, 0.05268114986451897]
Dev loss: 0.050617762010645224


Epoch:  21%|██        | 21/100 [14:36<54:58, 41.75s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024, 0.06833026246041865, 0.06610027680525908, 0.06310851398754765, 0.06034158341385223, 0.057755845624047356, 0.05513654216318517, 0.05268114986451897, 0.050617762010645224]
Dev loss: 0.0483449045065287


Epoch:  22%|██▏       | 22/100 [15:18<54:16, 41.76s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024, 0.06833026246041865, 0.06610027680525908, 0.06310851398754765, 0.06034158341385223, 0.057755845624047356, 0.05513654216318517, 0.05268114986451897, 0.050617762010645224, 0.0483449045065287]
Dev loss: 0.04636664185169581


Epoch:  23%|██▎       | 23/100 [16:00<53:35, 41.76s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024, 0.06833026246041865, 0.06610027680525908, 0.06310851398754765, 0.06034158341385223, 0.057755845624047356, 0.05513654216318517, 0.05268114986451897, 0.050617762010645224, 0.0483449045065287, 0.04636664185169581]
Dev loss: 0.04477491330456089


Epoch:  24%|██▍       | 24/100 [16:41<52:53, 41.76s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024, 0.06833026246041865, 0.06610027680525908, 0.06310851398754765, 0.06034158341385223, 0.057755845624047356, 0.05513654216318517, 0.05268114986451897, 0.050617762010645224, 0.0483449045065287, 0.04636664185169581, 0.04477491330456089]
Dev loss: 0.042460377856686306


Epoch:  25%|██▌       | 25/100 [17:23<52:12, 41.76s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024, 0.06833026246041865, 0.06610027680525908, 0.06310851398754765, 0.06034158341385223, 0.057755845624047356, 0.05513654216318517, 0.05268114986451897, 0.050617762010645224, 0.0483449045065287, 0.04636664185169581, 0.04477491330456089, 0.042460377856686306]
Dev loss: 0.04079744244950849


Epoch:  26%|██▌       | 26/100 [18:05<51:30, 41.76s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024, 0.06833026246041865, 0.06610027680525908, 0.06310851398754765, 0.06034158341385223, 0.057755845624047356, 0.05513654216318517, 0.05268114986451897, 0.050617762010645224, 0.0483449045065287, 0.04636664185169581, 0.04477491330456089, 0.042460377856686306, 0.04079744244950849]
Dev loss: 0.039173439195429954


Epoch:  27%|██▋       | 27/100 [18:47<50:48, 41.76s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024, 0.06833026246041865, 0.06610027680525908, 0.06310851398754765, 0.06034158341385223, 0.057755845624047356, 0.05513654216318517, 0.05268114986451897, 0.050617762010645224, 0.0483449045065287, 0.04636664185169581, 0.04477491330456089, 0.042460377856686306, 0.04079744244950849, 0.039173439195429954]
Dev loss: 0.03782599761679366


Epoch:  28%|██▊       | 28/100 [19:28<50:06, 41.76s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024, 0.06833026246041865, 0.06610027680525908, 0.06310851398754765, 0.06034158341385223, 0.057755845624047356, 0.05513654216318517, 0.05268114986451897, 0.050617762010645224, 0.0483449045065287, 0.04636664185169581, 0.04477491330456089, 0.042460377856686306, 0.04079744244950849, 0.039173439195429954, 0.03782599761679366]
Dev loss: 0.036300643674425176


Epoch:  29%|██▉       | 29/100 [20:10<49:25, 41.76s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024, 0.06833026246041865, 0.06610027680525908, 0.06310851398754765, 0.06034158341385223, 0.057755845624047356, 0.05513654216318517, 0.05268114986451897, 0.050617762010645224, 0.0483449045065287, 0.04636664185169581, 0.04477491330456089, 0.042460377856686306, 0.04079744244950849, 0.039173439195429954, 0.03782599761679366, 0.036300643674425176]
Dev loss: 0.03519904865203677


Epoch:  30%|███       | 30/100 [20:52<48:43, 41.76s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024, 0.06833026246041865, 0.06610027680525908, 0.06310851398754765, 0.06034158341385223, 0.057755845624047356, 0.05513654216318517, 0.05268114986451897, 0.050617762010645224, 0.0483449045065287, 0.04636664185169581, 0.04477491330456089, 0.042460377856686306, 0.04079744244950849, 0.039173439195429954, 0.03782599761679366, 0.036300643674425176, 0.03519904865203677]
Dev loss: 0.033840059579626935


Epoch:  31%|███       | 31/100 [21:34<48:01, 41.76s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024, 0.06833026246041865, 0.06610027680525908, 0.06310851398754765, 0.06034158341385223, 0.057755845624047356, 0.05513654216318517, 0.05268114986451897, 0.050617762010645224, 0.0483449045065287, 0.04636664185169581, 0.04477491330456089, 0.042460377856686306, 0.04079744244950849, 0.039173439195429954, 0.03782599761679366, 0.036300643674425176, 0.03519904865203677, 0.033840059579626935]
Dev loss: 0.032753174671450176


Epoch:  32%|███▏      | 32/100 [22:15<47:19, 41.75s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024, 0.06833026246041865, 0.06610027680525908, 0.06310851398754765, 0.06034158341385223, 0.057755845624047356, 0.05513654216318517, 0.05268114986451897, 0.050617762010645224, 0.0483449045065287, 0.04636664185169581, 0.04477491330456089, 0.042460377856686306, 0.04079744244950849, 0.039173439195429954, 0.03782599761679366, 0.036300643674425176, 0.03519904865203677, 0.033840059579626935, 0.032753174671450176]
Dev loss: 0.03172567597514874


Epoch:  33%|███▎      | 33/100 [22:57<46:37, 41.75s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024, 0.06833026246041865, 0.06610027680525908, 0.06310851398754765, 0.06034158341385223, 0.057755845624047356, 0.05513654216318517, 0.05268114986451897, 0.050617762010645224, 0.0483449045065287, 0.04636664185169581, 0.04477491330456089, 0.042460377856686306, 0.04079744244950849, 0.039173439195429954, 0.03782599761679366, 0.036300643674425176, 0.03519904865203677, 0.033840059579626935, 0.032753174671450176, 0.03172567597514874]
Dev loss: 0.031172650287280213


Epoch:  34%|███▍      | 34/100 [23:39<45:55, 41.75s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024, 0.06833026246041865, 0.06610027680525908, 0.06310851398754765, 0.06034158341385223, 0.057755845624047356, 0.05513654216318517, 0.05268114986451897, 0.050617762010645224, 0.0483449045065287, 0.04636664185169581, 0.04477491330456089, 0.042460377856686306, 0.04079744244950849, 0.039173439195429954, 0.03782599761679366, 0.036300643674425176, 0.03519904865203677, 0.033840059579626935, 0.032753174671450176, 0.03172567597514874, 0.031172650287280213]
Dev loss: 0.029896920546889305


Epoch:  35%|███▌      | 35/100 [24:21<45:13, 41.75s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024, 0.06833026246041865, 0.06610027680525908, 0.06310851398754765, 0.06034158341385223, 0.057755845624047356, 0.05513654216318517, 0.05268114986451897, 0.050617762010645224, 0.0483449045065287, 0.04636664185169581, 0.04477491330456089, 0.042460377856686306, 0.04079744244950849, 0.039173439195429954, 0.03782599761679366, 0.036300643674425176, 0.03519904865203677, 0.033840059579626935, 0.032753174671450176, 0.03172567597514874, 0.031172650287280213, 0.029896920546889305]
Dev loss: 0.02939180249499308


Epoch:  36%|███▌      | 36/100 [25:02<44:31, 41.75s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024, 0.06833026246041865, 0.06610027680525908, 0.06310851398754765, 0.06034158341385223, 0.057755845624047356, 0.05513654216318517, 0.05268114986451897, 0.050617762010645224, 0.0483449045065287, 0.04636664185169581, 0.04477491330456089, 0.042460377856686306, 0.04079744244950849, 0.039173439195429954, 0.03782599761679366, 0.036300643674425176, 0.03519904865203677, 0.033840059579626935, 0.032753174671450176, 0.03172567597514874, 0.031172650287280213, 0.029896920546889305, 0.02939180249499308]
Dev loss: 0.02836253629947031


Epoch:  37%|███▋      | 37/100 [25:44<43:50, 41.75s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024, 0.06833026246041865, 0.06610027680525908, 0.06310851398754765, 0.06034158341385223, 0.057755845624047356, 0.05513654216318517, 0.05268114986451897, 0.050617762010645224, 0.0483449045065287, 0.04636664185169581, 0.04477491330456089, 0.042460377856686306, 0.04079744244950849, 0.039173439195429954, 0.03782599761679366, 0.036300643674425176, 0.03519904865203677, 0.033840059579626935, 0.032753174671450176, 0.03172567597514874, 0.031172650287280213, 0.029896920546889305, 0.02939180249499308, 0.02836253629947031]
Dev loss: 0.027540754778562365


Epoch:  38%|███▊      | 38/100 [26:26<43:08, 41.75s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024, 0.06833026246041865, 0.06610027680525908, 0.06310851398754765, 0.06034158341385223, 0.057755845624047356, 0.05513654216318517, 0.05268114986451897, 0.050617762010645224, 0.0483449045065287, 0.04636664185169581, 0.04477491330456089, 0.042460377856686306, 0.04079744244950849, 0.039173439195429954, 0.03782599761679366, 0.036300643674425176, 0.03519904865203677, 0.033840059579626935, 0.032753174671450176, 0.03172567597514874, 0.031172650287280213, 0.029896920546889305, 0.02939180249499308, 0.02836253629947031, 0.027540754778562365]
Dev loss: 0.02749181677260109


Epoch:  39%|███▉      | 39/100 [27:08<42:26, 41.74s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024, 0.06833026246041865, 0.06610027680525908, 0.06310851398754765, 0.06034158341385223, 0.057755845624047356, 0.05513654216318517, 0.05268114986451897, 0.050617762010645224, 0.0483449045065287, 0.04636664185169581, 0.04477491330456089, 0.042460377856686306, 0.04079744244950849, 0.039173439195429954, 0.03782599761679366, 0.036300643674425176, 0.03519904865203677, 0.033840059579626935, 0.032753174671450176, 0.03172567597514874, 0.031172650287280213, 0.029896920546889305, 0.02939180249499308, 0.02836253629947031, 0.027540754778562365, 0.02749181677260109]
Dev loss: 0.026679486866939713


Epoch:  40%|████      | 40/100 [27:49<41:44, 41.75s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024, 0.06833026246041865, 0.06610027680525908, 0.06310851398754765, 0.06034158341385223, 0.057755845624047356, 0.05513654216318517, 0.05268114986451897, 0.050617762010645224, 0.0483449045065287, 0.04636664185169581, 0.04477491330456089, 0.042460377856686306, 0.04079744244950849, 0.039173439195429954, 0.03782599761679366, 0.036300643674425176, 0.03519904865203677, 0.033840059579626935, 0.032753174671450176, 0.03172567597514874, 0.031172650287280213, 0.029896920546889305, 0.02939180249499308, 0.02836253629947031, 0.027540754778562365, 0.02749181677260109, 0.026679486866939713]
Dev loss: 0.02606840690950284


Epoch:  41%|████      | 41/100 [28:31<41:03, 41.75s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024, 0.06833026246041865, 0.06610027680525908, 0.06310851398754765, 0.06034158341385223, 0.057755845624047356, 0.05513654216318517, 0.05268114986451897, 0.050617762010645224, 0.0483449045065287, 0.04636664185169581, 0.04477491330456089, 0.042460377856686306, 0.04079744244950849, 0.039173439195429954, 0.03782599761679366, 0.036300643674425176, 0.03519904865203677, 0.033840059579626935, 0.032753174671450176, 0.03172567597514874, 0.031172650287280213, 0.029896920546889305, 0.02939180249499308, 0.02836253629947031, 0.027540754778562365, 0.02749181677260109, 0.026679486866939713, 0.02606840690950284]
Dev loss: 0.02570021107188753


Epoch:  42%|████▏     | 42/100 [29:13<40:21, 41.75s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024, 0.06833026246041865, 0.06610027680525908, 0.06310851398754765, 0.06034158341385223, 0.057755845624047356, 0.05513654216318517, 0.05268114986451897, 0.050617762010645224, 0.0483449045065287, 0.04636664185169581, 0.04477491330456089, 0.042460377856686306, 0.04079744244950849, 0.039173439195429954, 0.03782599761679366, 0.036300643674425176, 0.03519904865203677, 0.033840059579626935, 0.032753174671450176, 0.03172567597514874, 0.031172650287280213, 0.029896920546889305, 0.02939180249499308, 0.02836253629947031, 0.027540754778562365, 0.02749181677260109, 0.026679486866939713, 0.02606840690950284, 0.02570021107188753]
Dev loss: 0.02524303352913341


Epoch:  43%|████▎     | 43/100 [29:55<39:39, 41.75s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…

Epoch:  44%|████▍     | 44/100 [30:36<38:50, 41.61s/it]


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024, 0.06833026246041865, 0.06610027680525908, 0.06310851398754765, 0.06034158341385223, 0.057755845624047356, 0.05513654216318517, 0.05268114986451897, 0.050617762010645224, 0.0483449045065287, 0.04636664185169581, 0.04477491330456089, 0.042460377856686306, 0.04079744244950849, 0.039173439195429954, 0.03782599761679366, 0.036300643674425176, 0.03519904865203677, 0.033840059579626935, 0.032753174671450176, 0.03172567597514874, 0.031172650287280213, 0.029896920546889305, 0.02939180249499308, 0.02836253629947031, 0.027540754778562365, 0.02749181677260109, 0.026679486866939713, 0.02606840690950284, 0.02570021107188753, 0.02524303352913341]
Dev loss: 0.025792617224962323


HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024, 0.06833026246041865, 0.06610027680525908, 0.06310851398754765, 0.06034158341385223, 0.057755845624047356, 0.05513654216318517, 0.05268114986451897, 0.050617762010645224, 0.0483449045065287, 0.04636664185169581, 0.04477491330456089, 0.042460377856686306, 0.04079744244950849, 0.039173439195429954, 0.03782599761679366, 0.036300643674425176, 0.03519904865203677, 0.033840059579626935, 0.032753174671450176, 0.03172567597514874, 0.031172650287280213, 0.029896920546889305, 0.02939180249499308, 0.02836253629947031, 0.027540754778562365, 0.02749181677260109, 0.026679486866939713, 0.02606840690950284, 0.02570021107188753, 0.02524303352913341, 0.025792617224962323]
Dev loss: 0.02466163459561161


Epoch:  45%|████▌     | 45/100 [31:18<38:10, 41.65s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024, 0.06833026246041865, 0.06610027680525908, 0.06310851398754765, 0.06034158341385223, 0.057755845624047356, 0.05513654216318517, 0.05268114986451897, 0.050617762010645224, 0.0483449045065287, 0.04636664185169581, 0.04477491330456089, 0.042460377856686306, 0.04079744244950849, 0.039173439195429954, 0.03782599761679366, 0.036300643674425176, 0.03519904865203677, 0.033840059579626935, 0.032753174671450176, 0.03172567597514874, 0.031172650287280213, 0.029896920546889305, 0.02939180249499308, 0.02836253629947031, 0.027540754778562365, 0.02749181677260109, 0.026679486866939713, 0.02606840690950284, 0.02570021107188753, 0.02524303352913341, 0.025792617224962323, 0.02466163459561161]
Dev loss: 0.02434137763103118


Epoch:  46%|████▌     | 46/100 [31:59<37:30, 41.68s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…

Epoch:  47%|████▋     | 47/100 [32:41<36:43, 41.57s/it]


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024, 0.06833026246041865, 0.06610027680525908, 0.06310851398754765, 0.06034158341385223, 0.057755845624047356, 0.05513654216318517, 0.05268114986451897, 0.050617762010645224, 0.0483449045065287, 0.04636664185169581, 0.04477491330456089, 0.042460377856686306, 0.04079744244950849, 0.039173439195429954, 0.03782599761679366, 0.036300643674425176, 0.03519904865203677, 0.033840059579626935, 0.032753174671450176, 0.03172567597514874, 0.031172650287280213, 0.029896920546889305, 0.02939180249499308, 0.02836253629947031, 0.027540754778562365, 0.02749181677260109, 0.026679486866939713, 0.02606840690950284, 0.02570021107188753, 0.02524303352913341, 0.025792617224962323, 0.02466163459561161, 0.02434137763103118]
Dev loss: 

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024, 0.06833026246041865, 0.06610027680525908, 0.06310851398754765, 0.06034158341385223, 0.057755845624047356, 0.05513654216318517, 0.05268114986451897, 0.050617762010645224, 0.0483449045065287, 0.04636664185169581, 0.04477491330456089, 0.042460377856686306, 0.04079744244950849, 0.039173439195429954, 0.03782599761679366, 0.036300643674425176, 0.03519904865203677, 0.033840059579626935, 0.032753174671450176, 0.03172567597514874, 0.031172650287280213, 0.029896920546889305, 0.02939180249499308, 0.02836253629947031, 0.027540754778562365, 0.02749181677260109, 0.026679486866939713, 0.02606840690950284, 0.02570021107188753, 0.02524303352913341, 0.025792617224962323, 0.02466163459561161, 0.02434137763103118, 0.02445359

Epoch:  48%|████▊     | 48/100 [33:22<36:04, 41.62s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024, 0.06833026246041865, 0.06610027680525908, 0.06310851398754765, 0.06034158341385223, 0.057755845624047356, 0.05513654216318517, 0.05268114986451897, 0.050617762010645224, 0.0483449045065287, 0.04636664185169581, 0.04477491330456089, 0.042460377856686306, 0.04079744244950849, 0.039173439195429954, 0.03782599761679366, 0.036300643674425176, 0.03519904865203677, 0.033840059579626935, 0.032753174671450176, 0.03172567597514874, 0.031172650287280213, 0.029896920546889305, 0.02939180249499308, 0.02836253629947031, 0.027540754778562365, 0.02749181677260109, 0.026679486866939713, 0.02606840690950284, 0.02570021107188753, 0.02524303352913341, 0.025792617224962323, 0.02466163459561161, 0.02434137763103118, 0.02445359

Epoch:  49%|████▉     | 49/100 [34:04<35:24, 41.65s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…

Epoch:  50%|█████     | 50/100 [34:45<34:37, 41.54s/it]


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024, 0.06833026246041865, 0.06610027680525908, 0.06310851398754765, 0.06034158341385223, 0.057755845624047356, 0.05513654216318517, 0.05268114986451897, 0.050617762010645224, 0.0483449045065287, 0.04636664185169581, 0.04477491330456089, 0.042460377856686306, 0.04079744244950849, 0.039173439195429954, 0.03782599761679366, 0.036300643674425176, 0.03519904865203677, 0.033840059579626935, 0.032753174671450176, 0.03172567597514874, 0.031172650287280213, 0.029896920546889305, 0.02939180249499308, 0.02836253629947031, 0.027540754778562365, 0.02749181677260109, 0.026679486866939713, 0.02606840690950284, 0.02570021107188753, 0.02524303352913341, 0.025792617224962323, 0.02466163459561161, 0.02434137763103118, 0.02445359

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024, 0.06833026246041865, 0.06610027680525908, 0.06310851398754765, 0.06034158341385223, 0.057755845624047356, 0.05513654216318517, 0.05268114986451897, 0.050617762010645224, 0.0483449045065287, 0.04636664185169581, 0.04477491330456089, 0.042460377856686306, 0.04079744244950849, 0.039173439195429954, 0.03782599761679366, 0.036300643674425176, 0.03519904865203677, 0.033840059579626935, 0.032753174671450176, 0.03172567597514874, 0.031172650287280213, 0.029896920546889305, 0.02939180249499308, 0.02836253629947031, 0.027540754778562365, 0.02749181677260109, 0.026679486866939713, 0.02606840690950284, 0.02570021107188753, 0.02524303352913341, 0.025792617224962323, 0.02466163459561161, 0.02434137763103118, 0.02445359

Epoch:  51%|█████     | 51/100 [35:27<33:58, 41.60s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…

Epoch:  52%|█████▏    | 52/100 [36:08<33:12, 41.51s/it]


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024, 0.06833026246041865, 0.06610027680525908, 0.06310851398754765, 0.06034158341385223, 0.057755845624047356, 0.05513654216318517, 0.05268114986451897, 0.050617762010645224, 0.0483449045065287, 0.04636664185169581, 0.04477491330456089, 0.042460377856686306, 0.04079744244950849, 0.039173439195429954, 0.03782599761679366, 0.036300643674425176, 0.03519904865203677, 0.033840059579626935, 0.032753174671450176, 0.03172567597514874, 0.031172650287280213, 0.029896920546889305, 0.02939180249499308, 0.02836253629947031, 0.027540754778562365, 0.02749181677260109, 0.026679486866939713, 0.02606840690950284, 0.02570021107188753, 0.02524303352913341, 0.025792617224962323, 0.02466163459561161, 0.02434137763103118, 0.02445359

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…

Epoch:  53%|█████▎    | 53/100 [36:50<32:27, 41.45s/it]


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024, 0.06833026246041865, 0.06610027680525908, 0.06310851398754765, 0.06034158341385223, 0.057755845624047356, 0.05513654216318517, 0.05268114986451897, 0.050617762010645224, 0.0483449045065287, 0.04636664185169581, 0.04477491330456089, 0.042460377856686306, 0.04079744244950849, 0.039173439195429954, 0.03782599761679366, 0.036300643674425176, 0.03519904865203677, 0.033840059579626935, 0.032753174671450176, 0.03172567597514874, 0.031172650287280213, 0.029896920546889305, 0.02939180249499308, 0.02836253629947031, 0.027540754778562365, 0.02749181677260109, 0.026679486866939713, 0.02606840690950284, 0.02570021107188753, 0.02524303352913341, 0.025792617224962323, 0.02466163459561161, 0.02434137763103118, 0.02445359

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…

Epoch:  54%|█████▍    | 54/100 [37:31<31:44, 41.40s/it]


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024, 0.06833026246041865, 0.06610027680525908, 0.06310851398754765, 0.06034158341385223, 0.057755845624047356, 0.05513654216318517, 0.05268114986451897, 0.050617762010645224, 0.0483449045065287, 0.04636664185169581, 0.04477491330456089, 0.042460377856686306, 0.04079744244950849, 0.039173439195429954, 0.03782599761679366, 0.036300643674425176, 0.03519904865203677, 0.033840059579626935, 0.032753174671450176, 0.03172567597514874, 0.031172650287280213, 0.029896920546889305, 0.02939180249499308, 0.02836253629947031, 0.027540754778562365, 0.02749181677260109, 0.026679486866939713, 0.02606840690950284, 0.02570021107188753, 0.02524303352913341, 0.025792617224962323, 0.02466163459561161, 0.02434137763103118, 0.02445359

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024, 0.06833026246041865, 0.06610027680525908, 0.06310851398754765, 0.06034158341385223, 0.057755845624047356, 0.05513654216318517, 0.05268114986451897, 0.050617762010645224, 0.0483449045065287, 0.04636664185169581, 0.04477491330456089, 0.042460377856686306, 0.04079744244950849, 0.039173439195429954, 0.03782599761679366, 0.036300643674425176, 0.03519904865203677, 0.033840059579626935, 0.032753174671450176, 0.03172567597514874, 0.031172650287280213, 0.029896920546889305, 0.02939180249499308, 0.02836253629947031, 0.027540754778562365, 0.02749181677260109, 0.026679486866939713, 0.02606840690950284, 0.02570021107188753, 0.02524303352913341, 0.025792617224962323, 0.02466163459561161, 0.02434137763103118, 0.02445359

Epoch:  55%|█████▌    | 55/100 [38:13<31:07, 41.50s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024, 0.06833026246041865, 0.06610027680525908, 0.06310851398754765, 0.06034158341385223, 0.057755845624047356, 0.05513654216318517, 0.05268114986451897, 0.050617762010645224, 0.0483449045065287, 0.04636664185169581, 0.04477491330456089, 0.042460377856686306, 0.04079744244950849, 0.039173439195429954, 0.03782599761679366, 0.036300643674425176, 0.03519904865203677, 0.033840059579626935, 0.032753174671450176, 0.03172567597514874, 0.031172650287280213, 0.029896920546889305, 0.02939180249499308, 0.02836253629947031, 0.027540754778562365, 0.02749181677260109, 0.026679486866939713, 0.02606840690950284, 0.02570021107188753, 0.02524303352913341, 0.025792617224962323, 0.02466163459561161, 0.02434137763103118, 0.02445359

Epoch:  56%|█████▌    | 56/100 [38:55<30:29, 41.57s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…

Epoch:  57%|█████▋    | 57/100 [39:36<29:43, 41.49s/it]


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024, 0.06833026246041865, 0.06610027680525908, 0.06310851398754765, 0.06034158341385223, 0.057755845624047356, 0.05513654216318517, 0.05268114986451897, 0.050617762010645224, 0.0483449045065287, 0.04636664185169581, 0.04477491330456089, 0.042460377856686306, 0.04079744244950849, 0.039173439195429954, 0.03782599761679366, 0.036300643674425176, 0.03519904865203677, 0.033840059579626935, 0.032753174671450176, 0.03172567597514874, 0.031172650287280213, 0.029896920546889305, 0.02939180249499308, 0.02836253629947031, 0.027540754778562365, 0.02749181677260109, 0.026679486866939713, 0.02606840690950284, 0.02570021107188753, 0.02524303352913341, 0.025792617224962323, 0.02466163459561161, 0.02434137763103118, 0.02445359

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024, 0.06833026246041865, 0.06610027680525908, 0.06310851398754765, 0.06034158341385223, 0.057755845624047356, 0.05513654216318517, 0.05268114986451897, 0.050617762010645224, 0.0483449045065287, 0.04636664185169581, 0.04477491330456089, 0.042460377856686306, 0.04079744244950849, 0.039173439195429954, 0.03782599761679366, 0.036300643674425176, 0.03519904865203677, 0.033840059579626935, 0.032753174671450176, 0.03172567597514874, 0.031172650287280213, 0.029896920546889305, 0.02939180249499308, 0.02836253629947031, 0.027540754778562365, 0.02749181677260109, 0.026679486866939713, 0.02606840690950284, 0.02570021107188753, 0.02524303352913341, 0.025792617224962323, 0.02466163459561161, 0.02434137763103118, 0.02445359

Epoch:  58%|█████▊    | 58/100 [40:18<29:05, 41.57s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…

Epoch:  59%|█████▉    | 59/100 [40:59<28:20, 41.49s/it]


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024, 0.06833026246041865, 0.06610027680525908, 0.06310851398754765, 0.06034158341385223, 0.057755845624047356, 0.05513654216318517, 0.05268114986451897, 0.050617762010645224, 0.0483449045065287, 0.04636664185169581, 0.04477491330456089, 0.042460377856686306, 0.04079744244950849, 0.039173439195429954, 0.03782599761679366, 0.036300643674425176, 0.03519904865203677, 0.033840059579626935, 0.032753174671450176, 0.03172567597514874, 0.031172650287280213, 0.029896920546889305, 0.02939180249499308, 0.02836253629947031, 0.027540754778562365, 0.02749181677260109, 0.026679486866939713, 0.02606840690950284, 0.02570021107188753, 0.02524303352913341, 0.025792617224962323, 0.02466163459561161, 0.02434137763103118, 0.02445359

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024, 0.06833026246041865, 0.06610027680525908, 0.06310851398754765, 0.06034158341385223, 0.057755845624047356, 0.05513654216318517, 0.05268114986451897, 0.050617762010645224, 0.0483449045065287, 0.04636664185169581, 0.04477491330456089, 0.042460377856686306, 0.04079744244950849, 0.039173439195429954, 0.03782599761679366, 0.036300643674425176, 0.03519904865203677, 0.033840059579626935, 0.032753174671450176, 0.03172567597514874, 0.031172650287280213, 0.029896920546889305, 0.02939180249499308, 0.02836253629947031, 0.027540754778562365, 0.02749181677260109, 0.026679486866939713, 0.02606840690950284, 0.02570021107188753, 0.02524303352913341, 0.025792617224962323, 0.02466163459561161, 0.02434137763103118, 0.02445359

Epoch:  60%|██████    | 60/100 [41:41<27:42, 41.57s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…

Epoch:  61%|██████    | 61/100 [42:22<26:57, 41.49s/it]


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024, 0.06833026246041865, 0.06610027680525908, 0.06310851398754765, 0.06034158341385223, 0.057755845624047356, 0.05513654216318517, 0.05268114986451897, 0.050617762010645224, 0.0483449045065287, 0.04636664185169581, 0.04477491330456089, 0.042460377856686306, 0.04079744244950849, 0.039173439195429954, 0.03782599761679366, 0.036300643674425176, 0.03519904865203677, 0.033840059579626935, 0.032753174671450176, 0.03172567597514874, 0.031172650287280213, 0.029896920546889305, 0.02939180249499308, 0.02836253629947031, 0.027540754778562365, 0.02749181677260109, 0.026679486866939713, 0.02606840690950284, 0.02570021107188753, 0.02524303352913341, 0.025792617224962323, 0.02466163459561161, 0.02434137763103118, 0.02445359

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…

Epoch:  62%|██████▏   | 62/100 [43:03<26:14, 41.43s/it]


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024, 0.06833026246041865, 0.06610027680525908, 0.06310851398754765, 0.06034158341385223, 0.057755845624047356, 0.05513654216318517, 0.05268114986451897, 0.050617762010645224, 0.0483449045065287, 0.04636664185169581, 0.04477491330456089, 0.042460377856686306, 0.04079744244950849, 0.039173439195429954, 0.03782599761679366, 0.036300643674425176, 0.03519904865203677, 0.033840059579626935, 0.032753174671450176, 0.03172567597514874, 0.031172650287280213, 0.029896920546889305, 0.02939180249499308, 0.02836253629947031, 0.027540754778562365, 0.02749181677260109, 0.026679486866939713, 0.02606840690950284, 0.02570021107188753, 0.02524303352913341, 0.025792617224962323, 0.02466163459561161, 0.02434137763103118, 0.02445359

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…

Epoch:  63%|██████▎   | 63/100 [43:45<25:31, 41.39s/it]


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024, 0.06833026246041865, 0.06610027680525908, 0.06310851398754765, 0.06034158341385223, 0.057755845624047356, 0.05513654216318517, 0.05268114986451897, 0.050617762010645224, 0.0483449045065287, 0.04636664185169581, 0.04477491330456089, 0.042460377856686306, 0.04079744244950849, 0.039173439195429954, 0.03782599761679366, 0.036300643674425176, 0.03519904865203677, 0.033840059579626935, 0.032753174671450176, 0.03172567597514874, 0.031172650287280213, 0.029896920546889305, 0.02939180249499308, 0.02836253629947031, 0.027540754778562365, 0.02749181677260109, 0.026679486866939713, 0.02606840690950284, 0.02570021107188753, 0.02524303352913341, 0.025792617224962323, 0.02466163459561161, 0.02434137763103118, 0.02445359

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…

Epoch:  64%|██████▍   | 64/100 [44:26<24:49, 41.37s/it]


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024, 0.06833026246041865, 0.06610027680525908, 0.06310851398754765, 0.06034158341385223, 0.057755845624047356, 0.05513654216318517, 0.05268114986451897, 0.050617762010645224, 0.0483449045065287, 0.04636664185169581, 0.04477491330456089, 0.042460377856686306, 0.04079744244950849, 0.039173439195429954, 0.03782599761679366, 0.036300643674425176, 0.03519904865203677, 0.033840059579626935, 0.032753174671450176, 0.03172567597514874, 0.031172650287280213, 0.029896920546889305, 0.02939180249499308, 0.02836253629947031, 0.027540754778562365, 0.02749181677260109, 0.026679486866939713, 0.02606840690950284, 0.02570021107188753, 0.02524303352913341, 0.025792617224962323, 0.02466163459561161, 0.02434137763103118, 0.02445359

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024, 0.06833026246041865, 0.06610027680525908, 0.06310851398754765, 0.06034158341385223, 0.057755845624047356, 0.05513654216318517, 0.05268114986451897, 0.050617762010645224, 0.0483449045065287, 0.04636664185169581, 0.04477491330456089, 0.042460377856686306, 0.04079744244950849, 0.039173439195429954, 0.03782599761679366, 0.036300643674425176, 0.03519904865203677, 0.033840059579626935, 0.032753174671450176, 0.03172567597514874, 0.031172650287280213, 0.029896920546889305, 0.02939180249499308, 0.02836253629947031, 0.027540754778562365, 0.02749181677260109, 0.026679486866939713, 0.02606840690950284, 0.02570021107188753, 0.02524303352913341, 0.025792617224962323, 0.02466163459561161, 0.02434137763103118, 0.02445359

Epoch:  65%|██████▌   | 65/100 [45:08<24:11, 41.49s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…

Epoch:  66%|██████▌   | 66/100 [45:49<23:28, 41.43s/it]


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024, 0.06833026246041865, 0.06610027680525908, 0.06310851398754765, 0.06034158341385223, 0.057755845624047356, 0.05513654216318517, 0.05268114986451897, 0.050617762010645224, 0.0483449045065287, 0.04636664185169581, 0.04477491330456089, 0.042460377856686306, 0.04079744244950849, 0.039173439195429954, 0.03782599761679366, 0.036300643674425176, 0.03519904865203677, 0.033840059579626935, 0.032753174671450176, 0.03172567597514874, 0.031172650287280213, 0.029896920546889305, 0.02939180249499308, 0.02836253629947031, 0.027540754778562365, 0.02749181677260109, 0.026679486866939713, 0.02606840690950284, 0.02570021107188753, 0.02524303352913341, 0.025792617224962323, 0.02466163459561161, 0.02434137763103118, 0.02445359

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…

Epoch:  67%|██████▋   | 67/100 [46:30<22:46, 41.40s/it]


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024, 0.06833026246041865, 0.06610027680525908, 0.06310851398754765, 0.06034158341385223, 0.057755845624047356, 0.05513654216318517, 0.05268114986451897, 0.050617762010645224, 0.0483449045065287, 0.04636664185169581, 0.04477491330456089, 0.042460377856686306, 0.04079744244950849, 0.039173439195429954, 0.03782599761679366, 0.036300643674425176, 0.03519904865203677, 0.033840059579626935, 0.032753174671450176, 0.03172567597514874, 0.031172650287280213, 0.029896920546889305, 0.02939180249499308, 0.02836253629947031, 0.027540754778562365, 0.02749181677260109, 0.026679486866939713, 0.02606840690950284, 0.02570021107188753, 0.02524303352913341, 0.025792617224962323, 0.02466163459561161, 0.02434137763103118, 0.02445359

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024, 0.06833026246041865, 0.06610027680525908, 0.06310851398754765, 0.06034158341385223, 0.057755845624047356, 0.05513654216318517, 0.05268114986451897, 0.050617762010645224, 0.0483449045065287, 0.04636664185169581, 0.04477491330456089, 0.042460377856686306, 0.04079744244950849, 0.039173439195429954, 0.03782599761679366, 0.036300643674425176, 0.03519904865203677, 0.033840059579626935, 0.032753174671450176, 0.03172567597514874, 0.031172650287280213, 0.029896920546889305, 0.02939180249499308, 0.02836253629947031, 0.027540754778562365, 0.02749181677260109, 0.026679486866939713, 0.02606840690950284, 0.02570021107188753, 0.02524303352913341, 0.025792617224962323, 0.02466163459561161, 0.02434137763103118, 0.02445359

Epoch:  68%|██████▊   | 68/100 [47:12<22:08, 41.51s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…

Epoch:  69%|██████▉   | 69/100 [47:53<21:24, 41.45s/it]


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024, 0.06833026246041865, 0.06610027680525908, 0.06310851398754765, 0.06034158341385223, 0.057755845624047356, 0.05513654216318517, 0.05268114986451897, 0.050617762010645224, 0.0483449045065287, 0.04636664185169581, 0.04477491330456089, 0.042460377856686306, 0.04079744244950849, 0.039173439195429954, 0.03782599761679366, 0.036300643674425176, 0.03519904865203677, 0.033840059579626935, 0.032753174671450176, 0.03172567597514874, 0.031172650287280213, 0.029896920546889305, 0.02939180249499308, 0.02836253629947031, 0.027540754778562365, 0.02749181677260109, 0.026679486866939713, 0.02606840690950284, 0.02570021107188753, 0.02524303352913341, 0.025792617224962323, 0.02466163459561161, 0.02434137763103118, 0.02445359

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024, 0.06833026246041865, 0.06610027680525908, 0.06310851398754765, 0.06034158341385223, 0.057755845624047356, 0.05513654216318517, 0.05268114986451897, 0.050617762010645224, 0.0483449045065287, 0.04636664185169581, 0.04477491330456089, 0.042460377856686306, 0.04079744244950849, 0.039173439195429954, 0.03782599761679366, 0.036300643674425176, 0.03519904865203677, 0.033840059579626935, 0.032753174671450176, 0.03172567597514874, 0.031172650287280213, 0.029896920546889305, 0.02939180249499308, 0.02836253629947031, 0.027540754778562365, 0.02749181677260109, 0.026679486866939713, 0.02606840690950284, 0.02570021107188753, 0.02524303352913341, 0.025792617224962323, 0.02466163459561161, 0.02434137763103118, 0.02445359

Epoch:  70%|███████   | 70/100 [48:35<20:46, 41.54s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…

Epoch:  71%|███████   | 71/100 [49:16<20:02, 41.47s/it]


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024, 0.06833026246041865, 0.06610027680525908, 0.06310851398754765, 0.06034158341385223, 0.057755845624047356, 0.05513654216318517, 0.05268114986451897, 0.050617762010645224, 0.0483449045065287, 0.04636664185169581, 0.04477491330456089, 0.042460377856686306, 0.04079744244950849, 0.039173439195429954, 0.03782599761679366, 0.036300643674425176, 0.03519904865203677, 0.033840059579626935, 0.032753174671450176, 0.03172567597514874, 0.031172650287280213, 0.029896920546889305, 0.02939180249499308, 0.02836253629947031, 0.027540754778562365, 0.02749181677260109, 0.026679486866939713, 0.02606840690950284, 0.02570021107188753, 0.02524303352913341, 0.025792617224962323, 0.02466163459561161, 0.02434137763103118, 0.02445359

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…

Epoch:  72%|███████▏  | 72/100 [49:58<19:19, 41.42s/it]


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024, 0.06833026246041865, 0.06610027680525908, 0.06310851398754765, 0.06034158341385223, 0.057755845624047356, 0.05513654216318517, 0.05268114986451897, 0.050617762010645224, 0.0483449045065287, 0.04636664185169581, 0.04477491330456089, 0.042460377856686306, 0.04079744244950849, 0.039173439195429954, 0.03782599761679366, 0.036300643674425176, 0.03519904865203677, 0.033840059579626935, 0.032753174671450176, 0.03172567597514874, 0.031172650287280213, 0.029896920546889305, 0.02939180249499308, 0.02836253629947031, 0.027540754778562365, 0.02749181677260109, 0.026679486866939713, 0.02606840690950284, 0.02570021107188753, 0.02524303352913341, 0.025792617224962323, 0.02466163459561161, 0.02434137763103118, 0.02445359

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…

Epoch:  73%|███████▎  | 73/100 [50:39<18:37, 41.38s/it]


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024, 0.06833026246041865, 0.06610027680525908, 0.06310851398754765, 0.06034158341385223, 0.057755845624047356, 0.05513654216318517, 0.05268114986451897, 0.050617762010645224, 0.0483449045065287, 0.04636664185169581, 0.04477491330456089, 0.042460377856686306, 0.04079744244950849, 0.039173439195429954, 0.03782599761679366, 0.036300643674425176, 0.03519904865203677, 0.033840059579626935, 0.032753174671450176, 0.03172567597514874, 0.031172650287280213, 0.029896920546889305, 0.02939180249499308, 0.02836253629947031, 0.027540754778562365, 0.02749181677260109, 0.026679486866939713, 0.02606840690950284, 0.02570021107188753, 0.02524303352913341, 0.025792617224962323, 0.02466163459561161, 0.02434137763103118, 0.02445359

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…

Epoch:  74%|███████▍  | 74/100 [51:20<17:55, 41.36s/it]


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024, 0.06833026246041865, 0.06610027680525908, 0.06310851398754765, 0.06034158341385223, 0.057755845624047356, 0.05513654216318517, 0.05268114986451897, 0.050617762010645224, 0.0483449045065287, 0.04636664185169581, 0.04477491330456089, 0.042460377856686306, 0.04079744244950849, 0.039173439195429954, 0.03782599761679366, 0.036300643674425176, 0.03519904865203677, 0.033840059579626935, 0.032753174671450176, 0.03172567597514874, 0.031172650287280213, 0.029896920546889305, 0.02939180249499308, 0.02836253629947031, 0.027540754778562365, 0.02749181677260109, 0.026679486866939713, 0.02606840690950284, 0.02570021107188753, 0.02524303352913341, 0.025792617224962323, 0.02466163459561161, 0.02434137763103118, 0.02445359

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.534661236647013, 0.34460563836870967, 0.22278829039754094, 0.1519625009717168, 0.11520464194787515, 0.09650587552302592, 0.08659176508317122, 0.08103243543489559, 0.07771772810736217, 0.07553859437639648, 0.07381320623932658, 0.07206979615462793, 0.07032747768067024, 0.06833026246041865, 0.06610027680525908, 0.06310851398754765, 0.06034158341385223, 0.057755845624047356, 0.05513654216318517, 0.05268114986451897, 0.050617762010645224, 0.0483449045065287, 0.04636664185169581, 0.04477491330456089, 0.042460377856686306, 0.04079744244950849, 0.039173439195429954, 0.03782599761679366, 0.036300643674425176, 0.03519904865203677, 0.033840059579626935, 0.032753174671450176, 0.03172567597514874, 0.031172650287280213, 0.029896920546889305, 0.02939180249499308, 0.02836253629947031, 0.027540754778562365, 0.02749181677260109, 0.026679486866939713, 0.02606840690950284, 0.02570021107188753, 0.02524303352913341, 0.025792617224962323, 0.02466163459561161, 0.02434137763103118, 0.02445359

I0406 14:22:28.555644 139997413660480 configuration_utils.py:256] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json from cache at /home/yves/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.8f56353af4a709bf5ff0fbc915d8f5b42bfff892cbb6ac98c3c45f481a03c685
I0406 14:22:28.556930 139997413660480 configuration_utils.py:292] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": null,
  "do_sample": false,
  "eos_token_ids": null,
  "finetuning_task": null,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1"
  },
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "is_decoder": false,
  "label2id": {
    "LABEL_0": 0,
    "LABEL_1": 1
  },
  "layer_norm_eps": 1e-12,
  "length_penalty": 1.0,
  "max_length": 20,
  "max_position_embedd

HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=56, style=ProgressStyle(descriptio…


Data size: 887
P: 730 / 841 = 0.8680142687277052
R: 730 / 949 = 0.7692307692307693
F: 0.8156424581005587
A: 0.7474633596392334
AL1: 0.7857948139797069
Train size: 2956
Dev size: 592


I0406 14:22:52.980218 139997413660480 configuration_utils.py:256] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json from cache at /home/yves/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.8f56353af4a709bf5ff0fbc915d8f5b42bfff892cbb6ac98c3c45f481a03c685
I0406 14:22:52.981806 139997413660480 configuration_utils.py:292] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": null,
  "do_sample": false,
  "eos_token_ids": null,
  "finetuning_task": null,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1"
  },
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "is_decoder": false,
  "label2id": {
    "LABEL_0": 0,
    "LABEL_1": 1
  },
  "layer_norm_eps": 1e-12,
  "length_penalty": 1.0,
  "max_length": 20,
  "max_position_embedd

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: []
Dev loss: 0.49947572640470556



Epoch:   1%|          | 1/100 [00:41<1:08:34, 41.56s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556]
Dev loss: 0.3372103499399649



Epoch:   2%|▏         | 2/100 [01:23<1:07:58, 41.61s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556, 0.3372103499399649]
Dev loss: 0.22774101149391485



Epoch:   3%|▎         | 3/100 [02:05<1:07:21, 41.66s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485]
Dev loss: 0.16122778082216108



Epoch:   4%|▍         | 4/100 [02:46<1:06:42, 41.70s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108]
Dev loss: 0.12348401465931454



Epoch:   5%|▌         | 5/100 [03:28<1:06:02, 41.71s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454]
Dev loss: 0.10242036387727067



Epoch:   6%|▌         | 6/100 [04:10<1:05:23, 41.74s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067]
Dev loss: 0.0903673786166552



Epoch:   7%|▋         | 7/100 [04:52<1:04:42, 41.75s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552]
Dev loss: 0.08330176388089722



Epoch:   8%|▊         | 8/100 [05:33<1:04:01, 41.75s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722]
Dev loss: 0.07875724640246984



Epoch:   9%|▉         | 9/100 [06:15<1:03:19, 41.76s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984]
Dev loss: 0.0756732926175401



Epoch:  10%|█         | 10/100 [06:57<1:02:37, 41.76s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401]
Dev loss: 0.07336872171711277



Epoch:  11%|█         | 11/100 [07:39<1:01:56, 41.76s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277]
Dev loss: 0.07125654534713642



Epoch:  12%|█▏        | 12/100 [08:20<1:01:14, 41.76s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642]
Dev loss: 0.06898764581293673



Epoch:  13%|█▎        | 13/100 [09:02<1:00:33, 41.76s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673]
Dev loss: 0.06644145731587668



Epoch:  14%|█▍        | 14/100 [09:44<59:50, 41.75s/it]  [A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668]
Dev loss: 0.06381961320703095



Epoch:  15%|█▌        | 15/100 [10:26<59:08, 41.75s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095]
Dev loss: 0.06114578408163947



Epoch:  16%|█▌        | 16/100 [11:07<58:27, 41.75s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947]
Dev loss: 0.05852316219258953



Epoch:  17%|█▋        | 17/100 [11:49<57:44, 41.75s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947, 0.05852316219258953]
Dev loss: 0.05584628356469644



Epoch:  18%|█▊        | 18/100 [12:31<57:03, 41.75s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947, 0.05852316219258953, 0.05584628356469644]
Dev loss: 0.05331681044520559



Epoch:  19%|█▉        | 19/100 [13:13<56:21, 41.75s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947, 0.05852316219258953, 0.05584628356469644, 0.05331681044520559]
Dev loss: 0.05090313233636521



Epoch:  20%|██        | 20/100 [13:54<55:39, 41.75s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947, 0.05852316219258953, 0.05584628356469644, 0.05331681044520559, 0.05090313233636521]
Dev loss: 0.04857169192385029



Epoch:  21%|██        | 21/100 [14:36<54:57, 41.74s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947, 0.05852316219258953, 0.05584628356469644, 0.05331681044520559, 0.05090313233636521, 0.04857169192385029]
Dev loss: 0.04606484551284764



Epoch:  22%|██▏       | 22/100 [15:18<54:15, 41.74s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947, 0.05852316219258953, 0.05584628356469644, 0.05331681044520559, 0.05090313233636521, 0.04857169192385029, 0.04606484551284764]
Dev loss: 0.04416739850028141



Epoch:  23%|██▎       | 23/100 [16:00<53:33, 41.74s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947, 0.05852316219258953, 0.05584628356469644, 0.05331681044520559, 0.05090313233636521, 0.04857169192385029, 0.04606484551284764, 0.04416739850028141]
Dev loss: 0.04221395312531574



Epoch:  24%|██▍       | 24/100 [16:41<52:51, 41.73s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947, 0.05852316219258953, 0.05584628356469644, 0.05331681044520559, 0.05090313233636521, 0.04857169192385029, 0.04606484551284764, 0.04416739850028141, 0.04221395312531574]
Dev loss: 0.04051934368908405



Epoch:  25%|██▌       | 25/100 [17:23<52:09, 41.73s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947, 0.05852316219258953, 0.05584628356469644, 0.05331681044520559, 0.05090313233636521, 0.04857169192385029, 0.04606484551284764, 0.04416739850028141, 0.04221395312531574, 0.04051934368908405]
Dev loss: 0.039077337227157644



Epoch:  26%|██▌       | 26/100 [18:05<51:28, 41.73s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947, 0.05852316219258953, 0.05584628356469644, 0.05331681044520559, 0.05090313233636521, 0.04857169192385029, 0.04606484551284764, 0.04416739850028141, 0.04221395312531574, 0.04051934368908405, 0.039077337227157644]
Dev loss: 0.037376915280883376



Epoch:  27%|██▋       | 27/100 [18:47<50:46, 41.73s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947, 0.05852316219258953, 0.05584628356469644, 0.05331681044520559, 0.05090313233636521, 0.04857169192385029, 0.04606484551284764, 0.04416739850028141, 0.04221395312531574, 0.04051934368908405, 0.039077337227157644, 0.037376915280883376]
Dev loss: 0.036090085433947074



Epoch:  28%|██▊       | 28/100 [19:28<50:04, 41.73s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947, 0.05852316219258953, 0.05584628356469644, 0.05331681044520559, 0.05090313233636521, 0.04857169192385029, 0.04606484551284764, 0.04416739850028141, 0.04221395312531574, 0.04051934368908405, 0.039077337227157644, 0.037376915280883376, 0.036090085433947074]
Dev loss: 0.0343968545061511



Epoch:  29%|██▉       | 29/100 [20:10<49:22, 41.73s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947, 0.05852316219258953, 0.05584628356469644, 0.05331681044520559, 0.05090313233636521, 0.04857169192385029, 0.04606484551284764, 0.04416739850028141, 0.04221395312531574, 0.04051934368908405, 0.039077337227157644, 0.037376915280883376, 0.036090085433947074, 0.0343968545061511]
Dev loss: 0.033403698203934205



Epoch:  30%|███       | 30/100 [20:52<48:41, 41.73s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947, 0.05852316219258953, 0.05584628356469644, 0.05331681044520559, 0.05090313233636521, 0.04857169192385029, 0.04606484551284764, 0.04416739850028141, 0.04221395312531574, 0.04051934368908405, 0.039077337227157644, 0.037376915280883376, 0.036090085433947074, 0.0343968545061511, 0.033403698203934205]
Dev loss: 0.03223067859338748



Epoch:  31%|███       | 31/100 [21:33<47:59, 41.73s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947, 0.05852316219258953, 0.05584628356469644, 0.05331681044520559, 0.05090313233636521, 0.04857169192385029, 0.04606484551284764, 0.04416739850028141, 0.04221395312531574, 0.04051934368908405, 0.039077337227157644, 0.037376915280883376, 0.036090085433947074, 0.0343968545061511, 0.033403698203934205, 0.03223067859338748]
Dev loss: 0.031530785802248364



Epoch:  32%|███▏      | 32/100 [22:15<47:18, 41.74s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947, 0.05852316219258953, 0.05584628356469644, 0.05331681044520559, 0.05090313233636521, 0.04857169192385029, 0.04606484551284764, 0.04416739850028141, 0.04221395312531574, 0.04051934368908405, 0.039077337227157644, 0.037376915280883376, 0.036090085433947074, 0.0343968545061511, 0.033403698203934205, 0.03223067859338748, 0.031530785802248364]
Dev loss: 0.030672638702231483



Epoch:  33%|███▎      | 33/100 [22:57<46:36, 41.74s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947, 0.05852316219258953, 0.05584628356469644, 0.05331681044520559, 0.05090313233636521, 0.04857169192385029, 0.04606484551284764, 0.04416739850028141, 0.04221395312531574, 0.04051934368908405, 0.039077337227157644, 0.037376915280883376, 0.036090085433947074, 0.0343968545061511, 0.033403698203934205, 0.03223067859338748, 0.031530785802248364, 0.030672638702231483]
Dev loss: 0.029531257176721417



Epoch:  34%|███▍      | 34/100 [23:39<45:54, 41.74s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947, 0.05852316219258953, 0.05584628356469644, 0.05331681044520559, 0.05090313233636521, 0.04857169192385029, 0.04606484551284764, 0.04416739850028141, 0.04221395312531574, 0.04051934368908405, 0.039077337227157644, 0.037376915280883376, 0.036090085433947074, 0.0343968545061511, 0.033403698203934205, 0.03223067859338748, 0.031530785802248364, 0.030672638702231483, 0.029531257176721417]
Dev loss: 0.028758454212063068



Epoch:  35%|███▌      | 35/100 [24:20<45:13, 41.74s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947, 0.05852316219258953, 0.05584628356469644, 0.05331681044520559, 0.05090313233636521, 0.04857169192385029, 0.04606484551284764, 0.04416739850028141, 0.04221395312531574, 0.04051934368908405, 0.039077337227157644, 0.037376915280883376, 0.036090085433947074, 0.0343968545061511, 0.033403698203934205, 0.03223067859338748, 0.031530785802248364, 0.030672638702231483, 0.029531257176721417, 0.028758454212063068]
Dev loss: 0.02852991787162987



Epoch:  36%|███▌      | 36/100 [25:02<44:31, 41.74s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947, 0.05852316219258953, 0.05584628356469644, 0.05331681044520559, 0.05090313233636521, 0.04857169192385029, 0.04606484551284764, 0.04416739850028141, 0.04221395312531574, 0.04051934368908405, 0.039077337227157644, 0.037376915280883376, 0.036090085433947074, 0.0343968545061511, 0.033403698203934205, 0.03223067859338748, 0.031530785802248364, 0.030672638702231483, 0.029531257176721417, 0.028758454212063068, 0.02852991787162987]
Dev loss: 0.027867276875956637



Epoch:  37%|███▋      | 37/100 [25:44<43:49, 41.74s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947, 0.05852316219258953, 0.05584628356469644, 0.05331681044520559, 0.05090313233636521, 0.04857169192385029, 0.04606484551284764, 0.04416739850028141, 0.04221395312531574, 0.04051934368908405, 0.039077337227157644, 0.037376915280883376, 0.036090085433947074, 0.0343968545061511, 0.033403698203934205, 0.03223067859338748, 0.031530785802248364, 0.030672638702231483, 0.029531257176721417, 0.028758454212063068, 0.02852991787162987, 0.027867276875956637]
Dev loss: 0.027036773217086855



Epoch:  38%|███▊      | 38/100 [26:26<43:07, 41.74s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947, 0.05852316219258953, 0.05584628356469644, 0.05331681044520559, 0.05090313233636521, 0.04857169192385029, 0.04606484551284764, 0.04416739850028141, 0.04221395312531574, 0.04051934368908405, 0.039077337227157644, 0.037376915280883376, 0.036090085433947074, 0.0343968545061511, 0.033403698203934205, 0.03223067859338748, 0.031530785802248364, 0.030672638702231483, 0.029531257176721417, 0.028758454212063068, 0.02852991787162987, 0.027867276875956637, 0.027036773217086855]
Dev loss: 0.026158170623553766



Epoch:  39%|███▉      | 39/100 [27:07<42:26, 41.74s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Epoch:  40%|████      | 40/100 [27:49<41:36, 41.61s/it][A


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947, 0.05852316219258953, 0.05584628356469644, 0.05331681044520559, 0.05090313233636521, 0.04857169192385029, 0.04606484551284764, 0.04416739850028141, 0.04221395312531574, 0.04051934368908405, 0.039077337227157644, 0.037376915280883376, 0.036090085433947074, 0.0343968545061511, 0.033403698203934205, 0.03223067859338748, 0.031530785802248364, 0.030672638702231483, 0.029531257176721417, 0.028758454212063068, 0.02852991787162987, 0.027867276875956637, 0.027036773217086855, 0.026158170623553766]
Dev loss: 0.02624164556933416


HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947, 0.05852316219258953, 0.05584628356469644, 0.05331681044520559, 0.05090313233636521, 0.04857169192385029, 0.04606484551284764, 0.04416739850028141, 0.04221395312531574, 0.04051934368908405, 0.039077337227157644, 0.037376915280883376, 0.036090085433947074, 0.0343968545061511, 0.033403698203934205, 0.03223067859338748, 0.031530785802248364, 0.030672638702231483, 0.029531257176721417, 0.028758454212063068, 0.02852991787162987, 0.027867276875956637, 0.027036773217086855, 0.026158170623553766, 0.02624164556933416]
Dev loss: 0.025601309220734482



Epoch:  41%|████      | 41/100 [28:30<40:57, 41.64s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947, 0.05852316219258953, 0.05584628356469644, 0.05331681044520559, 0.05090313233636521, 0.04857169192385029, 0.04606484551284764, 0.04416739850028141, 0.04221395312531574, 0.04051934368908405, 0.039077337227157644, 0.037376915280883376, 0.036090085433947074, 0.0343968545061511, 0.033403698203934205, 0.03223067859338748, 0.031530785802248364, 0.030672638702231483, 0.029531257176721417, 0.028758454212063068, 0.02852991787162987, 0.027867276875956637, 0.027036773217086855, 0.026158170623553766, 0.02624164556933416, 0.025601309220734482]
Dev loss: 0.025468723823291226



Epoch:  42%|████▏     | 42/100 [29:12<40:16, 41.67s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947, 0.05852316219258953, 0.05584628356469644, 0.05331681044520559, 0.05090313233636521, 0.04857169192385029, 0.04606484551284764, 0.04416739850028141, 0.04221395312531574, 0.04051934368908405, 0.039077337227157644, 0.037376915280883376, 0.036090085433947074, 0.0343968545061511, 0.033403698203934205, 0.03223067859338748, 0.031530785802248364, 0.030672638702231483, 0.029531257176721417, 0.028758454212063068, 0.02852991787162987, 0.027867276875956637, 0.027036773217086855, 0.026158170623553766, 0.02624164556933416, 0.025601309220734482, 0.025468723823291226]
Dev loss: 0.025086392019246076



Epoch:  43%|████▎     | 43/100 [29:54<39:36, 41.69s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947, 0.05852316219258953, 0.05584628356469644, 0.05331681044520559, 0.05090313233636521, 0.04857169192385029, 0.04606484551284764, 0.04416739850028141, 0.04221395312531574, 0.04051934368908405, 0.039077337227157644, 0.037376915280883376, 0.036090085433947074, 0.0343968545061511, 0.033403698203934205, 0.03223067859338748, 0.031530785802248364, 0.030672638702231483, 0.029531257176721417, 0.028758454212063068, 0.02852991787162987, 0.027867276875956637, 0.027036773217086855, 0.026158170623553766, 0.02624164556933416, 0.025601309220734482, 0.025468723823291226, 0.025086392019246076]
Dev loss: 0.024634593946708215



Epoch:  44%|████▍     | 44/100 [30:36<38:55, 41.71s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947, 0.05852316219258953, 0.05584628356469644, 0.05331681044520559, 0.05090313233636521, 0.04857169192385029, 0.04606484551284764, 0.04416739850028141, 0.04221395312531574, 0.04051934368908405, 0.039077337227157644, 0.037376915280883376, 0.036090085433947074, 0.0343968545061511, 0.033403698203934205, 0.03223067859338748, 0.031530785802248364, 0.030672638702231483, 0.029531257176721417, 0.028758454212063068, 0.02852991787162987, 0.027867276875956637, 0.027036773217086855, 0.026158170623553766, 0.02624164556933416, 0.025601309220734482, 0.025468723823291226, 0.025086392019246076, 0.024634593946708215]
Dev loss: 0.024366610658329887



Epoch:  45%|████▌     | 45/100 [31:17<38:14, 41.72s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947, 0.05852316219258953, 0.05584628356469644, 0.05331681044520559, 0.05090313233636521, 0.04857169192385029, 0.04606484551284764, 0.04416739850028141, 0.04221395312531574, 0.04051934368908405, 0.039077337227157644, 0.037376915280883376, 0.036090085433947074, 0.0343968545061511, 0.033403698203934205, 0.03223067859338748, 0.031530785802248364, 0.030672638702231483, 0.029531257176721417, 0.028758454212063068, 0.02852991787162987, 0.027867276875956637, 0.027036773217086855, 0.026158170623553766, 0.02624164556933416, 0.025601309220734482, 0.025468723823291226, 0.025086392019246076, 0.024634593946708215, 0.024366610658329887]
Dev loss: 0.024117613488153


Epoch:  46%|████▌     | 46/100 [31:59<37:33, 41.73s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947, 0.05852316219258953, 0.05584628356469644, 0.05331681044520559, 0.05090313233636521, 0.04857169192385029, 0.04606484551284764, 0.04416739850028141, 0.04221395312531574, 0.04051934368908405, 0.039077337227157644, 0.037376915280883376, 0.036090085433947074, 0.0343968545061511, 0.033403698203934205, 0.03223067859338748, 0.031530785802248364, 0.030672638702231483, 0.029531257176721417, 0.028758454212063068, 0.02852991787162987, 0.027867276875956637, 0.027036773217086855, 0.026158170623553766, 0.02624164556933416, 0.025601309220734482, 0.025468723823291226, 0.025086392019246076, 0.024634593946708215, 0.024366610658329887, 0.024117613488153833]
Dev l


Epoch:  47%|████▋     | 47/100 [32:41<36:51, 41.72s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947, 0.05852316219258953, 0.05584628356469644, 0.05331681044520559, 0.05090313233636521, 0.04857169192385029, 0.04606484551284764, 0.04416739850028141, 0.04221395312531574, 0.04051934368908405, 0.039077337227157644, 0.037376915280883376, 0.036090085433947074, 0.0343968545061511, 0.033403698203934205, 0.03223067859338748, 0.031530785802248364, 0.030672638702231483, 0.029531257176721417, 0.028758454212063068, 0.02852991787162987, 0.027867276875956637, 0.027036773217086855, 0.026158170623553766, 0.02624164556933416, 0.025601309220734482, 0.025468723823291226, 0.025086392019246076, 0.024634593946708215, 0.024366610658329887, 0.024117613488153833, 0.023


Epoch:  48%|████▊     | 48/100 [33:23<36:09, 41.73s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947, 0.05852316219258953, 0.05584628356469644, 0.05331681044520559, 0.05090313233636521, 0.04857169192385029, 0.04606484551284764, 0.04416739850028141, 0.04221395312531574, 0.04051934368908405, 0.039077337227157644, 0.037376915280883376, 0.036090085433947074, 0.0343968545061511, 0.033403698203934205, 0.03223067859338748, 0.031530785802248364, 0.030672638702231483, 0.029531257176721417, 0.028758454212063068, 0.02852991787162987, 0.027867276875956637, 0.027036773217086855, 0.026158170623553766, 0.02624164556933416, 0.025601309220734482, 0.025468723823291226, 0.025086392019246076, 0.024634593946708215, 0.024366610658329887, 0.024117613488153833, 0.023


Epoch:  49%|████▉     | 49/100 [34:04<35:28, 41.73s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947, 0.05852316219258953, 0.05584628356469644, 0.05331681044520559, 0.05090313233636521, 0.04857169192385029, 0.04606484551284764, 0.04416739850028141, 0.04221395312531574, 0.04051934368908405, 0.039077337227157644, 0.037376915280883376, 0.036090085433947074, 0.0343968545061511, 0.033403698203934205, 0.03223067859338748, 0.031530785802248364, 0.030672638702231483, 0.029531257176721417, 0.028758454212063068, 0.02852991787162987, 0.027867276875956637, 0.027036773217086855, 0.026158170623553766, 0.02624164556933416, 0.025601309220734482, 0.025468723823291226, 0.025086392019246076, 0.024634593946708215, 0.024366610658329887, 0.024117613488153833, 0.023


Epoch:  50%|█████     | 50/100 [34:46<34:46, 41.74s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947, 0.05852316219258953, 0.05584628356469644, 0.05331681044520559, 0.05090313233636521, 0.04857169192385029, 0.04606484551284764, 0.04416739850028141, 0.04221395312531574, 0.04051934368908405, 0.039077337227157644, 0.037376915280883376, 0.036090085433947074, 0.0343968545061511, 0.033403698203934205, 0.03223067859338748, 0.031530785802248364, 0.030672638702231483, 0.029531257176721417, 0.028758454212063068, 0.02852991787162987, 0.027867276875956637, 0.027036773217086855, 0.026158170623553766, 0.02624164556933416, 0.025601309220734482, 0.025468723823291226, 0.025086392019246076, 0.024634593946708215, 0.024366610658329887, 0.024117613488153833, 0.023


Epoch:  51%|█████     | 51/100 [35:28<34:05, 41.74s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947, 0.05852316219258953, 0.05584628356469644, 0.05331681044520559, 0.05090313233636521, 0.04857169192385029, 0.04606484551284764, 0.04416739850028141, 0.04221395312531574, 0.04051934368908405, 0.039077337227157644, 0.037376915280883376, 0.036090085433947074, 0.0343968545061511, 0.033403698203934205, 0.03223067859338748, 0.031530785802248364, 0.030672638702231483, 0.029531257176721417, 0.028758454212063068, 0.02852991787162987, 0.027867276875956637, 0.027036773217086855, 0.026158170623553766, 0.02624164556933416, 0.025601309220734482, 0.025468723823291226, 0.025086392019246076, 0.024634593946708215, 0.024366610658329887, 0.024117613488153833, 0.023


Epoch:  52%|█████▏    | 52/100 [36:10<33:23, 41.74s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947, 0.05852316219258953, 0.05584628356469644, 0.05331681044520559, 0.05090313233636521, 0.04857169192385029, 0.04606484551284764, 0.04416739850028141, 0.04221395312531574, 0.04051934368908405, 0.039077337227157644, 0.037376915280883376, 0.036090085433947074, 0.0343968545061511, 0.033403698203934205, 0.03223067859338748, 0.031530785802248364, 0.030672638702231483, 0.029531257176721417, 0.028758454212063068, 0.02852991787162987, 0.027867276875956637, 0.027036773217086855, 0.026158170623553766, 0.02624164556933416, 0.025601309220734482, 0.025468723823291226, 0.025086392019246076, 0.024634593946708215, 0.024366610658329887, 0.024117613488153833, 0.023


Epoch:  53%|█████▎    | 53/100 [36:51<32:41, 41.73s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947, 0.05852316219258953, 0.05584628356469644, 0.05331681044520559, 0.05090313233636521, 0.04857169192385029, 0.04606484551284764, 0.04416739850028141, 0.04221395312531574, 0.04051934368908405, 0.039077337227157644, 0.037376915280883376, 0.036090085433947074, 0.0343968545061511, 0.033403698203934205, 0.03223067859338748, 0.031530785802248364, 0.030672638702231483, 0.029531257176721417, 0.028758454212063068, 0.02852991787162987, 0.027867276875956637, 0.027036773217086855, 0.026158170623553766, 0.02624164556933416, 0.025601309220734482, 0.025468723823291226, 0.025086392019246076, 0.024634593946708215, 0.024366610658329887, 0.024117613488153833, 0.023


Epoch:  54%|█████▍    | 54/100 [37:33<31:59, 41.73s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Epoch:  55%|█████▌    | 55/100 [38:14<31:12, 41.60s/it][A


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947, 0.05852316219258953, 0.05584628356469644, 0.05331681044520559, 0.05090313233636521, 0.04857169192385029, 0.04606484551284764, 0.04416739850028141, 0.04221395312531574, 0.04051934368908405, 0.039077337227157644, 0.037376915280883376, 0.036090085433947074, 0.0343968545061511, 0.033403698203934205, 0.03223067859338748, 0.031530785802248364, 0.030672638702231483, 0.029531257176721417, 0.028758454212063068, 0.02852991787162987, 0.027867276875956637, 0.027036773217086855, 0.026158170623553766, 0.02624164556933416, 0.025601309220734482, 0.025468723823291226, 0.025086392019246076, 0.024634593946708215, 0.024366610658329887, 0.024117613488153833, 0.023

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947, 0.05852316219258953, 0.05584628356469644, 0.05331681044520559, 0.05090313233636521, 0.04857169192385029, 0.04606484551284764, 0.04416739850028141, 0.04221395312531574, 0.04051934368908405, 0.039077337227157644, 0.037376915280883376, 0.036090085433947074, 0.0343968545061511, 0.033403698203934205, 0.03223067859338748, 0.031530785802248364, 0.030672638702231483, 0.029531257176721417, 0.028758454212063068, 0.02852991787162987, 0.027867276875956637, 0.027036773217086855, 0.026158170623553766, 0.02624164556933416, 0.025601309220734482, 0.025468723823291226, 0.025086392019246076, 0.024634593946708215, 0.024366610658329887, 0.024117613488153833, 0.023


Epoch:  56%|█████▌    | 56/100 [38:56<30:32, 41.64s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947, 0.05852316219258953, 0.05584628356469644, 0.05331681044520559, 0.05090313233636521, 0.04857169192385029, 0.04606484551284764, 0.04416739850028141, 0.04221395312531574, 0.04051934368908405, 0.039077337227157644, 0.037376915280883376, 0.036090085433947074, 0.0343968545061511, 0.033403698203934205, 0.03223067859338748, 0.031530785802248364, 0.030672638702231483, 0.029531257176721417, 0.028758454212063068, 0.02852991787162987, 0.027867276875956637, 0.027036773217086855, 0.026158170623553766, 0.02624164556933416, 0.025601309220734482, 0.025468723823291226, 0.025086392019246076, 0.024634593946708215, 0.024366610658329887, 0.024117613488153833, 0.023


Epoch:  57%|█████▋    | 57/100 [39:38<29:52, 41.67s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947, 0.05852316219258953, 0.05584628356469644, 0.05331681044520559, 0.05090313233636521, 0.04857169192385029, 0.04606484551284764, 0.04416739850028141, 0.04221395312531574, 0.04051934368908405, 0.039077337227157644, 0.037376915280883376, 0.036090085433947074, 0.0343968545061511, 0.033403698203934205, 0.03223067859338748, 0.031530785802248364, 0.030672638702231483, 0.029531257176721417, 0.028758454212063068, 0.02852991787162987, 0.027867276875956637, 0.027036773217086855, 0.026158170623553766, 0.02624164556933416, 0.025601309220734482, 0.025468723823291226, 0.025086392019246076, 0.024634593946708215, 0.024366610658329887, 0.024117613488153833, 0.023


Epoch:  58%|█████▊    | 58/100 [40:20<29:10, 41.69s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947, 0.05852316219258953, 0.05584628356469644, 0.05331681044520559, 0.05090313233636521, 0.04857169192385029, 0.04606484551284764, 0.04416739850028141, 0.04221395312531574, 0.04051934368908405, 0.039077337227157644, 0.037376915280883376, 0.036090085433947074, 0.0343968545061511, 0.033403698203934205, 0.03223067859338748, 0.031530785802248364, 0.030672638702231483, 0.029531257176721417, 0.028758454212063068, 0.02852991787162987, 0.027867276875956637, 0.027036773217086855, 0.026158170623553766, 0.02624164556933416, 0.025601309220734482, 0.025468723823291226, 0.025086392019246076, 0.024634593946708215, 0.024366610658329887, 0.024117613488153833, 0.023


Epoch:  59%|█████▉    | 59/100 [41:01<28:29, 41.70s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Epoch:  60%|██████    | 60/100 [41:43<27:43, 41.58s/it][A


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947, 0.05852316219258953, 0.05584628356469644, 0.05331681044520559, 0.05090313233636521, 0.04857169192385029, 0.04606484551284764, 0.04416739850028141, 0.04221395312531574, 0.04051934368908405, 0.039077337227157644, 0.037376915280883376, 0.036090085433947074, 0.0343968545061511, 0.033403698203934205, 0.03223067859338748, 0.031530785802248364, 0.030672638702231483, 0.029531257176721417, 0.028758454212063068, 0.02852991787162987, 0.027867276875956637, 0.027036773217086855, 0.026158170623553766, 0.02624164556933416, 0.025601309220734482, 0.025468723823291226, 0.025086392019246076, 0.024634593946708215, 0.024366610658329887, 0.024117613488153833, 0.023

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Epoch:  61%|██████    | 61/100 [42:24<26:58, 41.50s/it][A


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947, 0.05852316219258953, 0.05584628356469644, 0.05331681044520559, 0.05090313233636521, 0.04857169192385029, 0.04606484551284764, 0.04416739850028141, 0.04221395312531574, 0.04051934368908405, 0.039077337227157644, 0.037376915280883376, 0.036090085433947074, 0.0343968545061511, 0.033403698203934205, 0.03223067859338748, 0.031530785802248364, 0.030672638702231483, 0.029531257176721417, 0.028758454212063068, 0.02852991787162987, 0.027867276875956637, 0.027036773217086855, 0.026158170623553766, 0.02624164556933416, 0.025601309220734482, 0.025468723823291226, 0.025086392019246076, 0.024634593946708215, 0.024366610658329887, 0.024117613488153833, 0.023

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947, 0.05852316219258953, 0.05584628356469644, 0.05331681044520559, 0.05090313233636521, 0.04857169192385029, 0.04606484551284764, 0.04416739850028141, 0.04221395312531574, 0.04051934368908405, 0.039077337227157644, 0.037376915280883376, 0.036090085433947074, 0.0343968545061511, 0.033403698203934205, 0.03223067859338748, 0.031530785802248364, 0.030672638702231483, 0.029531257176721417, 0.028758454212063068, 0.02852991787162987, 0.027867276875956637, 0.027036773217086855, 0.026158170623553766, 0.02624164556933416, 0.025601309220734482, 0.025468723823291226, 0.025086392019246076, 0.024634593946708215, 0.024366610658329887, 0.024117613488153833, 0.023


Epoch:  62%|██████▏   | 62/100 [43:06<26:19, 41.57s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947, 0.05852316219258953, 0.05584628356469644, 0.05331681044520559, 0.05090313233636521, 0.04857169192385029, 0.04606484551284764, 0.04416739850028141, 0.04221395312531574, 0.04051934368908405, 0.039077337227157644, 0.037376915280883376, 0.036090085433947074, 0.0343968545061511, 0.033403698203934205, 0.03223067859338748, 0.031530785802248364, 0.030672638702231483, 0.029531257176721417, 0.028758454212063068, 0.02852991787162987, 0.027867276875956637, 0.027036773217086855, 0.026158170623553766, 0.02624164556933416, 0.025601309220734482, 0.025468723823291226, 0.025086392019246076, 0.024634593946708215, 0.024366610658329887, 0.024117613488153833, 0.023


Epoch:  63%|██████▎   | 63/100 [43:47<25:40, 41.62s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Epoch:  64%|██████▍   | 64/100 [44:29<24:54, 41.53s/it][A


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947, 0.05852316219258953, 0.05584628356469644, 0.05331681044520559, 0.05090313233636521, 0.04857169192385029, 0.04606484551284764, 0.04416739850028141, 0.04221395312531574, 0.04051934368908405, 0.039077337227157644, 0.037376915280883376, 0.036090085433947074, 0.0343968545061511, 0.033403698203934205, 0.03223067859338748, 0.031530785802248364, 0.030672638702231483, 0.029531257176721417, 0.028758454212063068, 0.02852991787162987, 0.027867276875956637, 0.027036773217086855, 0.026158170623553766, 0.02624164556933416, 0.025601309220734482, 0.025468723823291226, 0.025086392019246076, 0.024634593946708215, 0.024366610658329887, 0.024117613488153833, 0.023

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947, 0.05852316219258953, 0.05584628356469644, 0.05331681044520559, 0.05090313233636521, 0.04857169192385029, 0.04606484551284764, 0.04416739850028141, 0.04221395312531574, 0.04051934368908405, 0.039077337227157644, 0.037376915280883376, 0.036090085433947074, 0.0343968545061511, 0.033403698203934205, 0.03223067859338748, 0.031530785802248364, 0.030672638702231483, 0.029531257176721417, 0.028758454212063068, 0.02852991787162987, 0.027867276875956637, 0.027036773217086855, 0.026158170623553766, 0.02624164556933416, 0.025601309220734482, 0.025468723823291226, 0.025086392019246076, 0.024634593946708215, 0.024366610658329887, 0.024117613488153833, 0.023


Epoch:  65%|██████▌   | 65/100 [45:10<24:15, 41.59s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Epoch:  66%|██████▌   | 66/100 [45:52<23:31, 41.51s/it][A


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947, 0.05852316219258953, 0.05584628356469644, 0.05331681044520559, 0.05090313233636521, 0.04857169192385029, 0.04606484551284764, 0.04416739850028141, 0.04221395312531574, 0.04051934368908405, 0.039077337227157644, 0.037376915280883376, 0.036090085433947074, 0.0343968545061511, 0.033403698203934205, 0.03223067859338748, 0.031530785802248364, 0.030672638702231483, 0.029531257176721417, 0.028758454212063068, 0.02852991787162987, 0.027867276875956637, 0.027036773217086855, 0.026158170623553766, 0.02624164556933416, 0.025601309220734482, 0.025468723823291226, 0.025086392019246076, 0.024634593946708215, 0.024366610658329887, 0.024117613488153833, 0.023

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Epoch:  67%|██████▋   | 67/100 [46:33<22:47, 41.44s/it][A


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947, 0.05852316219258953, 0.05584628356469644, 0.05331681044520559, 0.05090313233636521, 0.04857169192385029, 0.04606484551284764, 0.04416739850028141, 0.04221395312531574, 0.04051934368908405, 0.039077337227157644, 0.037376915280883376, 0.036090085433947074, 0.0343968545061511, 0.033403698203934205, 0.03223067859338748, 0.031530785802248364, 0.030672638702231483, 0.029531257176721417, 0.028758454212063068, 0.02852991787162987, 0.027867276875956637, 0.027036773217086855, 0.026158170623553766, 0.02624164556933416, 0.025601309220734482, 0.025468723823291226, 0.025086392019246076, 0.024634593946708215, 0.024366610658329887, 0.024117613488153833, 0.023

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Epoch:  68%|██████▊   | 68/100 [47:14<22:04, 41.40s/it][A


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947, 0.05852316219258953, 0.05584628356469644, 0.05331681044520559, 0.05090313233636521, 0.04857169192385029, 0.04606484551284764, 0.04416739850028141, 0.04221395312531574, 0.04051934368908405, 0.039077337227157644, 0.037376915280883376, 0.036090085433947074, 0.0343968545061511, 0.033403698203934205, 0.03223067859338748, 0.031530785802248364, 0.030672638702231483, 0.029531257176721417, 0.028758454212063068, 0.02852991787162987, 0.027867276875956637, 0.027036773217086855, 0.026158170623553766, 0.02624164556933416, 0.025601309220734482, 0.025468723823291226, 0.025086392019246076, 0.024634593946708215, 0.024366610658329887, 0.024117613488153833, 0.023

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947, 0.05852316219258953, 0.05584628356469644, 0.05331681044520559, 0.05090313233636521, 0.04857169192385029, 0.04606484551284764, 0.04416739850028141, 0.04221395312531574, 0.04051934368908405, 0.039077337227157644, 0.037376915280883376, 0.036090085433947074, 0.0343968545061511, 0.033403698203934205, 0.03223067859338748, 0.031530785802248364, 0.030672638702231483, 0.029531257176721417, 0.028758454212063068, 0.02852991787162987, 0.027867276875956637, 0.027036773217086855, 0.026158170623553766, 0.02624164556933416, 0.025601309220734482, 0.025468723823291226, 0.025086392019246076, 0.024634593946708215, 0.024366610658329887, 0.024117613488153833, 0.023


Epoch:  69%|██████▉   | 69/100 [47:56<21:26, 41.50s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Epoch:  70%|███████   | 70/100 [48:37<20:43, 41.44s/it][A


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947, 0.05852316219258953, 0.05584628356469644, 0.05331681044520559, 0.05090313233636521, 0.04857169192385029, 0.04606484551284764, 0.04416739850028141, 0.04221395312531574, 0.04051934368908405, 0.039077337227157644, 0.037376915280883376, 0.036090085433947074, 0.0343968545061511, 0.033403698203934205, 0.03223067859338748, 0.031530785802248364, 0.030672638702231483, 0.029531257176721417, 0.028758454212063068, 0.02852991787162987, 0.027867276875956637, 0.027036773217086855, 0.026158170623553766, 0.02624164556933416, 0.025601309220734482, 0.025468723823291226, 0.025086392019246076, 0.024634593946708215, 0.024366610658329887, 0.024117613488153833, 0.023

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Epoch:  71%|███████   | 71/100 [49:19<20:00, 41.39s/it][A


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947, 0.05852316219258953, 0.05584628356469644, 0.05331681044520559, 0.05090313233636521, 0.04857169192385029, 0.04606484551284764, 0.04416739850028141, 0.04221395312531574, 0.04051934368908405, 0.039077337227157644, 0.037376915280883376, 0.036090085433947074, 0.0343968545061511, 0.033403698203934205, 0.03223067859338748, 0.031530785802248364, 0.030672638702231483, 0.029531257176721417, 0.028758454212063068, 0.02852991787162987, 0.027867276875956637, 0.027036773217086855, 0.026158170623553766, 0.02624164556933416, 0.025601309220734482, 0.025468723823291226, 0.025086392019246076, 0.024634593946708215, 0.024366610658329887, 0.024117613488153833, 0.023

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Epoch:  72%|███████▏  | 72/100 [50:00<19:18, 41.36s/it][A


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947, 0.05852316219258953, 0.05584628356469644, 0.05331681044520559, 0.05090313233636521, 0.04857169192385029, 0.04606484551284764, 0.04416739850028141, 0.04221395312531574, 0.04051934368908405, 0.039077337227157644, 0.037376915280883376, 0.036090085433947074, 0.0343968545061511, 0.033403698203934205, 0.03223067859338748, 0.031530785802248364, 0.030672638702231483, 0.029531257176721417, 0.028758454212063068, 0.02852991787162987, 0.027867276875956637, 0.027036773217086855, 0.026158170623553766, 0.02624164556933416, 0.025601309220734482, 0.025468723823291226, 0.025086392019246076, 0.024634593946708215, 0.024366610658329887, 0.024117613488153833, 0.023

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947, 0.05852316219258953, 0.05584628356469644, 0.05331681044520559, 0.05090313233636521, 0.04857169192385029, 0.04606484551284764, 0.04416739850028141, 0.04221395312531574, 0.04051934368908405, 0.039077337227157644, 0.037376915280883376, 0.036090085433947074, 0.0343968545061511, 0.033403698203934205, 0.03223067859338748, 0.031530785802248364, 0.030672638702231483, 0.029531257176721417, 0.028758454212063068, 0.02852991787162987, 0.027867276875956637, 0.027036773217086855, 0.026158170623553766, 0.02624164556933416, 0.025601309220734482, 0.025468723823291226, 0.025086392019246076, 0.024634593946708215, 0.024366610658329887, 0.024117613488153833, 0.023


Epoch:  73%|███████▎  | 73/100 [50:42<18:39, 41.48s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Epoch:  74%|███████▍  | 74/100 [51:23<17:56, 41.42s/it][A


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947, 0.05852316219258953, 0.05584628356469644, 0.05331681044520559, 0.05090313233636521, 0.04857169192385029, 0.04606484551284764, 0.04416739850028141, 0.04221395312531574, 0.04051934368908405, 0.039077337227157644, 0.037376915280883376, 0.036090085433947074, 0.0343968545061511, 0.033403698203934205, 0.03223067859338748, 0.031530785802248364, 0.030672638702231483, 0.029531257176721417, 0.028758454212063068, 0.02852991787162987, 0.027867276875956637, 0.027036773217086855, 0.026158170623553766, 0.02624164556933416, 0.025601309220734482, 0.025468723823291226, 0.025086392019246076, 0.024634593946708215, 0.024366610658329887, 0.024117613488153833, 0.023

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Epoch:  75%|███████▌  | 75/100 [52:04<17:14, 41.39s/it][A


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947, 0.05852316219258953, 0.05584628356469644, 0.05331681044520559, 0.05090313233636521, 0.04857169192385029, 0.04606484551284764, 0.04416739850028141, 0.04221395312531574, 0.04051934368908405, 0.039077337227157644, 0.037376915280883376, 0.036090085433947074, 0.0343968545061511, 0.033403698203934205, 0.03223067859338748, 0.031530785802248364, 0.030672638702231483, 0.029531257176721417, 0.028758454212063068, 0.02852991787162987, 0.027867276875956637, 0.027036773217086855, 0.026158170623553766, 0.02624164556933416, 0.025601309220734482, 0.025468723823291226, 0.025086392019246076, 0.024634593946708215, 0.024366610658329887, 0.024117613488153833, 0.023

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Epoch:  76%|███████▌  | 76/100 [52:46<16:32, 41.36s/it][A


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947, 0.05852316219258953, 0.05584628356469644, 0.05331681044520559, 0.05090313233636521, 0.04857169192385029, 0.04606484551284764, 0.04416739850028141, 0.04221395312531574, 0.04051934368908405, 0.039077337227157644, 0.037376915280883376, 0.036090085433947074, 0.0343968545061511, 0.033403698203934205, 0.03223067859338748, 0.031530785802248364, 0.030672638702231483, 0.029531257176721417, 0.028758454212063068, 0.02852991787162987, 0.027867276875956637, 0.027036773217086855, 0.026158170623553766, 0.02624164556933416, 0.025601309220734482, 0.025468723823291226, 0.025086392019246076, 0.024634593946708215, 0.024366610658329887, 0.024117613488153833, 0.023

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Epoch:  77%|███████▋  | 77/100 [53:27<15:50, 41.35s/it][A


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947, 0.05852316219258953, 0.05584628356469644, 0.05331681044520559, 0.05090313233636521, 0.04857169192385029, 0.04606484551284764, 0.04416739850028141, 0.04221395312531574, 0.04051934368908405, 0.039077337227157644, 0.037376915280883376, 0.036090085433947074, 0.0343968545061511, 0.033403698203934205, 0.03223067859338748, 0.031530785802248364, 0.030672638702231483, 0.029531257176721417, 0.028758454212063068, 0.02852991787162987, 0.027867276875956637, 0.027036773217086855, 0.026158170623553766, 0.02624164556933416, 0.025601309220734482, 0.025468723823291226, 0.025086392019246076, 0.024634593946708215, 0.024366610658329887, 0.024117613488153833, 0.023

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.49947572640470556, 0.3372103499399649, 0.22774101149391485, 0.16122778082216108, 0.12348401465931454, 0.10242036387727067, 0.0903673786166552, 0.08330176388089722, 0.07875724640246984, 0.0756732926175401, 0.07336872171711277, 0.07125654534713642, 0.06898764581293673, 0.06644145731587668, 0.06381961320703095, 0.06114578408163947, 0.05852316219258953, 0.05584628356469644, 0.05331681044520559, 0.05090313233636521, 0.04857169192385029, 0.04606484551284764, 0.04416739850028141, 0.04221395312531574, 0.04051934368908405, 0.039077337227157644, 0.037376915280883376, 0.036090085433947074, 0.0343968545061511, 0.033403698203934205, 0.03223067859338748, 0.031530785802248364, 0.030672638702231483, 0.029531257176721417, 0.028758454212063068, 0.02852991787162987, 0.027867276875956637, 0.027036773217086855, 0.026158170623553766, 0.02624164556933416, 0.025601309220734482, 0.025468723823291226, 0.025086392019246076, 0.024634593946708215, 0.024366610658329887, 0.024117613488153833, 0.023

I0406 15:17:04.536095 139997413660480 configuration_utils.py:256] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json from cache at /home/yves/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.8f56353af4a709bf5ff0fbc915d8f5b42bfff892cbb6ac98c3c45f481a03c685
I0406 15:17:04.537767 139997413660480 configuration_utils.py:292] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": null,
  "do_sample": false,
  "eos_token_ids": null,
  "finetuning_task": null,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1"
  },
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "is_decoder": false,
  "label2id": {
    "LABEL_0": 0,
    "LABEL_1": 1
  },
  "layer_norm_eps": 1e-12,
  "length_penalty": 1.0,
  "max_length": 20,
  "max_position_embedd

HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=56, style=ProgressStyle(descriptio…


Data size: 887
P: 780 / 888 = 0.8783783783783784
R: 780 / 966 = 0.8074534161490683
F: 0.8414239482200646
A: 0.7835400225479143
AL1: 0.8275084554678692
Train size: 2956
Dev size: 592


I0406 15:17:29.592978 139997413660480 configuration_utils.py:256] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json from cache at /home/yves/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.8f56353af4a709bf5ff0fbc915d8f5b42bfff892cbb6ac98c3c45f481a03c685
I0406 15:17:29.594574 139997413660480 configuration_utils.py:292] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": null,
  "do_sample": false,
  "eos_token_ids": null,
  "finetuning_task": null,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1"
  },
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "is_decoder": false,
  "label2id": {
    "LABEL_0": 0,
    "LABEL_1": 1
  },
  "layer_norm_eps": 1e-12,
  "length_penalty": 1.0,
  "max_length": 20,
  "max_position_embedd

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: []
Dev loss: 0.5149836169706808




Epoch:   1%|          | 1/100 [00:41<1:08:30, 41.52s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808]
Dev loss: 0.3414199223389497




Epoch:   2%|▏         | 2/100 [01:23<1:07:55, 41.58s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497]
Dev loss: 0.22151915648499051




Epoch:   3%|▎         | 3/100 [02:05<1:07:18, 41.64s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051]
Dev loss: 0.1496102097066673




Epoch:   4%|▍         | 4/100 [02:46<1:06:40, 41.67s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673]
Dev loss: 0.11323500323939968




Epoch:   5%|▌         | 5/100 [03:28<1:06:01, 41.70s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968]
Dev loss: 0.09499084526622617




Epoch:   6%|▌         | 6/100 [04:10<1:05:21, 41.72s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617]
Dev loss: 0.08551248887906203




Epoch:   7%|▋         | 7/100 [04:52<1:04:41, 41.73s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203]
Dev loss: 0.08025593431414785




Epoch:   8%|▊         | 8/100 [05:33<1:04:00, 41.75s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785]
Dev loss: 0.07719727786811623




Epoch:   9%|▉         | 9/100 [06:15<1:03:19, 41.75s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623]
Dev loss: 0.07536589716737335




Epoch:  10%|█         | 10/100 [06:57<1:02:37, 41.75s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335]
Dev loss: 0.07407827232335065




Epoch:  11%|█         | 11/100 [07:39<1:01:55, 41.75s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065]
Dev loss: 0.07308865882254936




Epoch:  12%|█▏        | 12/100 [08:20<1:01:13, 41.75s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936]
Dev loss: 0.07176894956343882




Epoch:  13%|█▎        | 13/100 [09:02<1:00:32, 41.75s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882]
Dev loss: 0.07013430285292703




Epoch:  14%|█▍        | 14/100 [09:44<59:50, 41.75s/it]  [A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703]
Dev loss: 0.06827247988533329




Epoch:  15%|█▌        | 15/100 [10:26<59:08, 41.75s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329]
Dev loss: 0.06618624593357782




Epoch:  16%|█▌        | 16/100 [11:07<58:26, 41.75s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782]
Dev loss: 0.06393589310952134




Epoch:  17%|█▋        | 17/100 [11:49<57:44, 41.74s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134]
Dev loss: 0.06196139360199104




Epoch:  18%|█▊        | 18/100 [12:31<57:03, 41.75s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104]
Dev loss: 0.05915939536046337




Epoch:  19%|█▉        | 19/100 [13:13<56:21, 41.75s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337]
Dev loss: 0.05673292853139542




Epoch:  20%|██        | 20/100 [13:54<55:39, 41.75s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542]
Dev loss: 0.05428198020200472




Epoch:  21%|██        | 21/100 [14:36<54:58, 41.75s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472]
Dev loss: 0.0521855657366482




Epoch:  22%|██▏       | 22/100 [15:18<54:16, 41.75s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482]
Dev loss: 0.049792519392999446




Epoch:  23%|██▎       | 23/100 [16:00<53:34, 41.75s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446]
Dev loss: 0.047605392699306075




Epoch:  24%|██▍       | 24/100 [16:41<52:52, 41.75s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075]
Dev loss: 0.04559452257849075




Epoch:  25%|██▌       | 25/100 [17:23<52:11, 41.75s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075]
Dev loss: 0.04354281582542368




Epoch:  26%|██▌       | 26/100 [18:05<51:29, 41.75s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368]
Dev loss: 0.0417332415645187




Epoch:  27%|██▋       | 27/100 [18:47<50:47, 41.74s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187]
Dev loss: 0.040254368169887644




Epoch:  28%|██▊       | 28/100 [19:28<50:05, 41.74s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644]
Dev loss: 0.03846274286105826




Epoch:  29%|██▉       | 29/100 [20:10<49:23, 41.74s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826]
Dev loss: 0.03701911589785202




Epoch:  30%|███       | 30/100 [20:52<48:41, 41.74s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202]
Dev loss: 0.035473480568947016




Epoch:  31%|███       | 31/100 [21:34<48:00, 41.74s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016]
Dev loss: 0.034058156549125106




Epoch:  32%|███▏      | 32/100 [22:15<47:18, 41.74s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106]
Dev loss: 0.03301481031686873




Epoch:  33%|███▎      | 33/100 [22:57<46:36, 41.74s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873]
Dev loss: 0.03208547583907037




Epoch:  34%|███▍      | 34/100 [23:39<45:54, 41.73s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037]
Dev loss: 0.03082778138687482




Epoch:  35%|███▌      | 35/100 [24:20<45:12, 41.73s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037, 0.03082778138687482]
Dev loss: 0.03010273845614614




Epoch:  36%|███▌      | 36/100 [25:02<44:30, 41.73s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037, 0.03082778138687482, 0.03010273845614614]
Dev loss: 0.029236309935112257




Epoch:  37%|███▋      | 37/100 [25:44<43:49, 41.74s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037, 0.03082778138687482, 0.03010273845614614, 0.029236309935112257]
Dev loss: 0.028609317903583113




Epoch:  38%|███▊      | 38/100 [26:26<43:07, 41.74s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037, 0.03082778138687482, 0.03010273845614614, 0.029236309935112257, 0.028609317903583113]
Dev loss: 0.02767453226890113




Epoch:  39%|███▉      | 39/100 [27:07<42:25, 41.74s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037, 0.03082778138687482, 0.03010273845614614, 0.029236309935112257, 0.028609317903583113, 0.02767453226890113]
Dev loss: 0.02697559156631296




Epoch:  40%|████      | 40/100 [27:49<41:44, 41.73s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037, 0.03082778138687482, 0.03010273845614614, 0.029236309935112257, 0.028609317903583113, 0.02767453226890113, 0.02697559156631296]
Dev loss: 0.02621117654583744




Epoch:  41%|████      | 41/100 [28:31<41:02, 41.74s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037, 0.03082778138687482, 0.03010273845614614, 0.029236309935112257, 0.028609317903583113, 0.02767453226890113, 0.02697559156631296, 0.02621117654583744]
Dev loss: 0.025313521534003115




Epoch:  42%|████▏     | 42/100 [29:13<40:21, 41.75s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037, 0.03082778138687482, 0.03010273845614614, 0.029236309935112257, 0.028609317903583113, 0.02767453226890113, 0.02697559156631296, 0.02621117654583744, 0.025313521534003115]
Dev loss: 0.024899339504741335




Epoch:  43%|████▎     | 43/100 [29:54<39:39, 41.75s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037, 0.03082778138687482, 0.03010273845614614, 0.029236309935112257, 0.028609317903583113, 0.02767453226890113, 0.02697559156631296, 0.02621117654583744, 0.025313521534003115, 0.024899339504741335]
Dev loss: 0.024404904276535317




Epoch:  44%|████▍     | 44/100 [30:36<38:56, 41.72s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037, 0.03082778138687482, 0.03010273845614614, 0.029236309935112257, 0.028609317903583113, 0.02767453226890113, 0.02697559156631296, 0.02621117654583744, 0.025313521534003115, 0.024899339504741335, 0.024404904276535317]
Dev loss: 0.023967495616021996




Epoch:  45%|████▌     | 45/100 [31:18<38:14, 41.73s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037, 0.03082778138687482, 0.03010273845614614, 0.029236309935112257, 0.028609317903583113, 0.02767453226890113, 0.02697559156631296, 0.02621117654583744, 0.025313521534003115, 0.024899339504741335, 0.024404904276535317, 0.023967495616021996]
Dev loss: 0.023492925271794602




Epoch:  46%|████▌     | 46/100 [31:59<37:32, 41.72s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037, 0.03082778138687482, 0.03010273845614614, 0.029236309935112257, 0.028609317903583113, 0.02767453226890113, 0.02697559156631296, 0.02621117654583744, 0.025313521534003115, 0.024899339504741335, 0.024404904276535317, 0.023967495616021996, 0.023492925271794602]
Dev loss: 0



Epoch:  47%|████▋     | 47/100 [32:41<36:51, 41.73s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037, 0.03082778138687482, 0.03010273845614614, 0.029236309935112257, 0.028609317903583113, 0.02767453226890113, 0.02697559156631296, 0.02621117654583744, 0.025313521534003115, 0.024899339504741335, 0.024404904276535317, 0.023967495616021996, 0.023492925271794602, 0.023232067



Epoch:  48%|████▊     | 48/100 [33:23<36:09, 41.73s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037, 0.03082778138687482, 0.03010273845614614, 0.029236309935112257, 0.028609317903583113, 0.02767453226890113, 0.02697559156631296, 0.02621117654583744, 0.025313521534003115, 0.024899339504741335, 0.024404904276535317, 0.023967495616021996, 0.023492925271794602, 0.023232067



Epoch:  49%|████▉     | 49/100 [34:05<35:28, 41.73s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037, 0.03082778138687482, 0.03010273845614614, 0.029236309935112257, 0.028609317903583113, 0.02767453226890113, 0.02697559156631296, 0.02621117654583744, 0.025313521534003115, 0.024899339504741335, 0.024404904276535317, 0.023967495616021996, 0.023492925271794602, 0.023232067



Epoch:  50%|█████     | 50/100 [34:46<34:46, 41.73s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037, 0.03082778138687482, 0.03010273845614614, 0.029236309935112257, 0.028609317903583113, 0.02767453226890113, 0.02697559156631296, 0.02621117654583744, 0.025313521534003115, 0.024899339504741335, 0.024404904276535317, 0.023967495616021996, 0.023492925271794602, 0.023232067



Epoch:  51%|█████     | 51/100 [35:28<34:05, 41.74s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037, 0.03082778138687482, 0.03010273845614614, 0.029236309935112257, 0.028609317903583113, 0.02767453226890113, 0.02697559156631296, 0.02621117654583744, 0.025313521534003115, 0.024899339504741335, 0.024404904276535317, 0.023967495616021996, 0.023492925271794602, 0.023232067



Epoch:  52%|█████▏    | 52/100 [36:10<33:23, 41.74s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037, 0.03082778138687482, 0.03010273845614614, 0.029236309935112257, 0.028609317903583113, 0.02767453226890113, 0.02697559156631296, 0.02621117654583744, 0.025313521534003115, 0.024899339504741335, 0.024404904276535317, 0.023967495616021996, 0.023492925271794602, 0.023232067



Epoch:  53%|█████▎    | 53/100 [36:52<32:41, 41.74s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037, 0.03082778138687482, 0.03010273845614614, 0.029236309935112257, 0.028609317903583113, 0.02767453226890113, 0.02697559156631296, 0.02621117654583744, 0.025313521534003115, 0.024899339504741335, 0.024404904276535317, 0.023967495616021996, 0.023492925271794602, 0.023232067



Epoch:  54%|█████▍    | 54/100 [37:33<32:00, 41.74s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037, 0.03082778138687482, 0.03010273845614614, 0.029236309935112257, 0.028609317903583113, 0.02767453226890113, 0.02697559156631296, 0.02621117654583744, 0.025313521534003115, 0.024899339504741335, 0.024404904276535317, 0.023967495616021996, 0.023492925271794602, 0.023232067



Epoch:  55%|█████▌    | 55/100 [38:15<31:18, 41.74s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037, 0.03082778138687482, 0.03010273845614614, 0.029236309935112257, 0.028609317903583113, 0.02767453226890113, 0.02697559156631296, 0.02621117654583744, 0.025313521534003115, 0.024899339504741335, 0.024404904276535317, 0.023967495616021996, 0.023492925271794602, 0.023232067



Epoch:  56%|█████▌    | 56/100 [38:57<30:36, 41.74s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037, 0.03082778138687482, 0.03010273845614614, 0.029236309935112257, 0.028609317903583113, 0.02767453226890113, 0.02697559156631296, 0.02621117654583744, 0.025313521534003115, 0.024899339504741335, 0.024404904276535317, 0.023967495616021996, 0.023492925271794602, 0.023232067



Epoch:  57%|█████▋    | 57/100 [39:39<29:54, 41.74s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037, 0.03082778138687482, 0.03010273845614614, 0.029236309935112257, 0.028609317903583113, 0.02767453226890113, 0.02697559156631296, 0.02621117654583744, 0.025313521534003115, 0.024899339504741335, 0.024404904276535317, 0.023967495616021996, 0.023492925271794602, 0.023232067



Epoch:  58%|█████▊    | 58/100 [40:20<29:13, 41.75s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037, 0.03082778138687482, 0.03010273845614614, 0.029236309935112257, 0.028609317903583113, 0.02767453226890113, 0.02697559156631296, 0.02621117654583744, 0.025313521534003115, 0.024899339504741335, 0.024404904276535317, 0.023967495616021996, 0.023492925271794602, 0.023232067



Epoch:  59%|█████▉    | 59/100 [41:02<28:31, 41.74s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037, 0.03082778138687482, 0.03010273845614614, 0.029236309935112257, 0.028609317903583113, 0.02767453226890113, 0.02697559156631296, 0.02621117654583744, 0.025313521534003115, 0.024899339504741335, 0.024404904276535317, 0.023967495616021996, 0.023492925271794602, 0.023232067



Epoch:  60%|██████    | 60/100 [41:44<27:49, 41.74s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037, 0.03082778138687482, 0.03010273845614614, 0.029236309935112257, 0.028609317903583113, 0.02767453226890113, 0.02697559156631296, 0.02621117654583744, 0.025313521534003115, 0.024899339504741335, 0.024404904276535317, 0.023967495616021996, 0.023492925271794602, 0.023232067



Epoch:  61%|██████    | 61/100 [42:26<27:07, 41.74s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037, 0.03082778138687482, 0.03010273845614614, 0.029236309935112257, 0.028609317903583113, 0.02767453226890113, 0.02697559156631296, 0.02621117654583744, 0.025313521534003115, 0.024899339504741335, 0.024404904276535317, 0.023967495616021996, 0.023492925271794602, 0.023232067



Epoch:  62%|██████▏   | 62/100 [43:07<26:26, 41.74s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037, 0.03082778138687482, 0.03010273845614614, 0.029236309935112257, 0.028609317903583113, 0.02767453226890113, 0.02697559156631296, 0.02621117654583744, 0.025313521534003115, 0.024899339504741335, 0.024404904276535317, 0.023967495616021996, 0.023492925271794602, 0.023232067



Epoch:  63%|██████▎   | 63/100 [43:49<25:44, 41.74s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…



Epoch:  64%|██████▍   | 64/100 [44:30<24:57, 41.61s/it][A[A


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037, 0.03082778138687482, 0.03010273845614614, 0.029236309935112257, 0.028609317903583113, 0.02767453226890113, 0.02697559156631296, 0.02621117654583744, 0.025313521534003115, 0.024899339504741335, 0.024404904276535317, 0.023967495616021996, 0.023492925271794602, 0.023232067

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037, 0.03082778138687482, 0.03010273845614614, 0.029236309935112257, 0.028609317903583113, 0.02767453226890113, 0.02697559156631296, 0.02621117654583744, 0.025313521534003115, 0.024899339504741335, 0.024404904276535317, 0.023967495616021996, 0.023492925271794602, 0.023232067



Epoch:  65%|██████▌   | 65/100 [45:12<24:17, 41.65s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037, 0.03082778138687482, 0.03010273845614614, 0.029236309935112257, 0.028609317903583113, 0.02767453226890113, 0.02697559156631296, 0.02621117654583744, 0.025313521534003115, 0.024899339504741335, 0.024404904276535317, 0.023967495616021996, 0.023492925271794602, 0.023232067



Epoch:  66%|██████▌   | 66/100 [45:54<23:36, 41.68s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…



Epoch:  67%|██████▋   | 67/100 [46:35<22:51, 41.56s/it][A[A


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037, 0.03082778138687482, 0.03010273845614614, 0.029236309935112257, 0.028609317903583113, 0.02767453226890113, 0.02697559156631296, 0.02621117654583744, 0.025313521534003115, 0.024899339504741335, 0.024404904276535317, 0.023967495616021996, 0.023492925271794602, 0.023232067

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…



Epoch:  68%|██████▊   | 68/100 [47:16<22:07, 41.48s/it][A[A


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037, 0.03082778138687482, 0.03010273845614614, 0.029236309935112257, 0.028609317903583113, 0.02767453226890113, 0.02697559156631296, 0.02621117654583744, 0.025313521534003115, 0.024899339504741335, 0.024404904276535317, 0.023967495616021996, 0.023492925271794602, 0.023232067

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037, 0.03082778138687482, 0.03010273845614614, 0.029236309935112257, 0.028609317903583113, 0.02767453226890113, 0.02697559156631296, 0.02621117654583744, 0.025313521534003115, 0.024899339504741335, 0.024404904276535317, 0.023967495616021996, 0.023492925271794602, 0.023232067



Epoch:  69%|██████▉   | 69/100 [47:58<21:28, 41.56s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037, 0.03082778138687482, 0.03010273845614614, 0.029236309935112257, 0.028609317903583113, 0.02767453226890113, 0.02697559156631296, 0.02621117654583744, 0.025313521534003115, 0.024899339504741335, 0.024404904276535317, 0.023967495616021996, 0.023492925271794602, 0.023232067



Epoch:  70%|███████   | 70/100 [48:40<20:48, 41.61s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037, 0.03082778138687482, 0.03010273845614614, 0.029236309935112257, 0.028609317903583113, 0.02767453226890113, 0.02697559156631296, 0.02621117654583744, 0.025313521534003115, 0.024899339504741335, 0.024404904276535317, 0.023967495616021996, 0.023492925271794602, 0.023232067



Epoch:  71%|███████   | 71/100 [49:22<20:07, 41.65s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037, 0.03082778138687482, 0.03010273845614614, 0.029236309935112257, 0.028609317903583113, 0.02767453226890113, 0.02697559156631296, 0.02621117654583744, 0.025313521534003115, 0.024899339504741335, 0.024404904276535317, 0.023967495616021996, 0.023492925271794602, 0.023232067



Epoch:  72%|███████▏  | 72/100 [50:03<19:26, 41.67s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…



Epoch:  73%|███████▎  | 73/100 [50:45<18:41, 41.55s/it][A[A


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037, 0.03082778138687482, 0.03010273845614614, 0.029236309935112257, 0.028609317903583113, 0.02767453226890113, 0.02697559156631296, 0.02621117654583744, 0.025313521534003115, 0.024899339504741335, 0.024404904276535317, 0.023967495616021996, 0.023492925271794602, 0.023232067

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…



Epoch:  74%|███████▍  | 74/100 [51:26<17:58, 41.48s/it][A[A


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037, 0.03082778138687482, 0.03010273845614614, 0.029236309935112257, 0.028609317903583113, 0.02767453226890113, 0.02697559156631296, 0.02621117654583744, 0.025313521534003115, 0.024899339504741335, 0.024404904276535317, 0.023967495616021996, 0.023492925271794602, 0.023232067

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037, 0.03082778138687482, 0.03010273845614614, 0.029236309935112257, 0.028609317903583113, 0.02767453226890113, 0.02697559156631296, 0.02621117654583744, 0.025313521534003115, 0.024899339504741335, 0.024404904276535317, 0.023967495616021996, 0.023492925271794602, 0.023232067



Epoch:  75%|███████▌  | 75/100 [52:08<17:18, 41.55s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…



Epoch:  76%|███████▌  | 76/100 [52:49<16:35, 41.48s/it][A[A


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037, 0.03082778138687482, 0.03010273845614614, 0.029236309935112257, 0.028609317903583113, 0.02767453226890113, 0.02697559156631296, 0.02621117654583744, 0.025313521534003115, 0.024899339504741335, 0.024404904276535317, 0.023967495616021996, 0.023492925271794602, 0.023232067

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037, 0.03082778138687482, 0.03010273845614614, 0.029236309935112257, 0.028609317903583113, 0.02767453226890113, 0.02697559156631296, 0.02621117654583744, 0.025313521534003115, 0.024899339504741335, 0.024404904276535317, 0.023967495616021996, 0.023492925271794602, 0.023232067



Epoch:  77%|███████▋  | 77/100 [53:31<15:55, 41.55s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037, 0.03082778138687482, 0.03010273845614614, 0.029236309935112257, 0.028609317903583113, 0.02767453226890113, 0.02697559156631296, 0.02621117654583744, 0.025313521534003115, 0.024899339504741335, 0.024404904276535317, 0.023967495616021996, 0.023492925271794602, 0.023232067



Epoch:  78%|███████▊  | 78/100 [54:12<15:15, 41.61s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…



Epoch:  79%|███████▉  | 79/100 [54:54<14:31, 41.51s/it][A[A


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037, 0.03082778138687482, 0.03010273845614614, 0.029236309935112257, 0.028609317903583113, 0.02767453226890113, 0.02697559156631296, 0.02621117654583744, 0.025313521534003115, 0.024899339504741335, 0.024404904276535317, 0.023967495616021996, 0.023492925271794602, 0.023232067

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…



Epoch:  80%|████████  | 80/100 [55:35<13:48, 41.45s/it][A[A


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037, 0.03082778138687482, 0.03010273845614614, 0.029236309935112257, 0.028609317903583113, 0.02767453226890113, 0.02697559156631296, 0.02621117654583744, 0.025313521534003115, 0.024899339504741335, 0.024404904276535317, 0.023967495616021996, 0.023492925271794602, 0.023232067

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…



Epoch:  81%|████████  | 81/100 [56:16<13:06, 41.40s/it][A[A


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037, 0.03082778138687482, 0.03010273845614614, 0.029236309935112257, 0.028609317903583113, 0.02767453226890113, 0.02697559156631296, 0.02621117654583744, 0.025313521534003115, 0.024899339504741335, 0.024404904276535317, 0.023967495616021996, 0.023492925271794602, 0.023232067

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037, 0.03082778138687482, 0.03010273845614614, 0.029236309935112257, 0.028609317903583113, 0.02767453226890113, 0.02697559156631296, 0.02621117654583744, 0.025313521534003115, 0.024899339504741335, 0.024404904276535317, 0.023967495616021996, 0.023492925271794602, 0.023232067



Epoch:  82%|████████▏ | 82/100 [56:58<12:27, 41.50s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…



Epoch:  83%|████████▎ | 83/100 [57:39<11:44, 41.44s/it][A[A


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037, 0.03082778138687482, 0.03010273845614614, 0.029236309935112257, 0.028609317903583113, 0.02767453226890113, 0.02697559156631296, 0.02621117654583744, 0.025313521534003115, 0.024899339504741335, 0.024404904276535317, 0.023967495616021996, 0.023492925271794602, 0.023232067

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…



Epoch:  84%|████████▍ | 84/100 [58:21<11:02, 41.39s/it][A[A


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037, 0.03082778138687482, 0.03010273845614614, 0.029236309935112257, 0.028609317903583113, 0.02767453226890113, 0.02697559156631296, 0.02621117654583744, 0.025313521534003115, 0.024899339504741335, 0.024404904276535317, 0.023967495616021996, 0.023492925271794602, 0.023232067

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037, 0.03082778138687482, 0.03010273845614614, 0.029236309935112257, 0.028609317903583113, 0.02767453226890113, 0.02697559156631296, 0.02621117654583744, 0.025313521534003115, 0.024899339504741335, 0.024404904276535317, 0.023967495616021996, 0.023492925271794602, 0.023232067



Epoch:  85%|████████▌ | 85/100 [59:02<10:22, 41.49s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037, 0.03082778138687482, 0.03010273845614614, 0.029236309935112257, 0.028609317903583113, 0.02767453226890113, 0.02697559156631296, 0.02621117654583744, 0.025313521534003115, 0.024899339504741335, 0.024404904276535317, 0.023967495616021996, 0.023492925271794602, 0.023232067



Epoch:  86%|████████▌ | 86/100 [59:44<09:41, 41.56s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…



Epoch:  87%|████████▋ | 87/100 [1:00:25<08:59, 41.48s/it][A[A


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037, 0.03082778138687482, 0.03010273845614614, 0.029236309935112257, 0.028609317903583113, 0.02767453226890113, 0.02697559156631296, 0.02621117654583744, 0.025313521534003115, 0.024899339504741335, 0.024404904276535317, 0.023967495616021996, 0.023492925271794602, 0.023232067

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…



Epoch:  88%|████████▊ | 88/100 [1:01:07<08:17, 41.42s/it][A[A


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037, 0.03082778138687482, 0.03010273845614614, 0.029236309935112257, 0.028609317903583113, 0.02767453226890113, 0.02697559156631296, 0.02621117654583744, 0.025313521534003115, 0.024899339504741335, 0.024404904276535317, 0.023967495616021996, 0.023492925271794602, 0.023232067

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…



Epoch:  89%|████████▉ | 89/100 [1:01:48<07:35, 41.38s/it][A[A


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037, 0.03082778138687482, 0.03010273845614614, 0.029236309935112257, 0.028609317903583113, 0.02767453226890113, 0.02697559156631296, 0.02621117654583744, 0.025313521534003115, 0.024899339504741335, 0.024404904276535317, 0.023967495616021996, 0.023492925271794602, 0.023232067

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…



Epoch:  90%|█████████ | 90/100 [1:02:29<06:53, 41.36s/it][A[A


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037, 0.03082778138687482, 0.03010273845614614, 0.029236309935112257, 0.028609317903583113, 0.02767453226890113, 0.02697559156631296, 0.02621117654583744, 0.025313521534003115, 0.024899339504741335, 0.024404904276535317, 0.023967495616021996, 0.023492925271794602, 0.023232067

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037, 0.03082778138687482, 0.03010273845614614, 0.029236309935112257, 0.028609317903583113, 0.02767453226890113, 0.02697559156631296, 0.02621117654583744, 0.025313521534003115, 0.024899339504741335, 0.024404904276535317, 0.023967495616021996, 0.023492925271794602, 0.023232067



Epoch:  91%|█████████ | 91/100 [1:03:11<06:13, 41.47s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…



Epoch:  92%|█████████▏| 92/100 [1:03:52<05:31, 41.42s/it][A[A


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037, 0.03082778138687482, 0.03010273845614614, 0.029236309935112257, 0.028609317903583113, 0.02767453226890113, 0.02697559156631296, 0.02621117654583744, 0.025313521534003115, 0.024899339504741335, 0.024404904276535317, 0.023967495616021996, 0.023492925271794602, 0.023232067

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…



Epoch:  93%|█████████▎| 93/100 [1:04:34<04:49, 41.38s/it][A[A


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037, 0.03082778138687482, 0.03010273845614614, 0.029236309935112257, 0.028609317903583113, 0.02767453226890113, 0.02697559156631296, 0.02621117654583744, 0.025313521534003115, 0.024899339504741335, 0.024404904276535317, 0.023967495616021996, 0.023492925271794602, 0.023232067

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…



Epoch:  94%|█████████▍| 94/100 [1:05:15<04:08, 41.35s/it][A[A


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037, 0.03082778138687482, 0.03010273845614614, 0.029236309935112257, 0.028609317903583113, 0.02767453226890113, 0.02697559156631296, 0.02621117654583744, 0.025313521534003115, 0.024899339504741335, 0.024404904276535317, 0.023967495616021996, 0.023492925271794602, 0.023232067

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…



Epoch:  95%|█████████▌| 95/100 [1:05:56<03:26, 41.34s/it][A[A


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037, 0.03082778138687482, 0.03010273845614614, 0.029236309935112257, 0.028609317903583113, 0.02767453226890113, 0.02697559156631296, 0.02621117654583744, 0.025313521534003115, 0.024899339504741335, 0.024404904276535317, 0.023967495616021996, 0.023492925271794602, 0.023232067

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5149836169706808, 0.3414199223389497, 0.22151915648499051, 0.1496102097066673, 0.11323500323939968, 0.09499084526622617, 0.08551248887906203, 0.08025593431414785, 0.07719727786811623, 0.07536589716737335, 0.07407827232335065, 0.07308865882254936, 0.07176894956343882, 0.07013430285292703, 0.06827247988533329, 0.06618624593357782, 0.06393589310952134, 0.06196139360199104, 0.05915939536046337, 0.05673292853139542, 0.05428198020200472, 0.0521855657366482, 0.049792519392999446, 0.047605392699306075, 0.04559452257849075, 0.04354281582542368, 0.0417332415645187, 0.040254368169887644, 0.03846274286105826, 0.03701911589785202, 0.035473480568947016, 0.034058156549125106, 0.03301481031686873, 0.03208547583907037, 0.03082778138687482, 0.03010273845614614, 0.029236309935112257, 0.028609317903583113, 0.02767453226890113, 0.02697559156631296, 0.02621117654583744, 0.025313521534003115, 0.024899339504741335, 0.024404904276535317, 0.023967495616021996, 0.023492925271794602, 0.023232067

I0406 16:24:10.353752 139997413660480 configuration_utils.py:256] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json from cache at /home/yves/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.8f56353af4a709bf5ff0fbc915d8f5b42bfff892cbb6ac98c3c45f481a03c685
I0406 16:24:10.355189 139997413660480 configuration_utils.py:292] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": null,
  "do_sample": false,
  "eos_token_ids": null,
  "finetuning_task": null,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1"
  },
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "is_decoder": false,
  "label2id": {
    "LABEL_0": 0,
    "LABEL_1": 1
  },
  "layer_norm_eps": 1e-12,
  "length_penalty": 1.0,
  "max_length": 20,
  "max_position_embedd

HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=56, style=ProgressStyle(descriptio…


Data size: 887
P: 733 / 848 = 0.8643867924528302
R: 733 / 951 = 0.7707676130389064
F: 0.8148971650917176
A: 0.762119503945885
AL1: 0.7925591882750845
Train size: 2956
Dev size: 592


I0406 16:24:35.200961 139997413660480 configuration_utils.py:256] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json from cache at /home/yves/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.8f56353af4a709bf5ff0fbc915d8f5b42bfff892cbb6ac98c3c45f481a03c685
I0406 16:24:35.202509 139997413660480 configuration_utils.py:292] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": null,
  "do_sample": false,
  "eos_token_ids": null,
  "finetuning_task": null,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1"
  },
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "is_decoder": false,
  "label2id": {
    "LABEL_0": 0,
    "LABEL_1": 1
  },
  "layer_norm_eps": 1e-12,
  "length_penalty": 1.0,
  "max_length": 20,
  "max_position_embedd

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: []
Dev loss: 0.5213218041368433





Epoch:   1%|          | 1/100 [00:41<1:08:29, 41.51s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5213218041368433]
Dev loss: 0.3512672838327047





Epoch:   2%|▏         | 2/100 [01:23<1:07:53, 41.57s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5213218041368433, 0.3512672838327047]
Dev loss: 0.22659491606660792





Epoch:   3%|▎         | 3/100 [02:04<1:07:17, 41.62s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792]
Dev loss: 0.15170020992691452





Epoch:   4%|▍         | 4/100 [02:46<1:06:39, 41.66s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452]
Dev loss: 0.11356990703859844





Epoch:   5%|▌         | 5/100 [03:28<1:06:01, 41.69s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844]
Dev loss: 0.09469581637028102





Epoch:   6%|▌         | 6/100 [04:10<1:05:21, 41.72s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102]
Dev loss: 0.08492339019839829





Epoch:   7%|▋         | 7/100 [04:52<1:04:40, 41.73s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829]
Dev loss: 0.07952852808945887





Epoch:   8%|▊         | 8/100 [05:33<1:03:59, 41.74s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887]
Dev loss: 0.07629744426624195





Epoch:   9%|▉         | 9/100 [06:15<1:03:18, 41.74s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195]
Dev loss: 0.07419083468817375





Epoch:  10%|█         | 10/100 [06:57<1:02:36, 41.74s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195, 0.07419083468817375]
Dev loss: 0.0725745868843955





Epoch:  11%|█         | 11/100 [07:39<1:01:55, 41.74s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195, 0.07419083468817375, 0.0725745868843955]
Dev loss: 0.07107165012810682





Epoch:  12%|█▏        | 12/100 [08:20<1:01:13, 41.74s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195, 0.07419083468817375, 0.0725745868843955, 0.07107165012810682]
Dev loss: 0.06928105591922193





Epoch:  13%|█▎        | 13/100 [09:02<1:00:31, 41.74s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195, 0.07419083468817375, 0.0725745868843955, 0.07107165012810682, 0.06928105591922193]
Dev loss: 0.06736977509147413





Epoch:  14%|█▍        | 14/100 [09:44<59:49, 41.74s/it]  [A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195, 0.07419083468817375, 0.0725745868843955, 0.07107165012810682, 0.06928105591922193, 0.06736977509147413]
Dev loss: 0.06515877099858748





Epoch:  15%|█▌        | 15/100 [10:25<59:07, 41.74s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195, 0.07419083468817375, 0.0725745868843955, 0.07107165012810682, 0.06928105591922193, 0.06736977509147413, 0.06515877099858748]
Dev loss: 0.06267403186978521





Epoch:  16%|█▌        | 16/100 [11:07<58:26, 41.74s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195, 0.07419083468817375, 0.0725745868843955, 0.07107165012810682, 0.06928105591922193, 0.06736977509147413, 0.06515877099858748, 0.06267403186978521]
Dev loss: 0.05988547669069187





Epoch:  17%|█▋        | 17/100 [11:49<57:44, 41.74s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195, 0.07419083468817375, 0.0725745868843955, 0.07107165012810682, 0.06928105591922193, 0.06736977509147413, 0.06515877099858748, 0.06267403186978521, 0.05988547669069187]
Dev loss: 0.05735226267495671





Epoch:  18%|█▊        | 18/100 [12:31<57:02, 41.74s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195, 0.07419083468817375, 0.0725745868843955, 0.07107165012810682, 0.06928105591922193, 0.06736977509147413, 0.06515877099858748, 0.06267403186978521, 0.05988547669069187, 0.05735226267495671]
Dev loss: 0.0545267711418706





Epoch:  19%|█▉        | 19/100 [13:12<56:21, 41.74s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195, 0.07419083468817375, 0.0725745868843955, 0.07107165012810682, 0.06928105591922193, 0.06736977509147413, 0.06515877099858748, 0.06267403186978521, 0.05988547669069187, 0.05735226267495671, 0.0545267711418706]
Dev loss: 0.05180915575977918





Epoch:  20%|██        | 20/100 [13:54<55:39, 41.74s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195, 0.07419083468817375, 0.0725745868843955, 0.07107165012810682, 0.06928105591922193, 0.06736977509147413, 0.06515877099858748, 0.06267403186978521, 0.05988547669069187, 0.05735226267495671, 0.0545267711418706, 0.05180915575977918]
Dev loss: 0.049151497213421644





Epoch:  21%|██        | 21/100 [14:36<54:57, 41.74s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195, 0.07419083468817375, 0.0725745868843955, 0.07107165012810682, 0.06928105591922193, 0.06736977509147413, 0.06515877099858748, 0.06267403186978521, 0.05988547669069187, 0.05735226267495671, 0.0545267711418706, 0.05180915575977918, 0.049151497213421644]
Dev loss: 0.04689121790028907





Epoch:  22%|██▏       | 22/100 [15:18<54:15, 41.74s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195, 0.07419083468817375, 0.0725745868843955, 0.07107165012810682, 0.06928105591922193, 0.06736977509147413, 0.06515877099858748, 0.06267403186978521, 0.05988547669069187, 0.05735226267495671, 0.0545267711418706, 0.05180915575977918, 0.049151497213421644, 0.04689121790028907]
Dev loss: 0.04473428246942726





Epoch:  23%|██▎       | 23/100 [15:59<53:33, 41.74s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195, 0.07419083468817375, 0.0725745868843955, 0.07107165012810682, 0.06928105591922193, 0.06736977509147413, 0.06515877099858748, 0.06267403186978521, 0.05988547669069187, 0.05735226267495671, 0.0545267711418706, 0.05180915575977918, 0.049151497213421644, 0.04689121790028907, 0.04473428246942726]
Dev loss: 0.04255887372670947





Epoch:  24%|██▍       | 24/100 [16:41<52:51, 41.74s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195, 0.07419083468817375, 0.0725745868843955, 0.07107165012810682, 0.06928105591922193, 0.06736977509147413, 0.06515877099858748, 0.06267403186978521, 0.05988547669069187, 0.05735226267495671, 0.0545267711418706, 0.05180915575977918, 0.049151497213421644, 0.04689121790028907, 0.04473428246942726, 0.04255887372670947]
Dev loss: 0.04059563582209316





Epoch:  25%|██▌       | 25/100 [17:23<52:10, 41.74s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195, 0.07419083468817375, 0.0725745868843955, 0.07107165012810682, 0.06928105591922193, 0.06736977509147413, 0.06515877099858748, 0.06267403186978521, 0.05988547669069187, 0.05735226267495671, 0.0545267711418706, 0.05180915575977918, 0.049151497213421644, 0.04689121790028907, 0.04473428246942726, 0.04255887372670947, 0.04059563582209316]
Dev loss: 0.038926143048180116





Epoch:  26%|██▌       | 26/100 [18:05<51:28, 41.74s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195, 0.07419083468817375, 0.0725745868843955, 0.07107165012810682, 0.06928105591922193, 0.06736977509147413, 0.06515877099858748, 0.06267403186978521, 0.05988547669069187, 0.05735226267495671, 0.0545267711418706, 0.05180915575977918, 0.049151497213421644, 0.04689121790028907, 0.04473428246942726, 0.04255887372670947, 0.04059563582209316, 0.038926143048180116]
Dev loss: 0.037120581680052986





Epoch:  27%|██▋       | 27/100 [18:46<50:46, 41.73s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195, 0.07419083468817375, 0.0725745868843955, 0.07107165012810682, 0.06928105591922193, 0.06736977509147413, 0.06515877099858748, 0.06267403186978521, 0.05988547669069187, 0.05735226267495671, 0.0545267711418706, 0.05180915575977918, 0.049151497213421644, 0.04689121790028907, 0.04473428246942726, 0.04255887372670947, 0.04059563582209316, 0.038926143048180116, 0.037120581680052986]
Dev loss: 0.035901194539021804





Epoch:  28%|██▊       | 28/100 [19:28<50:04, 41.73s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195, 0.07419083468817375, 0.0725745868843955, 0.07107165012810682, 0.06928105591922193, 0.06736977509147413, 0.06515877099858748, 0.06267403186978521, 0.05988547669069187, 0.05735226267495671, 0.0545267711418706, 0.05180915575977918, 0.049151497213421644, 0.04689121790028907, 0.04473428246942726, 0.04255887372670947, 0.04059563582209316, 0.038926143048180116, 0.037120581680052986, 0.035901194539021804]
Dev loss: 0.03453905779767681





Epoch:  29%|██▉       | 29/100 [20:10<49:22, 41.73s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195, 0.07419083468817375, 0.0725745868843955, 0.07107165012810682, 0.06928105591922193, 0.06736977509147413, 0.06515877099858748, 0.06267403186978521, 0.05988547669069187, 0.05735226267495671, 0.0545267711418706, 0.05180915575977918, 0.049151497213421644, 0.04689121790028907, 0.04473428246942726, 0.04255887372670947, 0.04059563582209316, 0.038926143048180116, 0.037120581680052986, 0.035901194539021804, 0.03453905779767681]
Dev loss: 0.03356715910942168





Epoch:  30%|███       | 30/100 [20:52<48:42, 41.74s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195, 0.07419083468817375, 0.0725745868843955, 0.07107165012810682, 0.06928105591922193, 0.06736977509147413, 0.06515877099858748, 0.06267403186978521, 0.05988547669069187, 0.05735226267495671, 0.0545267711418706, 0.05180915575977918, 0.049151497213421644, 0.04689121790028907, 0.04473428246942726, 0.04255887372670947, 0.04059563582209316, 0.038926143048180116, 0.037120581680052986, 0.035901194539021804, 0.03453905779767681, 0.03356715910942168]
Dev loss: 0.032434062327484824





Epoch:  31%|███       | 31/100 [21:33<48:00, 41.75s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195, 0.07419083468817375, 0.0725745868843955, 0.07107165012810682, 0.06928105591922193, 0.06736977509147413, 0.06515877099858748, 0.06267403186978521, 0.05988547669069187, 0.05735226267495671, 0.0545267711418706, 0.05180915575977918, 0.049151497213421644, 0.04689121790028907, 0.04473428246942726, 0.04255887372670947, 0.04059563582209316, 0.038926143048180116, 0.037120581680052986, 0.035901194539021804, 0.03453905779767681, 0.03356715910942168, 0.032434062327484824]
Dev loss: 0.031215171958949114





Epoch:  32%|███▏      | 32/100 [22:15<47:18, 41.75s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195, 0.07419083468817375, 0.0725745868843955, 0.07107165012810682, 0.06928105591922193, 0.06736977509147413, 0.06515877099858748, 0.06267403186978521, 0.05988547669069187, 0.05735226267495671, 0.0545267711418706, 0.05180915575977918, 0.049151497213421644, 0.04689121790028907, 0.04473428246942726, 0.04255887372670947, 0.04059563582209316, 0.038926143048180116, 0.037120581680052986, 0.035901194539021804, 0.03453905779767681, 0.03356715910942168, 0.032434062327484824, 0.031215171958949114]
Dev loss: 0.030425874919101998





Epoch:  33%|███▎      | 33/100 [22:57<46:37, 41.75s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195, 0.07419083468817375, 0.0725745868843955, 0.07107165012810682, 0.06928105591922193, 0.06736977509147413, 0.06515877099858748, 0.06267403186978521, 0.05988547669069187, 0.05735226267495671, 0.0545267711418706, 0.05180915575977918, 0.049151497213421644, 0.04689121790028907, 0.04473428246942726, 0.04255887372670947, 0.04059563582209316, 0.038926143048180116, 0.037120581680052986, 0.035901194539021804, 0.03453905779767681, 0.03356715910942168, 0.032434062327484824, 0.031215171958949114, 0.030425874919101998]
Dev loss: 0.029301784982955135





Epoch:  34%|███▍      | 34/100 [23:39<45:55, 41.75s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195, 0.07419083468817375, 0.0725745868843955, 0.07107165012810682, 0.06928105591922193, 0.06736977509147413, 0.06515877099858748, 0.06267403186978521, 0.05988547669069187, 0.05735226267495671, 0.0545267711418706, 0.05180915575977918, 0.049151497213421644, 0.04689121790028907, 0.04473428246942726, 0.04255887372670947, 0.04059563582209316, 0.038926143048180116, 0.037120581680052986, 0.035901194539021804, 0.03453905779767681, 0.03356715910942168, 0.032434062327484824, 0.031215171958949114, 0.030425874919101998, 0.029301784982955135]
Dev loss: 0.02863636497106101





Epoch:  35%|███▌      | 35/100 [24:20<45:13, 41.74s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195, 0.07419083468817375, 0.0725745868843955, 0.07107165012810682, 0.06928105591922193, 0.06736977509147413, 0.06515877099858748, 0.06267403186978521, 0.05988547669069187, 0.05735226267495671, 0.0545267711418706, 0.05180915575977918, 0.049151497213421644, 0.04689121790028907, 0.04473428246942726, 0.04255887372670947, 0.04059563582209316, 0.038926143048180116, 0.037120581680052986, 0.035901194539021804, 0.03453905779767681, 0.03356715910942168, 0.032434062327484824, 0.031215171958949114, 0.030425874919101998, 0.029301784982955135, 0.02863636497106101]
Dev loss: 0.027768823368525184





Epoch:  36%|███▌      | 36/100 [25:02<44:31, 41.74s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195, 0.07419083468817375, 0.0725745868843955, 0.07107165012810682, 0.06928105591922193, 0.06736977509147413, 0.06515877099858748, 0.06267403186978521, 0.05988547669069187, 0.05735226267495671, 0.0545267711418706, 0.05180915575977918, 0.049151497213421644, 0.04689121790028907, 0.04473428246942726, 0.04255887372670947, 0.04059563582209316, 0.038926143048180116, 0.037120581680052986, 0.035901194539021804, 0.03453905779767681, 0.03356715910942168, 0.032434062327484824, 0.031215171958949114, 0.030425874919101998, 0.029301784982955135, 0.02863636497106101, 0.027768823368525184]
Dev loss: 0.027125793635039717





Epoch:  37%|███▋      | 37/100 [25:44<43:49, 41.74s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195, 0.07419083468817375, 0.0725745868843955, 0.07107165012810682, 0.06928105591922193, 0.06736977509147413, 0.06515877099858748, 0.06267403186978521, 0.05988547669069187, 0.05735226267495671, 0.0545267711418706, 0.05180915575977918, 0.049151497213421644, 0.04689121790028907, 0.04473428246942726, 0.04255887372670947, 0.04059563582209316, 0.038926143048180116, 0.037120581680052986, 0.035901194539021804, 0.03453905779767681, 0.03356715910942168, 0.032434062327484824, 0.031215171958949114, 0.030425874919101998, 0.029301784982955135, 0.02863636497106101, 0.027768823368525184, 0.027125793635039717]
Dev loss: 0.026067637888765014





Epoch:  38%|███▊      | 38/100 [26:25<43:07, 41.74s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195, 0.07419083468817375, 0.0725745868843955, 0.07107165012810682, 0.06928105591922193, 0.06736977509147413, 0.06515877099858748, 0.06267403186978521, 0.05988547669069187, 0.05735226267495671, 0.0545267711418706, 0.05180915575977918, 0.049151497213421644, 0.04689121790028907, 0.04473428246942726, 0.04255887372670947, 0.04059563582209316, 0.038926143048180116, 0.037120581680052986, 0.035901194539021804, 0.03453905779767681, 0.03356715910942168, 0.032434062327484824, 0.031215171958949114, 0.030425874919101998, 0.029301784982955135, 0.02863636497106101, 0.027768823368525184, 0.027125793635039717, 0.026067637888765014]
Dev loss: 0.02596252937675328





Epoch:  39%|███▉      | 39/100 [27:07<42:25, 41.74s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195, 0.07419083468817375, 0.0725745868843955, 0.07107165012810682, 0.06928105591922193, 0.06736977509147413, 0.06515877099858748, 0.06267403186978521, 0.05988547669069187, 0.05735226267495671, 0.0545267711418706, 0.05180915575977918, 0.049151497213421644, 0.04689121790028907, 0.04473428246942726, 0.04255887372670947, 0.04059563582209316, 0.038926143048180116, 0.037120581680052986, 0.035901194539021804, 0.03453905779767681, 0.03356715910942168, 0.032434062327484824, 0.031215171958949114, 0.030425874919101998, 0.029301784982955135, 0.02863636497106101, 0.027768823368525184, 0.027125793635039717, 0.026067637888765014, 0.02596252937675328]
Dev loss: 0.02543608532161326





Epoch:  40%|████      | 40/100 [27:49<41:44, 41.74s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195, 0.07419083468817375, 0.0725745868843955, 0.07107165012810682, 0.06928105591922193, 0.06736977509147413, 0.06515877099858748, 0.06267403186978521, 0.05988547669069187, 0.05735226267495671, 0.0545267711418706, 0.05180915575977918, 0.049151497213421644, 0.04689121790028907, 0.04473428246942726, 0.04255887372670947, 0.04059563582209316, 0.038926143048180116, 0.037120581680052986, 0.035901194539021804, 0.03453905779767681, 0.03356715910942168, 0.032434062327484824, 0.031215171958949114, 0.030425874919101998, 0.029301784982955135, 0.02863636497106101, 0.027768823368525184, 0.027125793635039717, 0.026067637888765014, 0.02596252937675328, 0.02543608532161326]
Dev loss: 0.024767381466321042





Epoch:  41%|████      | 41/100 [28:31<41:02, 41.74s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195, 0.07419083468817375, 0.0725745868843955, 0.07107165012810682, 0.06928105591922193, 0.06736977509147413, 0.06515877099858748, 0.06267403186978521, 0.05988547669069187, 0.05735226267495671, 0.0545267711418706, 0.05180915575977918, 0.049151497213421644, 0.04689121790028907, 0.04473428246942726, 0.04255887372670947, 0.04059563582209316, 0.038926143048180116, 0.037120581680052986, 0.035901194539021804, 0.03453905779767681, 0.03356715910942168, 0.032434062327484824, 0.031215171958949114, 0.030425874919101998, 0.029301784982955135, 0.02863636497106101, 0.027768823368525184, 0.027125793635039717, 0.026067637888765014, 0.02596252937675328, 0.02543608532161326, 0.024767381466321042]
Dev loss: 0.024453865213168634





Epoch:  42%|████▏     | 42/100 [29:12<40:21, 41.74s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195, 0.07419083468817375, 0.0725745868843955, 0.07107165012810682, 0.06928105591922193, 0.06736977509147413, 0.06515877099858748, 0.06267403186978521, 0.05988547669069187, 0.05735226267495671, 0.0545267711418706, 0.05180915575977918, 0.049151497213421644, 0.04689121790028907, 0.04473428246942726, 0.04255887372670947, 0.04059563582209316, 0.038926143048180116, 0.037120581680052986, 0.035901194539021804, 0.03453905779767681, 0.03356715910942168, 0.032434062327484824, 0.031215171958949114, 0.030425874919101998, 0.029301784982955135, 0.02863636497106101, 0.027768823368525184, 0.027125793635039717, 0.026067637888765014, 0.02596252937675328, 0.02543608532161326, 0.024767381466321042, 0.024453865213168634]
Dev loss: 0.024056918063276523





Epoch:  43%|████▎     | 43/100 [29:54<39:39, 41.74s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195, 0.07419083468817375, 0.0725745868843955, 0.07107165012810682, 0.06928105591922193, 0.06736977509147413, 0.06515877099858748, 0.06267403186978521, 0.05988547669069187, 0.05735226267495671, 0.0545267711418706, 0.05180915575977918, 0.049151497213421644, 0.04689121790028907, 0.04473428246942726, 0.04255887372670947, 0.04059563582209316, 0.038926143048180116, 0.037120581680052986, 0.035901194539021804, 0.03453905779767681, 0.03356715910942168, 0.032434062327484824, 0.031215171958949114, 0.030425874919101998, 0.029301784982955135, 0.02863636497106101, 0.027768823368525184, 0.027125793635039717, 0.026067637888765014, 0.02596252937675328, 0.02543608532161326, 0.024767381466321042, 0.024453865213168634, 0.024056918063276523]
Dev loss: 0.023693673740569





Epoch:  44%|████▍     | 44/100 [30:36<38:57, 41.74s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195, 0.07419083468817375, 0.0725745868843955, 0.07107165012810682, 0.06928105591922193, 0.06736977509147413, 0.06515877099858748, 0.06267403186978521, 0.05988547669069187, 0.05735226267495671, 0.0545267711418706, 0.05180915575977918, 0.049151497213421644, 0.04689121790028907, 0.04473428246942726, 0.04255887372670947, 0.04059563582209316, 0.038926143048180116, 0.037120581680052986, 0.035901194539021804, 0.03453905779767681, 0.03356715910942168, 0.032434062327484824, 0.031215171958949114, 0.030425874919101998, 0.029301784982955135, 0.02863636497106101, 0.027768823368525184, 0.027125793635039717, 0.026067637888765014, 0.02596252937675328, 0.02543608532161326, 0.024767381466321042, 0.024453865213168634, 0.024056918063276523, 0.023693673740569]
Dev loss: 0.02324753919163266





Epoch:  45%|████▌     | 45/100 [31:18<38:15, 41.74s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195, 0.07419083468817375, 0.0725745868843955, 0.07107165012810682, 0.06928105591922193, 0.06736977509147413, 0.06515877099858748, 0.06267403186978521, 0.05988547669069187, 0.05735226267495671, 0.0545267711418706, 0.05180915575977918, 0.049151497213421644, 0.04689121790028907, 0.04473428246942726, 0.04255887372670947, 0.04059563582209316, 0.038926143048180116, 0.037120581680052986, 0.035901194539021804, 0.03453905779767681, 0.03356715910942168, 0.032434062327484824, 0.031215171958949114, 0.030425874919101998, 0.029301784982955135, 0.02863636497106101, 0.027768823368525184, 0.027125793635039717, 0.026067637888765014, 0.02596252937675328, 0.02543608532161326, 0.024767381466321042, 0.024453865213168634, 0.024056918063276523, 0.023693673740569, 0.02324753919163266]
Dev loss: 0.022897354855730728





Epoch:  46%|████▌     | 46/100 [32:00<37:36, 41.78s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195, 0.07419083468817375, 0.0725745868843955, 0.07107165012810682, 0.06928105591922193, 0.06736977509147413, 0.06515877099858748, 0.06267403186978521, 0.05988547669069187, 0.05735226267495671, 0.0545267711418706, 0.05180915575977918, 0.049151497213421644, 0.04689121790028907, 0.04473428246942726, 0.04255887372670947, 0.04059563582209316, 0.038926143048180116, 0.037120581680052986, 0.035901194539021804, 0.03453905779767681, 0.03356715910942168, 0.032434062327484824, 0.031215171958949114, 0.030425874919101998, 0.029301784982955135, 0.02863636497106101, 0.027768823368525184, 0.027125793635039717, 0.026067637888765014, 0.02596252937675328, 0.02543608532161326, 0.024767381466321042, 0.024453865213168634, 0.024056918063276523, 0.023693673740569, 0.02324753919163266, 0.022897354855730728]
Dev loss:




Epoch:  47%|████▋     | 47/100 [32:41<36:53, 41.77s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…




Epoch:  48%|████▊     | 48/100 [33:23<36:04, 41.63s/it][A[A[A


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195, 0.07419083468817375, 0.0725745868843955, 0.07107165012810682, 0.06928105591922193, 0.06736977509147413, 0.06515877099858748, 0.06267403186978521, 0.05988547669069187, 0.05735226267495671, 0.0545267711418706, 0.05180915575977918, 0.049151497213421644, 0.04689121790028907, 0.04473428246942726, 0.04255887372670947, 0.04059563582209316, 0.038926143048180116, 0.037120581680052986, 0.035901194539021804, 0.03453905779767681, 0.03356715910942168, 0.032434062327484824, 0.031215171958949114, 0.030425874919101998, 0.029301784982955135, 0.02863636497106101, 0.027768823368525184, 0.027125793635039717, 0.026067637888765014, 0.02596252937675328, 0.02543608532161326, 0.024767381466321042, 0.024453865213168634, 0.024056918063276523, 0.023693673740569, 0.02324753919163266, 0.022897354855730728, 0.0228723

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195, 0.07419083468817375, 0.0725745868843955, 0.07107165012810682, 0.06928105591922193, 0.06736977509147413, 0.06515877099858748, 0.06267403186978521, 0.05988547669069187, 0.05735226267495671, 0.0545267711418706, 0.05180915575977918, 0.049151497213421644, 0.04689121790028907, 0.04473428246942726, 0.04255887372670947, 0.04059563582209316, 0.038926143048180116, 0.037120581680052986, 0.035901194539021804, 0.03453905779767681, 0.03356715910942168, 0.032434062327484824, 0.031215171958949114, 0.030425874919101998, 0.029301784982955135, 0.02863636497106101, 0.027768823368525184, 0.027125793635039717, 0.026067637888765014, 0.02596252937675328, 0.02543608532161326, 0.024767381466321042, 0.024453865213168634, 0.024056918063276523, 0.023693673740569, 0.02324753919163266, 0.022897354855730728, 0.0228723




Epoch:  49%|████▉     | 49/100 [34:04<35:24, 41.66s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195, 0.07419083468817375, 0.0725745868843955, 0.07107165012810682, 0.06928105591922193, 0.06736977509147413, 0.06515877099858748, 0.06267403186978521, 0.05988547669069187, 0.05735226267495671, 0.0545267711418706, 0.05180915575977918, 0.049151497213421644, 0.04689121790028907, 0.04473428246942726, 0.04255887372670947, 0.04059563582209316, 0.038926143048180116, 0.037120581680052986, 0.035901194539021804, 0.03453905779767681, 0.03356715910942168, 0.032434062327484824, 0.031215171958949114, 0.030425874919101998, 0.029301784982955135, 0.02863636497106101, 0.027768823368525184, 0.027125793635039717, 0.026067637888765014, 0.02596252937675328, 0.02543608532161326, 0.024767381466321042, 0.024453865213168634, 0.024056918063276523, 0.023693673740569, 0.02324753919163266, 0.022897354855730728, 0.0228723




Epoch:  50%|█████     | 50/100 [34:46<34:44, 41.68s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195, 0.07419083468817375, 0.0725745868843955, 0.07107165012810682, 0.06928105591922193, 0.06736977509147413, 0.06515877099858748, 0.06267403186978521, 0.05988547669069187, 0.05735226267495671, 0.0545267711418706, 0.05180915575977918, 0.049151497213421644, 0.04689121790028907, 0.04473428246942726, 0.04255887372670947, 0.04059563582209316, 0.038926143048180116, 0.037120581680052986, 0.035901194539021804, 0.03453905779767681, 0.03356715910942168, 0.032434062327484824, 0.031215171958949114, 0.030425874919101998, 0.029301784982955135, 0.02863636497106101, 0.027768823368525184, 0.027125793635039717, 0.026067637888765014, 0.02596252937675328, 0.02543608532161326, 0.024767381466321042, 0.024453865213168634, 0.024056918063276523, 0.023693673740569, 0.02324753919163266, 0.022897354855730728, 0.0228723




Epoch:  51%|█████     | 51/100 [35:28<34:03, 41.70s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195, 0.07419083468817375, 0.0725745868843955, 0.07107165012810682, 0.06928105591922193, 0.06736977509147413, 0.06515877099858748, 0.06267403186978521, 0.05988547669069187, 0.05735226267495671, 0.0545267711418706, 0.05180915575977918, 0.049151497213421644, 0.04689121790028907, 0.04473428246942726, 0.04255887372670947, 0.04059563582209316, 0.038926143048180116, 0.037120581680052986, 0.035901194539021804, 0.03453905779767681, 0.03356715910942168, 0.032434062327484824, 0.031215171958949114, 0.030425874919101998, 0.029301784982955135, 0.02863636497106101, 0.027768823368525184, 0.027125793635039717, 0.026067637888765014, 0.02596252937675328, 0.02543608532161326, 0.024767381466321042, 0.024453865213168634, 0.024056918063276523, 0.023693673740569, 0.02324753919163266, 0.022897354855730728, 0.0228723




Epoch:  52%|█████▏    | 52/100 [36:10<33:22, 41.71s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195, 0.07419083468817375, 0.0725745868843955, 0.07107165012810682, 0.06928105591922193, 0.06736977509147413, 0.06515877099858748, 0.06267403186978521, 0.05988547669069187, 0.05735226267495671, 0.0545267711418706, 0.05180915575977918, 0.049151497213421644, 0.04689121790028907, 0.04473428246942726, 0.04255887372670947, 0.04059563582209316, 0.038926143048180116, 0.037120581680052986, 0.035901194539021804, 0.03453905779767681, 0.03356715910942168, 0.032434062327484824, 0.031215171958949114, 0.030425874919101998, 0.029301784982955135, 0.02863636497106101, 0.027768823368525184, 0.027125793635039717, 0.026067637888765014, 0.02596252937675328, 0.02543608532161326, 0.024767381466321042, 0.024453865213168634, 0.024056918063276523, 0.023693673740569, 0.02324753919163266, 0.022897354855730728, 0.0228723




Epoch:  53%|█████▎    | 53/100 [36:51<32:40, 41.72s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…




Epoch:  54%|█████▍    | 54/100 [37:33<31:53, 41.59s/it][A[A[A


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195, 0.07419083468817375, 0.0725745868843955, 0.07107165012810682, 0.06928105591922193, 0.06736977509147413, 0.06515877099858748, 0.06267403186978521, 0.05988547669069187, 0.05735226267495671, 0.0545267711418706, 0.05180915575977918, 0.049151497213421644, 0.04689121790028907, 0.04473428246942726, 0.04255887372670947, 0.04059563582209316, 0.038926143048180116, 0.037120581680052986, 0.035901194539021804, 0.03453905779767681, 0.03356715910942168, 0.032434062327484824, 0.031215171958949114, 0.030425874919101998, 0.029301784982955135, 0.02863636497106101, 0.027768823368525184, 0.027125793635039717, 0.026067637888765014, 0.02596252937675328, 0.02543608532161326, 0.024767381466321042, 0.024453865213168634, 0.024056918063276523, 0.023693673740569, 0.02324753919163266, 0.022897354855730728, 0.0228723

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195, 0.07419083468817375, 0.0725745868843955, 0.07107165012810682, 0.06928105591922193, 0.06736977509147413, 0.06515877099858748, 0.06267403186978521, 0.05988547669069187, 0.05735226267495671, 0.0545267711418706, 0.05180915575977918, 0.049151497213421644, 0.04689121790028907, 0.04473428246942726, 0.04255887372670947, 0.04059563582209316, 0.038926143048180116, 0.037120581680052986, 0.035901194539021804, 0.03453905779767681, 0.03356715910942168, 0.032434062327484824, 0.031215171958949114, 0.030425874919101998, 0.029301784982955135, 0.02863636497106101, 0.027768823368525184, 0.027125793635039717, 0.026067637888765014, 0.02596252937675328, 0.02543608532161326, 0.024767381466321042, 0.024453865213168634, 0.024056918063276523, 0.023693673740569, 0.02324753919163266, 0.022897354855730728, 0.0228723




Epoch:  55%|█████▌    | 55/100 [38:14<31:13, 41.63s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195, 0.07419083468817375, 0.0725745868843955, 0.07107165012810682, 0.06928105591922193, 0.06736977509147413, 0.06515877099858748, 0.06267403186978521, 0.05988547669069187, 0.05735226267495671, 0.0545267711418706, 0.05180915575977918, 0.049151497213421644, 0.04689121790028907, 0.04473428246942726, 0.04255887372670947, 0.04059563582209316, 0.038926143048180116, 0.037120581680052986, 0.035901194539021804, 0.03453905779767681, 0.03356715910942168, 0.032434062327484824, 0.031215171958949114, 0.030425874919101998, 0.029301784982955135, 0.02863636497106101, 0.027768823368525184, 0.027125793635039717, 0.026067637888765014, 0.02596252937675328, 0.02543608532161326, 0.024767381466321042, 0.024453865213168634, 0.024056918063276523, 0.023693673740569, 0.02324753919163266, 0.022897354855730728, 0.0228723




Epoch:  56%|█████▌    | 56/100 [38:56<30:33, 41.67s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195, 0.07419083468817375, 0.0725745868843955, 0.07107165012810682, 0.06928105591922193, 0.06736977509147413, 0.06515877099858748, 0.06267403186978521, 0.05988547669069187, 0.05735226267495671, 0.0545267711418706, 0.05180915575977918, 0.049151497213421644, 0.04689121790028907, 0.04473428246942726, 0.04255887372670947, 0.04059563582209316, 0.038926143048180116, 0.037120581680052986, 0.035901194539021804, 0.03453905779767681, 0.03356715910942168, 0.032434062327484824, 0.031215171958949114, 0.030425874919101998, 0.029301784982955135, 0.02863636497106101, 0.027768823368525184, 0.027125793635039717, 0.026067637888765014, 0.02596252937675328, 0.02543608532161326, 0.024767381466321042, 0.024453865213168634, 0.024056918063276523, 0.023693673740569, 0.02324753919163266, 0.022897354855730728, 0.0228723




Epoch:  57%|█████▋    | 57/100 [39:38<29:52, 41.69s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195, 0.07419083468817375, 0.0725745868843955, 0.07107165012810682, 0.06928105591922193, 0.06736977509147413, 0.06515877099858748, 0.06267403186978521, 0.05988547669069187, 0.05735226267495671, 0.0545267711418706, 0.05180915575977918, 0.049151497213421644, 0.04689121790028907, 0.04473428246942726, 0.04255887372670947, 0.04059563582209316, 0.038926143048180116, 0.037120581680052986, 0.035901194539021804, 0.03453905779767681, 0.03356715910942168, 0.032434062327484824, 0.031215171958949114, 0.030425874919101998, 0.029301784982955135, 0.02863636497106101, 0.027768823368525184, 0.027125793635039717, 0.026067637888765014, 0.02596252937675328, 0.02543608532161326, 0.024767381466321042, 0.024453865213168634, 0.024056918063276523, 0.023693673740569, 0.02324753919163266, 0.022897354855730728, 0.0228723




Epoch:  58%|█████▊    | 58/100 [40:19<29:11, 41.69s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…




Epoch:  59%|█████▉    | 59/100 [41:01<28:24, 41.57s/it][A[A[A


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195, 0.07419083468817375, 0.0725745868843955, 0.07107165012810682, 0.06928105591922193, 0.06736977509147413, 0.06515877099858748, 0.06267403186978521, 0.05988547669069187, 0.05735226267495671, 0.0545267711418706, 0.05180915575977918, 0.049151497213421644, 0.04689121790028907, 0.04473428246942726, 0.04255887372670947, 0.04059563582209316, 0.038926143048180116, 0.037120581680052986, 0.035901194539021804, 0.03453905779767681, 0.03356715910942168, 0.032434062327484824, 0.031215171958949114, 0.030425874919101998, 0.029301784982955135, 0.02863636497106101, 0.027768823368525184, 0.027125793635039717, 0.026067637888765014, 0.02596252937675328, 0.02543608532161326, 0.024767381466321042, 0.024453865213168634, 0.024056918063276523, 0.023693673740569, 0.02324753919163266, 0.022897354855730728, 0.0228723

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…




Epoch:  60%|██████    | 60/100 [41:42<27:39, 41.49s/it][A[A[A


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195, 0.07419083468817375, 0.0725745868843955, 0.07107165012810682, 0.06928105591922193, 0.06736977509147413, 0.06515877099858748, 0.06267403186978521, 0.05988547669069187, 0.05735226267495671, 0.0545267711418706, 0.05180915575977918, 0.049151497213421644, 0.04689121790028907, 0.04473428246942726, 0.04255887372670947, 0.04059563582209316, 0.038926143048180116, 0.037120581680052986, 0.035901194539021804, 0.03453905779767681, 0.03356715910942168, 0.032434062327484824, 0.031215171958949114, 0.030425874919101998, 0.029301784982955135, 0.02863636497106101, 0.027768823368525184, 0.027125793635039717, 0.026067637888765014, 0.02596252937675328, 0.02543608532161326, 0.024767381466321042, 0.024453865213168634, 0.024056918063276523, 0.023693673740569, 0.02324753919163266, 0.022897354855730728, 0.0228723

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195, 0.07419083468817375, 0.0725745868843955, 0.07107165012810682, 0.06928105591922193, 0.06736977509147413, 0.06515877099858748, 0.06267403186978521, 0.05988547669069187, 0.05735226267495671, 0.0545267711418706, 0.05180915575977918, 0.049151497213421644, 0.04689121790028907, 0.04473428246942726, 0.04255887372670947, 0.04059563582209316, 0.038926143048180116, 0.037120581680052986, 0.035901194539021804, 0.03453905779767681, 0.03356715910942168, 0.032434062327484824, 0.031215171958949114, 0.030425874919101998, 0.029301784982955135, 0.02863636497106101, 0.027768823368525184, 0.027125793635039717, 0.026067637888765014, 0.02596252937675328, 0.02543608532161326, 0.024767381466321042, 0.024453865213168634, 0.024056918063276523, 0.023693673740569, 0.02324753919163266, 0.022897354855730728, 0.0228723




Epoch:  61%|██████    | 61/100 [42:24<27:00, 41.56s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195, 0.07419083468817375, 0.0725745868843955, 0.07107165012810682, 0.06928105591922193, 0.06736977509147413, 0.06515877099858748, 0.06267403186978521, 0.05988547669069187, 0.05735226267495671, 0.0545267711418706, 0.05180915575977918, 0.049151497213421644, 0.04689121790028907, 0.04473428246942726, 0.04255887372670947, 0.04059563582209316, 0.038926143048180116, 0.037120581680052986, 0.035901194539021804, 0.03453905779767681, 0.03356715910942168, 0.032434062327484824, 0.031215171958949114, 0.030425874919101998, 0.029301784982955135, 0.02863636497106101, 0.027768823368525184, 0.027125793635039717, 0.026067637888765014, 0.02596252937675328, 0.02543608532161326, 0.024767381466321042, 0.024453865213168634, 0.024056918063276523, 0.023693673740569, 0.02324753919163266, 0.022897354855730728, 0.0228723




Epoch:  62%|██████▏   | 62/100 [43:06<26:21, 41.62s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195, 0.07419083468817375, 0.0725745868843955, 0.07107165012810682, 0.06928105591922193, 0.06736977509147413, 0.06515877099858748, 0.06267403186978521, 0.05988547669069187, 0.05735226267495671, 0.0545267711418706, 0.05180915575977918, 0.049151497213421644, 0.04689121790028907, 0.04473428246942726, 0.04255887372670947, 0.04059563582209316, 0.038926143048180116, 0.037120581680052986, 0.035901194539021804, 0.03453905779767681, 0.03356715910942168, 0.032434062327484824, 0.031215171958949114, 0.030425874919101998, 0.029301784982955135, 0.02863636497106101, 0.027768823368525184, 0.027125793635039717, 0.026067637888765014, 0.02596252937675328, 0.02543608532161326, 0.024767381466321042, 0.024453865213168634, 0.024056918063276523, 0.023693673740569, 0.02324753919163266, 0.022897354855730728, 0.0228723




Epoch:  63%|██████▎   | 63/100 [43:47<25:41, 41.65s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…




Epoch:  64%|██████▍   | 64/100 [44:29<24:55, 41.54s/it][A[A[A


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195, 0.07419083468817375, 0.0725745868843955, 0.07107165012810682, 0.06928105591922193, 0.06736977509147413, 0.06515877099858748, 0.06267403186978521, 0.05988547669069187, 0.05735226267495671, 0.0545267711418706, 0.05180915575977918, 0.049151497213421644, 0.04689121790028907, 0.04473428246942726, 0.04255887372670947, 0.04059563582209316, 0.038926143048180116, 0.037120581680052986, 0.035901194539021804, 0.03453905779767681, 0.03356715910942168, 0.032434062327484824, 0.031215171958949114, 0.030425874919101998, 0.029301784982955135, 0.02863636497106101, 0.027768823368525184, 0.027125793635039717, 0.026067637888765014, 0.02596252937675328, 0.02543608532161326, 0.024767381466321042, 0.024453865213168634, 0.024056918063276523, 0.023693673740569, 0.02324753919163266, 0.022897354855730728, 0.0228723

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195, 0.07419083468817375, 0.0725745868843955, 0.07107165012810682, 0.06928105591922193, 0.06736977509147413, 0.06515877099858748, 0.06267403186978521, 0.05988547669069187, 0.05735226267495671, 0.0545267711418706, 0.05180915575977918, 0.049151497213421644, 0.04689121790028907, 0.04473428246942726, 0.04255887372670947, 0.04059563582209316, 0.038926143048180116, 0.037120581680052986, 0.035901194539021804, 0.03453905779767681, 0.03356715910942168, 0.032434062327484824, 0.031215171958949114, 0.030425874919101998, 0.029301784982955135, 0.02863636497106101, 0.027768823368525184, 0.027125793635039717, 0.026067637888765014, 0.02596252937675328, 0.02543608532161326, 0.024767381466321042, 0.024453865213168634, 0.024056918063276523, 0.023693673740569, 0.02324753919163266, 0.022897354855730728, 0.0228723




Epoch:  65%|██████▌   | 65/100 [45:10<24:15, 41.60s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195, 0.07419083468817375, 0.0725745868843955, 0.07107165012810682, 0.06928105591922193, 0.06736977509147413, 0.06515877099858748, 0.06267403186978521, 0.05988547669069187, 0.05735226267495671, 0.0545267711418706, 0.05180915575977918, 0.049151497213421644, 0.04689121790028907, 0.04473428246942726, 0.04255887372670947, 0.04059563582209316, 0.038926143048180116, 0.037120581680052986, 0.035901194539021804, 0.03453905779767681, 0.03356715910942168, 0.032434062327484824, 0.031215171958949114, 0.030425874919101998, 0.029301784982955135, 0.02863636497106101, 0.027768823368525184, 0.027125793635039717, 0.026067637888765014, 0.02596252937675328, 0.02543608532161326, 0.024767381466321042, 0.024453865213168634, 0.024056918063276523, 0.023693673740569, 0.02324753919163266, 0.022897354855730728, 0.0228723




Epoch:  66%|██████▌   | 66/100 [45:52<23:35, 41.63s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…




Epoch:  67%|██████▋   | 67/100 [46:33<22:50, 41.53s/it][A[A[A


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195, 0.07419083468817375, 0.0725745868843955, 0.07107165012810682, 0.06928105591922193, 0.06736977509147413, 0.06515877099858748, 0.06267403186978521, 0.05988547669069187, 0.05735226267495671, 0.0545267711418706, 0.05180915575977918, 0.049151497213421644, 0.04689121790028907, 0.04473428246942726, 0.04255887372670947, 0.04059563582209316, 0.038926143048180116, 0.037120581680052986, 0.035901194539021804, 0.03453905779767681, 0.03356715910942168, 0.032434062327484824, 0.031215171958949114, 0.030425874919101998, 0.029301784982955135, 0.02863636497106101, 0.027768823368525184, 0.027125793635039717, 0.026067637888765014, 0.02596252937675328, 0.02543608532161326, 0.024767381466321042, 0.024453865213168634, 0.024056918063276523, 0.023693673740569, 0.02324753919163266, 0.022897354855730728, 0.0228723

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…




Epoch:  68%|██████▊   | 68/100 [47:15<22:06, 41.45s/it][A[A[A


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195, 0.07419083468817375, 0.0725745868843955, 0.07107165012810682, 0.06928105591922193, 0.06736977509147413, 0.06515877099858748, 0.06267403186978521, 0.05988547669069187, 0.05735226267495671, 0.0545267711418706, 0.05180915575977918, 0.049151497213421644, 0.04689121790028907, 0.04473428246942726, 0.04255887372670947, 0.04059563582209316, 0.038926143048180116, 0.037120581680052986, 0.035901194539021804, 0.03453905779767681, 0.03356715910942168, 0.032434062327484824, 0.031215171958949114, 0.030425874919101998, 0.029301784982955135, 0.02863636497106101, 0.027768823368525184, 0.027125793635039717, 0.026067637888765014, 0.02596252937675328, 0.02543608532161326, 0.024767381466321042, 0.024453865213168634, 0.024056918063276523, 0.023693673740569, 0.02324753919163266, 0.022897354855730728, 0.0228723

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…




Epoch:  69%|██████▉   | 69/100 [47:56<21:23, 41.40s/it][A[A[A


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195, 0.07419083468817375, 0.0725745868843955, 0.07107165012810682, 0.06928105591922193, 0.06736977509147413, 0.06515877099858748, 0.06267403186978521, 0.05988547669069187, 0.05735226267495671, 0.0545267711418706, 0.05180915575977918, 0.049151497213421644, 0.04689121790028907, 0.04473428246942726, 0.04255887372670947, 0.04059563582209316, 0.038926143048180116, 0.037120581680052986, 0.035901194539021804, 0.03453905779767681, 0.03356715910942168, 0.032434062327484824, 0.031215171958949114, 0.030425874919101998, 0.029301784982955135, 0.02863636497106101, 0.027768823368525184, 0.027125793635039717, 0.026067637888765014, 0.02596252937675328, 0.02543608532161326, 0.024767381466321042, 0.024453865213168634, 0.024056918063276523, 0.023693673740569, 0.02324753919163266, 0.022897354855730728, 0.0228723

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…




Epoch:  70%|███████   | 70/100 [48:37<20:41, 41.37s/it][A[A[A


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195, 0.07419083468817375, 0.0725745868843955, 0.07107165012810682, 0.06928105591922193, 0.06736977509147413, 0.06515877099858748, 0.06267403186978521, 0.05988547669069187, 0.05735226267495671, 0.0545267711418706, 0.05180915575977918, 0.049151497213421644, 0.04689121790028907, 0.04473428246942726, 0.04255887372670947, 0.04059563582209316, 0.038926143048180116, 0.037120581680052986, 0.035901194539021804, 0.03453905779767681, 0.03356715910942168, 0.032434062327484824, 0.031215171958949114, 0.030425874919101998, 0.029301784982955135, 0.02863636497106101, 0.027768823368525184, 0.027125793635039717, 0.026067637888765014, 0.02596252937675328, 0.02543608532161326, 0.024767381466321042, 0.024453865213168634, 0.024056918063276523, 0.023693673740569, 0.02324753919163266, 0.022897354855730728, 0.0228723

HBox(children=(IntProgress(value=0, description='Training iteration', max=185, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=37, style=ProgressStyle(descriptio…


Loss history: [0.5213218041368433, 0.3512672838327047, 0.22659491606660792, 0.15170020992691452, 0.11356990703859844, 0.09469581637028102, 0.08492339019839829, 0.07952852808945887, 0.07629744426624195, 0.07419083468817375, 0.0725745868843955, 0.07107165012810682, 0.06928105591922193, 0.06736977509147413, 0.06515877099858748, 0.06267403186978521, 0.05988547669069187, 0.05735226267495671, 0.0545267711418706, 0.05180915575977918, 0.049151497213421644, 0.04689121790028907, 0.04473428246942726, 0.04255887372670947, 0.04059563582209316, 0.038926143048180116, 0.037120581680052986, 0.035901194539021804, 0.03453905779767681, 0.03356715910942168, 0.032434062327484824, 0.031215171958949114, 0.030425874919101998, 0.029301784982955135, 0.02863636497106101, 0.027768823368525184, 0.027125793635039717, 0.026067637888765014, 0.02596252937675328, 0.02543608532161326, 0.024767381466321042, 0.024453865213168634, 0.024056918063276523, 0.023693673740569, 0.02324753919163266, 0.022897354855730728, 0.0228723

I0406 17:13:57.230393 139997413660480 configuration_utils.py:256] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json from cache at /home/yves/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.8f56353af4a709bf5ff0fbc915d8f5b42bfff892cbb6ac98c3c45f481a03c685
I0406 17:13:57.231432 139997413660480 configuration_utils.py:292] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": null,
  "do_sample": false,
  "eos_token_ids": null,
  "finetuning_task": null,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1"
  },
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "is_decoder": false,
  "label2id": {
    "LABEL_0": 0,
    "LABEL_1": 1
  },
  "layer_norm_eps": 1e-12,
  "length_penalty": 1.0,
  "max_length": 20,
  "max_position_embedd

HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=56, style=ProgressStyle(descriptio…


Data size: 887
P: 730 / 833 = 0.8763505402160864
R: 730 / 949 = 0.7692307692307693
F: 0.8193041526374859
A: 0.7531003382187148
AL1: 0.7936865839909808


## Evaluation

In [5]:
evaluate_output(all_correct, all_predicted)

Data size: 4435
P: 3723 / 4272 = 0.8714887640449438
R: 3723 / 4763 = 0.7816502204492967
F: 0.8241283895960154
A: 0.7627959413754227
AL1: 0.8011273957158963


In [6]:
scores = {}
for item, predicted, correct in zip(all_test_data, all_predicted, all_correct):
    correct_labels = [idx2label[i] for i, l in enumerate(correct) if l == 1]
    predicted_labels = [idx2label[i] for i, l in enumerate(predicted) if l == 1]
    print("{}#{}#{}".format(item.text, ";".join(correct_labels), ";".join(predicted_labels)))    
        
    for label in predicted_labels + correct_labels:
        if label not in scores:
            scores[label] = {"tp": 0, "fp": 0, "fn": 0, "support": 0}
    
    for label in predicted_labels:
        if label in correct_labels:
            scores[label]["tp"] += 1
        else:
            scores[label]["fp"] += 1

    for label in correct_labels:
        scores[label]["support"] += 1
        if label not in predicted_labels:
            scores[label]["fn"] += 1
            
for label in scores:
    lp = scores[label]["tp"] / (scores[label]["tp"] + scores[label]["fp"]) if scores[label]["tp"] + scores[label]["fp"] > 0 else 0
    lr = scores[label]["tp"] / (scores[label]["tp"] + scores[label]["fn"]) if scores[label]["tp"] + scores[label]["fn"] > 0 else 0
    lf = 2*lp*lr/(lp+lr) if lp+lr > 0 else 0
    
    print(label, lp, lr, lf, scores[label]["support"])

Methane from cow burps harms the environment, but some people still want to eat regular beef.#eatingmeat4_but_Feedback_7#eatingmeat4_but_Feedback_7
Methane from cow burps harms the environment, so Impossible foods wants to make non-meat food items that taste like meat in order to stop the raising of animals for food purposes.#eatingmeat4_so_Feedback_2#eatingmeat4_so_Feedback_2
Eastern Michigan University cut women's tennis and softball, but it was found to be in violation of Title IX.#title9_but_EMU_viol_TitleIX#title9_but_EMU_viol_TitleIX
Methane from cow burps harms the environment because each cow belches approximately 50 gallons of methane per year which contributes to air pollution as well as global warming.#eatingmeat4_because_Feedback_5#eatingmeat4_because_Feedback_5
Eastern Michigan University cut women's tennis and softball, so a student athlete who had a softball scholarship teamed up with someone to sue the university for violating Title IX.#title9_so_Ath_sued_TitleIX#title9

Eastern Michigan University cut women's tennis and softball, but they were forced to reinstate these sports after being successfully sued for being in violation of Title IX.#title9_but_Teams_reinstated;title9_but_Women_sued#title9_but_Teams_reinstated;title9_but_Women_sued
Eastern Michigan University cut women's tennis and softball because because wanted to save money#title9_because_To_Save_Money#title9_because_To_Save_Money
Methane from cow burps harms the environment because each day they burp out 30-50 gallons of the greenhouse gas methane which damages the ozone layer.#eatingmeat4_because_Feedback_9#eatingmeat4_because_Feedback_5
Methane from cow burps harms the environment because it increases the earth's temperature called Global Warming.#eatingmeat4_because_Feedback_1#eatingmeat4_because_Feedback_1
Methane from cow burps harms the environment, so scientists have been testing different diets to reduce the methane from cow burps#eatingmeat4_so_Feedback_3#
Eastern Michigan Universi

Eastern Michigan University cut women's tennis and softball, so it would be able to reduce it's deficit spending#title9_so_Reduce_costs#title9_so_Reduce_costs
Eastern Michigan University cut women's tennis and softball because because they had to find a way to save money because they were way over budget.#title9_because_To_Save_Money;title9_because_Overspending#title9_because_To_Save_Money;title9_because_Overspending
Methane from cow burps harms the environment because Not everyone.#eatingmeat4_because_Feedback_8#eatingmeat4_because_Feedback_8
Methane from cow burps harms the environment, but we can consume impossible foods to lower the emissions and dependency on cows.#eatingmeat4_but_Feedback_5#
Methane from cow burps harms the environment because methane gas causes Earth's temperature to rise in a process called global warming.#eatingmeat4_because_Feedback_1#eatingmeat4_because_Feedback_1
Methane from cow burps harms the environment, so we need to change our habits and how we handle

Methane from cow burps harms the environment, so by either eating less meat, or by having cows eat less gassy food, we can lower the amount of methane that's put into the atmosphere.#eatingmeat4_so_Feedback_10#eatingmeat4_so_Feedback_10
Eastern Michigan University cut women's tennis and softball, but they argued the cuts would effect more men than women.#title9_but_Affected_More_Men#title9_but_Affected_More_Men
Eastern Michigan University cut women's tennis and softball because because it needed to make budget cuts as it was deeply in debt.#title9_because_To_Save_Money;title9_because_Overspending#title9_because_To_Save_Money;title9_because_Overspending
Eastern Michigan University cut women's tennis and softball, but later reinstated them after a judge ruled that they must do so.#title9_but_Teams_reinstated#title9_but_Teams_reinstated
Eastern Michigan University cut women's tennis and softball because because it claims it did not have enough money in the budget.#title9_because_Lack_of_f

Methane from cow burps harms the environment, but to address this problem, the company Impossible Foods wants to end all animal agriculture by 2035#eatingmeat4_but_Feedback_8#eatingmeat4_but_Feedback_8
Methane from cow burps harms the environment, so the current agricultural practices of the world should be changed.#eatingmeat4_so_Feedback_9#eatingmeat4_so_Feedback_9
Methane from cow burps harms the environment because the gases released contain methane which is harmful to the environment.#eatingmeat4_because_Feedback_5#eatingmeat4_because_Feedback_5
Methane from cow burps harms the environment, so another solution lies in what farmers feed cows#eatingmeat4_so_Feedback_10#
Eastern Michigan University cut women's tennis and softball because because of costs to the university.#title9_because_Overspending#title9_because_Fin_Trouble
Methane from cow burps harms the environment, so the need to change what farmers feed their cows#eatingmeat4_so_Feedback_10#eatingmeat4_so_Feedback_10
Eastern 

In [7]:
print(len([p for p in all_predicted if sum(p) == 0]), "/", len(all_predicted))

389 / 4435
