# Multilabel BERT Experiments

In this notebook we do some first experiments with BERT: we finetune a BERT model+classifier on each of our datasets separately and compute the accuracy of the resulting classifier on the test data.

For these experiments we use the `pytorch_transformers` package. It contains a variety of neural network architectures for transfer learning and pretrained models, including BERT and XLNET.

Two different BERT models are relevant for our experiments: 

- BERT-base-uncased: a relatively small BERT model that should already give reasonable results,
- BERT-large-uncased: a larger model for real state-of-the-art results.

In [1]:
PREFIX = "title9_but" #"voting_so_automl"
BERT_MODEL = 'bert-base-uncased'
BATCH_SIZE = 16 if "base" in BERT_MODEL else 2
GRADIENT_ACCUMULATION_STEPS = 1 if "base" in BERT_MODEL else 8
MAX_SEQ_LENGTH = 100

## Data

We use the same data as for all our previous experiments. Here we load the training, development and test data for a particular prompt.

In [2]:
import sys
sys.path.append('../')

import ndjson
import glob
import numpy as np
from collections import Counter

from quillnlp.models.bert.preprocessing import preprocess, create_label_vocabulary

data_file = f"../data/interim/{PREFIX}_withprompt.ndjson"

with open(data_file) as i:
    data = ndjson.load(i)
    
for item in data:
    item["label"] = item["labels"]
    
label2idx = create_label_vocabulary(data)
idx2label = {v:k for k,v in label2idx.items()}
target_names = [idx2label[s] for s in range(len(idx2label))]

data_items = preprocess(data, BERT_MODEL, label2idx, MAX_SEQ_LENGTH)
data_items = np.array(data_items)
    
labels = Counter()
for item in data:
    labels.update(item["label"])
print(labels)

I0404 18:53:29.534177 140043817379648 file_utils.py:41] PyTorch version 1.2.0+cu92 available.
I0404 18:53:30.516104 140043817379648 file_utils.py:57] TensorFlow version 2.1.0 available.
I0404 18:53:31.153071 140043817379648 tokenization_utils.py:501] loading file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt from cache at /home/yves/.cache/torch/transformers/26bc1ad6c0ac742e9b52263248f6d0f00068293b33709fae12320c0e35ccfbbb.542ce4285a40d23a559526243235df47c5f75c197f04f37d1a0c124c32c9a084


Counter({'Miscellaneous': 124, 'Teams_reinstated': 116, 'Women_sued': 107, 'Cuts_to_WSD,': 92, 'EMU_viol_TitleIX': 56, 'Court_Decision': 49, 'Cuts_to_mens_sports': 48, 'Judge_reinstate': 45, 'Affected_More_Men': 42, 'Did_not_cut_t_and_s': 30, 'Fem_athletes_mad': 22, 'Kept_other_ones': 20, 'Limit_woms_opps': 17, 'EMU_cut_less_men': 17, 'EMU_compliant': 15, 'Cuts_to_WSD': 1})


Read synthetic data, when available.

In [3]:
from collections import defaultdict

synth_files = glob.glob(f"../data/interim/{PREFIX}_withprompt_*.ndjson")

synth_data = []
for synth_file in synth_files:
    with open(synth_file) as i:
        synth_data.extend(ndjson.load(i))

preprocessed_synth_data = preprocess(synth_data, BERT_MODEL, label2idx, MAX_SEQ_LENGTH)
        
synth_map = defaultdict(list)
for item, preprocessed_item in zip(synth_data, preprocessed_synth_data):
    synth_map[item["source_text"]].append(preprocessed_item)

synth_map = {}

I0404 18:53:31.873625 140043817379648 tokenization_utils.py:501] loading file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt from cache at /home/yves/.cache/torch/transformers/26bc1ad6c0ac742e9b52263248f6d0f00068293b33709fae12320c0e35ccfbbb.542ce4285a40d23a559526243235df47c5f75c197f04f37d1a0c124c32c9a084


## Model

We load the pretrained model and put it on a GPU if one is available. We also put the model in "training" mode, so that we can correctly update its internal parameters on the basis of our data sets.

In [4]:
def evaluate_output(all_correct, all_predicted):
    correct = 0
    at_least_one = 0
    fp, fn, tp, tn = 0, 0, 0, 0
    for c, p in zip(all_correct, all_predicted):
        if sum(c == p) == len(c):
            correct +=1

        for ci, pi in zip(c, p):
            if pi == 1 and ci == 1:
                at_least_one += 1
                break

        for ci, pi in zip(c, p):
            if pi == 1 and ci == 1:
                tp += 1
                same = 1
            elif pi == 1 and ci == 0:
                fp += 1
            elif pi == 0 and ci == 1:
                fn += 1
            else:
                tn += 1
                same =1

    precision = tp/(tp+fp)
    recall = tp/(tp+fn)
    print("Data size:", len(all_predicted))
    print("P:", tp, "/", tp+fp, "=", precision)
    print("R:", tp, "/", tp+fn, "=", recall)
    print("F:", 2*precision*recall/(precision+recall))
    print("A:", correct/len(all_correct))
    print("AL1:", at_least_one/len(all_correct))

In [5]:
import torch
import random

from quillnlp.models.bert.train import train, evaluate
from quillnlp.models.bert.models import get_multilabel_bert_classifier

from quillnlp.models.bert.preprocessing import get_data_loader
from sklearn.model_selection import KFold

random.shuffle(data_items)

kf = KFold(n_splits=5, shuffle=True, random_state=1)
all_correct, all_predicted = [], []
all_test_data = []
for train_idx, test_idx in kf.split(data_items):

    train_and_dev_data = data_items[train_idx]
    cutoff = int(len(train_and_dev_data)/6*5)
    train_data = train_and_dev_data[:cutoff]
    dev_data = train_and_dev_data[cutoff:]
    test_data = data_items[test_idx]
    
    print("Train size:", len(train_data))
    
    synth_data = []
    for item in train_data:
        synth_data.extend(synth_map.get(item.text, []))
                
    train_dataloader = get_data_loader(np.concatenate((train_data, synth_data)), BATCH_SIZE)
    dev_dataloader = get_data_loader(dev_data, BATCH_SIZE)
    test_dataloader = get_data_loader(test_data, BATCH_SIZE, shuffle=False)

    print("Final train dataloader length:", len(train_dataloader))
    
    device = "cuda" if torch.cuda.is_available() else "cpu"
    model = get_multilabel_bert_classifier(BERT_MODEL, len(label2idx), device=device)
    output_model_file = train(model, train_dataloader, dev_dataloader, BATCH_SIZE, GRADIENT_ACCUMULATION_STEPS, 
                              device, num_train_epochs=100)
    
    print("Loading model from", output_model_file)
    device="cpu"

    model = get_multilabel_bert_classifier(BERT_MODEL, len(label2idx), model_file=output_model_file, device=device)
    model.eval()
    
    _, _, test_correct, test_predicted = evaluate(model, test_dataloader, device)
    evaluate_output(test_correct, test_predicted)
    all_correct.extend(test_correct)
    all_predicted.extend(test_predicted)
    all_test_data.extend(test_data)

Train size: 460
Final train dataloader length: 29


I0404 18:53:32.406511 140043817379648 configuration_utils.py:256] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json from cache at /home/yves/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.8f56353af4a709bf5ff0fbc915d8f5b42bfff892cbb6ac98c3c45f481a03c685
I0404 18:53:32.408109 140043817379648 configuration_utils.py:292] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": null,
  "do_sample": false,
  "eos_token_ids": null,
  "finetuning_task": null,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1"
  },
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "is_decoder": false,
  "label2id": {
    "LABEL_0": 0,
    "LABEL_1": 1
  },
  "layer_norm_eps": 1e-12,
  "length_penalty": 1.0,
  "max_length": 20,
  "max_position_embedd

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: []
Dev loss: 0.5807238121827444


Epoch:   1%|          | 1/100 [00:06<11:20,  6.87s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5807238121827444]
Dev loss: 0.4713802585999171


Epoch:   2%|▏         | 2/100 [00:13<11:14,  6.88s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5807238121827444, 0.4713802585999171]
Dev loss: 0.3920477976401647


Epoch:   3%|▎         | 3/100 [00:20<11:08,  6.89s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647]
Dev loss: 0.3334565609693527


Epoch:   4%|▍         | 4/100 [00:27<11:01,  6.89s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527]
Dev loss: 0.28837329645951587


Epoch:   5%|▌         | 5/100 [00:34<10:55,  6.90s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587]
Dev loss: 0.26220135887463886


Epoch:   6%|▌         | 6/100 [00:41<10:48,  6.90s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587, 0.26220135887463886]
Dev loss: 0.24571756521860758


Epoch:   7%|▋         | 7/100 [00:48<10:42,  6.90s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587, 0.26220135887463886, 0.24571756521860758]
Dev loss: 0.23376855005820593


Epoch:   8%|▊         | 8/100 [00:55<10:35,  6.91s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587, 0.26220135887463886, 0.24571756521860758, 0.23376855005820593]
Dev loss: 0.22130146125952402


Epoch:   9%|▉         | 9/100 [01:02<10:28,  6.91s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587, 0.26220135887463886, 0.24571756521860758, 0.23376855005820593, 0.22130146125952402]
Dev loss: 0.2088807945450147


Epoch:  10%|█         | 10/100 [01:09<10:22,  6.92s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587, 0.26220135887463886, 0.24571756521860758, 0.23376855005820593, 0.22130146125952402, 0.2088807945450147]
Dev loss: 0.1988967756430308


Epoch:  11%|█         | 11/100 [01:15<10:15,  6.92s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587, 0.26220135887463886, 0.24571756521860758, 0.23376855005820593, 0.22130146125952402, 0.2088807945450147, 0.1988967756430308]
Dev loss: 0.18800713370243707


Epoch:  12%|█▏        | 12/100 [01:22<10:09,  6.92s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587, 0.26220135887463886, 0.24571756521860758, 0.23376855005820593, 0.22130146125952402, 0.2088807945450147, 0.1988967756430308, 0.18800713370243707]
Dev loss: 0.176468292872111


Epoch:  13%|█▎        | 13/100 [01:29<10:02,  6.93s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587, 0.26220135887463886, 0.24571756521860758, 0.23376855005820593, 0.22130146125952402, 0.2088807945450147, 0.1988967756430308, 0.18800713370243707, 0.176468292872111]
Dev loss: 0.16879013429085413


Epoch:  14%|█▍        | 14/100 [01:36<09:55,  6.93s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587, 0.26220135887463886, 0.24571756521860758, 0.23376855005820593, 0.22130146125952402, 0.2088807945450147, 0.1988967756430308, 0.18800713370243707, 0.176468292872111, 0.16879013429085413]
Dev loss: 0.15945591777563095


Epoch:  15%|█▌        | 15/100 [01:43<09:51,  6.96s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587, 0.26220135887463886, 0.24571756521860758, 0.23376855005820593, 0.22130146125952402, 0.2088807945450147, 0.1988967756430308, 0.18800713370243707, 0.176468292872111, 0.16879013429085413, 0.15945591777563095]
Dev loss: 0.15360741565624872


Epoch:  16%|█▌        | 16/100 [01:50<09:44,  6.95s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587, 0.26220135887463886, 0.24571756521860758, 0.23376855005820593, 0.22130146125952402, 0.2088807945450147, 0.1988967756430308, 0.18800713370243707, 0.176468292872111, 0.16879013429085413, 0.15945591777563095, 0.15360741565624872]
Dev loss: 0.14622070640325546


Epoch:  17%|█▋        | 17/100 [01:57<09:36,  6.95s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587, 0.26220135887463886, 0.24571756521860758, 0.23376855005820593, 0.22130146125952402, 0.2088807945450147, 0.1988967756430308, 0.18800713370243707, 0.176468292872111, 0.16879013429085413, 0.15945591777563095, 0.15360741565624872, 0.14622070640325546]
Dev loss: 0.14106962333122888


Epoch:  18%|█▊        | 18/100 [02:04<09:29,  6.95s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587, 0.26220135887463886, 0.24571756521860758, 0.23376855005820593, 0.22130146125952402, 0.2088807945450147, 0.1988967756430308, 0.18800713370243707, 0.176468292872111, 0.16879013429085413, 0.15945591777563095, 0.15360741565624872, 0.14622070640325546, 0.14106962333122888]
Dev loss: 0.13842809945344925


Epoch:  19%|█▉        | 19/100 [02:11<09:22,  6.94s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587, 0.26220135887463886, 0.24571756521860758, 0.23376855005820593, 0.22130146125952402, 0.2088807945450147, 0.1988967756430308, 0.18800713370243707, 0.176468292872111, 0.16879013429085413, 0.15945591777563095, 0.15360741565624872, 0.14622070640325546, 0.14106962333122888, 0.13842809945344925]
Dev loss: 0.13366339231530824


Epoch:  20%|██        | 20/100 [02:18<09:15,  6.94s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587, 0.26220135887463886, 0.24571756521860758, 0.23376855005820593, 0.22130146125952402, 0.2088807945450147, 0.1988967756430308, 0.18800713370243707, 0.176468292872111, 0.16879013429085413, 0.15945591777563095, 0.15360741565624872, 0.14622070640325546, 0.14106962333122888, 0.13842809945344925, 0.13366339231530824]
Dev loss: 0.13091055179635683


Epoch:  21%|██        | 21/100 [02:25<09:08,  6.94s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587, 0.26220135887463886, 0.24571756521860758, 0.23376855005820593, 0.22130146125952402, 0.2088807945450147, 0.1988967756430308, 0.18800713370243707, 0.176468292872111, 0.16879013429085413, 0.15945591777563095, 0.15360741565624872, 0.14622070640325546, 0.14106962333122888, 0.13842809945344925, 0.13366339231530824, 0.13091055179635683]
Dev loss: 0.12942353015144667


Epoch:  22%|██▏       | 22/100 [02:32<09:01,  6.94s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587, 0.26220135887463886, 0.24571756521860758, 0.23376855005820593, 0.22130146125952402, 0.2088807945450147, 0.1988967756430308, 0.18800713370243707, 0.176468292872111, 0.16879013429085413, 0.15945591777563095, 0.15360741565624872, 0.14622070640325546, 0.14106962333122888, 0.13842809945344925, 0.13366339231530824, 0.13091055179635683, 0.12942353015144667]
Dev loss: 0.12504354616006216


Epoch:  23%|██▎       | 23/100 [02:39<08:54,  6.94s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587, 0.26220135887463886, 0.24571756521860758, 0.23376855005820593, 0.22130146125952402, 0.2088807945450147, 0.1988967756430308, 0.18800713370243707, 0.176468292872111, 0.16879013429085413, 0.15945591777563095, 0.15360741565624872, 0.14622070640325546, 0.14106962333122888, 0.13842809945344925, 0.13366339231530824, 0.13091055179635683, 0.12942353015144667, 0.12504354616006216]
Dev loss: 0.12317216272155444


Epoch:  24%|██▍       | 24/100 [02:46<08:47,  6.94s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587, 0.26220135887463886, 0.24571756521860758, 0.23376855005820593, 0.22130146125952402, 0.2088807945450147, 0.1988967756430308, 0.18800713370243707, 0.176468292872111, 0.16879013429085413, 0.15945591777563095, 0.15360741565624872, 0.14622070640325546, 0.14106962333122888, 0.13842809945344925, 0.13366339231530824, 0.13091055179635683, 0.12942353015144667, 0.12504354616006216, 0.12317216272155444]
Dev loss: 0.12026351317763329


Epoch:  25%|██▌       | 25/100 [02:53<08:40,  6.94s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587, 0.26220135887463886, 0.24571756521860758, 0.23376855005820593, 0.22130146125952402, 0.2088807945450147, 0.1988967756430308, 0.18800713370243707, 0.176468292872111, 0.16879013429085413, 0.15945591777563095, 0.15360741565624872, 0.14622070640325546, 0.14106962333122888, 0.13842809945344925, 0.13366339231530824, 0.13091055179635683, 0.12942353015144667, 0.12504354616006216, 0.12317216272155444, 0.12026351317763329]
Dev loss: 0.11964867388208707


Epoch:  26%|██▌       | 26/100 [03:00<08:33,  6.94s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587, 0.26220135887463886, 0.24571756521860758, 0.23376855005820593, 0.22130146125952402, 0.2088807945450147, 0.1988967756430308, 0.18800713370243707, 0.176468292872111, 0.16879013429085413, 0.15945591777563095, 0.15360741565624872, 0.14622070640325546, 0.14106962333122888, 0.13842809945344925, 0.13366339231530824, 0.13091055179635683, 0.12942353015144667, 0.12504354616006216, 0.12317216272155444, 0.12026351317763329, 0.11964867388208707]
Dev loss: 0.11491534238060315


Epoch:  27%|██▋       | 27/100 [03:07<08:26,  6.94s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587, 0.26220135887463886, 0.24571756521860758, 0.23376855005820593, 0.22130146125952402, 0.2088807945450147, 0.1988967756430308, 0.18800713370243707, 0.176468292872111, 0.16879013429085413, 0.15945591777563095, 0.15360741565624872, 0.14622070640325546, 0.14106962333122888, 0.13842809945344925, 0.13366339231530824, 0.13091055179635683, 0.12942353015144667, 0.12504354616006216, 0.12317216272155444, 0.12026351317763329, 0.11964867388208707, 0.11491534238060315]
Dev loss: 0.113737386961778


Epoch:  28%|██▊       | 28/100 [03:14<08:19,  6.94s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587, 0.26220135887463886, 0.24571756521860758, 0.23376855005820593, 0.22130146125952402, 0.2088807945450147, 0.1988967756430308, 0.18800713370243707, 0.176468292872111, 0.16879013429085413, 0.15945591777563095, 0.15360741565624872, 0.14622070640325546, 0.14106962333122888, 0.13842809945344925, 0.13366339231530824, 0.13091055179635683, 0.12942353015144667, 0.12504354616006216, 0.12317216272155444, 0.12026351317763329, 0.11964867388208707, 0.11491534238060315, 0.113737386961778]
Dev loss: 0.11367260416348775


Epoch:  29%|██▉       | 29/100 [03:20<08:12,  6.94s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587, 0.26220135887463886, 0.24571756521860758, 0.23376855005820593, 0.22130146125952402, 0.2088807945450147, 0.1988967756430308, 0.18800713370243707, 0.176468292872111, 0.16879013429085413, 0.15945591777563095, 0.15360741565624872, 0.14622070640325546, 0.14106962333122888, 0.13842809945344925, 0.13366339231530824, 0.13091055179635683, 0.12942353015144667, 0.12504354616006216, 0.12317216272155444, 0.12026351317763329, 0.11964867388208707, 0.11491534238060315, 0.113737386961778, 0.11367260416348775]
Dev loss: 0.11274700860182445


Epoch:  30%|███       | 30/100 [03:27<08:05,  6.94s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587, 0.26220135887463886, 0.24571756521860758, 0.23376855005820593, 0.22130146125952402, 0.2088807945450147, 0.1988967756430308, 0.18800713370243707, 0.176468292872111, 0.16879013429085413, 0.15945591777563095, 0.15360741565624872, 0.14622070640325546, 0.14106962333122888, 0.13842809945344925, 0.13366339231530824, 0.13091055179635683, 0.12942353015144667, 0.12504354616006216, 0.12317216272155444, 0.12026351317763329, 0.11964867388208707, 0.11491534238060315, 0.113737386961778, 0.11367260416348775, 0.11274700860182445]
Dev loss: 0.11127892384926479


Epoch:  31%|███       | 31/100 [03:34<07:59,  6.94s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587, 0.26220135887463886, 0.24571756521860758, 0.23376855005820593, 0.22130146125952402, 0.2088807945450147, 0.1988967756430308, 0.18800713370243707, 0.176468292872111, 0.16879013429085413, 0.15945591777563095, 0.15360741565624872, 0.14622070640325546, 0.14106962333122888, 0.13842809945344925, 0.13366339231530824, 0.13091055179635683, 0.12942353015144667, 0.12504354616006216, 0.12317216272155444, 0.12026351317763329, 0.11964867388208707, 0.11491534238060315, 0.113737386961778, 0.11367260416348775, 0.11274700860182445, 0.11127892384926479]
Dev loss: 0.10855888575315475


Epoch:  32%|███▏      | 32/100 [03:41<07:52,  6.94s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587, 0.26220135887463886, 0.24571756521860758, 0.23376855005820593, 0.22130146125952402, 0.2088807945450147, 0.1988967756430308, 0.18800713370243707, 0.176468292872111, 0.16879013429085413, 0.15945591777563095, 0.15360741565624872, 0.14622070640325546, 0.14106962333122888, 0.13842809945344925, 0.13366339231530824, 0.13091055179635683, 0.12942353015144667, 0.12504354616006216, 0.12317216272155444, 0.12026351317763329, 0.11964867388208707, 0.11491534238060315, 0.113737386961778, 0.11367260416348775, 0.11274700860182445, 0.11127892384926479, 0.10855888575315475]
Dev loss: 0.10669782012701035


Epoch:  33%|███▎      | 33/100 [03:48<07:45,  6.94s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587, 0.26220135887463886, 0.24571756521860758, 0.23376855005820593, 0.22130146125952402, 0.2088807945450147, 0.1988967756430308, 0.18800713370243707, 0.176468292872111, 0.16879013429085413, 0.15945591777563095, 0.15360741565624872, 0.14622070640325546, 0.14106962333122888, 0.13842809945344925, 0.13366339231530824, 0.13091055179635683, 0.12942353015144667, 0.12504354616006216, 0.12317216272155444, 0.12026351317763329, 0.11964867388208707, 0.11491534238060315, 0.113737386961778, 0.11367260416348775, 0.11274700860182445, 0.11127892384926479, 0.10855888575315475, 0.10669782012701035]
Dev loss: 0.10567709182699521


Epoch:  34%|███▍      | 34/100 [03:55<07:38,  6.94s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…

Epoch:  35%|███▌      | 35/100 [04:02<07:21,  6.80s/it]


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587, 0.26220135887463886, 0.24571756521860758, 0.23376855005820593, 0.22130146125952402, 0.2088807945450147, 0.1988967756430308, 0.18800713370243707, 0.176468292872111, 0.16879013429085413, 0.15945591777563095, 0.15360741565624872, 0.14622070640325546, 0.14106962333122888, 0.13842809945344925, 0.13366339231530824, 0.13091055179635683, 0.12942353015144667, 0.12504354616006216, 0.12317216272155444, 0.12026351317763329, 0.11964867388208707, 0.11491534238060315, 0.113737386961778, 0.11367260416348775, 0.11274700860182445, 0.11127892384926479, 0.10855888575315475, 0.10669782012701035, 0.10567709182699521]
Dev loss: 0.10577129075924556


HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587, 0.26220135887463886, 0.24571756521860758, 0.23376855005820593, 0.22130146125952402, 0.2088807945450147, 0.1988967756430308, 0.18800713370243707, 0.176468292872111, 0.16879013429085413, 0.15945591777563095, 0.15360741565624872, 0.14622070640325546, 0.14106962333122888, 0.13842809945344925, 0.13366339231530824, 0.13091055179635683, 0.12942353015144667, 0.12504354616006216, 0.12317216272155444, 0.12026351317763329, 0.11964867388208707, 0.11491534238060315, 0.113737386961778, 0.11367260416348775, 0.11274700860182445, 0.11127892384926479, 0.10855888575315475, 0.10669782012701035, 0.10567709182699521, 0.10577129075924556]
Dev loss: 0.10335046673814456


Epoch:  36%|███▌      | 36/100 [04:09<07:17,  6.84s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…

Epoch:  37%|███▋      | 37/100 [04:15<07:03,  6.72s/it]


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587, 0.26220135887463886, 0.24571756521860758, 0.23376855005820593, 0.22130146125952402, 0.2088807945450147, 0.1988967756430308, 0.18800713370243707, 0.176468292872111, 0.16879013429085413, 0.15945591777563095, 0.15360741565624872, 0.14622070640325546, 0.14106962333122888, 0.13842809945344925, 0.13366339231530824, 0.13091055179635683, 0.12942353015144667, 0.12504354616006216, 0.12317216272155444, 0.12026351317763329, 0.11964867388208707, 0.11491534238060315, 0.113737386961778, 0.11367260416348775, 0.11274700860182445, 0.11127892384926479, 0.10855888575315475, 0.10669782012701035, 0.10567709182699521, 0.10577129075924556, 0.10335046673814456]
Dev loss: 0.10435107350349426


HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587, 0.26220135887463886, 0.24571756521860758, 0.23376855005820593, 0.22130146125952402, 0.2088807945450147, 0.1988967756430308, 0.18800713370243707, 0.176468292872111, 0.16879013429085413, 0.15945591777563095, 0.15360741565624872, 0.14622070640325546, 0.14106962333122888, 0.13842809945344925, 0.13366339231530824, 0.13091055179635683, 0.12942353015144667, 0.12504354616006216, 0.12317216272155444, 0.12026351317763329, 0.11964867388208707, 0.11491534238060315, 0.113737386961778, 0.11367260416348775, 0.11274700860182445, 0.11127892384926479, 0.10855888575315475, 0.10669782012701035, 0.10567709182699521, 0.10577129075924556, 0.10335046673814456, 0.10435107350349426]
Dev loss: 0.1011144941051801


Epoch:  38%|███▊      | 38/100 [04:22<07:01,  6.79s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…

Epoch:  39%|███▉      | 39/100 [04:28<06:48,  6.69s/it]


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587, 0.26220135887463886, 0.24571756521860758, 0.23376855005820593, 0.22130146125952402, 0.2088807945450147, 0.1988967756430308, 0.18800713370243707, 0.176468292872111, 0.16879013429085413, 0.15945591777563095, 0.15360741565624872, 0.14622070640325546, 0.14106962333122888, 0.13842809945344925, 0.13366339231530824, 0.13091055179635683, 0.12942353015144667, 0.12504354616006216, 0.12317216272155444, 0.12026351317763329, 0.11964867388208707, 0.11491534238060315, 0.113737386961778, 0.11367260416348775, 0.11274700860182445, 0.11127892384926479, 0.10855888575315475, 0.10669782012701035, 0.10567709182699521, 0.10577129075924556, 0.10335046673814456, 0.10435107350349426, 0.1011144941051801]
Dev loss: 0.10139636819561322


HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…

Epoch:  40%|████      | 40/100 [04:35<06:37,  6.62s/it]


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587, 0.26220135887463886, 0.24571756521860758, 0.23376855005820593, 0.22130146125952402, 0.2088807945450147, 0.1988967756430308, 0.18800713370243707, 0.176468292872111, 0.16879013429085413, 0.15945591777563095, 0.15360741565624872, 0.14622070640325546, 0.14106962333122888, 0.13842809945344925, 0.13366339231530824, 0.13091055179635683, 0.12942353015144667, 0.12504354616006216, 0.12317216272155444, 0.12026351317763329, 0.11964867388208707, 0.11491534238060315, 0.113737386961778, 0.11367260416348775, 0.11274700860182445, 0.11127892384926479, 0.10855888575315475, 0.10669782012701035, 0.10567709182699521, 0.10577129075924556, 0.10335046673814456, 0.10435107350349426, 0.1011144941051801, 0.10139636819561322]
Dev loss: 0.1020019253094991


HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…

Epoch:  41%|████      | 41/100 [04:41<06:27,  6.57s/it]


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587, 0.26220135887463886, 0.24571756521860758, 0.23376855005820593, 0.22130146125952402, 0.2088807945450147, 0.1988967756430308, 0.18800713370243707, 0.176468292872111, 0.16879013429085413, 0.15945591777563095, 0.15360741565624872, 0.14622070640325546, 0.14106962333122888, 0.13842809945344925, 0.13366339231530824, 0.13091055179635683, 0.12942353015144667, 0.12504354616006216, 0.12317216272155444, 0.12026351317763329, 0.11964867388208707, 0.11491534238060315, 0.113737386961778, 0.11367260416348775, 0.11274700860182445, 0.11127892384926479, 0.10855888575315475, 0.10669782012701035, 0.10567709182699521, 0.10577129075924556, 0.10335046673814456, 0.10435107350349426, 0.1011144941051801, 0.10139636819561322, 0.1020019253094991]
Dev loss: 0.1023290641605854


HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587, 0.26220135887463886, 0.24571756521860758, 0.23376855005820593, 0.22130146125952402, 0.2088807945450147, 0.1988967756430308, 0.18800713370243707, 0.176468292872111, 0.16879013429085413, 0.15945591777563095, 0.15360741565624872, 0.14622070640325546, 0.14106962333122888, 0.13842809945344925, 0.13366339231530824, 0.13091055179635683, 0.12942353015144667, 0.12504354616006216, 0.12317216272155444, 0.12026351317763329, 0.11964867388208707, 0.11491534238060315, 0.113737386961778, 0.11367260416348775, 0.11274700860182445, 0.11127892384926479, 0.10855888575315475, 0.10669782012701035, 0.10567709182699521, 0.10577129075924556, 0.10335046673814456, 0.10435107350349426, 0.1011144941051801, 0.10139636819561322, 0.1020019253094991, 0.1023290641605854]
Dev loss: 0.09968179712692897


Epoch:  42%|████▏     | 42/100 [04:48<06:27,  6.68s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587, 0.26220135887463886, 0.24571756521860758, 0.23376855005820593, 0.22130146125952402, 0.2088807945450147, 0.1988967756430308, 0.18800713370243707, 0.176468292872111, 0.16879013429085413, 0.15945591777563095, 0.15360741565624872, 0.14622070640325546, 0.14106962333122888, 0.13842809945344925, 0.13366339231530824, 0.13091055179635683, 0.12942353015144667, 0.12504354616006216, 0.12317216272155444, 0.12026351317763329, 0.11964867388208707, 0.11491534238060315, 0.113737386961778, 0.11367260416348775, 0.11274700860182445, 0.11127892384926479, 0.10855888575315475, 0.10669782012701035, 0.10567709182699521, 0.10577129075924556, 0.10335046673814456, 0.10435107350349426, 0.1011144941051801, 0.10139636819561322, 0.1020019253094991, 0.1023290641605854, 0.09968179712692897]
Dev loss: 0.099302318568031


Epoch:  43%|████▎     | 43/100 [04:55<06:25,  6.76s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587, 0.26220135887463886, 0.24571756521860758, 0.23376855005820593, 0.22130146125952402, 0.2088807945450147, 0.1988967756430308, 0.18800713370243707, 0.176468292872111, 0.16879013429085413, 0.15945591777563095, 0.15360741565624872, 0.14622070640325546, 0.14106962333122888, 0.13842809945344925, 0.13366339231530824, 0.13091055179635683, 0.12942353015144667, 0.12504354616006216, 0.12317216272155444, 0.12026351317763329, 0.11964867388208707, 0.11491534238060315, 0.113737386961778, 0.11367260416348775, 0.11274700860182445, 0.11127892384926479, 0.10855888575315475, 0.10669782012701035, 0.10567709182699521, 0.10577129075924556, 0.10335046673814456, 0.10435107350349426, 0.1011144941051801, 0.10139636819561322, 0.1020019253094991, 0.1023290641605854, 0.09968179712692897, 0.099302318568031]
Dev loss: 0.09783929958939552


Epoch:  44%|████▍     | 44/100 [05:02<06:21,  6.82s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587, 0.26220135887463886, 0.24571756521860758, 0.23376855005820593, 0.22130146125952402, 0.2088807945450147, 0.1988967756430308, 0.18800713370243707, 0.176468292872111, 0.16879013429085413, 0.15945591777563095, 0.15360741565624872, 0.14622070640325546, 0.14106962333122888, 0.13842809945344925, 0.13366339231530824, 0.13091055179635683, 0.12942353015144667, 0.12504354616006216, 0.12317216272155444, 0.12026351317763329, 0.11964867388208707, 0.11491534238060315, 0.113737386961778, 0.11367260416348775, 0.11274700860182445, 0.11127892384926479, 0.10855888575315475, 0.10669782012701035, 0.10567709182699521, 0.10577129075924556, 0.10335046673814456, 0.10435107350349426, 0.1011144941051801, 0.10139636819561322, 0.1020019253094991, 0.1023290641605854, 0.09968179712692897, 0.099302318568031, 0.09783929958939552]
Dev loss: 0.09740341827273369


Epoch:  45%|████▌     | 45/100 [05:09<06:17,  6.86s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…

Epoch:  46%|████▌     | 46/100 [05:16<06:03,  6.74s/it]


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587, 0.26220135887463886, 0.24571756521860758, 0.23376855005820593, 0.22130146125952402, 0.2088807945450147, 0.1988967756430308, 0.18800713370243707, 0.176468292872111, 0.16879013429085413, 0.15945591777563095, 0.15360741565624872, 0.14622070640325546, 0.14106962333122888, 0.13842809945344925, 0.13366339231530824, 0.13091055179635683, 0.12942353015144667, 0.12504354616006216, 0.12317216272155444, 0.12026351317763329, 0.11964867388208707, 0.11491534238060315, 0.113737386961778, 0.11367260416348775, 0.11274700860182445, 0.11127892384926479, 0.10855888575315475, 0.10669782012701035, 0.10567709182699521, 0.10577129075924556, 0.10335046673814456, 0.10435107350349426, 0.1011144941051801, 0.10139636819561322, 0.1020019253094991, 0.1023290641605854, 0.09968179712692897, 0.099302318568031, 0.09783929958939552, 0.09740341827273369]
Dev loss: 0.09800939758618672


HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587, 0.26220135887463886, 0.24571756521860758, 0.23376855005820593, 0.22130146125952402, 0.2088807945450147, 0.1988967756430308, 0.18800713370243707, 0.176468292872111, 0.16879013429085413, 0.15945591777563095, 0.15360741565624872, 0.14622070640325546, 0.14106962333122888, 0.13842809945344925, 0.13366339231530824, 0.13091055179635683, 0.12942353015144667, 0.12504354616006216, 0.12317216272155444, 0.12026351317763329, 0.11964867388208707, 0.11491534238060315, 0.113737386961778, 0.11367260416348775, 0.11274700860182445, 0.11127892384926479, 0.10855888575315475, 0.10669782012701035, 0.10567709182699521, 0.10577129075924556, 0.10335046673814456, 0.10435107350349426, 0.1011144941051801, 0.10139636819561322, 0.1020019253094991, 0.1023290641605854, 0.09968179712692897, 0.099302318568031, 0.09783929958939552, 0.09740341827273369, 0.09800939758618672]
Dev loss: 0.09697592879335086


Epoch:  47%|████▋     | 47/100 [05:23<06:00,  6.80s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…

Epoch:  48%|████▊     | 48/100 [05:29<05:48,  6.70s/it]


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587, 0.26220135887463886, 0.24571756521860758, 0.23376855005820593, 0.22130146125952402, 0.2088807945450147, 0.1988967756430308, 0.18800713370243707, 0.176468292872111, 0.16879013429085413, 0.15945591777563095, 0.15360741565624872, 0.14622070640325546, 0.14106962333122888, 0.13842809945344925, 0.13366339231530824, 0.13091055179635683, 0.12942353015144667, 0.12504354616006216, 0.12317216272155444, 0.12026351317763329, 0.11964867388208707, 0.11491534238060315, 0.113737386961778, 0.11367260416348775, 0.11274700860182445, 0.11127892384926479, 0.10855888575315475, 0.10669782012701035, 0.10567709182699521, 0.10577129075924556, 0.10335046673814456, 0.10435107350349426, 0.1011144941051801, 0.10139636819561322, 0.1020019253094991, 0.1023290641605854, 0.09968179712692897, 0.099302318568031, 0.09783929958939552, 0.09740341827273369, 0.09800939758618672, 0.09697592879335086]
Dev loss: 0.

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587, 0.26220135887463886, 0.24571756521860758, 0.23376855005820593, 0.22130146125952402, 0.2088807945450147, 0.1988967756430308, 0.18800713370243707, 0.176468292872111, 0.16879013429085413, 0.15945591777563095, 0.15360741565624872, 0.14622070640325546, 0.14106962333122888, 0.13842809945344925, 0.13366339231530824, 0.13091055179635683, 0.12942353015144667, 0.12504354616006216, 0.12317216272155444, 0.12026351317763329, 0.11964867388208707, 0.11491534238060315, 0.113737386961778, 0.11367260416348775, 0.11274700860182445, 0.11127892384926479, 0.10855888575315475, 0.10669782012701035, 0.10567709182699521, 0.10577129075924556, 0.10335046673814456, 0.10435107350349426, 0.1011144941051801, 0.10139636819561322, 0.1020019253094991, 0.1023290641605854, 0.09968179712692897, 0.099302318568031, 0.09783929958939552, 0.09740341827273369, 0.09800939758618672, 0.09697592879335086, 0.0984523048

Epoch:  49%|████▉     | 49/100 [05:36<05:45,  6.77s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587, 0.26220135887463886, 0.24571756521860758, 0.23376855005820593, 0.22130146125952402, 0.2088807945450147, 0.1988967756430308, 0.18800713370243707, 0.176468292872111, 0.16879013429085413, 0.15945591777563095, 0.15360741565624872, 0.14622070640325546, 0.14106962333122888, 0.13842809945344925, 0.13366339231530824, 0.13091055179635683, 0.12942353015144667, 0.12504354616006216, 0.12317216272155444, 0.12026351317763329, 0.11964867388208707, 0.11491534238060315, 0.113737386961778, 0.11367260416348775, 0.11274700860182445, 0.11127892384926479, 0.10855888575315475, 0.10669782012701035, 0.10567709182699521, 0.10577129075924556, 0.10335046673814456, 0.10435107350349426, 0.1011144941051801, 0.10139636819561322, 0.1020019253094991, 0.1023290641605854, 0.09968179712692897, 0.099302318568031, 0.09783929958939552, 0.09740341827273369, 0.09800939758618672, 0.09697592879335086, 0.0984523048

Epoch:  50%|█████     | 50/100 [05:43<05:41,  6.83s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…

Epoch:  51%|█████     | 51/100 [05:49<05:29,  6.72s/it]


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587, 0.26220135887463886, 0.24571756521860758, 0.23376855005820593, 0.22130146125952402, 0.2088807945450147, 0.1988967756430308, 0.18800713370243707, 0.176468292872111, 0.16879013429085413, 0.15945591777563095, 0.15360741565624872, 0.14622070640325546, 0.14106962333122888, 0.13842809945344925, 0.13366339231530824, 0.13091055179635683, 0.12942353015144667, 0.12504354616006216, 0.12317216272155444, 0.12026351317763329, 0.11964867388208707, 0.11491534238060315, 0.113737386961778, 0.11367260416348775, 0.11274700860182445, 0.11127892384926479, 0.10855888575315475, 0.10669782012701035, 0.10567709182699521, 0.10577129075924556, 0.10335046673814456, 0.10435107350349426, 0.1011144941051801, 0.10139636819561322, 0.1020019253094991, 0.1023290641605854, 0.09968179712692897, 0.099302318568031, 0.09783929958939552, 0.09740341827273369, 0.09800939758618672, 0.09697592879335086, 0.0984523048

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…

Epoch:  52%|█████▏    | 52/100 [05:56<05:18,  6.64s/it]


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587, 0.26220135887463886, 0.24571756521860758, 0.23376855005820593, 0.22130146125952402, 0.2088807945450147, 0.1988967756430308, 0.18800713370243707, 0.176468292872111, 0.16879013429085413, 0.15945591777563095, 0.15360741565624872, 0.14622070640325546, 0.14106962333122888, 0.13842809945344925, 0.13366339231530824, 0.13091055179635683, 0.12942353015144667, 0.12504354616006216, 0.12317216272155444, 0.12026351317763329, 0.11964867388208707, 0.11491534238060315, 0.113737386961778, 0.11367260416348775, 0.11274700860182445, 0.11127892384926479, 0.10855888575315475, 0.10669782012701035, 0.10567709182699521, 0.10577129075924556, 0.10335046673814456, 0.10435107350349426, 0.1011144941051801, 0.10139636819561322, 0.1020019253094991, 0.1023290641605854, 0.09968179712692897, 0.099302318568031, 0.09783929958939552, 0.09740341827273369, 0.09800939758618672, 0.09697592879335086, 0.0984523048

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587, 0.26220135887463886, 0.24571756521860758, 0.23376855005820593, 0.22130146125952402, 0.2088807945450147, 0.1988967756430308, 0.18800713370243707, 0.176468292872111, 0.16879013429085413, 0.15945591777563095, 0.15360741565624872, 0.14622070640325546, 0.14106962333122888, 0.13842809945344925, 0.13366339231530824, 0.13091055179635683, 0.12942353015144667, 0.12504354616006216, 0.12317216272155444, 0.12026351317763329, 0.11964867388208707, 0.11491534238060315, 0.113737386961778, 0.11367260416348775, 0.11274700860182445, 0.11127892384926479, 0.10855888575315475, 0.10669782012701035, 0.10567709182699521, 0.10577129075924556, 0.10335046673814456, 0.10435107350349426, 0.1011144941051801, 0.10139636819561322, 0.1020019253094991, 0.1023290641605854, 0.09968179712692897, 0.099302318568031, 0.09783929958939552, 0.09740341827273369, 0.09800939758618672, 0.09697592879335086, 0.0984523048

Epoch:  53%|█████▎    | 53/100 [06:03<05:16,  6.73s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…

Epoch:  54%|█████▍    | 54/100 [06:09<05:05,  6.65s/it]


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587, 0.26220135887463886, 0.24571756521860758, 0.23376855005820593, 0.22130146125952402, 0.2088807945450147, 0.1988967756430308, 0.18800713370243707, 0.176468292872111, 0.16879013429085413, 0.15945591777563095, 0.15360741565624872, 0.14622070640325546, 0.14106962333122888, 0.13842809945344925, 0.13366339231530824, 0.13091055179635683, 0.12942353015144667, 0.12504354616006216, 0.12317216272155444, 0.12026351317763329, 0.11964867388208707, 0.11491534238060315, 0.113737386961778, 0.11367260416348775, 0.11274700860182445, 0.11127892384926479, 0.10855888575315475, 0.10669782012701035, 0.10567709182699521, 0.10577129075924556, 0.10335046673814456, 0.10435107350349426, 0.1011144941051801, 0.10139636819561322, 0.1020019253094991, 0.1023290641605854, 0.09968179712692897, 0.099302318568031, 0.09783929958939552, 0.09740341827273369, 0.09800939758618672, 0.09697592879335086, 0.0984523048

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…

Epoch:  55%|█████▌    | 55/100 [06:16<04:56,  6.59s/it]


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587, 0.26220135887463886, 0.24571756521860758, 0.23376855005820593, 0.22130146125952402, 0.2088807945450147, 0.1988967756430308, 0.18800713370243707, 0.176468292872111, 0.16879013429085413, 0.15945591777563095, 0.15360741565624872, 0.14622070640325546, 0.14106962333122888, 0.13842809945344925, 0.13366339231530824, 0.13091055179635683, 0.12942353015144667, 0.12504354616006216, 0.12317216272155444, 0.12026351317763329, 0.11964867388208707, 0.11491534238060315, 0.113737386961778, 0.11367260416348775, 0.11274700860182445, 0.11127892384926479, 0.10855888575315475, 0.10669782012701035, 0.10567709182699521, 0.10577129075924556, 0.10335046673814456, 0.10435107350349426, 0.1011144941051801, 0.10139636819561322, 0.1020019253094991, 0.1023290641605854, 0.09968179712692897, 0.099302318568031, 0.09783929958939552, 0.09740341827273369, 0.09800939758618672, 0.09697592879335086, 0.0984523048

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…

Epoch:  56%|█████▌    | 56/100 [06:22<04:48,  6.55s/it]


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587, 0.26220135887463886, 0.24571756521860758, 0.23376855005820593, 0.22130146125952402, 0.2088807945450147, 0.1988967756430308, 0.18800713370243707, 0.176468292872111, 0.16879013429085413, 0.15945591777563095, 0.15360741565624872, 0.14622070640325546, 0.14106962333122888, 0.13842809945344925, 0.13366339231530824, 0.13091055179635683, 0.12942353015144667, 0.12504354616006216, 0.12317216272155444, 0.12026351317763329, 0.11964867388208707, 0.11491534238060315, 0.113737386961778, 0.11367260416348775, 0.11274700860182445, 0.11127892384926479, 0.10855888575315475, 0.10669782012701035, 0.10567709182699521, 0.10577129075924556, 0.10335046673814456, 0.10435107350349426, 0.1011144941051801, 0.10139636819561322, 0.1020019253094991, 0.1023290641605854, 0.09968179712692897, 0.099302318568031, 0.09783929958939552, 0.09740341827273369, 0.09800939758618672, 0.09697592879335086, 0.0984523048

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587, 0.26220135887463886, 0.24571756521860758, 0.23376855005820593, 0.22130146125952402, 0.2088807945450147, 0.1988967756430308, 0.18800713370243707, 0.176468292872111, 0.16879013429085413, 0.15945591777563095, 0.15360741565624872, 0.14622070640325546, 0.14106962333122888, 0.13842809945344925, 0.13366339231530824, 0.13091055179635683, 0.12942353015144667, 0.12504354616006216, 0.12317216272155444, 0.12026351317763329, 0.11964867388208707, 0.11491534238060315, 0.113737386961778, 0.11367260416348775, 0.11274700860182445, 0.11127892384926479, 0.10855888575315475, 0.10669782012701035, 0.10567709182699521, 0.10577129075924556, 0.10335046673814456, 0.10435107350349426, 0.1011144941051801, 0.10139636819561322, 0.1020019253094991, 0.1023290641605854, 0.09968179712692897, 0.099302318568031, 0.09783929958939552, 0.09740341827273369, 0.09800939758618672, 0.09697592879335086, 0.0984523048

Epoch:  57%|█████▋    | 57/100 [06:29<04:46,  6.67s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587, 0.26220135887463886, 0.24571756521860758, 0.23376855005820593, 0.22130146125952402, 0.2088807945450147, 0.1988967756430308, 0.18800713370243707, 0.176468292872111, 0.16879013429085413, 0.15945591777563095, 0.15360741565624872, 0.14622070640325546, 0.14106962333122888, 0.13842809945344925, 0.13366339231530824, 0.13091055179635683, 0.12942353015144667, 0.12504354616006216, 0.12317216272155444, 0.12026351317763329, 0.11964867388208707, 0.11491534238060315, 0.113737386961778, 0.11367260416348775, 0.11274700860182445, 0.11127892384926479, 0.10855888575315475, 0.10669782012701035, 0.10567709182699521, 0.10577129075924556, 0.10335046673814456, 0.10435107350349426, 0.1011144941051801, 0.10139636819561322, 0.1020019253094991, 0.1023290641605854, 0.09968179712692897, 0.099302318568031, 0.09783929958939552, 0.09740341827273369, 0.09800939758618672, 0.09697592879335086, 0.0984523048

Epoch:  58%|█████▊    | 58/100 [06:36<04:43,  6.75s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…

Epoch:  59%|█████▉    | 59/100 [06:43<04:33,  6.66s/it]


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587, 0.26220135887463886, 0.24571756521860758, 0.23376855005820593, 0.22130146125952402, 0.2088807945450147, 0.1988967756430308, 0.18800713370243707, 0.176468292872111, 0.16879013429085413, 0.15945591777563095, 0.15360741565624872, 0.14622070640325546, 0.14106962333122888, 0.13842809945344925, 0.13366339231530824, 0.13091055179635683, 0.12942353015144667, 0.12504354616006216, 0.12317216272155444, 0.12026351317763329, 0.11964867388208707, 0.11491534238060315, 0.113737386961778, 0.11367260416348775, 0.11274700860182445, 0.11127892384926479, 0.10855888575315475, 0.10669782012701035, 0.10567709182699521, 0.10577129075924556, 0.10335046673814456, 0.10435107350349426, 0.1011144941051801, 0.10139636819561322, 0.1020019253094991, 0.1023290641605854, 0.09968179712692897, 0.099302318568031, 0.09783929958939552, 0.09740341827273369, 0.09800939758618672, 0.09697592879335086, 0.0984523048

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587, 0.26220135887463886, 0.24571756521860758, 0.23376855005820593, 0.22130146125952402, 0.2088807945450147, 0.1988967756430308, 0.18800713370243707, 0.176468292872111, 0.16879013429085413, 0.15945591777563095, 0.15360741565624872, 0.14622070640325546, 0.14106962333122888, 0.13842809945344925, 0.13366339231530824, 0.13091055179635683, 0.12942353015144667, 0.12504354616006216, 0.12317216272155444, 0.12026351317763329, 0.11964867388208707, 0.11491534238060315, 0.113737386961778, 0.11367260416348775, 0.11274700860182445, 0.11127892384926479, 0.10855888575315475, 0.10669782012701035, 0.10567709182699521, 0.10577129075924556, 0.10335046673814456, 0.10435107350349426, 0.1011144941051801, 0.10139636819561322, 0.1020019253094991, 0.1023290641605854, 0.09968179712692897, 0.099302318568031, 0.09783929958939552, 0.09740341827273369, 0.09800939758618672, 0.09697592879335086, 0.0984523048

Epoch:  60%|██████    | 60/100 [06:49<04:30,  6.75s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587, 0.26220135887463886, 0.24571756521860758, 0.23376855005820593, 0.22130146125952402, 0.2088807945450147, 0.1988967756430308, 0.18800713370243707, 0.176468292872111, 0.16879013429085413, 0.15945591777563095, 0.15360741565624872, 0.14622070640325546, 0.14106962333122888, 0.13842809945344925, 0.13366339231530824, 0.13091055179635683, 0.12942353015144667, 0.12504354616006216, 0.12317216272155444, 0.12026351317763329, 0.11964867388208707, 0.11491534238060315, 0.113737386961778, 0.11367260416348775, 0.11274700860182445, 0.11127892384926479, 0.10855888575315475, 0.10669782012701035, 0.10567709182699521, 0.10577129075924556, 0.10335046673814456, 0.10435107350349426, 0.1011144941051801, 0.10139636819561322, 0.1020019253094991, 0.1023290641605854, 0.09968179712692897, 0.099302318568031, 0.09783929958939552, 0.09740341827273369, 0.09800939758618672, 0.09697592879335086, 0.0984523048

Epoch:  61%|██████    | 61/100 [06:56<04:25,  6.81s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587, 0.26220135887463886, 0.24571756521860758, 0.23376855005820593, 0.22130146125952402, 0.2088807945450147, 0.1988967756430308, 0.18800713370243707, 0.176468292872111, 0.16879013429085413, 0.15945591777563095, 0.15360741565624872, 0.14622070640325546, 0.14106962333122888, 0.13842809945344925, 0.13366339231530824, 0.13091055179635683, 0.12942353015144667, 0.12504354616006216, 0.12317216272155444, 0.12026351317763329, 0.11964867388208707, 0.11491534238060315, 0.113737386961778, 0.11367260416348775, 0.11274700860182445, 0.11127892384926479, 0.10855888575315475, 0.10669782012701035, 0.10567709182699521, 0.10577129075924556, 0.10335046673814456, 0.10435107350349426, 0.1011144941051801, 0.10139636819561322, 0.1020019253094991, 0.1023290641605854, 0.09968179712692897, 0.099302318568031, 0.09783929958939552, 0.09740341827273369, 0.09800939758618672, 0.09697592879335086, 0.0984523048

Epoch:  62%|██████▏   | 62/100 [07:03<04:20,  6.85s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…

Epoch:  63%|██████▎   | 63/100 [07:10<04:09,  6.73s/it]


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587, 0.26220135887463886, 0.24571756521860758, 0.23376855005820593, 0.22130146125952402, 0.2088807945450147, 0.1988967756430308, 0.18800713370243707, 0.176468292872111, 0.16879013429085413, 0.15945591777563095, 0.15360741565624872, 0.14622070640325546, 0.14106962333122888, 0.13842809945344925, 0.13366339231530824, 0.13091055179635683, 0.12942353015144667, 0.12504354616006216, 0.12317216272155444, 0.12026351317763329, 0.11964867388208707, 0.11491534238060315, 0.113737386961778, 0.11367260416348775, 0.11274700860182445, 0.11127892384926479, 0.10855888575315475, 0.10669782012701035, 0.10567709182699521, 0.10577129075924556, 0.10335046673814456, 0.10435107350349426, 0.1011144941051801, 0.10139636819561322, 0.1020019253094991, 0.1023290641605854, 0.09968179712692897, 0.099302318568031, 0.09783929958939552, 0.09740341827273369, 0.09800939758618672, 0.09697592879335086, 0.0984523048

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…

Epoch:  64%|██████▍   | 64/100 [07:16<03:59,  6.65s/it]


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587, 0.26220135887463886, 0.24571756521860758, 0.23376855005820593, 0.22130146125952402, 0.2088807945450147, 0.1988967756430308, 0.18800713370243707, 0.176468292872111, 0.16879013429085413, 0.15945591777563095, 0.15360741565624872, 0.14622070640325546, 0.14106962333122888, 0.13842809945344925, 0.13366339231530824, 0.13091055179635683, 0.12942353015144667, 0.12504354616006216, 0.12317216272155444, 0.12026351317763329, 0.11964867388208707, 0.11491534238060315, 0.113737386961778, 0.11367260416348775, 0.11274700860182445, 0.11127892384926479, 0.10855888575315475, 0.10669782012701035, 0.10567709182699521, 0.10577129075924556, 0.10335046673814456, 0.10435107350349426, 0.1011144941051801, 0.10139636819561322, 0.1020019253094991, 0.1023290641605854, 0.09968179712692897, 0.099302318568031, 0.09783929958939552, 0.09740341827273369, 0.09800939758618672, 0.09697592879335086, 0.0984523048

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…

Epoch:  65%|██████▌   | 65/100 [07:23<03:50,  6.59s/it]


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587, 0.26220135887463886, 0.24571756521860758, 0.23376855005820593, 0.22130146125952402, 0.2088807945450147, 0.1988967756430308, 0.18800713370243707, 0.176468292872111, 0.16879013429085413, 0.15945591777563095, 0.15360741565624872, 0.14622070640325546, 0.14106962333122888, 0.13842809945344925, 0.13366339231530824, 0.13091055179635683, 0.12942353015144667, 0.12504354616006216, 0.12317216272155444, 0.12026351317763329, 0.11964867388208707, 0.11491534238060315, 0.113737386961778, 0.11367260416348775, 0.11274700860182445, 0.11127892384926479, 0.10855888575315475, 0.10669782012701035, 0.10567709182699521, 0.10577129075924556, 0.10335046673814456, 0.10435107350349426, 0.1011144941051801, 0.10139636819561322, 0.1020019253094991, 0.1023290641605854, 0.09968179712692897, 0.099302318568031, 0.09783929958939552, 0.09740341827273369, 0.09800939758618672, 0.09697592879335086, 0.0984523048

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…

Epoch:  66%|██████▌   | 66/100 [07:29<03:42,  6.55s/it]


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587, 0.26220135887463886, 0.24571756521860758, 0.23376855005820593, 0.22130146125952402, 0.2088807945450147, 0.1988967756430308, 0.18800713370243707, 0.176468292872111, 0.16879013429085413, 0.15945591777563095, 0.15360741565624872, 0.14622070640325546, 0.14106962333122888, 0.13842809945344925, 0.13366339231530824, 0.13091055179635683, 0.12942353015144667, 0.12504354616006216, 0.12317216272155444, 0.12026351317763329, 0.11964867388208707, 0.11491534238060315, 0.113737386961778, 0.11367260416348775, 0.11274700860182445, 0.11127892384926479, 0.10855888575315475, 0.10669782012701035, 0.10567709182699521, 0.10577129075924556, 0.10335046673814456, 0.10435107350349426, 0.1011144941051801, 0.10139636819561322, 0.1020019253094991, 0.1023290641605854, 0.09968179712692897, 0.099302318568031, 0.09783929958939552, 0.09740341827273369, 0.09800939758618672, 0.09697592879335086, 0.0984523048

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5807238121827444, 0.4713802585999171, 0.3920477976401647, 0.3334565609693527, 0.28837329645951587, 0.26220135887463886, 0.24571756521860758, 0.23376855005820593, 0.22130146125952402, 0.2088807945450147, 0.1988967756430308, 0.18800713370243707, 0.176468292872111, 0.16879013429085413, 0.15945591777563095, 0.15360741565624872, 0.14622070640325546, 0.14106962333122888, 0.13842809945344925, 0.13366339231530824, 0.13091055179635683, 0.12942353015144667, 0.12504354616006216, 0.12317216272155444, 0.12026351317763329, 0.11964867388208707, 0.11491534238060315, 0.113737386961778, 0.11367260416348775, 0.11274700860182445, 0.11127892384926479, 0.10855888575315475, 0.10669782012701035, 0.10567709182699521, 0.10577129075924556, 0.10335046673814456, 0.10435107350349426, 0.1011144941051801, 0.10139636819561322, 0.1020019253094991, 0.1023290641605854, 0.09968179712692897, 0.099302318568031, 0.09783929958939552, 0.09740341827273369, 0.09800939758618672, 0.09697592879335086, 0.0984523048


I0404 19:01:15.196125 140043817379648 configuration_utils.py:256] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json from cache at /home/yves/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.8f56353af4a709bf5ff0fbc915d8f5b42bfff892cbb6ac98c3c45f481a03c685
I0404 19:01:15.197483 140043817379648 configuration_utils.py:292] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": null,
  "do_sample": false,
  "eos_token_ids": null,
  "finetuning_task": null,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1"
  },
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "is_decoder": false,
  "label2id": {
    "LABEL_0": 0,
    "LABEL_1": 1
  },
  "layer_norm_eps": 1e-12,
  "length_penalty": 1.0,
  "max_length": 20,
  "max_position_embed

HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Data size: 139
P: 123 / 133 = 0.924812030075188
R: 123 / 159 = 0.7735849056603774
F: 0.8424657534246577
A: 0.7482014388489209
AL1: 0.8201438848920863
Train size: 460
Final train dataloader length: 29


I0404 19:01:21.061128 140043817379648 configuration_utils.py:256] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json from cache at /home/yves/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.8f56353af4a709bf5ff0fbc915d8f5b42bfff892cbb6ac98c3c45f481a03c685
I0404 19:01:21.062078 140043817379648 configuration_utils.py:292] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": null,
  "do_sample": false,
  "eos_token_ids": null,
  "finetuning_task": null,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1"
  },
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "is_decoder": false,
  "label2id": {
    "LABEL_0": 0,
    "LABEL_1": 1
  },
  "layer_norm_eps": 1e-12,
  "length_penalty": 1.0,
  "max_length": 20,
  "max_position_embedd

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: []
Dev loss: 0.6063797970612844


Epoch:   1%|          | 1/100 [00:06<11:23,  6.90s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.6063797970612844]
Dev loss: 0.4430684943993886


Epoch:   2%|▏         | 2/100 [00:13<11:15,  6.89s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.6063797970612844, 0.4430684943993886]
Dev loss: 0.34838584065437317


Epoch:   3%|▎         | 3/100 [00:20<11:08,  6.89s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.6063797970612844, 0.4430684943993886, 0.34838584065437317]
Dev loss: 0.2993813951810201


Epoch:   4%|▍         | 4/100 [00:27<11:01,  6.90s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.6063797970612844, 0.4430684943993886, 0.34838584065437317, 0.2993813951810201]
Dev loss: 0.2680129408836365


Epoch:   5%|▌         | 5/100 [00:34<10:55,  6.90s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.6063797970612844, 0.4430684943993886, 0.34838584065437317, 0.2993813951810201, 0.2680129408836365]
Dev loss: 0.2480297933022181


Epoch:   6%|▌         | 6/100 [00:41<10:48,  6.90s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.6063797970612844, 0.4430684943993886, 0.34838584065437317, 0.2993813951810201, 0.2680129408836365, 0.2480297933022181]
Dev loss: 0.23315516610940298


Epoch:   7%|▋         | 7/100 [00:48<10:41,  6.90s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.6063797970612844, 0.4430684943993886, 0.34838584065437317, 0.2993813951810201, 0.2680129408836365, 0.2480297933022181, 0.23315516610940298]
Dev loss: 0.21830538660287857


Epoch:   8%|▊         | 8/100 [00:55<10:34,  6.90s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.6063797970612844, 0.4430684943993886, 0.34838584065437317, 0.2993813951810201, 0.2680129408836365, 0.2480297933022181, 0.23315516610940298, 0.21830538660287857]
Dev loss: 0.20367339750130972


Epoch:   9%|▉         | 9/100 [01:02<10:27,  6.90s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.6063797970612844, 0.4430684943993886, 0.34838584065437317, 0.2993813951810201, 0.2680129408836365, 0.2480297933022181, 0.23315516610940298, 0.21830538660287857, 0.20367339750130972]
Dev loss: 0.19177435586849848


Epoch:  10%|█         | 10/100 [01:08<10:21,  6.90s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.6063797970612844, 0.4430684943993886, 0.34838584065437317, 0.2993813951810201, 0.2680129408836365, 0.2480297933022181, 0.23315516610940298, 0.21830538660287857, 0.20367339750130972, 0.19177435586849848]
Dev loss: 0.18002596000830332


Epoch:  11%|█         | 11/100 [01:15<10:14,  6.90s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.6063797970612844, 0.4430684943993886, 0.34838584065437317, 0.2993813951810201, 0.2680129408836365, 0.2480297933022181, 0.23315516610940298, 0.21830538660287857, 0.20367339750130972, 0.19177435586849848, 0.18002596000830332]
Dev loss: 0.1708356315890948


Epoch:  12%|█▏        | 12/100 [01:22<10:07,  6.90s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.6063797970612844, 0.4430684943993886, 0.34838584065437317, 0.2993813951810201, 0.2680129408836365, 0.2480297933022181, 0.23315516610940298, 0.21830538660287857, 0.20367339750130972, 0.19177435586849848, 0.18002596000830332, 0.1708356315890948]
Dev loss: 0.16365011284748712


Epoch:  13%|█▎        | 13/100 [01:29<10:00,  6.90s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.6063797970612844, 0.4430684943993886, 0.34838584065437317, 0.2993813951810201, 0.2680129408836365, 0.2480297933022181, 0.23315516610940298, 0.21830538660287857, 0.20367339750130972, 0.19177435586849848, 0.18002596000830332, 0.1708356315890948, 0.16365011284748712]
Dev loss: 0.15577391038338342


Epoch:  14%|█▍        | 14/100 [01:36<09:53,  6.91s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.6063797970612844, 0.4430684943993886, 0.34838584065437317, 0.2993813951810201, 0.2680129408836365, 0.2480297933022181, 0.23315516610940298, 0.21830538660287857, 0.20367339750130972, 0.19177435586849848, 0.18002596000830332, 0.1708356315890948, 0.16365011284748712, 0.15577391038338342]
Dev loss: 0.15119144320487976


Epoch:  15%|█▌        | 15/100 [01:43<09:46,  6.91s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.6063797970612844, 0.4430684943993886, 0.34838584065437317, 0.2993813951810201, 0.2680129408836365, 0.2480297933022181, 0.23315516610940298, 0.21830538660287857, 0.20367339750130972, 0.19177435586849848, 0.18002596000830332, 0.1708356315890948, 0.16365011284748712, 0.15577391038338342, 0.15119144320487976]
Dev loss: 0.1452421322464943


Epoch:  16%|█▌        | 16/100 [01:50<09:40,  6.91s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.6063797970612844, 0.4430684943993886, 0.34838584065437317, 0.2993813951810201, 0.2680129408836365, 0.2480297933022181, 0.23315516610940298, 0.21830538660287857, 0.20367339750130972, 0.19177435586849848, 0.18002596000830332, 0.1708356315890948, 0.16365011284748712, 0.15577391038338342, 0.15119144320487976, 0.1452421322464943]
Dev loss: 0.1424237216512362


Epoch:  17%|█▋        | 17/100 [01:57<09:33,  6.91s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.6063797970612844, 0.4430684943993886, 0.34838584065437317, 0.2993813951810201, 0.2680129408836365, 0.2480297933022181, 0.23315516610940298, 0.21830538660287857, 0.20367339750130972, 0.19177435586849848, 0.18002596000830332, 0.1708356315890948, 0.16365011284748712, 0.15577391038338342, 0.15119144320487976, 0.1452421322464943, 0.1424237216512362]
Dev loss: 0.1370781809091568


Epoch:  18%|█▊        | 18/100 [02:04<09:26,  6.91s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.6063797970612844, 0.4430684943993886, 0.34838584065437317, 0.2993813951810201, 0.2680129408836365, 0.2480297933022181, 0.23315516610940298, 0.21830538660287857, 0.20367339750130972, 0.19177435586849848, 0.18002596000830332, 0.1708356315890948, 0.16365011284748712, 0.15577391038338342, 0.15119144320487976, 0.1452421322464943, 0.1424237216512362, 0.1370781809091568]
Dev loss: 0.13258896519740423


Epoch:  19%|█▉        | 19/100 [02:11<09:19,  6.91s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.6063797970612844, 0.4430684943993886, 0.34838584065437317, 0.2993813951810201, 0.2680129408836365, 0.2480297933022181, 0.23315516610940298, 0.21830538660287857, 0.20367339750130972, 0.19177435586849848, 0.18002596000830332, 0.1708356315890948, 0.16365011284748712, 0.15577391038338342, 0.15119144320487976, 0.1452421322464943, 0.1424237216512362, 0.1370781809091568, 0.13258896519740423]
Dev loss: 0.131445050239563


Epoch:  20%|██        | 20/100 [02:18<09:12,  6.91s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.6063797970612844, 0.4430684943993886, 0.34838584065437317, 0.2993813951810201, 0.2680129408836365, 0.2480297933022181, 0.23315516610940298, 0.21830538660287857, 0.20367339750130972, 0.19177435586849848, 0.18002596000830332, 0.1708356315890948, 0.16365011284748712, 0.15577391038338342, 0.15119144320487976, 0.1452421322464943, 0.1424237216512362, 0.1370781809091568, 0.13258896519740423, 0.131445050239563]
Dev loss: 0.12771031384666762


Epoch:  21%|██        | 21/100 [02:24<09:05,  6.91s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.6063797970612844, 0.4430684943993886, 0.34838584065437317, 0.2993813951810201, 0.2680129408836365, 0.2480297933022181, 0.23315516610940298, 0.21830538660287857, 0.20367339750130972, 0.19177435586849848, 0.18002596000830332, 0.1708356315890948, 0.16365011284748712, 0.15577391038338342, 0.15119144320487976, 0.1452421322464943, 0.1424237216512362, 0.1370781809091568, 0.13258896519740423, 0.131445050239563, 0.12771031384666762]
Dev loss: 0.12630400309960046


Epoch:  22%|██▏       | 22/100 [02:31<08:58,  6.91s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.6063797970612844, 0.4430684943993886, 0.34838584065437317, 0.2993813951810201, 0.2680129408836365, 0.2480297933022181, 0.23315516610940298, 0.21830538660287857, 0.20367339750130972, 0.19177435586849848, 0.18002596000830332, 0.1708356315890948, 0.16365011284748712, 0.15577391038338342, 0.15119144320487976, 0.1452421322464943, 0.1424237216512362, 0.1370781809091568, 0.13258896519740423, 0.131445050239563, 0.12771031384666762, 0.12630400309960046]
Dev loss: 0.12201684340834618


Epoch:  23%|██▎       | 23/100 [02:38<08:51,  6.91s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…

Epoch:  24%|██▍       | 24/100 [02:45<08:34,  6.77s/it]


Loss history: [0.6063797970612844, 0.4430684943993886, 0.34838584065437317, 0.2993813951810201, 0.2680129408836365, 0.2480297933022181, 0.23315516610940298, 0.21830538660287857, 0.20367339750130972, 0.19177435586849848, 0.18002596000830332, 0.1708356315890948, 0.16365011284748712, 0.15577391038338342, 0.15119144320487976, 0.1452421322464943, 0.1424237216512362, 0.1370781809091568, 0.13258896519740423, 0.131445050239563, 0.12771031384666762, 0.12630400309960046, 0.12201684340834618]
Dev loss: 0.1251875969270865


HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.6063797970612844, 0.4430684943993886, 0.34838584065437317, 0.2993813951810201, 0.2680129408836365, 0.2480297933022181, 0.23315516610940298, 0.21830538660287857, 0.20367339750130972, 0.19177435586849848, 0.18002596000830332, 0.1708356315890948, 0.16365011284748712, 0.15577391038338342, 0.15119144320487976, 0.1452421322464943, 0.1424237216512362, 0.1370781809091568, 0.13258896519740423, 0.131445050239563, 0.12771031384666762, 0.12630400309960046, 0.12201684340834618, 0.1251875969270865]
Dev loss: 0.11935119827588399


Epoch:  25%|██▌       | 25/100 [02:52<08:31,  6.81s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.6063797970612844, 0.4430684943993886, 0.34838584065437317, 0.2993813951810201, 0.2680129408836365, 0.2480297933022181, 0.23315516610940298, 0.21830538660287857, 0.20367339750130972, 0.19177435586849848, 0.18002596000830332, 0.1708356315890948, 0.16365011284748712, 0.15577391038338342, 0.15119144320487976, 0.1452421322464943, 0.1424237216512362, 0.1370781809091568, 0.13258896519740423, 0.131445050239563, 0.12771031384666762, 0.12630400309960046, 0.12201684340834618, 0.1251875969270865, 0.11935119827588399]
Dev loss: 0.11456630006432533


Epoch:  26%|██▌       | 26/100 [02:59<08:26,  6.84s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…

Epoch:  27%|██▋       | 27/100 [03:05<08:11,  6.73s/it]


Loss history: [0.6063797970612844, 0.4430684943993886, 0.34838584065437317, 0.2993813951810201, 0.2680129408836365, 0.2480297933022181, 0.23315516610940298, 0.21830538660287857, 0.20367339750130972, 0.19177435586849848, 0.18002596000830332, 0.1708356315890948, 0.16365011284748712, 0.15577391038338342, 0.15119144320487976, 0.1452421322464943, 0.1424237216512362, 0.1370781809091568, 0.13258896519740423, 0.131445050239563, 0.12771031384666762, 0.12630400309960046, 0.12201684340834618, 0.1251875969270865, 0.11935119827588399, 0.11456630006432533]
Dev loss: 0.11606102188428243


HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…

Epoch:  28%|██▊       | 28/100 [03:11<07:58,  6.65s/it]


Loss history: [0.6063797970612844, 0.4430684943993886, 0.34838584065437317, 0.2993813951810201, 0.2680129408836365, 0.2480297933022181, 0.23315516610940298, 0.21830538660287857, 0.20367339750130972, 0.19177435586849848, 0.18002596000830332, 0.1708356315890948, 0.16365011284748712, 0.15577391038338342, 0.15119144320487976, 0.1452421322464943, 0.1424237216512362, 0.1370781809091568, 0.13258896519740423, 0.131445050239563, 0.12771031384666762, 0.12630400309960046, 0.12201684340834618, 0.1251875969270865, 0.11935119827588399, 0.11456630006432533, 0.11606102188428243]
Dev loss: 0.11616700390974681


HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.6063797970612844, 0.4430684943993886, 0.34838584065437317, 0.2993813951810201, 0.2680129408836365, 0.2480297933022181, 0.23315516610940298, 0.21830538660287857, 0.20367339750130972, 0.19177435586849848, 0.18002596000830332, 0.1708356315890948, 0.16365011284748712, 0.15577391038338342, 0.15119144320487976, 0.1452421322464943, 0.1424237216512362, 0.1370781809091568, 0.13258896519740423, 0.131445050239563, 0.12771031384666762, 0.12630400309960046, 0.12201684340834618, 0.1251875969270865, 0.11935119827588399, 0.11456630006432533, 0.11606102188428243, 0.11616700390974681]
Dev loss: 0.11212937906384468


Epoch:  29%|██▉       | 29/100 [03:18<07:57,  6.73s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…

Epoch:  30%|███       | 30/100 [03:25<07:45,  6.65s/it]


Loss history: [0.6063797970612844, 0.4430684943993886, 0.34838584065437317, 0.2993813951810201, 0.2680129408836365, 0.2480297933022181, 0.23315516610940298, 0.21830538660287857, 0.20367339750130972, 0.19177435586849848, 0.18002596000830332, 0.1708356315890948, 0.16365011284748712, 0.15577391038338342, 0.15119144320487976, 0.1452421322464943, 0.1424237216512362, 0.1370781809091568, 0.13258896519740423, 0.131445050239563, 0.12771031384666762, 0.12630400309960046, 0.12201684340834618, 0.1251875969270865, 0.11935119827588399, 0.11456630006432533, 0.11606102188428243, 0.11616700390974681, 0.11212937906384468]
Dev loss: 0.11368871852755547


HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…

Epoch:  31%|███       | 31/100 [03:31<07:34,  6.59s/it]


Loss history: [0.6063797970612844, 0.4430684943993886, 0.34838584065437317, 0.2993813951810201, 0.2680129408836365, 0.2480297933022181, 0.23315516610940298, 0.21830538660287857, 0.20367339750130972, 0.19177435586849848, 0.18002596000830332, 0.1708356315890948, 0.16365011284748712, 0.15577391038338342, 0.15119144320487976, 0.1452421322464943, 0.1424237216512362, 0.1370781809091568, 0.13258896519740423, 0.131445050239563, 0.12771031384666762, 0.12630400309960046, 0.12201684340834618, 0.1251875969270865, 0.11935119827588399, 0.11456630006432533, 0.11606102188428243, 0.11616700390974681, 0.11212937906384468, 0.11368871852755547]
Dev loss: 0.11232185612122218


HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.6063797970612844, 0.4430684943993886, 0.34838584065437317, 0.2993813951810201, 0.2680129408836365, 0.2480297933022181, 0.23315516610940298, 0.21830538660287857, 0.20367339750130972, 0.19177435586849848, 0.18002596000830332, 0.1708356315890948, 0.16365011284748712, 0.15577391038338342, 0.15119144320487976, 0.1452421322464943, 0.1424237216512362, 0.1370781809091568, 0.13258896519740423, 0.131445050239563, 0.12771031384666762, 0.12630400309960046, 0.12201684340834618, 0.1251875969270865, 0.11935119827588399, 0.11456630006432533, 0.11606102188428243, 0.11616700390974681, 0.11212937906384468, 0.11368871852755547, 0.11232185612122218]
Dev loss: 0.11014917244513829


Epoch:  32%|███▏      | 32/100 [03:38<07:34,  6.68s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.6063797970612844, 0.4430684943993886, 0.34838584065437317, 0.2993813951810201, 0.2680129408836365, 0.2480297933022181, 0.23315516610940298, 0.21830538660287857, 0.20367339750130972, 0.19177435586849848, 0.18002596000830332, 0.1708356315890948, 0.16365011284748712, 0.15577391038338342, 0.15119144320487976, 0.1452421322464943, 0.1424237216512362, 0.1370781809091568, 0.13258896519740423, 0.131445050239563, 0.12771031384666762, 0.12630400309960046, 0.12201684340834618, 0.1251875969270865, 0.11935119827588399, 0.11456630006432533, 0.11606102188428243, 0.11616700390974681, 0.11212937906384468, 0.11368871852755547, 0.11232185612122218, 0.11014917244513829]
Dev loss: 0.10988780111074448


Epoch:  33%|███▎      | 33/100 [03:45<07:32,  6.75s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.6063797970612844, 0.4430684943993886, 0.34838584065437317, 0.2993813951810201, 0.2680129408836365, 0.2480297933022181, 0.23315516610940298, 0.21830538660287857, 0.20367339750130972, 0.19177435586849848, 0.18002596000830332, 0.1708356315890948, 0.16365011284748712, 0.15577391038338342, 0.15119144320487976, 0.1452421322464943, 0.1424237216512362, 0.1370781809091568, 0.13258896519740423, 0.131445050239563, 0.12771031384666762, 0.12630400309960046, 0.12201684340834618, 0.1251875969270865, 0.11935119827588399, 0.11456630006432533, 0.11606102188428243, 0.11616700390974681, 0.11212937906384468, 0.11368871852755547, 0.11232185612122218, 0.11014917244513829, 0.10988780111074448]
Dev loss: 0.10884868974486987


Epoch:  34%|███▍      | 34/100 [03:52<07:28,  6.80s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.6063797970612844, 0.4430684943993886, 0.34838584065437317, 0.2993813951810201, 0.2680129408836365, 0.2480297933022181, 0.23315516610940298, 0.21830538660287857, 0.20367339750130972, 0.19177435586849848, 0.18002596000830332, 0.1708356315890948, 0.16365011284748712, 0.15577391038338342, 0.15119144320487976, 0.1452421322464943, 0.1424237216512362, 0.1370781809091568, 0.13258896519740423, 0.131445050239563, 0.12771031384666762, 0.12630400309960046, 0.12201684340834618, 0.1251875969270865, 0.11935119827588399, 0.11456630006432533, 0.11606102188428243, 0.11616700390974681, 0.11212937906384468, 0.11368871852755547, 0.11232185612122218, 0.11014917244513829, 0.10988780111074448, 0.10884868974486987]
Dev loss: 0.1075655755897363


Epoch:  35%|███▌      | 35/100 [03:59<07:24,  6.83s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…

Epoch:  36%|███▌      | 36/100 [04:05<07:09,  6.72s/it]


Loss history: [0.6063797970612844, 0.4430684943993886, 0.34838584065437317, 0.2993813951810201, 0.2680129408836365, 0.2480297933022181, 0.23315516610940298, 0.21830538660287857, 0.20367339750130972, 0.19177435586849848, 0.18002596000830332, 0.1708356315890948, 0.16365011284748712, 0.15577391038338342, 0.15119144320487976, 0.1452421322464943, 0.1424237216512362, 0.1370781809091568, 0.13258896519740423, 0.131445050239563, 0.12771031384666762, 0.12630400309960046, 0.12201684340834618, 0.1251875969270865, 0.11935119827588399, 0.11456630006432533, 0.11606102188428243, 0.11616700390974681, 0.11212937906384468, 0.11368871852755547, 0.11232185612122218, 0.11014917244513829, 0.10988780111074448, 0.10884868974486987, 0.1075655755897363]
Dev loss: 0.10782802974184354


HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.6063797970612844, 0.4430684943993886, 0.34838584065437317, 0.2993813951810201, 0.2680129408836365, 0.2480297933022181, 0.23315516610940298, 0.21830538660287857, 0.20367339750130972, 0.19177435586849848, 0.18002596000830332, 0.1708356315890948, 0.16365011284748712, 0.15577391038338342, 0.15119144320487976, 0.1452421322464943, 0.1424237216512362, 0.1370781809091568, 0.13258896519740423, 0.131445050239563, 0.12771031384666762, 0.12630400309960046, 0.12201684340834618, 0.1251875969270865, 0.11935119827588399, 0.11456630006432533, 0.11606102188428243, 0.11616700390974681, 0.11212937906384468, 0.11368871852755547, 0.11232185612122218, 0.11014917244513829, 0.10988780111074448, 0.10884868974486987, 0.1075655755897363, 0.10782802974184354]
Dev loss: 0.10494858274857204


Epoch:  37%|███▋      | 37/100 [04:12<07:06,  6.78s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…

Epoch:  38%|███▊      | 38/100 [04:19<06:54,  6.68s/it]


Loss history: [0.6063797970612844, 0.4430684943993886, 0.34838584065437317, 0.2993813951810201, 0.2680129408836365, 0.2480297933022181, 0.23315516610940298, 0.21830538660287857, 0.20367339750130972, 0.19177435586849848, 0.18002596000830332, 0.1708356315890948, 0.16365011284748712, 0.15577391038338342, 0.15119144320487976, 0.1452421322464943, 0.1424237216512362, 0.1370781809091568, 0.13258896519740423, 0.131445050239563, 0.12771031384666762, 0.12630400309960046, 0.12201684340834618, 0.1251875969270865, 0.11935119827588399, 0.11456630006432533, 0.11606102188428243, 0.11616700390974681, 0.11212937906384468, 0.11368871852755547, 0.11232185612122218, 0.11014917244513829, 0.10988780111074448, 0.10884868974486987, 0.1075655755897363, 0.10782802974184354, 0.10494858274857204]
Dev loss: 0.10647786408662796


HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…

Epoch:  39%|███▉      | 39/100 [04:25<06:43,  6.61s/it]


Loss history: [0.6063797970612844, 0.4430684943993886, 0.34838584065437317, 0.2993813951810201, 0.2680129408836365, 0.2480297933022181, 0.23315516610940298, 0.21830538660287857, 0.20367339750130972, 0.19177435586849848, 0.18002596000830332, 0.1708356315890948, 0.16365011284748712, 0.15577391038338342, 0.15119144320487976, 0.1452421322464943, 0.1424237216512362, 0.1370781809091568, 0.13258896519740423, 0.131445050239563, 0.12771031384666762, 0.12630400309960046, 0.12201684340834618, 0.1251875969270865, 0.11935119827588399, 0.11456630006432533, 0.11606102188428243, 0.11616700390974681, 0.11212937906384468, 0.11368871852755547, 0.11232185612122218, 0.11014917244513829, 0.10988780111074448, 0.10884868974486987, 0.1075655755897363, 0.10782802974184354, 0.10494858274857204, 0.10647786408662796]
Dev loss: 0.10702131936947505


HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.6063797970612844, 0.4430684943993886, 0.34838584065437317, 0.2993813951810201, 0.2680129408836365, 0.2480297933022181, 0.23315516610940298, 0.21830538660287857, 0.20367339750130972, 0.19177435586849848, 0.18002596000830332, 0.1708356315890948, 0.16365011284748712, 0.15577391038338342, 0.15119144320487976, 0.1452421322464943, 0.1424237216512362, 0.1370781809091568, 0.13258896519740423, 0.131445050239563, 0.12771031384666762, 0.12630400309960046, 0.12201684340834618, 0.1251875969270865, 0.11935119827588399, 0.11456630006432533, 0.11606102188428243, 0.11616700390974681, 0.11212937906384468, 0.11368871852755547, 0.11232185612122218, 0.11014917244513829, 0.10988780111074448, 0.10884868974486987, 0.1075655755897363, 0.10782802974184354, 0.10494858274857204, 0.10647786408662796, 0.10702131936947505]
Dev loss: 0.09913409998019536


Epoch:  40%|████      | 40/100 [04:32<06:42,  6.70s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…

Epoch:  41%|████      | 41/100 [04:39<06:30,  6.63s/it]


Loss history: [0.6063797970612844, 0.4430684943993886, 0.34838584065437317, 0.2993813951810201, 0.2680129408836365, 0.2480297933022181, 0.23315516610940298, 0.21830538660287857, 0.20367339750130972, 0.19177435586849848, 0.18002596000830332, 0.1708356315890948, 0.16365011284748712, 0.15577391038338342, 0.15119144320487976, 0.1452421322464943, 0.1424237216512362, 0.1370781809091568, 0.13258896519740423, 0.131445050239563, 0.12771031384666762, 0.12630400309960046, 0.12201684340834618, 0.1251875969270865, 0.11935119827588399, 0.11456630006432533, 0.11606102188428243, 0.11616700390974681, 0.11212937906384468, 0.11368871852755547, 0.11232185612122218, 0.11014917244513829, 0.10988780111074448, 0.10884868974486987, 0.1075655755897363, 0.10782802974184354, 0.10494858274857204, 0.10647786408662796, 0.10702131936947505, 0.09913409998019536]
Dev loss: 0.10386243090033531


HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…

Epoch:  42%|████▏     | 42/100 [04:45<06:21,  6.58s/it]


Loss history: [0.6063797970612844, 0.4430684943993886, 0.34838584065437317, 0.2993813951810201, 0.2680129408836365, 0.2480297933022181, 0.23315516610940298, 0.21830538660287857, 0.20367339750130972, 0.19177435586849848, 0.18002596000830332, 0.1708356315890948, 0.16365011284748712, 0.15577391038338342, 0.15119144320487976, 0.1452421322464943, 0.1424237216512362, 0.1370781809091568, 0.13258896519740423, 0.131445050239563, 0.12771031384666762, 0.12630400309960046, 0.12201684340834618, 0.1251875969270865, 0.11935119827588399, 0.11456630006432533, 0.11606102188428243, 0.11616700390974681, 0.11212937906384468, 0.11368871852755547, 0.11232185612122218, 0.11014917244513829, 0.10988780111074448, 0.10884868974486987, 0.1075655755897363, 0.10782802974184354, 0.10494858274857204, 0.10647786408662796, 0.10702131936947505, 0.09913409998019536, 0.10386243090033531]
Dev loss: 0.1019996168712775


HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…

Epoch:  43%|████▎     | 43/100 [04:51<06:12,  6.54s/it]


Loss history: [0.6063797970612844, 0.4430684943993886, 0.34838584065437317, 0.2993813951810201, 0.2680129408836365, 0.2480297933022181, 0.23315516610940298, 0.21830538660287857, 0.20367339750130972, 0.19177435586849848, 0.18002596000830332, 0.1708356315890948, 0.16365011284748712, 0.15577391038338342, 0.15119144320487976, 0.1452421322464943, 0.1424237216512362, 0.1370781809091568, 0.13258896519740423, 0.131445050239563, 0.12771031384666762, 0.12630400309960046, 0.12201684340834618, 0.1251875969270865, 0.11935119827588399, 0.11456630006432533, 0.11606102188428243, 0.11616700390974681, 0.11212937906384468, 0.11368871852755547, 0.11232185612122218, 0.11014917244513829, 0.10988780111074448, 0.10884868974486987, 0.1075655755897363, 0.10782802974184354, 0.10494858274857204, 0.10647786408662796, 0.10702131936947505, 0.09913409998019536, 0.10386243090033531, 0.1019996168712775]
Dev loss: 0.10209232941269875


HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…

Epoch:  44%|████▍     | 44/100 [04:58<06:04,  6.51s/it]


Loss history: [0.6063797970612844, 0.4430684943993886, 0.34838584065437317, 0.2993813951810201, 0.2680129408836365, 0.2480297933022181, 0.23315516610940298, 0.21830538660287857, 0.20367339750130972, 0.19177435586849848, 0.18002596000830332, 0.1708356315890948, 0.16365011284748712, 0.15577391038338342, 0.15119144320487976, 0.1452421322464943, 0.1424237216512362, 0.1370781809091568, 0.13258896519740423, 0.131445050239563, 0.12771031384666762, 0.12630400309960046, 0.12201684340834618, 0.1251875969270865, 0.11935119827588399, 0.11456630006432533, 0.11606102188428243, 0.11616700390974681, 0.11212937906384468, 0.11368871852755547, 0.11232185612122218, 0.11014917244513829, 0.10988780111074448, 0.10884868974486987, 0.1075655755897363, 0.10782802974184354, 0.10494858274857204, 0.10647786408662796, 0.10702131936947505, 0.09913409998019536, 0.10386243090033531, 0.1019996168712775, 0.10209232941269875]
Dev loss: 0.10242579504847527


HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.6063797970612844, 0.4430684943993886, 0.34838584065437317, 0.2993813951810201, 0.2680129408836365, 0.2480297933022181, 0.23315516610940298, 0.21830538660287857, 0.20367339750130972, 0.19177435586849848, 0.18002596000830332, 0.1708356315890948, 0.16365011284748712, 0.15577391038338342, 0.15119144320487976, 0.1452421322464943, 0.1424237216512362, 0.1370781809091568, 0.13258896519740423, 0.131445050239563, 0.12771031384666762, 0.12630400309960046, 0.12201684340834618, 0.1251875969270865, 0.11935119827588399, 0.11456630006432533, 0.11606102188428243, 0.11616700390974681, 0.11212937906384468, 0.11368871852755547, 0.11232185612122218, 0.11014917244513829, 0.10988780111074448, 0.10884868974486987, 0.1075655755897363, 0.10782802974184354, 0.10494858274857204, 0.10647786408662796, 0.10702131936947505, 0.09913409998019536, 0.10386243090033531, 0.1019996168712775, 0.10209232941269875, 0.10242579504847527]
Dev loss: 0.09791096299886703


Epoch:  45%|████▌     | 45/100 [05:05<06:04,  6.63s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…

Epoch:  46%|████▌     | 46/100 [05:11<05:55,  6.58s/it]


Loss history: [0.6063797970612844, 0.4430684943993886, 0.34838584065437317, 0.2993813951810201, 0.2680129408836365, 0.2480297933022181, 0.23315516610940298, 0.21830538660287857, 0.20367339750130972, 0.19177435586849848, 0.18002596000830332, 0.1708356315890948, 0.16365011284748712, 0.15577391038338342, 0.15119144320487976, 0.1452421322464943, 0.1424237216512362, 0.1370781809091568, 0.13258896519740423, 0.131445050239563, 0.12771031384666762, 0.12630400309960046, 0.12201684340834618, 0.1251875969270865, 0.11935119827588399, 0.11456630006432533, 0.11606102188428243, 0.11616700390974681, 0.11212937906384468, 0.11368871852755547, 0.11232185612122218, 0.11014917244513829, 0.10988780111074448, 0.10884868974486987, 0.1075655755897363, 0.10782802974184354, 0.10494858274857204, 0.10647786408662796, 0.10702131936947505, 0.09913409998019536, 0.10386243090033531, 0.1019996168712775, 0.10209232941269875, 0.10242579504847527, 0.09791096299886703]
Dev loss: 0.09901359180609386


HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…

Epoch:  47%|████▋     | 47/100 [05:18<05:46,  6.54s/it]


Loss history: [0.6063797970612844, 0.4430684943993886, 0.34838584065437317, 0.2993813951810201, 0.2680129408836365, 0.2480297933022181, 0.23315516610940298, 0.21830538660287857, 0.20367339750130972, 0.19177435586849848, 0.18002596000830332, 0.1708356315890948, 0.16365011284748712, 0.15577391038338342, 0.15119144320487976, 0.1452421322464943, 0.1424237216512362, 0.1370781809091568, 0.13258896519740423, 0.131445050239563, 0.12771031384666762, 0.12630400309960046, 0.12201684340834618, 0.1251875969270865, 0.11935119827588399, 0.11456630006432533, 0.11606102188428243, 0.11616700390974681, 0.11212937906384468, 0.11368871852755547, 0.11232185612122218, 0.11014917244513829, 0.10988780111074448, 0.10884868974486987, 0.1075655755897363, 0.10782802974184354, 0.10494858274857204, 0.10647786408662796, 0.10702131936947505, 0.09913409998019536, 0.10386243090033531, 0.1019996168712775, 0.10209232941269875, 0.10242579504847527, 0.09791096299886703, 0.09901359180609386]
Dev loss: 0.10092383374770482


HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…

Epoch:  48%|████▊     | 48/100 [05:24<05:38,  6.52s/it]


Loss history: [0.6063797970612844, 0.4430684943993886, 0.34838584065437317, 0.2993813951810201, 0.2680129408836365, 0.2480297933022181, 0.23315516610940298, 0.21830538660287857, 0.20367339750130972, 0.19177435586849848, 0.18002596000830332, 0.1708356315890948, 0.16365011284748712, 0.15577391038338342, 0.15119144320487976, 0.1452421322464943, 0.1424237216512362, 0.1370781809091568, 0.13258896519740423, 0.131445050239563, 0.12771031384666762, 0.12630400309960046, 0.12201684340834618, 0.1251875969270865, 0.11935119827588399, 0.11456630006432533, 0.11606102188428243, 0.11616700390974681, 0.11212937906384468, 0.11368871852755547, 0.11232185612122218, 0.11014917244513829, 0.10988780111074448, 0.10884868974486987, 0.1075655755897363, 0.10782802974184354, 0.10494858274857204, 0.10647786408662796, 0.10702131936947505, 0.09913409998019536, 0.10386243090033531, 0.1019996168712775, 0.10209232941269875, 0.10242579504847527, 0.09791096299886703, 0.09901359180609386, 0.10092383374770482]
Dev loss: 0

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…

Epoch:  49%|████▉     | 49/100 [05:31<05:31,  6.50s/it]


Loss history: [0.6063797970612844, 0.4430684943993886, 0.34838584065437317, 0.2993813951810201, 0.2680129408836365, 0.2480297933022181, 0.23315516610940298, 0.21830538660287857, 0.20367339750130972, 0.19177435586849848, 0.18002596000830332, 0.1708356315890948, 0.16365011284748712, 0.15577391038338342, 0.15119144320487976, 0.1452421322464943, 0.1424237216512362, 0.1370781809091568, 0.13258896519740423, 0.131445050239563, 0.12771031384666762, 0.12630400309960046, 0.12201684340834618, 0.1251875969270865, 0.11935119827588399, 0.11456630006432533, 0.11606102188428243, 0.11616700390974681, 0.11212937906384468, 0.11368871852755547, 0.11232185612122218, 0.11014917244513829, 0.10988780111074448, 0.10884868974486987, 0.1075655755897363, 0.10782802974184354, 0.10494858274857204, 0.10647786408662796, 0.10702131936947505, 0.09913409998019536, 0.10386243090033531, 0.1019996168712775, 0.10209232941269875, 0.10242579504847527, 0.09791096299886703, 0.09901359180609386, 0.10092383374770482, 0.102020146

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.6063797970612844, 0.4430684943993886, 0.34838584065437317, 0.2993813951810201, 0.2680129408836365, 0.2480297933022181, 0.23315516610940298, 0.21830538660287857, 0.20367339750130972, 0.19177435586849848, 0.18002596000830332, 0.1708356315890948, 0.16365011284748712, 0.15577391038338342, 0.15119144320487976, 0.1452421322464943, 0.1424237216512362, 0.1370781809091568, 0.13258896519740423, 0.131445050239563, 0.12771031384666762, 0.12630400309960046, 0.12201684340834618, 0.1251875969270865, 0.11935119827588399, 0.11456630006432533, 0.11606102188428243, 0.11616700390974681, 0.11212937906384468, 0.11368871852755547, 0.11232185612122218, 0.11014917244513829, 0.10988780111074448, 0.10884868974486987, 0.1075655755897363, 0.10782802974184354, 0.10494858274857204, 0.10647786408662796, 0.10702131936947505, 0.09913409998019536, 0.10386243090033531, 0.1019996168712775, 0.10209232941269875, 0.10242579504847527, 0.09791096299886703, 0.09901359180609386, 0.10092383374770482, 0.102020146

I0404 19:07:01.744144 140043817379648 configuration_utils.py:256] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json from cache at /home/yves/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.8f56353af4a709bf5ff0fbc915d8f5b42bfff892cbb6ac98c3c45f481a03c685
I0404 19:07:01.745219 140043817379648 configuration_utils.py:292] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": null,
  "do_sample": false,
  "eos_token_ids": null,
  "finetuning_task": null,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1"
  },
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "is_decoder": false,
  "label2id": {
    "LABEL_0": 0,
    "LABEL_1": 1
  },
  "layer_norm_eps": 1e-12,
  "length_penalty": 1.0,
  "max_length": 20,
  "max_position_embedd

HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Data size: 139
P: 121 / 138 = 0.8768115942028986
R: 121 / 166 = 0.7289156626506024
F: 0.7960526315789472
A: 0.6546762589928058
AL1: 0.7697841726618705
Train size: 461
Final train dataloader length: 29


I0404 19:07:07.828162 140043817379648 configuration_utils.py:256] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json from cache at /home/yves/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.8f56353af4a709bf5ff0fbc915d8f5b42bfff892cbb6ac98c3c45f481a03c685
I0404 19:07:07.829785 140043817379648 configuration_utils.py:292] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": null,
  "do_sample": false,
  "eos_token_ids": null,
  "finetuning_task": null,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1"
  },
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "is_decoder": false,
  "label2id": {
    "LABEL_0": 0,
    "LABEL_1": 1
  },
  "layer_norm_eps": 1e-12,
  "length_penalty": 1.0,
  "max_length": 20,
  "max_position_embedd

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: []
Dev loss: 0.605769564708074



Epoch:   1%|          | 1/100 [00:06<11:24,  6.92s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.605769564708074]
Dev loss: 0.4612632393836975



Epoch:   2%|▏         | 2/100 [00:13<11:17,  6.91s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.605769564708074, 0.4612632393836975]
Dev loss: 0.3484412282705307



Epoch:   3%|▎         | 3/100 [00:20<11:10,  6.91s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307]
Dev loss: 0.29717086255550385



Epoch:   4%|▍         | 4/100 [00:27<11:03,  6.92s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385]
Dev loss: 0.26731228331724805



Epoch:   5%|▌         | 5/100 [00:34<10:56,  6.91s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805]
Dev loss: 0.24957897265752158



Epoch:   6%|▌         | 6/100 [00:41<10:50,  6.92s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158]
Dev loss: 0.2357658992211024



Epoch:   7%|▋         | 7/100 [00:48<10:43,  6.92s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024]
Dev loss: 0.22482032080491385



Epoch:   8%|▊         | 8/100 [00:55<10:36,  6.92s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385]
Dev loss: 0.21327275782823563



Epoch:   9%|▉         | 9/100 [01:02<10:29,  6.92s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563]
Dev loss: 0.20361091196537018



Epoch:  10%|█         | 10/100 [01:09<10:22,  6.92s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018]
Dev loss: 0.19357392440239587



Epoch:  11%|█         | 11/100 [01:16<10:15,  6.92s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587]
Dev loss: 0.18363535155852637



Epoch:  12%|█▏        | 12/100 [01:22<10:08,  6.92s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637]
Dev loss: 0.1735388288895289



Epoch:  13%|█▎        | 13/100 [01:29<10:01,  6.92s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289]
Dev loss: 0.1645234872897466



Epoch:  14%|█▍        | 14/100 [01:36<09:54,  6.92s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466]
Dev loss: 0.15794453273216882



Epoch:  15%|█▌        | 15/100 [01:43<09:47,  6.92s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882]
Dev loss: 0.1493313138683637



Epoch:  16%|█▌        | 16/100 [01:50<09:40,  6.92s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637]
Dev loss: 0.1438723181684812



Epoch:  17%|█▋        | 17/100 [01:57<09:34,  6.92s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812]
Dev loss: 0.14142207925518355



Epoch:  18%|█▊        | 18/100 [02:04<09:27,  6.91s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355]
Dev loss: 0.13510849823554358



Epoch:  19%|█▉        | 19/100 [02:11<09:20,  6.92s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358]
Dev loss: 0.13173267617821693



Epoch:  20%|██        | 20/100 [02:18<09:13,  6.92s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693]
Dev loss: 0.1275815355281035



Epoch:  21%|██        | 21/100 [02:25<09:06,  6.92s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035]
Dev loss: 0.1236833209792773



Epoch:  22%|██▏       | 22/100 [02:32<08:59,  6.92s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035, 0.1236833209792773]
Dev loss: 0.12003363420565923



Epoch:  23%|██▎       | 23/100 [02:39<08:52,  6.92s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035, 0.1236833209792773, 0.12003363420565923]
Dev loss: 0.11891685674587886



Epoch:  24%|██▍       | 24/100 [02:45<08:45,  6.92s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035, 0.1236833209792773, 0.12003363420565923, 0.11891685674587886]
Dev loss: 0.11540696894129117



Epoch:  25%|██▌       | 25/100 [02:52<08:38,  6.92s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Epoch:  26%|██▌       | 26/100 [02:59<08:21,  6.78s/it][A


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035, 0.1236833209792773, 0.12003363420565923, 0.11891685674587886, 0.11540696894129117]
Dev loss: 0.11576371019085248


HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035, 0.1236833209792773, 0.12003363420565923, 0.11891685674587886, 0.11540696894129117, 0.11576371019085248]
Dev loss: 0.1132194995880127



Epoch:  27%|██▋       | 27/100 [03:06<08:17,  6.82s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035, 0.1236833209792773, 0.12003363420565923, 0.11891685674587886, 0.11540696894129117, 0.11576371019085248, 0.1132194995880127]
Dev loss: 0.10920468717813492



Epoch:  28%|██▊       | 28/100 [03:13<08:13,  6.85s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Epoch:  29%|██▉       | 29/100 [03:19<07:58,  6.73s/it][A


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035, 0.1236833209792773, 0.12003363420565923, 0.11891685674587886, 0.11540696894129117, 0.11576371019085248, 0.1132194995880127, 0.10920468717813492]
Dev loss: 0.10964834193388621


HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035, 0.1236833209792773, 0.12003363420565923, 0.11891685674587886, 0.11540696894129117, 0.11576371019085248, 0.1132194995880127, 0.10920468717813492, 0.10964834193388621]
Dev loss: 0.10718649377425511



Epoch:  30%|███       | 30/100 [03:26<07:55,  6.79s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035, 0.1236833209792773, 0.12003363420565923, 0.11891685674587886, 0.11540696894129117, 0.11576371019085248, 0.1132194995880127, 0.10920468717813492, 0.10964834193388621, 0.10718649377425511]
Dev loss: 0.10522439082463582



Epoch:  31%|███       | 31/100 [03:33<07:50,  6.83s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035, 0.1236833209792773, 0.12003363420565923, 0.11891685674587886, 0.11540696894129117, 0.11576371019085248, 0.1132194995880127, 0.10920468717813492, 0.10964834193388621, 0.10718649377425511, 0.10522439082463582]
Dev loss: 0.10399086276690166



Epoch:  32%|███▏      | 32/100 [03:40<07:45,  6.85s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035, 0.1236833209792773, 0.12003363420565923, 0.11891685674587886, 0.11540696894129117, 0.11576371019085248, 0.1132194995880127, 0.10920468717813492, 0.10964834193388621, 0.10718649377425511, 0.10522439082463582, 0.10399086276690166]
Dev loss: 0.10198018079002698



Epoch:  33%|███▎      | 33/100 [03:47<07:40,  6.87s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035, 0.1236833209792773, 0.12003363420565923, 0.11891685674587886, 0.11540696894129117, 0.11576371019085248, 0.1132194995880127, 0.10920468717813492, 0.10964834193388621, 0.10718649377425511, 0.10522439082463582, 0.10399086276690166, 0.10198018079002698]
Dev loss: 0.10100916276375453



Epoch:  34%|███▍      | 34/100 [03:54<07:34,  6.88s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035, 0.1236833209792773, 0.12003363420565923, 0.11891685674587886, 0.11540696894129117, 0.11576371019085248, 0.1132194995880127, 0.10920468717813492, 0.10964834193388621, 0.10718649377425511, 0.10522439082463582, 0.10399086276690166, 0.10198018079002698, 0.10100916276375453]
Dev loss: 0.09902210161089897



Epoch:  35%|███▌      | 35/100 [04:01<07:27,  6.89s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Epoch:  36%|███▌      | 36/100 [04:07<07:12,  6.76s/it][A


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035, 0.1236833209792773, 0.12003363420565923, 0.11891685674587886, 0.11540696894129117, 0.11576371019085248, 0.1132194995880127, 0.10920468717813492, 0.10964834193388621, 0.10718649377425511, 0.10522439082463582, 0.10399086276690166, 0.10198018079002698, 0.10100916276375453, 0.09902210161089897]
Dev loss: 0.0992783469458421


HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035, 0.1236833209792773, 0.12003363420565923, 0.11891685674587886, 0.11540696894129117, 0.11576371019085248, 0.1132194995880127, 0.10920468717813492, 0.10964834193388621, 0.10718649377425511, 0.10522439082463582, 0.10399086276690166, 0.10198018079002698, 0.10100916276375453, 0.09902210161089897, 0.0992783469458421]
Dev loss: 0.09752156212925911



Epoch:  37%|███▋      | 37/100 [04:14<07:08,  6.81s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Epoch:  38%|███▊      | 38/100 [04:20<06:55,  6.71s/it][A


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035, 0.1236833209792773, 0.12003363420565923, 0.11891685674587886, 0.11540696894129117, 0.11576371019085248, 0.1132194995880127, 0.10920468717813492, 0.10964834193388621, 0.10718649377425511, 0.10522439082463582, 0.10399086276690166, 0.10198018079002698, 0.10100916276375453, 0.09902210161089897, 0.0992783469458421, 0.09752156212925911]
Dev loss: 0.09848614285389583


HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035, 0.1236833209792773, 0.12003363420565923, 0.11891685674587886, 0.11540696894129117, 0.11576371019085248, 0.1132194995880127, 0.10920468717813492, 0.10964834193388621, 0.10718649377425511, 0.10522439082463582, 0.10399086276690166, 0.10198018079002698, 0.10100916276375453, 0.09902210161089897, 0.0992783469458421, 0.09752156212925911, 0.09848614285389583]
Dev loss: 0.09702384720245998



Epoch:  39%|███▉      | 39/100 [04:27<06:52,  6.77s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035, 0.1236833209792773, 0.12003363420565923, 0.11891685674587886, 0.11540696894129117, 0.11576371019085248, 0.1132194995880127, 0.10920468717813492, 0.10964834193388621, 0.10718649377425511, 0.10522439082463582, 0.10399086276690166, 0.10198018079002698, 0.10100916276375453, 0.09902210161089897, 0.0992783469458421, 0.09752156212925911, 0.09848614285389583, 0.09702384720245998]
Dev loss: 0.09578496466080348



Epoch:  40%|████      | 40/100 [04:34<06:48,  6.81s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Epoch:  41%|████      | 41/100 [04:41<06:35,  6.71s/it][A


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035, 0.1236833209792773, 0.12003363420565923, 0.11891685674587886, 0.11540696894129117, 0.11576371019085248, 0.1132194995880127, 0.10920468717813492, 0.10964834193388621, 0.10718649377425511, 0.10522439082463582, 0.10399086276690166, 0.10198018079002698, 0.10100916276375453, 0.09902210161089897, 0.0992783469458421, 0.09752156212925911, 0.09848614285389583, 0.09702384720245998, 0.09578496466080348]
Dev loss: 0.09599502260486285


HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035, 0.1236833209792773, 0.12003363420565923, 0.11891685674587886, 0.11540696894129117, 0.11576371019085248, 0.1132194995880127, 0.10920468717813492, 0.10964834193388621, 0.10718649377425511, 0.10522439082463582, 0.10399086276690166, 0.10198018079002698, 0.10100916276375453, 0.09902210161089897, 0.0992783469458421, 0.09752156212925911, 0.09848614285389583, 0.09702384720245998, 0.09578496466080348, 0.09599502260486285]
Dev loss: 0.09521357342600822



Epoch:  42%|████▏     | 42/100 [04:48<06:32,  6.77s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Epoch:  43%|████▎     | 43/100 [04:54<06:20,  6.68s/it][A


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035, 0.1236833209792773, 0.12003363420565923, 0.11891685674587886, 0.11540696894129117, 0.11576371019085248, 0.1132194995880127, 0.10920468717813492, 0.10964834193388621, 0.10718649377425511, 0.10522439082463582, 0.10399086276690166, 0.10198018079002698, 0.10100916276375453, 0.09902210161089897, 0.0992783469458421, 0.09752156212925911, 0.09848614285389583, 0.09702384720245998, 0.09578496466080348, 0.09599502260486285, 0.09521357342600822]
Dev loss: 0.09541725864013036


HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035, 0.1236833209792773, 0.12003363420565923, 0.11891685674587886, 0.11540696894129117, 0.11576371019085248, 0.1132194995880127, 0.10920468717813492, 0.10964834193388621, 0.10718649377425511, 0.10522439082463582, 0.10399086276690166, 0.10198018079002698, 0.10100916276375453, 0.09902210161089897, 0.0992783469458421, 0.09752156212925911, 0.09848614285389583, 0.09702384720245998, 0.09578496466080348, 0.09599502260486285, 0.09521357342600822, 0.09541725864013036]
Dev loss: 0.09348195667068164



Epoch:  44%|████▍     | 44/100 [05:01<06:17,  6.75s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035, 0.1236833209792773, 0.12003363420565923, 0.11891685674587886, 0.11540696894129117, 0.11576371019085248, 0.1132194995880127, 0.10920468717813492, 0.10964834193388621, 0.10718649377425511, 0.10522439082463582, 0.10399086276690166, 0.10198018079002698, 0.10100916276375453, 0.09902210161089897, 0.0992783469458421, 0.09752156212925911, 0.09848614285389583, 0.09702384720245998, 0.09578496466080348, 0.09599502260486285, 0.09521357342600822, 0.09541725864013036, 0.09348195667068164]
Dev loss: 0.09311617724597454



Epoch:  45%|████▌     | 45/100 [05:08<06:13,  6.80s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035, 0.1236833209792773, 0.12003363420565923, 0.11891685674587886, 0.11540696894129117, 0.11576371019085248, 0.1132194995880127, 0.10920468717813492, 0.10964834193388621, 0.10718649377425511, 0.10522439082463582, 0.10399086276690166, 0.10198018079002698, 0.10100916276375453, 0.09902210161089897, 0.0992783469458421, 0.09752156212925911, 0.09848614285389583, 0.09702384720245998, 0.09578496466080348, 0.09599502260486285, 0.09521357342600822, 0.09541725864013036, 0.09348195667068164, 0.09311617724597454]
Dev loss: 0.09050437373419602



Epoch:  46%|████▌     | 46/100 [05:15<06:09,  6.84s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Epoch:  47%|████▋     | 47/100 [05:21<05:56,  6.72s/it][A


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035, 0.1236833209792773, 0.12003363420565923, 0.11891685674587886, 0.11540696894129117, 0.11576371019085248, 0.1132194995880127, 0.10920468717813492, 0.10964834193388621, 0.10718649377425511, 0.10522439082463582, 0.10399086276690166, 0.10198018079002698, 0.10100916276375453, 0.09902210161089897, 0.0992783469458421, 0.09752156212925911, 0.09848614285389583, 0.09702384720245998, 0.09578496466080348, 0.09599502260486285, 0.09521357342600822, 0.09541725864013036, 0.09348195667068164, 0.09311617724597454, 0.09050437373419602]
Dev loss: 0.09073082854350407


HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Epoch:  48%|████▊     | 48/100 [05:28<05:45,  6.65s/it][A


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035, 0.1236833209792773, 0.12003363420565923, 0.11891685674587886, 0.11540696894129117, 0.11576371019085248, 0.1132194995880127, 0.10920468717813492, 0.10964834193388621, 0.10718649377425511, 0.10522439082463582, 0.10399086276690166, 0.10198018079002698, 0.10100916276375453, 0.09902210161089897, 0.0992783469458421, 0.09752156212925911, 0.09848614285389583, 0.09702384720245998, 0.09578496466080348, 0.09599502260486285, 0.09521357342600822, 0.09541725864013036, 0.09348195667068164, 0.09311617724597454, 0.09050437373419602, 0.09073082854350407]
Dev loss: 

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Epoch:  49%|████▉     | 49/100 [05:34<05:36,  6.59s/it][A


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035, 0.1236833209792773, 0.12003363420565923, 0.11891685674587886, 0.11540696894129117, 0.11576371019085248, 0.1132194995880127, 0.10920468717813492, 0.10964834193388621, 0.10718649377425511, 0.10522439082463582, 0.10399086276690166, 0.10198018079002698, 0.10100916276375453, 0.09902210161089897, 0.0992783469458421, 0.09752156212925911, 0.09848614285389583, 0.09702384720245998, 0.09578496466080348, 0.09599502260486285, 0.09521357342600822, 0.09541725864013036, 0.09348195667068164, 0.09311617724597454, 0.09050437373419602, 0.09073082854350407, 0.09055501

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035, 0.1236833209792773, 0.12003363420565923, 0.11891685674587886, 0.11540696894129117, 0.11576371019085248, 0.1132194995880127, 0.10920468717813492, 0.10964834193388621, 0.10718649377425511, 0.10522439082463582, 0.10399086276690166, 0.10198018079002698, 0.10100916276375453, 0.09902210161089897, 0.0992783469458421, 0.09752156212925911, 0.09848614285389583, 0.09702384720245998, 0.09578496466080348, 0.09599502260486285, 0.09521357342600822, 0.09541725864013036, 0.09348195667068164, 0.09311617724597454, 0.09050437373419602, 0.09073082854350407, 0.09055501


Epoch:  50%|█████     | 50/100 [05:41<05:34,  6.69s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Epoch:  51%|█████     | 51/100 [05:48<05:24,  6.62s/it][A


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035, 0.1236833209792773, 0.12003363420565923, 0.11891685674587886, 0.11540696894129117, 0.11576371019085248, 0.1132194995880127, 0.10920468717813492, 0.10964834193388621, 0.10718649377425511, 0.10522439082463582, 0.10399086276690166, 0.10198018079002698, 0.10100916276375453, 0.09902210161089897, 0.0992783469458421, 0.09752156212925911, 0.09848614285389583, 0.09702384720245998, 0.09578496466080348, 0.09599502260486285, 0.09521357342600822, 0.09541725864013036, 0.09348195667068164, 0.09311617724597454, 0.09050437373419602, 0.09073082854350407, 0.09055501

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Epoch:  52%|█████▏    | 52/100 [05:54<05:15,  6.57s/it][A


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035, 0.1236833209792773, 0.12003363420565923, 0.11891685674587886, 0.11540696894129117, 0.11576371019085248, 0.1132194995880127, 0.10920468717813492, 0.10964834193388621, 0.10718649377425511, 0.10522439082463582, 0.10399086276690166, 0.10198018079002698, 0.10100916276375453, 0.09902210161089897, 0.0992783469458421, 0.09752156212925911, 0.09848614285389583, 0.09702384720245998, 0.09578496466080348, 0.09599502260486285, 0.09521357342600822, 0.09541725864013036, 0.09348195667068164, 0.09311617724597454, 0.09050437373419602, 0.09073082854350407, 0.09055501

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035, 0.1236833209792773, 0.12003363420565923, 0.11891685674587886, 0.11540696894129117, 0.11576371019085248, 0.1132194995880127, 0.10920468717813492, 0.10964834193388621, 0.10718649377425511, 0.10522439082463582, 0.10399086276690166, 0.10198018079002698, 0.10100916276375453, 0.09902210161089897, 0.0992783469458421, 0.09752156212925911, 0.09848614285389583, 0.09702384720245998, 0.09578496466080348, 0.09599502260486285, 0.09521357342600822, 0.09541725864013036, 0.09348195667068164, 0.09311617724597454, 0.09050437373419602, 0.09073082854350407, 0.09055501


Epoch:  53%|█████▎    | 53/100 [06:01<05:13,  6.67s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Epoch:  54%|█████▍    | 54/100 [06:08<05:04,  6.61s/it][A


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035, 0.1236833209792773, 0.12003363420565923, 0.11891685674587886, 0.11540696894129117, 0.11576371019085248, 0.1132194995880127, 0.10920468717813492, 0.10964834193388621, 0.10718649377425511, 0.10522439082463582, 0.10399086276690166, 0.10198018079002698, 0.10100916276375453, 0.09902210161089897, 0.0992783469458421, 0.09752156212925911, 0.09848614285389583, 0.09702384720245998, 0.09578496466080348, 0.09599502260486285, 0.09521357342600822, 0.09541725864013036, 0.09348195667068164, 0.09311617724597454, 0.09050437373419602, 0.09073082854350407, 0.09055501

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035, 0.1236833209792773, 0.12003363420565923, 0.11891685674587886, 0.11540696894129117, 0.11576371019085248, 0.1132194995880127, 0.10920468717813492, 0.10964834193388621, 0.10718649377425511, 0.10522439082463582, 0.10399086276690166, 0.10198018079002698, 0.10100916276375453, 0.09902210161089897, 0.0992783469458421, 0.09752156212925911, 0.09848614285389583, 0.09702384720245998, 0.09578496466080348, 0.09599502260486285, 0.09521357342600822, 0.09541725864013036, 0.09348195667068164, 0.09311617724597454, 0.09050437373419602, 0.09073082854350407, 0.09055501


Epoch:  55%|█████▌    | 55/100 [06:14<05:01,  6.70s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Epoch:  56%|█████▌    | 56/100 [06:21<04:51,  6.63s/it][A


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035, 0.1236833209792773, 0.12003363420565923, 0.11891685674587886, 0.11540696894129117, 0.11576371019085248, 0.1132194995880127, 0.10920468717813492, 0.10964834193388621, 0.10718649377425511, 0.10522439082463582, 0.10399086276690166, 0.10198018079002698, 0.10100916276375453, 0.09902210161089897, 0.0992783469458421, 0.09752156212925911, 0.09848614285389583, 0.09702384720245998, 0.09578496466080348, 0.09599502260486285, 0.09521357342600822, 0.09541725864013036, 0.09348195667068164, 0.09311617724597454, 0.09050437373419602, 0.09073082854350407, 0.09055501

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035, 0.1236833209792773, 0.12003363420565923, 0.11891685674587886, 0.11540696894129117, 0.11576371019085248, 0.1132194995880127, 0.10920468717813492, 0.10964834193388621, 0.10718649377425511, 0.10522439082463582, 0.10399086276690166, 0.10198018079002698, 0.10100916276375453, 0.09902210161089897, 0.0992783469458421, 0.09752156212925911, 0.09848614285389583, 0.09702384720245998, 0.09578496466080348, 0.09599502260486285, 0.09521357342600822, 0.09541725864013036, 0.09348195667068164, 0.09311617724597454, 0.09050437373419602, 0.09073082854350407, 0.09055501


Epoch:  57%|█████▋    | 57/100 [06:28<04:48,  6.71s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035, 0.1236833209792773, 0.12003363420565923, 0.11891685674587886, 0.11540696894129117, 0.11576371019085248, 0.1132194995880127, 0.10920468717813492, 0.10964834193388621, 0.10718649377425511, 0.10522439082463582, 0.10399086276690166, 0.10198018079002698, 0.10100916276375453, 0.09902210161089897, 0.0992783469458421, 0.09752156212925911, 0.09848614285389583, 0.09702384720245998, 0.09578496466080348, 0.09599502260486285, 0.09521357342600822, 0.09541725864013036, 0.09348195667068164, 0.09311617724597454, 0.09050437373419602, 0.09073082854350407, 0.09055501


Epoch:  58%|█████▊    | 58/100 [06:35<04:44,  6.77s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Epoch:  59%|█████▉    | 59/100 [06:41<04:33,  6.68s/it][A


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035, 0.1236833209792773, 0.12003363420565923, 0.11891685674587886, 0.11540696894129117, 0.11576371019085248, 0.1132194995880127, 0.10920468717813492, 0.10964834193388621, 0.10718649377425511, 0.10522439082463582, 0.10399086276690166, 0.10198018079002698, 0.10100916276375453, 0.09902210161089897, 0.0992783469458421, 0.09752156212925911, 0.09848614285389583, 0.09702384720245998, 0.09578496466080348, 0.09599502260486285, 0.09521357342600822, 0.09541725864013036, 0.09348195667068164, 0.09311617724597454, 0.09050437373419602, 0.09073082854350407, 0.09055501

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035, 0.1236833209792773, 0.12003363420565923, 0.11891685674587886, 0.11540696894129117, 0.11576371019085248, 0.1132194995880127, 0.10920468717813492, 0.10964834193388621, 0.10718649377425511, 0.10522439082463582, 0.10399086276690166, 0.10198018079002698, 0.10100916276375453, 0.09902210161089897, 0.0992783469458421, 0.09752156212925911, 0.09848614285389583, 0.09702384720245998, 0.09578496466080348, 0.09599502260486285, 0.09521357342600822, 0.09541725864013036, 0.09348195667068164, 0.09311617724597454, 0.09050437373419602, 0.09073082854350407, 0.09055501


Epoch:  60%|██████    | 60/100 [06:48<04:30,  6.75s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035, 0.1236833209792773, 0.12003363420565923, 0.11891685674587886, 0.11540696894129117, 0.11576371019085248, 0.1132194995880127, 0.10920468717813492, 0.10964834193388621, 0.10718649377425511, 0.10522439082463582, 0.10399086276690166, 0.10198018079002698, 0.10100916276375453, 0.09902210161089897, 0.0992783469458421, 0.09752156212925911, 0.09848614285389583, 0.09702384720245998, 0.09578496466080348, 0.09599502260486285, 0.09521357342600822, 0.09541725864013036, 0.09348195667068164, 0.09311617724597454, 0.09050437373419602, 0.09073082854350407, 0.09055501


Epoch:  61%|██████    | 61/100 [06:55<04:25,  6.80s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Epoch:  62%|██████▏   | 62/100 [07:01<04:14,  6.70s/it][A


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035, 0.1236833209792773, 0.12003363420565923, 0.11891685674587886, 0.11540696894129117, 0.11576371019085248, 0.1132194995880127, 0.10920468717813492, 0.10964834193388621, 0.10718649377425511, 0.10522439082463582, 0.10399086276690166, 0.10198018079002698, 0.10100916276375453, 0.09902210161089897, 0.0992783469458421, 0.09752156212925911, 0.09848614285389583, 0.09702384720245998, 0.09578496466080348, 0.09599502260486285, 0.09521357342600822, 0.09541725864013036, 0.09348195667068164, 0.09311617724597454, 0.09050437373419602, 0.09073082854350407, 0.09055501

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035, 0.1236833209792773, 0.12003363420565923, 0.11891685674587886, 0.11540696894129117, 0.11576371019085248, 0.1132194995880127, 0.10920468717813492, 0.10964834193388621, 0.10718649377425511, 0.10522439082463582, 0.10399086276690166, 0.10198018079002698, 0.10100916276375453, 0.09902210161089897, 0.0992783469458421, 0.09752156212925911, 0.09848614285389583, 0.09702384720245998, 0.09578496466080348, 0.09599502260486285, 0.09521357342600822, 0.09541725864013036, 0.09348195667068164, 0.09311617724597454, 0.09050437373419602, 0.09073082854350407, 0.09055501


Epoch:  63%|██████▎   | 63/100 [07:08<04:10,  6.76s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035, 0.1236833209792773, 0.12003363420565923, 0.11891685674587886, 0.11540696894129117, 0.11576371019085248, 0.1132194995880127, 0.10920468717813492, 0.10964834193388621, 0.10718649377425511, 0.10522439082463582, 0.10399086276690166, 0.10198018079002698, 0.10100916276375453, 0.09902210161089897, 0.0992783469458421, 0.09752156212925911, 0.09848614285389583, 0.09702384720245998, 0.09578496466080348, 0.09599502260486285, 0.09521357342600822, 0.09541725864013036, 0.09348195667068164, 0.09311617724597454, 0.09050437373419602, 0.09073082854350407, 0.09055501


Epoch:  64%|██████▍   | 64/100 [07:15<04:04,  6.80s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Epoch:  65%|██████▌   | 65/100 [07:22<03:54,  6.70s/it][A


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035, 0.1236833209792773, 0.12003363420565923, 0.11891685674587886, 0.11540696894129117, 0.11576371019085248, 0.1132194995880127, 0.10920468717813492, 0.10964834193388621, 0.10718649377425511, 0.10522439082463582, 0.10399086276690166, 0.10198018079002698, 0.10100916276375453, 0.09902210161089897, 0.0992783469458421, 0.09752156212925911, 0.09848614285389583, 0.09702384720245998, 0.09578496466080348, 0.09599502260486285, 0.09521357342600822, 0.09541725864013036, 0.09348195667068164, 0.09311617724597454, 0.09050437373419602, 0.09073082854350407, 0.09055501

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Epoch:  66%|██████▌   | 66/100 [07:28<03:45,  6.63s/it][A


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035, 0.1236833209792773, 0.12003363420565923, 0.11891685674587886, 0.11540696894129117, 0.11576371019085248, 0.1132194995880127, 0.10920468717813492, 0.10964834193388621, 0.10718649377425511, 0.10522439082463582, 0.10399086276690166, 0.10198018079002698, 0.10100916276375453, 0.09902210161089897, 0.0992783469458421, 0.09752156212925911, 0.09848614285389583, 0.09702384720245998, 0.09578496466080348, 0.09599502260486285, 0.09521357342600822, 0.09541725864013036, 0.09348195667068164, 0.09311617724597454, 0.09050437373419602, 0.09073082854350407, 0.09055501

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Epoch:  67%|██████▋   | 67/100 [07:35<03:37,  6.58s/it][A


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035, 0.1236833209792773, 0.12003363420565923, 0.11891685674587886, 0.11540696894129117, 0.11576371019085248, 0.1132194995880127, 0.10920468717813492, 0.10964834193388621, 0.10718649377425511, 0.10522439082463582, 0.10399086276690166, 0.10198018079002698, 0.10100916276375453, 0.09902210161089897, 0.0992783469458421, 0.09752156212925911, 0.09848614285389583, 0.09702384720245998, 0.09578496466080348, 0.09599502260486285, 0.09521357342600822, 0.09541725864013036, 0.09348195667068164, 0.09311617724597454, 0.09050437373419602, 0.09073082854350407, 0.09055501

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Epoch:  68%|██████▊   | 68/100 [07:41<03:29,  6.54s/it][A


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035, 0.1236833209792773, 0.12003363420565923, 0.11891685674587886, 0.11540696894129117, 0.11576371019085248, 0.1132194995880127, 0.10920468717813492, 0.10964834193388621, 0.10718649377425511, 0.10522439082463582, 0.10399086276690166, 0.10198018079002698, 0.10100916276375453, 0.09902210161089897, 0.0992783469458421, 0.09752156212925911, 0.09848614285389583, 0.09702384720245998, 0.09578496466080348, 0.09599502260486285, 0.09521357342600822, 0.09541725864013036, 0.09348195667068164, 0.09311617724597454, 0.09050437373419602, 0.09073082854350407, 0.09055501

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035, 0.1236833209792773, 0.12003363420565923, 0.11891685674587886, 0.11540696894129117, 0.11576371019085248, 0.1132194995880127, 0.10920468717813492, 0.10964834193388621, 0.10718649377425511, 0.10522439082463582, 0.10399086276690166, 0.10198018079002698, 0.10100916276375453, 0.09902210161089897, 0.0992783469458421, 0.09752156212925911, 0.09848614285389583, 0.09702384720245998, 0.09578496466080348, 0.09599502260486285, 0.09521357342600822, 0.09541725864013036, 0.09348195667068164, 0.09311617724597454, 0.09050437373419602, 0.09073082854350407, 0.09055501


Epoch:  69%|██████▉   | 69/100 [07:48<03:26,  6.65s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Epoch:  70%|███████   | 70/100 [07:54<03:17,  6.60s/it][A


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035, 0.1236833209792773, 0.12003363420565923, 0.11891685674587886, 0.11540696894129117, 0.11576371019085248, 0.1132194995880127, 0.10920468717813492, 0.10964834193388621, 0.10718649377425511, 0.10522439082463582, 0.10399086276690166, 0.10198018079002698, 0.10100916276375453, 0.09902210161089897, 0.0992783469458421, 0.09752156212925911, 0.09848614285389583, 0.09702384720245998, 0.09578496466080348, 0.09599502260486285, 0.09521357342600822, 0.09541725864013036, 0.09348195667068164, 0.09311617724597454, 0.09050437373419602, 0.09073082854350407, 0.09055501

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Epoch:  71%|███████   | 71/100 [08:01<03:10,  6.56s/it][A


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035, 0.1236833209792773, 0.12003363420565923, 0.11891685674587886, 0.11540696894129117, 0.11576371019085248, 0.1132194995880127, 0.10920468717813492, 0.10964834193388621, 0.10718649377425511, 0.10522439082463582, 0.10399086276690166, 0.10198018079002698, 0.10100916276375453, 0.09902210161089897, 0.0992783469458421, 0.09752156212925911, 0.09848614285389583, 0.09702384720245998, 0.09578496466080348, 0.09599502260486285, 0.09521357342600822, 0.09541725864013036, 0.09348195667068164, 0.09311617724597454, 0.09050437373419602, 0.09073082854350407, 0.09055501

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Epoch:  72%|███████▏  | 72/100 [08:07<03:02,  6.53s/it][A


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035, 0.1236833209792773, 0.12003363420565923, 0.11891685674587886, 0.11540696894129117, 0.11576371019085248, 0.1132194995880127, 0.10920468717813492, 0.10964834193388621, 0.10718649377425511, 0.10522439082463582, 0.10399086276690166, 0.10198018079002698, 0.10100916276375453, 0.09902210161089897, 0.0992783469458421, 0.09752156212925911, 0.09848614285389583, 0.09702384720245998, 0.09578496466080348, 0.09599502260486285, 0.09521357342600822, 0.09541725864013036, 0.09348195667068164, 0.09311617724597454, 0.09050437373419602, 0.09073082854350407, 0.09055501

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035, 0.1236833209792773, 0.12003363420565923, 0.11891685674587886, 0.11540696894129117, 0.11576371019085248, 0.1132194995880127, 0.10920468717813492, 0.10964834193388621, 0.10718649377425511, 0.10522439082463582, 0.10399086276690166, 0.10198018079002698, 0.10100916276375453, 0.09902210161089897, 0.0992783469458421, 0.09752156212925911, 0.09848614285389583, 0.09702384720245998, 0.09578496466080348, 0.09599502260486285, 0.09521357342600822, 0.09541725864013036, 0.09348195667068164, 0.09311617724597454, 0.09050437373419602, 0.09073082854350407, 0.09055501


Epoch:  73%|███████▎  | 73/100 [08:14<03:00,  6.67s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035, 0.1236833209792773, 0.12003363420565923, 0.11891685674587886, 0.11540696894129117, 0.11576371019085248, 0.1132194995880127, 0.10920468717813492, 0.10964834193388621, 0.10718649377425511, 0.10522439082463582, 0.10399086276690166, 0.10198018079002698, 0.10100916276375453, 0.09902210161089897, 0.0992783469458421, 0.09752156212925911, 0.09848614285389583, 0.09702384720245998, 0.09578496466080348, 0.09599502260486285, 0.09521357342600822, 0.09541725864013036, 0.09348195667068164, 0.09311617724597454, 0.09050437373419602, 0.09073082854350407, 0.09055501


Epoch:  74%|███████▍  | 74/100 [08:21<02:55,  6.75s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Epoch:  75%|███████▌  | 75/100 [08:28<02:46,  6.66s/it][A


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035, 0.1236833209792773, 0.12003363420565923, 0.11891685674587886, 0.11540696894129117, 0.11576371019085248, 0.1132194995880127, 0.10920468717813492, 0.10964834193388621, 0.10718649377425511, 0.10522439082463582, 0.10399086276690166, 0.10198018079002698, 0.10100916276375453, 0.09902210161089897, 0.0992783469458421, 0.09752156212925911, 0.09848614285389583, 0.09702384720245998, 0.09578496466080348, 0.09599502260486285, 0.09521357342600822, 0.09541725864013036, 0.09348195667068164, 0.09311617724597454, 0.09050437373419602, 0.09073082854350407, 0.09055501

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Epoch:  76%|███████▌  | 76/100 [08:34<02:38,  6.60s/it][A


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035, 0.1236833209792773, 0.12003363420565923, 0.11891685674587886, 0.11540696894129117, 0.11576371019085248, 0.1132194995880127, 0.10920468717813492, 0.10964834193388621, 0.10718649377425511, 0.10522439082463582, 0.10399086276690166, 0.10198018079002698, 0.10100916276375453, 0.09902210161089897, 0.0992783469458421, 0.09752156212925911, 0.09848614285389583, 0.09702384720245998, 0.09578496466080348, 0.09599502260486285, 0.09521357342600822, 0.09541725864013036, 0.09348195667068164, 0.09311617724597454, 0.09050437373419602, 0.09073082854350407, 0.09055501

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Epoch:  77%|███████▋  | 77/100 [08:41<02:30,  6.56s/it][A


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035, 0.1236833209792773, 0.12003363420565923, 0.11891685674587886, 0.11540696894129117, 0.11576371019085248, 0.1132194995880127, 0.10920468717813492, 0.10964834193388621, 0.10718649377425511, 0.10522439082463582, 0.10399086276690166, 0.10198018079002698, 0.10100916276375453, 0.09902210161089897, 0.0992783469458421, 0.09752156212925911, 0.09848614285389583, 0.09702384720245998, 0.09578496466080348, 0.09599502260486285, 0.09521357342600822, 0.09541725864013036, 0.09348195667068164, 0.09311617724597454, 0.09050437373419602, 0.09073082854350407, 0.09055501

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035, 0.1236833209792773, 0.12003363420565923, 0.11891685674587886, 0.11540696894129117, 0.11576371019085248, 0.1132194995880127, 0.10920468717813492, 0.10964834193388621, 0.10718649377425511, 0.10522439082463582, 0.10399086276690166, 0.10198018079002698, 0.10100916276375453, 0.09902210161089897, 0.0992783469458421, 0.09752156212925911, 0.09848614285389583, 0.09702384720245998, 0.09578496466080348, 0.09599502260486285, 0.09521357342600822, 0.09541725864013036, 0.09348195667068164, 0.09311617724597454, 0.09050437373419602, 0.09073082854350407, 0.09055501


Epoch:  78%|███████▊  | 78/100 [08:48<02:26,  6.67s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Epoch:  79%|███████▉  | 79/100 [08:54<02:18,  6.61s/it][A


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035, 0.1236833209792773, 0.12003363420565923, 0.11891685674587886, 0.11540696894129117, 0.11576371019085248, 0.1132194995880127, 0.10920468717813492, 0.10964834193388621, 0.10718649377425511, 0.10522439082463582, 0.10399086276690166, 0.10198018079002698, 0.10100916276375453, 0.09902210161089897, 0.0992783469458421, 0.09752156212925911, 0.09848614285389583, 0.09702384720245998, 0.09578496466080348, 0.09599502260486285, 0.09521357342600822, 0.09541725864013036, 0.09348195667068164, 0.09311617724597454, 0.09050437373419602, 0.09073082854350407, 0.09055501

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Epoch:  80%|████████  | 80/100 [09:01<02:11,  6.56s/it][A


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035, 0.1236833209792773, 0.12003363420565923, 0.11891685674587886, 0.11540696894129117, 0.11576371019085248, 0.1132194995880127, 0.10920468717813492, 0.10964834193388621, 0.10718649377425511, 0.10522439082463582, 0.10399086276690166, 0.10198018079002698, 0.10100916276375453, 0.09902210161089897, 0.0992783469458421, 0.09752156212925911, 0.09848614285389583, 0.09702384720245998, 0.09578496466080348, 0.09599502260486285, 0.09521357342600822, 0.09541725864013036, 0.09348195667068164, 0.09311617724597454, 0.09050437373419602, 0.09073082854350407, 0.09055501

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Epoch:  81%|████████  | 81/100 [09:07<02:04,  6.53s/it][A


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035, 0.1236833209792773, 0.12003363420565923, 0.11891685674587886, 0.11540696894129117, 0.11576371019085248, 0.1132194995880127, 0.10920468717813492, 0.10964834193388621, 0.10718649377425511, 0.10522439082463582, 0.10399086276690166, 0.10198018079002698, 0.10100916276375453, 0.09902210161089897, 0.0992783469458421, 0.09752156212925911, 0.09848614285389583, 0.09702384720245998, 0.09578496466080348, 0.09599502260486285, 0.09521357342600822, 0.09541725864013036, 0.09348195667068164, 0.09311617724597454, 0.09050437373419602, 0.09073082854350407, 0.09055501

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Epoch:  82%|████████▏ | 82/100 [09:13<01:57,  6.51s/it][A


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035, 0.1236833209792773, 0.12003363420565923, 0.11891685674587886, 0.11540696894129117, 0.11576371019085248, 0.1132194995880127, 0.10920468717813492, 0.10964834193388621, 0.10718649377425511, 0.10522439082463582, 0.10399086276690166, 0.10198018079002698, 0.10100916276375453, 0.09902210161089897, 0.0992783469458421, 0.09752156212925911, 0.09848614285389583, 0.09702384720245998, 0.09578496466080348, 0.09599502260486285, 0.09521357342600822, 0.09541725864013036, 0.09348195667068164, 0.09311617724597454, 0.09050437373419602, 0.09073082854350407, 0.09055501

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.605769564708074, 0.4612632393836975, 0.3484412282705307, 0.29717086255550385, 0.26731228331724805, 0.24957897265752158, 0.2357658992211024, 0.22482032080491385, 0.21327275782823563, 0.20361091196537018, 0.19357392440239587, 0.18363535155852637, 0.1735388288895289, 0.1645234872897466, 0.15794453273216882, 0.1493313138683637, 0.1438723181684812, 0.14142207925518355, 0.13510849823554358, 0.13173267617821693, 0.1275815355281035, 0.1236833209792773, 0.12003363420565923, 0.11891685674587886, 0.11540696894129117, 0.11576371019085248, 0.1132194995880127, 0.10920468717813492, 0.10964834193388621, 0.10718649377425511, 0.10522439082463582, 0.10399086276690166, 0.10198018079002698, 0.10100916276375453, 0.09902210161089897, 0.0992783469458421, 0.09752156212925911, 0.09848614285389583, 0.09702384720245998, 0.09578496466080348, 0.09599502260486285, 0.09521357342600822, 0.09541725864013036, 0.09348195667068164, 0.09311617724597454, 0.09050437373419602, 0.09073082854350407, 0.09055501

I0404 19:16:31.096741 140043817379648 configuration_utils.py:256] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json from cache at /home/yves/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.8f56353af4a709bf5ff0fbc915d8f5b42bfff892cbb6ac98c3c45f481a03c685
I0404 19:16:31.097923 140043817379648 configuration_utils.py:292] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": null,
  "do_sample": false,
  "eos_token_ids": null,
  "finetuning_task": null,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1"
  },
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "is_decoder": false,
  "label2id": {
    "LABEL_0": 0,
    "LABEL_1": 1
  },
  "layer_norm_eps": 1e-12,
  "length_penalty": 1.0,
  "max_length": 20,
  "max_position_embedd

HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Data size: 138
P: 118 / 136 = 0.8676470588235294
R: 118 / 155 = 0.7612903225806451
F: 0.8109965635738832
A: 0.7246376811594203
AL1: 0.782608695652174
Train size: 461
Final train dataloader length: 29


I0404 19:16:36.810857 140043817379648 configuration_utils.py:256] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json from cache at /home/yves/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.8f56353af4a709bf5ff0fbc915d8f5b42bfff892cbb6ac98c3c45f481a03c685
I0404 19:16:36.812008 140043817379648 configuration_utils.py:292] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": null,
  "do_sample": false,
  "eos_token_ids": null,
  "finetuning_task": null,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1"
  },
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "is_decoder": false,
  "label2id": {
    "LABEL_0": 0,
    "LABEL_1": 1
  },
  "layer_norm_eps": 1e-12,
  "length_penalty": 1.0,
  "max_length": 20,
  "max_position_embedd

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: []
Dev loss: 0.5897560318311056




Epoch:   1%|          | 1/100 [00:06<11:23,  6.90s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5897560318311056]
Dev loss: 0.42438603440920514




Epoch:   2%|▏         | 2/100 [00:13<11:16,  6.90s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5897560318311056, 0.42438603440920514]
Dev loss: 0.3408896128336589




Epoch:   3%|▎         | 3/100 [00:20<11:09,  6.90s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5897560318311056, 0.42438603440920514, 0.3408896128336589]
Dev loss: 0.2958464175462723




Epoch:   4%|▍         | 4/100 [00:27<11:02,  6.91s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5897560318311056, 0.42438603440920514, 0.3408896128336589, 0.2958464175462723]
Dev loss: 0.2669556587934494




Epoch:   5%|▌         | 5/100 [00:34<10:56,  6.91s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5897560318311056, 0.42438603440920514, 0.3408896128336589, 0.2958464175462723, 0.2669556587934494]
Dev loss: 0.24936150014400482




Epoch:   6%|▌         | 6/100 [00:41<10:49,  6.91s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5897560318311056, 0.42438603440920514, 0.3408896128336589, 0.2958464175462723, 0.2669556587934494, 0.24936150014400482]
Dev loss: 0.23706879963477454




Epoch:   7%|▋         | 7/100 [00:48<10:43,  6.91s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5897560318311056, 0.42438603440920514, 0.3408896128336589, 0.2958464175462723, 0.2669556587934494, 0.24936150014400482, 0.23706879963477454]
Dev loss: 0.2241834501425425




Epoch:   8%|▊         | 8/100 [00:55<10:36,  6.92s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5897560318311056, 0.42438603440920514, 0.3408896128336589, 0.2958464175462723, 0.2669556587934494, 0.24936150014400482, 0.23706879963477454, 0.2241834501425425]
Dev loss: 0.20994778722524643




Epoch:   9%|▉         | 9/100 [01:02<10:29,  6.92s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5897560318311056, 0.42438603440920514, 0.3408896128336589, 0.2958464175462723, 0.2669556587934494, 0.24936150014400482, 0.23706879963477454, 0.2241834501425425, 0.20994778722524643]
Dev loss: 0.1965125153462092




Epoch:  10%|█         | 10/100 [01:09<10:22,  6.92s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5897560318311056, 0.42438603440920514, 0.3408896128336589, 0.2958464175462723, 0.2669556587934494, 0.24936150014400482, 0.23706879963477454, 0.2241834501425425, 0.20994778722524643, 0.1965125153462092]
Dev loss: 0.18459401031335196




Epoch:  11%|█         | 11/100 [01:16<10:15,  6.92s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5897560318311056, 0.42438603440920514, 0.3408896128336589, 0.2958464175462723, 0.2669556587934494, 0.24936150014400482, 0.23706879963477454, 0.2241834501425425, 0.20994778722524643, 0.1965125153462092, 0.18459401031335196]
Dev loss: 0.17228167752424875




Epoch:  12%|█▏        | 12/100 [01:22<10:08,  6.92s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5897560318311056, 0.42438603440920514, 0.3408896128336589, 0.2958464175462723, 0.2669556587934494, 0.24936150014400482, 0.23706879963477454, 0.2241834501425425, 0.20994778722524643, 0.1965125153462092, 0.18459401031335196, 0.17228167752424875]
Dev loss: 0.1610290432969729




Epoch:  13%|█▎        | 13/100 [01:29<10:01,  6.92s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5897560318311056, 0.42438603440920514, 0.3408896128336589, 0.2958464175462723, 0.2669556587934494, 0.24936150014400482, 0.23706879963477454, 0.2241834501425425, 0.20994778722524643, 0.1965125153462092, 0.18459401031335196, 0.17228167752424875, 0.1610290432969729]
Dev loss: 0.15713689724604288




Epoch:  14%|█▍        | 14/100 [01:36<09:54,  6.91s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5897560318311056, 0.42438603440920514, 0.3408896128336589, 0.2958464175462723, 0.2669556587934494, 0.24936150014400482, 0.23706879963477454, 0.2241834501425425, 0.20994778722524643, 0.1965125153462092, 0.18459401031335196, 0.17228167752424875, 0.1610290432969729, 0.15713689724604288]
Dev loss: 0.1487144815425078




Epoch:  15%|█▌        | 15/100 [01:43<09:47,  6.91s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5897560318311056, 0.42438603440920514, 0.3408896128336589, 0.2958464175462723, 0.2669556587934494, 0.24936150014400482, 0.23706879963477454, 0.2241834501425425, 0.20994778722524643, 0.1965125153462092, 0.18459401031335196, 0.17228167752424875, 0.1610290432969729, 0.15713689724604288, 0.1487144815425078]
Dev loss: 0.14096936086813608




Epoch:  16%|█▌        | 16/100 [01:50<09:40,  6.92s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5897560318311056, 0.42438603440920514, 0.3408896128336589, 0.2958464175462723, 0.2669556587934494, 0.24936150014400482, 0.23706879963477454, 0.2241834501425425, 0.20994778722524643, 0.1965125153462092, 0.18459401031335196, 0.17228167752424875, 0.1610290432969729, 0.15713689724604288, 0.1487144815425078, 0.14096936086813608]
Dev loss: 0.1378570074836413




Epoch:  17%|█▋        | 17/100 [01:57<09:34,  6.92s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5897560318311056, 0.42438603440920514, 0.3408896128336589, 0.2958464175462723, 0.2669556587934494, 0.24936150014400482, 0.23706879963477454, 0.2241834501425425, 0.20994778722524643, 0.1965125153462092, 0.18459401031335196, 0.17228167752424875, 0.1610290432969729, 0.15713689724604288, 0.1487144815425078, 0.14096936086813608, 0.1378570074836413]
Dev loss: 0.1331882749994596




Epoch:  18%|█▊        | 18/100 [02:04<09:27,  6.92s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5897560318311056, 0.42438603440920514, 0.3408896128336589, 0.2958464175462723, 0.2669556587934494, 0.24936150014400482, 0.23706879963477454, 0.2241834501425425, 0.20994778722524643, 0.1965125153462092, 0.18459401031335196, 0.17228167752424875, 0.1610290432969729, 0.15713689724604288, 0.1487144815425078, 0.14096936086813608, 0.1378570074836413, 0.1331882749994596]
Dev loss: 0.12955481062332788




Epoch:  19%|█▉        | 19/100 [02:11<09:20,  6.91s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5897560318311056, 0.42438603440920514, 0.3408896128336589, 0.2958464175462723, 0.2669556587934494, 0.24936150014400482, 0.23706879963477454, 0.2241834501425425, 0.20994778722524643, 0.1965125153462092, 0.18459401031335196, 0.17228167752424875, 0.1610290432969729, 0.15713689724604288, 0.1487144815425078, 0.14096936086813608, 0.1378570074836413, 0.1331882749994596, 0.12955481062332788]
Dev loss: 0.12415460248788197




Epoch:  20%|██        | 20/100 [02:18<09:13,  6.92s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…



Epoch:  21%|██        | 21/100 [02:24<08:55,  6.78s/it][A[A


Loss history: [0.5897560318311056, 0.42438603440920514, 0.3408896128336589, 0.2958464175462723, 0.2669556587934494, 0.24936150014400482, 0.23706879963477454, 0.2241834501425425, 0.20994778722524643, 0.1965125153462092, 0.18459401031335196, 0.17228167752424875, 0.1610290432969729, 0.15713689724604288, 0.1487144815425078, 0.14096936086813608, 0.1378570074836413, 0.1331882749994596, 0.12955481062332788, 0.12415460248788197]
Dev loss: 0.124529713143905


HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5897560318311056, 0.42438603440920514, 0.3408896128336589, 0.2958464175462723, 0.2669556587934494, 0.24936150014400482, 0.23706879963477454, 0.2241834501425425, 0.20994778722524643, 0.1965125153462092, 0.18459401031335196, 0.17228167752424875, 0.1610290432969729, 0.15713689724604288, 0.1487144815425078, 0.14096936086813608, 0.1378570074836413, 0.1331882749994596, 0.12955481062332788, 0.12415460248788197, 0.124529713143905]
Dev loss: 0.12019181996583939




Epoch:  22%|██▏       | 22/100 [02:31<08:52,  6.82s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5897560318311056, 0.42438603440920514, 0.3408896128336589, 0.2958464175462723, 0.2669556587934494, 0.24936150014400482, 0.23706879963477454, 0.2241834501425425, 0.20994778722524643, 0.1965125153462092, 0.18459401031335196, 0.17228167752424875, 0.1610290432969729, 0.15713689724604288, 0.1487144815425078, 0.14096936086813608, 0.1378570074836413, 0.1331882749994596, 0.12955481062332788, 0.12415460248788197, 0.124529713143905, 0.12019181996583939]
Dev loss: 0.1182880848646164




Epoch:  23%|██▎       | 23/100 [02:38<08:47,  6.85s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5897560318311056, 0.42438603440920514, 0.3408896128336589, 0.2958464175462723, 0.2669556587934494, 0.24936150014400482, 0.23706879963477454, 0.2241834501425425, 0.20994778722524643, 0.1965125153462092, 0.18459401031335196, 0.17228167752424875, 0.1610290432969729, 0.15713689724604288, 0.1487144815425078, 0.14096936086813608, 0.1378570074836413, 0.1331882749994596, 0.12955481062332788, 0.12415460248788197, 0.124529713143905, 0.12019181996583939, 0.1182880848646164]
Dev loss: 0.11627718185385068




Epoch:  24%|██▍       | 24/100 [02:45<08:42,  6.87s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5897560318311056, 0.42438603440920514, 0.3408896128336589, 0.2958464175462723, 0.2669556587934494, 0.24936150014400482, 0.23706879963477454, 0.2241834501425425, 0.20994778722524643, 0.1965125153462092, 0.18459401031335196, 0.17228167752424875, 0.1610290432969729, 0.15713689724604288, 0.1487144815425078, 0.14096936086813608, 0.1378570074836413, 0.1331882749994596, 0.12955481062332788, 0.12415460248788197, 0.124529713143905, 0.12019181996583939, 0.1182880848646164, 0.11627718185385068]
Dev loss: 0.11467846234639485




Epoch:  25%|██▌       | 25/100 [02:52<08:36,  6.89s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5897560318311056, 0.42438603440920514, 0.3408896128336589, 0.2958464175462723, 0.2669556587934494, 0.24936150014400482, 0.23706879963477454, 0.2241834501425425, 0.20994778722524643, 0.1965125153462092, 0.18459401031335196, 0.17228167752424875, 0.1610290432969729, 0.15713689724604288, 0.1487144815425078, 0.14096936086813608, 0.1378570074836413, 0.1331882749994596, 0.12955481062332788, 0.12415460248788197, 0.124529713143905, 0.12019181996583939, 0.1182880848646164, 0.11627718185385068, 0.11467846234639485]
Dev loss: 0.11249915137887001




Epoch:  26%|██▌       | 26/100 [02:59<08:30,  6.90s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5897560318311056, 0.42438603440920514, 0.3408896128336589, 0.2958464175462723, 0.2669556587934494, 0.24936150014400482, 0.23706879963477454, 0.2241834501425425, 0.20994778722524643, 0.1965125153462092, 0.18459401031335196, 0.17228167752424875, 0.1610290432969729, 0.15713689724604288, 0.1487144815425078, 0.14096936086813608, 0.1378570074836413, 0.1331882749994596, 0.12955481062332788, 0.12415460248788197, 0.124529713143905, 0.12019181996583939, 0.1182880848646164, 0.11627718185385068, 0.11467846234639485, 0.11249915137887001]
Dev loss: 0.11126840859651566




Epoch:  27%|██▋       | 27/100 [03:06<08:23,  6.90s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5897560318311056, 0.42438603440920514, 0.3408896128336589, 0.2958464175462723, 0.2669556587934494, 0.24936150014400482, 0.23706879963477454, 0.2241834501425425, 0.20994778722524643, 0.1965125153462092, 0.18459401031335196, 0.17228167752424875, 0.1610290432969729, 0.15713689724604288, 0.1487144815425078, 0.14096936086813608, 0.1378570074836413, 0.1331882749994596, 0.12955481062332788, 0.12415460248788197, 0.124529713143905, 0.12019181996583939, 0.1182880848646164, 0.11627718185385068, 0.11467846234639485, 0.11249915137887001, 0.11126840859651566]
Dev loss: 0.10758234187960625




Epoch:  28%|██▊       | 28/100 [03:13<08:17,  6.91s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…



Epoch:  29%|██▉       | 29/100 [03:19<08:00,  6.77s/it][A[A


Loss history: [0.5897560318311056, 0.42438603440920514, 0.3408896128336589, 0.2958464175462723, 0.2669556587934494, 0.24936150014400482, 0.23706879963477454, 0.2241834501425425, 0.20994778722524643, 0.1965125153462092, 0.18459401031335196, 0.17228167752424875, 0.1610290432969729, 0.15713689724604288, 0.1487144815425078, 0.14096936086813608, 0.1378570074836413, 0.1331882749994596, 0.12955481062332788, 0.12415460248788197, 0.124529713143905, 0.12019181996583939, 0.1182880848646164, 0.11627718185385068, 0.11467846234639485, 0.11249915137887001, 0.11126840859651566, 0.10758234187960625]
Dev loss: 0.10868677869439125


HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5897560318311056, 0.42438603440920514, 0.3408896128336589, 0.2958464175462723, 0.2669556587934494, 0.24936150014400482, 0.23706879963477454, 0.2241834501425425, 0.20994778722524643, 0.1965125153462092, 0.18459401031335196, 0.17228167752424875, 0.1610290432969729, 0.15713689724604288, 0.1487144815425078, 0.14096936086813608, 0.1378570074836413, 0.1331882749994596, 0.12955481062332788, 0.12415460248788197, 0.124529713143905, 0.12019181996583939, 0.1182880848646164, 0.11627718185385068, 0.11467846234639485, 0.11249915137887001, 0.11126840859651566, 0.10758234187960625, 0.10868677869439125]
Dev loss: 0.10692776615420978




Epoch:  30%|███       | 30/100 [03:26<07:57,  6.82s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…



Epoch:  31%|███       | 31/100 [03:33<07:43,  6.71s/it][A[A


Loss history: [0.5897560318311056, 0.42438603440920514, 0.3408896128336589, 0.2958464175462723, 0.2669556587934494, 0.24936150014400482, 0.23706879963477454, 0.2241834501425425, 0.20994778722524643, 0.1965125153462092, 0.18459401031335196, 0.17228167752424875, 0.1610290432969729, 0.15713689724604288, 0.1487144815425078, 0.14096936086813608, 0.1378570074836413, 0.1331882749994596, 0.12955481062332788, 0.12415460248788197, 0.124529713143905, 0.12019181996583939, 0.1182880848646164, 0.11627718185385068, 0.11467846234639485, 0.11249915137887001, 0.11126840859651566, 0.10758234187960625, 0.10868677869439125, 0.10692776615420978]
Dev loss: 0.10727763672669728


HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5897560318311056, 0.42438603440920514, 0.3408896128336589, 0.2958464175462723, 0.2669556587934494, 0.24936150014400482, 0.23706879963477454, 0.2241834501425425, 0.20994778722524643, 0.1965125153462092, 0.18459401031335196, 0.17228167752424875, 0.1610290432969729, 0.15713689724604288, 0.1487144815425078, 0.14096936086813608, 0.1378570074836413, 0.1331882749994596, 0.12955481062332788, 0.12415460248788197, 0.124529713143905, 0.12019181996583939, 0.1182880848646164, 0.11627718185385068, 0.11467846234639485, 0.11249915137887001, 0.11126840859651566, 0.10758234187960625, 0.10868677869439125, 0.10692776615420978, 0.10727763672669728]
Dev loss: 0.10536192605892818




Epoch:  32%|███▏      | 32/100 [03:39<07:40,  6.77s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5897560318311056, 0.42438603440920514, 0.3408896128336589, 0.2958464175462723, 0.2669556587934494, 0.24936150014400482, 0.23706879963477454, 0.2241834501425425, 0.20994778722524643, 0.1965125153462092, 0.18459401031335196, 0.17228167752424875, 0.1610290432969729, 0.15713689724604288, 0.1487144815425078, 0.14096936086813608, 0.1378570074836413, 0.1331882749994596, 0.12955481062332788, 0.12415460248788197, 0.124529713143905, 0.12019181996583939, 0.1182880848646164, 0.11627718185385068, 0.11467846234639485, 0.11249915137887001, 0.11126840859651566, 0.10758234187960625, 0.10868677869439125, 0.10692776615420978, 0.10727763672669728, 0.10536192605892818]
Dev loss: 0.10303825760881107




Epoch:  33%|███▎      | 33/100 [03:46<07:36,  6.82s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…



Epoch:  34%|███▍      | 34/100 [03:53<07:22,  6.71s/it][A[A


Loss history: [0.5897560318311056, 0.42438603440920514, 0.3408896128336589, 0.2958464175462723, 0.2669556587934494, 0.24936150014400482, 0.23706879963477454, 0.2241834501425425, 0.20994778722524643, 0.1965125153462092, 0.18459401031335196, 0.17228167752424875, 0.1610290432969729, 0.15713689724604288, 0.1487144815425078, 0.14096936086813608, 0.1378570074836413, 0.1331882749994596, 0.12955481062332788, 0.12415460248788197, 0.124529713143905, 0.12019181996583939, 0.1182880848646164, 0.11627718185385068, 0.11467846234639485, 0.11249915137887001, 0.11126840859651566, 0.10758234187960625, 0.10868677869439125, 0.10692776615420978, 0.10727763672669728, 0.10536192605892818, 0.10303825760881107]
Dev loss: 0.1032723958293597


HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5897560318311056, 0.42438603440920514, 0.3408896128336589, 0.2958464175462723, 0.2669556587934494, 0.24936150014400482, 0.23706879963477454, 0.2241834501425425, 0.20994778722524643, 0.1965125153462092, 0.18459401031335196, 0.17228167752424875, 0.1610290432969729, 0.15713689724604288, 0.1487144815425078, 0.14096936086813608, 0.1378570074836413, 0.1331882749994596, 0.12955481062332788, 0.12415460248788197, 0.124529713143905, 0.12019181996583939, 0.1182880848646164, 0.11627718185385068, 0.11467846234639485, 0.11249915137887001, 0.11126840859651566, 0.10758234187960625, 0.10868677869439125, 0.10692776615420978, 0.10727763672669728, 0.10536192605892818, 0.10303825760881107, 0.1032723958293597]
Dev loss: 0.10134127736091614




Epoch:  35%|███▌      | 35/100 [04:00<07:20,  6.77s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5897560318311056, 0.42438603440920514, 0.3408896128336589, 0.2958464175462723, 0.2669556587934494, 0.24936150014400482, 0.23706879963477454, 0.2241834501425425, 0.20994778722524643, 0.1965125153462092, 0.18459401031335196, 0.17228167752424875, 0.1610290432969729, 0.15713689724604288, 0.1487144815425078, 0.14096936086813608, 0.1378570074836413, 0.1331882749994596, 0.12955481062332788, 0.12415460248788197, 0.124529713143905, 0.12019181996583939, 0.1182880848646164, 0.11627718185385068, 0.11467846234639485, 0.11249915137887001, 0.11126840859651566, 0.10758234187960625, 0.10868677869439125, 0.10692776615420978, 0.10727763672669728, 0.10536192605892818, 0.10303825760881107, 0.1032723958293597, 0.10134127736091614]
Dev loss: 0.10091410701473554




Epoch:  36%|███▌      | 36/100 [04:07<07:16,  6.82s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5897560318311056, 0.42438603440920514, 0.3408896128336589, 0.2958464175462723, 0.2669556587934494, 0.24936150014400482, 0.23706879963477454, 0.2241834501425425, 0.20994778722524643, 0.1965125153462092, 0.18459401031335196, 0.17228167752424875, 0.1610290432969729, 0.15713689724604288, 0.1487144815425078, 0.14096936086813608, 0.1378570074836413, 0.1331882749994596, 0.12955481062332788, 0.12415460248788197, 0.124529713143905, 0.12019181996583939, 0.1182880848646164, 0.11627718185385068, 0.11467846234639485, 0.11249915137887001, 0.11126840859651566, 0.10758234187960625, 0.10868677869439125, 0.10692776615420978, 0.10727763672669728, 0.10536192605892818, 0.10303825760881107, 0.1032723958293597, 0.10134127736091614, 0.10091410701473554]
Dev loss: 0.10060369223356247




Epoch:  37%|███▋      | 37/100 [04:14<07:11,  6.84s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5897560318311056, 0.42438603440920514, 0.3408896128336589, 0.2958464175462723, 0.2669556587934494, 0.24936150014400482, 0.23706879963477454, 0.2241834501425425, 0.20994778722524643, 0.1965125153462092, 0.18459401031335196, 0.17228167752424875, 0.1610290432969729, 0.15713689724604288, 0.1487144815425078, 0.14096936086813608, 0.1378570074836413, 0.1331882749994596, 0.12955481062332788, 0.12415460248788197, 0.124529713143905, 0.12019181996583939, 0.1182880848646164, 0.11627718185385068, 0.11467846234639485, 0.11249915137887001, 0.11126840859651566, 0.10758234187960625, 0.10868677869439125, 0.10692776615420978, 0.10727763672669728, 0.10536192605892818, 0.10303825760881107, 0.1032723958293597, 0.10134127736091614, 0.10091410701473554, 0.10060369223356247]
Dev loss: 0.10051995515823364




Epoch:  38%|███▊      | 38/100 [04:20<07:05,  6.86s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5897560318311056, 0.42438603440920514, 0.3408896128336589, 0.2958464175462723, 0.2669556587934494, 0.24936150014400482, 0.23706879963477454, 0.2241834501425425, 0.20994778722524643, 0.1965125153462092, 0.18459401031335196, 0.17228167752424875, 0.1610290432969729, 0.15713689724604288, 0.1487144815425078, 0.14096936086813608, 0.1378570074836413, 0.1331882749994596, 0.12955481062332788, 0.12415460248788197, 0.124529713143905, 0.12019181996583939, 0.1182880848646164, 0.11627718185385068, 0.11467846234639485, 0.11249915137887001, 0.11126840859651566, 0.10758234187960625, 0.10868677869439125, 0.10692776615420978, 0.10727763672669728, 0.10536192605892818, 0.10303825760881107, 0.1032723958293597, 0.10134127736091614, 0.10091410701473554, 0.10060369223356247, 0.10051995515823364]
Dev loss: 0.09876178453365962




Epoch:  39%|███▉      | 39/100 [04:27<06:59,  6.88s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5897560318311056, 0.42438603440920514, 0.3408896128336589, 0.2958464175462723, 0.2669556587934494, 0.24936150014400482, 0.23706879963477454, 0.2241834501425425, 0.20994778722524643, 0.1965125153462092, 0.18459401031335196, 0.17228167752424875, 0.1610290432969729, 0.15713689724604288, 0.1487144815425078, 0.14096936086813608, 0.1378570074836413, 0.1331882749994596, 0.12955481062332788, 0.12415460248788197, 0.124529713143905, 0.12019181996583939, 0.1182880848646164, 0.11627718185385068, 0.11467846234639485, 0.11249915137887001, 0.11126840859651566, 0.10758234187960625, 0.10868677869439125, 0.10692776615420978, 0.10727763672669728, 0.10536192605892818, 0.10303825760881107, 0.1032723958293597, 0.10134127736091614, 0.10091410701473554, 0.10060369223356247, 0.10051995515823364, 0.09876178453365962]
Dev loss: 0.09869231780370076




Epoch:  40%|████      | 40/100 [04:34<06:53,  6.89s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5897560318311056, 0.42438603440920514, 0.3408896128336589, 0.2958464175462723, 0.2669556587934494, 0.24936150014400482, 0.23706879963477454, 0.2241834501425425, 0.20994778722524643, 0.1965125153462092, 0.18459401031335196, 0.17228167752424875, 0.1610290432969729, 0.15713689724604288, 0.1487144815425078, 0.14096936086813608, 0.1378570074836413, 0.1331882749994596, 0.12955481062332788, 0.12415460248788197, 0.124529713143905, 0.12019181996583939, 0.1182880848646164, 0.11627718185385068, 0.11467846234639485, 0.11249915137887001, 0.11126840859651566, 0.10758234187960625, 0.10868677869439125, 0.10692776615420978, 0.10727763672669728, 0.10536192605892818, 0.10303825760881107, 0.1032723958293597, 0.10134127736091614, 0.10091410701473554, 0.10060369223356247, 0.10051995515823364, 0.09876178453365962, 0.09869231780370076]
Dev loss: 0.09831515947977702




Epoch:  41%|████      | 41/100 [04:41<06:46,  6.90s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5897560318311056, 0.42438603440920514, 0.3408896128336589, 0.2958464175462723, 0.2669556587934494, 0.24936150014400482, 0.23706879963477454, 0.2241834501425425, 0.20994778722524643, 0.1965125153462092, 0.18459401031335196, 0.17228167752424875, 0.1610290432969729, 0.15713689724604288, 0.1487144815425078, 0.14096936086813608, 0.1378570074836413, 0.1331882749994596, 0.12955481062332788, 0.12415460248788197, 0.124529713143905, 0.12019181996583939, 0.1182880848646164, 0.11627718185385068, 0.11467846234639485, 0.11249915137887001, 0.11126840859651566, 0.10758234187960625, 0.10868677869439125, 0.10692776615420978, 0.10727763672669728, 0.10536192605892818, 0.10303825760881107, 0.1032723958293597, 0.10134127736091614, 0.10091410701473554, 0.10060369223356247, 0.10051995515823364, 0.09876178453365962, 0.09869231780370076, 0.09831515947977702]
Dev loss: 0.09729117155075073




Epoch:  42%|████▏     | 42/100 [04:48<06:40,  6.90s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…



Epoch:  43%|████▎     | 43/100 [04:55<06:25,  6.77s/it][A[A


Loss history: [0.5897560318311056, 0.42438603440920514, 0.3408896128336589, 0.2958464175462723, 0.2669556587934494, 0.24936150014400482, 0.23706879963477454, 0.2241834501425425, 0.20994778722524643, 0.1965125153462092, 0.18459401031335196, 0.17228167752424875, 0.1610290432969729, 0.15713689724604288, 0.1487144815425078, 0.14096936086813608, 0.1378570074836413, 0.1331882749994596, 0.12955481062332788, 0.12415460248788197, 0.124529713143905, 0.12019181996583939, 0.1182880848646164, 0.11627718185385068, 0.11467846234639485, 0.11249915137887001, 0.11126840859651566, 0.10758234187960625, 0.10868677869439125, 0.10692776615420978, 0.10727763672669728, 0.10536192605892818, 0.10303825760881107, 0.1032723958293597, 0.10134127736091614, 0.10091410701473554, 0.10060369223356247, 0.10051995515823364, 0.09876178453365962, 0.09869231780370076, 0.09831515947977702, 0.09729117155075073]
Dev loss: 0.10025986408193906


HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…



Epoch:  44%|████▍     | 44/100 [05:01<06:13,  6.68s/it][A[A


Loss history: [0.5897560318311056, 0.42438603440920514, 0.3408896128336589, 0.2958464175462723, 0.2669556587934494, 0.24936150014400482, 0.23706879963477454, 0.2241834501425425, 0.20994778722524643, 0.1965125153462092, 0.18459401031335196, 0.17228167752424875, 0.1610290432969729, 0.15713689724604288, 0.1487144815425078, 0.14096936086813608, 0.1378570074836413, 0.1331882749994596, 0.12955481062332788, 0.12415460248788197, 0.124529713143905, 0.12019181996583939, 0.1182880848646164, 0.11627718185385068, 0.11467846234639485, 0.11249915137887001, 0.11126840859651566, 0.10758234187960625, 0.10868677869439125, 0.10692776615420978, 0.10727763672669728, 0.10536192605892818, 0.10303825760881107, 0.1032723958293597, 0.10134127736091614, 0.10091410701473554, 0.10060369223356247, 0.10051995515823364, 0.09876178453365962, 0.09869231780370076, 0.09831515947977702, 0.09729117155075073, 0.10025986408193906]
Dev loss: 0.09759494662284851


HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…



Epoch:  45%|████▌     | 45/100 [05:08<06:03,  6.61s/it][A[A


Loss history: [0.5897560318311056, 0.42438603440920514, 0.3408896128336589, 0.2958464175462723, 0.2669556587934494, 0.24936150014400482, 0.23706879963477454, 0.2241834501425425, 0.20994778722524643, 0.1965125153462092, 0.18459401031335196, 0.17228167752424875, 0.1610290432969729, 0.15713689724604288, 0.1487144815425078, 0.14096936086813608, 0.1378570074836413, 0.1331882749994596, 0.12955481062332788, 0.12415460248788197, 0.124529713143905, 0.12019181996583939, 0.1182880848646164, 0.11627718185385068, 0.11467846234639485, 0.11249915137887001, 0.11126840859651566, 0.10758234187960625, 0.10868677869439125, 0.10692776615420978, 0.10727763672669728, 0.10536192605892818, 0.10303825760881107, 0.1032723958293597, 0.10134127736091614, 0.10091410701473554, 0.10060369223356247, 0.10051995515823364, 0.09876178453365962, 0.09869231780370076, 0.09831515947977702, 0.09729117155075073, 0.10025986408193906, 0.09759494662284851]
Dev loss: 0.09765498836835225


HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5897560318311056, 0.42438603440920514, 0.3408896128336589, 0.2958464175462723, 0.2669556587934494, 0.24936150014400482, 0.23706879963477454, 0.2241834501425425, 0.20994778722524643, 0.1965125153462092, 0.18459401031335196, 0.17228167752424875, 0.1610290432969729, 0.15713689724604288, 0.1487144815425078, 0.14096936086813608, 0.1378570074836413, 0.1331882749994596, 0.12955481062332788, 0.12415460248788197, 0.124529713143905, 0.12019181996583939, 0.1182880848646164, 0.11627718185385068, 0.11467846234639485, 0.11249915137887001, 0.11126840859651566, 0.10758234187960625, 0.10868677869439125, 0.10692776615420978, 0.10727763672669728, 0.10536192605892818, 0.10303825760881107, 0.1032723958293597, 0.10134127736091614, 0.10091410701473554, 0.10060369223356247, 0.10051995515823364, 0.09876178453365962, 0.09869231780370076, 0.09831515947977702, 0.09729117155075073, 0.10025986408193906, 0.09759494662284851, 0.09765498836835225]
Dev loss: 0.09626502481599648




Epoch:  46%|████▌     | 46/100 [05:14<06:02,  6.70s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5897560318311056, 0.42438603440920514, 0.3408896128336589, 0.2958464175462723, 0.2669556587934494, 0.24936150014400482, 0.23706879963477454, 0.2241834501425425, 0.20994778722524643, 0.1965125153462092, 0.18459401031335196, 0.17228167752424875, 0.1610290432969729, 0.15713689724604288, 0.1487144815425078, 0.14096936086813608, 0.1378570074836413, 0.1331882749994596, 0.12955481062332788, 0.12415460248788197, 0.124529713143905, 0.12019181996583939, 0.1182880848646164, 0.11627718185385068, 0.11467846234639485, 0.11249915137887001, 0.11126840859651566, 0.10758234187960625, 0.10868677869439125, 0.10692776615420978, 0.10727763672669728, 0.10536192605892818, 0.10303825760881107, 0.1032723958293597, 0.10134127736091614, 0.10091410701473554, 0.10060369223356247, 0.10051995515823364, 0.09876178453365962, 0.09869231780370076, 0.09831515947977702, 0.09729117155075073, 0.10025986408193906, 0.09759494662284851, 0.09765498836835225, 0.09626502481599648]
Dev loss: 0.09477770576874416




Epoch:  47%|████▋     | 47/100 [05:21<05:58,  6.77s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…



Epoch:  48%|████▊     | 48/100 [05:28<05:47,  6.68s/it][A[A


Loss history: [0.5897560318311056, 0.42438603440920514, 0.3408896128336589, 0.2958464175462723, 0.2669556587934494, 0.24936150014400482, 0.23706879963477454, 0.2241834501425425, 0.20994778722524643, 0.1965125153462092, 0.18459401031335196, 0.17228167752424875, 0.1610290432969729, 0.15713689724604288, 0.1487144815425078, 0.14096936086813608, 0.1378570074836413, 0.1331882749994596, 0.12955481062332788, 0.12415460248788197, 0.124529713143905, 0.12019181996583939, 0.1182880848646164, 0.11627718185385068, 0.11467846234639485, 0.11249915137887001, 0.11126840859651566, 0.10758234187960625, 0.10868677869439125, 0.10692776615420978, 0.10727763672669728, 0.10536192605892818, 0.10303825760881107, 0.1032723958293597, 0.10134127736091614, 0.10091410701473554, 0.10060369223356247, 0.10051995515823364, 0.09876178453365962, 0.09869231780370076, 0.09831515947977702, 0.09729117155075073, 0.10025986408193906, 0.09759494662284851, 0.09765498836835225, 0.09626502481599648, 0.09477770576874416]
Dev loss: 0

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…



Epoch:  49%|████▉     | 49/100 [05:34<05:37,  6.61s/it][A[A


Loss history: [0.5897560318311056, 0.42438603440920514, 0.3408896128336589, 0.2958464175462723, 0.2669556587934494, 0.24936150014400482, 0.23706879963477454, 0.2241834501425425, 0.20994778722524643, 0.1965125153462092, 0.18459401031335196, 0.17228167752424875, 0.1610290432969729, 0.15713689724604288, 0.1487144815425078, 0.14096936086813608, 0.1378570074836413, 0.1331882749994596, 0.12955481062332788, 0.12415460248788197, 0.124529713143905, 0.12019181996583939, 0.1182880848646164, 0.11627718185385068, 0.11467846234639485, 0.11249915137887001, 0.11126840859651566, 0.10758234187960625, 0.10868677869439125, 0.10692776615420978, 0.10727763672669728, 0.10536192605892818, 0.10303825760881107, 0.1032723958293597, 0.10134127736091614, 0.10091410701473554, 0.10060369223356247, 0.10051995515823364, 0.09876178453365962, 0.09869231780370076, 0.09831515947977702, 0.09729117155075073, 0.10025986408193906, 0.09759494662284851, 0.09765498836835225, 0.09626502481599648, 0.09477770576874416, 0.096774825

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5897560318311056, 0.42438603440920514, 0.3408896128336589, 0.2958464175462723, 0.2669556587934494, 0.24936150014400482, 0.23706879963477454, 0.2241834501425425, 0.20994778722524643, 0.1965125153462092, 0.18459401031335196, 0.17228167752424875, 0.1610290432969729, 0.15713689724604288, 0.1487144815425078, 0.14096936086813608, 0.1378570074836413, 0.1331882749994596, 0.12955481062332788, 0.12415460248788197, 0.124529713143905, 0.12019181996583939, 0.1182880848646164, 0.11627718185385068, 0.11467846234639485, 0.11249915137887001, 0.11126840859651566, 0.10758234187960625, 0.10868677869439125, 0.10692776615420978, 0.10727763672669728, 0.10536192605892818, 0.10303825760881107, 0.1032723958293597, 0.10134127736091614, 0.10091410701473554, 0.10060369223356247, 0.10051995515823364, 0.09876178453365962, 0.09869231780370076, 0.09831515947977702, 0.09729117155075073, 0.10025986408193906, 0.09759494662284851, 0.09765498836835225, 0.09626502481599648, 0.09477770576874416, 0.096774825



Epoch:  50%|█████     | 50/100 [05:41<05:35,  6.70s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…



Epoch:  51%|█████     | 51/100 [05:48<05:24,  6.63s/it][A[A


Loss history: [0.5897560318311056, 0.42438603440920514, 0.3408896128336589, 0.2958464175462723, 0.2669556587934494, 0.24936150014400482, 0.23706879963477454, 0.2241834501425425, 0.20994778722524643, 0.1965125153462092, 0.18459401031335196, 0.17228167752424875, 0.1610290432969729, 0.15713689724604288, 0.1487144815425078, 0.14096936086813608, 0.1378570074836413, 0.1331882749994596, 0.12955481062332788, 0.12415460248788197, 0.124529713143905, 0.12019181996583939, 0.1182880848646164, 0.11627718185385068, 0.11467846234639485, 0.11249915137887001, 0.11126840859651566, 0.10758234187960625, 0.10868677869439125, 0.10692776615420978, 0.10727763672669728, 0.10536192605892818, 0.10303825760881107, 0.1032723958293597, 0.10134127736091614, 0.10091410701473554, 0.10060369223356247, 0.10051995515823364, 0.09876178453365962, 0.09869231780370076, 0.09831515947977702, 0.09729117155075073, 0.10025986408193906, 0.09759494662284851, 0.09765498836835225, 0.09626502481599648, 0.09477770576874416, 0.096774825

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5897560318311056, 0.42438603440920514, 0.3408896128336589, 0.2958464175462723, 0.2669556587934494, 0.24936150014400482, 0.23706879963477454, 0.2241834501425425, 0.20994778722524643, 0.1965125153462092, 0.18459401031335196, 0.17228167752424875, 0.1610290432969729, 0.15713689724604288, 0.1487144815425078, 0.14096936086813608, 0.1378570074836413, 0.1331882749994596, 0.12955481062332788, 0.12415460248788197, 0.124529713143905, 0.12019181996583939, 0.1182880848646164, 0.11627718185385068, 0.11467846234639485, 0.11249915137887001, 0.11126840859651566, 0.10758234187960625, 0.10868677869439125, 0.10692776615420978, 0.10727763672669728, 0.10536192605892818, 0.10303825760881107, 0.1032723958293597, 0.10134127736091614, 0.10091410701473554, 0.10060369223356247, 0.10051995515823364, 0.09876178453365962, 0.09869231780370076, 0.09831515947977702, 0.09729117155075073, 0.10025986408193906, 0.09759494662284851, 0.09765498836835225, 0.09626502481599648, 0.09477770576874416, 0.096774825



Epoch:  52%|█████▏    | 52/100 [05:55<05:22,  6.72s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5897560318311056, 0.42438603440920514, 0.3408896128336589, 0.2958464175462723, 0.2669556587934494, 0.24936150014400482, 0.23706879963477454, 0.2241834501425425, 0.20994778722524643, 0.1965125153462092, 0.18459401031335196, 0.17228167752424875, 0.1610290432969729, 0.15713689724604288, 0.1487144815425078, 0.14096936086813608, 0.1378570074836413, 0.1331882749994596, 0.12955481062332788, 0.12415460248788197, 0.124529713143905, 0.12019181996583939, 0.1182880848646164, 0.11627718185385068, 0.11467846234639485, 0.11249915137887001, 0.11126840859651566, 0.10758234187960625, 0.10868677869439125, 0.10692776615420978, 0.10727763672669728, 0.10536192605892818, 0.10303825760881107, 0.1032723958293597, 0.10134127736091614, 0.10091410701473554, 0.10060369223356247, 0.10051995515823364, 0.09876178453365962, 0.09869231780370076, 0.09831515947977702, 0.09729117155075073, 0.10025986408193906, 0.09759494662284851, 0.09765498836835225, 0.09626502481599648, 0.09477770576874416, 0.096774825



Epoch:  53%|█████▎    | 53/100 [06:02<05:18,  6.78s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…



Epoch:  54%|█████▍    | 54/100 [06:08<05:07,  6.68s/it][A[A


Loss history: [0.5897560318311056, 0.42438603440920514, 0.3408896128336589, 0.2958464175462723, 0.2669556587934494, 0.24936150014400482, 0.23706879963477454, 0.2241834501425425, 0.20994778722524643, 0.1965125153462092, 0.18459401031335196, 0.17228167752424875, 0.1610290432969729, 0.15713689724604288, 0.1487144815425078, 0.14096936086813608, 0.1378570074836413, 0.1331882749994596, 0.12955481062332788, 0.12415460248788197, 0.124529713143905, 0.12019181996583939, 0.1182880848646164, 0.11627718185385068, 0.11467846234639485, 0.11249915137887001, 0.11126840859651566, 0.10758234187960625, 0.10868677869439125, 0.10692776615420978, 0.10727763672669728, 0.10536192605892818, 0.10303825760881107, 0.1032723958293597, 0.10134127736091614, 0.10091410701473554, 0.10060369223356247, 0.10051995515823364, 0.09876178453365962, 0.09869231780370076, 0.09831515947977702, 0.09729117155075073, 0.10025986408193906, 0.09759494662284851, 0.09765498836835225, 0.09626502481599648, 0.09477770576874416, 0.096774825

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5897560318311056, 0.42438603440920514, 0.3408896128336589, 0.2958464175462723, 0.2669556587934494, 0.24936150014400482, 0.23706879963477454, 0.2241834501425425, 0.20994778722524643, 0.1965125153462092, 0.18459401031335196, 0.17228167752424875, 0.1610290432969729, 0.15713689724604288, 0.1487144815425078, 0.14096936086813608, 0.1378570074836413, 0.1331882749994596, 0.12955481062332788, 0.12415460248788197, 0.124529713143905, 0.12019181996583939, 0.1182880848646164, 0.11627718185385068, 0.11467846234639485, 0.11249915137887001, 0.11126840859651566, 0.10758234187960625, 0.10868677869439125, 0.10692776615420978, 0.10727763672669728, 0.10536192605892818, 0.10303825760881107, 0.1032723958293597, 0.10134127736091614, 0.10091410701473554, 0.10060369223356247, 0.10051995515823364, 0.09876178453365962, 0.09869231780370076, 0.09831515947977702, 0.09729117155075073, 0.10025986408193906, 0.09759494662284851, 0.09765498836835225, 0.09626502481599648, 0.09477770576874416, 0.096774825



Epoch:  55%|█████▌    | 55/100 [06:15<05:03,  6.75s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…



Epoch:  56%|█████▌    | 56/100 [06:21<04:53,  6.67s/it][A[A


Loss history: [0.5897560318311056, 0.42438603440920514, 0.3408896128336589, 0.2958464175462723, 0.2669556587934494, 0.24936150014400482, 0.23706879963477454, 0.2241834501425425, 0.20994778722524643, 0.1965125153462092, 0.18459401031335196, 0.17228167752424875, 0.1610290432969729, 0.15713689724604288, 0.1487144815425078, 0.14096936086813608, 0.1378570074836413, 0.1331882749994596, 0.12955481062332788, 0.12415460248788197, 0.124529713143905, 0.12019181996583939, 0.1182880848646164, 0.11627718185385068, 0.11467846234639485, 0.11249915137887001, 0.11126840859651566, 0.10758234187960625, 0.10868677869439125, 0.10692776615420978, 0.10727763672669728, 0.10536192605892818, 0.10303825760881107, 0.1032723958293597, 0.10134127736091614, 0.10091410701473554, 0.10060369223356247, 0.10051995515823364, 0.09876178453365962, 0.09869231780370076, 0.09831515947977702, 0.09729117155075073, 0.10025986408193906, 0.09759494662284851, 0.09765498836835225, 0.09626502481599648, 0.09477770576874416, 0.096774825

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…



Epoch:  57%|█████▋    | 57/100 [06:28<04:43,  6.60s/it][A[A


Loss history: [0.5897560318311056, 0.42438603440920514, 0.3408896128336589, 0.2958464175462723, 0.2669556587934494, 0.24936150014400482, 0.23706879963477454, 0.2241834501425425, 0.20994778722524643, 0.1965125153462092, 0.18459401031335196, 0.17228167752424875, 0.1610290432969729, 0.15713689724604288, 0.1487144815425078, 0.14096936086813608, 0.1378570074836413, 0.1331882749994596, 0.12955481062332788, 0.12415460248788197, 0.124529713143905, 0.12019181996583939, 0.1182880848646164, 0.11627718185385068, 0.11467846234639485, 0.11249915137887001, 0.11126840859651566, 0.10758234187960625, 0.10868677869439125, 0.10692776615420978, 0.10727763672669728, 0.10536192605892818, 0.10303825760881107, 0.1032723958293597, 0.10134127736091614, 0.10091410701473554, 0.10060369223356247, 0.10051995515823364, 0.09876178453365962, 0.09869231780370076, 0.09831515947977702, 0.09729117155075073, 0.10025986408193906, 0.09759494662284851, 0.09765498836835225, 0.09626502481599648, 0.09477770576874416, 0.096774825

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5897560318311056, 0.42438603440920514, 0.3408896128336589, 0.2958464175462723, 0.2669556587934494, 0.24936150014400482, 0.23706879963477454, 0.2241834501425425, 0.20994778722524643, 0.1965125153462092, 0.18459401031335196, 0.17228167752424875, 0.1610290432969729, 0.15713689724604288, 0.1487144815425078, 0.14096936086813608, 0.1378570074836413, 0.1331882749994596, 0.12955481062332788, 0.12415460248788197, 0.124529713143905, 0.12019181996583939, 0.1182880848646164, 0.11627718185385068, 0.11467846234639485, 0.11249915137887001, 0.11126840859651566, 0.10758234187960625, 0.10868677869439125, 0.10692776615420978, 0.10727763672669728, 0.10536192605892818, 0.10303825760881107, 0.1032723958293597, 0.10134127736091614, 0.10091410701473554, 0.10060369223356247, 0.10051995515823364, 0.09876178453365962, 0.09869231780370076, 0.09831515947977702, 0.09729117155075073, 0.10025986408193906, 0.09759494662284851, 0.09765498836835225, 0.09626502481599648, 0.09477770576874416, 0.096774825



Epoch:  58%|█████▊    | 58/100 [06:35<04:41,  6.70s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5897560318311056, 0.42438603440920514, 0.3408896128336589, 0.2958464175462723, 0.2669556587934494, 0.24936150014400482, 0.23706879963477454, 0.2241834501425425, 0.20994778722524643, 0.1965125153462092, 0.18459401031335196, 0.17228167752424875, 0.1610290432969729, 0.15713689724604288, 0.1487144815425078, 0.14096936086813608, 0.1378570074836413, 0.1331882749994596, 0.12955481062332788, 0.12415460248788197, 0.124529713143905, 0.12019181996583939, 0.1182880848646164, 0.11627718185385068, 0.11467846234639485, 0.11249915137887001, 0.11126840859651566, 0.10758234187960625, 0.10868677869439125, 0.10692776615420978, 0.10727763672669728, 0.10536192605892818, 0.10303825760881107, 0.1032723958293597, 0.10134127736091614, 0.10091410701473554, 0.10060369223356247, 0.10051995515823364, 0.09876178453365962, 0.09869231780370076, 0.09831515947977702, 0.09729117155075073, 0.10025986408193906, 0.09759494662284851, 0.09765498836835225, 0.09626502481599648, 0.09477770576874416, 0.096774825



Epoch:  59%|█████▉    | 59/100 [06:42<04:37,  6.76s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…



Epoch:  60%|██████    | 60/100 [06:48<04:26,  6.67s/it][A[A


Loss history: [0.5897560318311056, 0.42438603440920514, 0.3408896128336589, 0.2958464175462723, 0.2669556587934494, 0.24936150014400482, 0.23706879963477454, 0.2241834501425425, 0.20994778722524643, 0.1965125153462092, 0.18459401031335196, 0.17228167752424875, 0.1610290432969729, 0.15713689724604288, 0.1487144815425078, 0.14096936086813608, 0.1378570074836413, 0.1331882749994596, 0.12955481062332788, 0.12415460248788197, 0.124529713143905, 0.12019181996583939, 0.1182880848646164, 0.11627718185385068, 0.11467846234639485, 0.11249915137887001, 0.11126840859651566, 0.10758234187960625, 0.10868677869439125, 0.10692776615420978, 0.10727763672669728, 0.10536192605892818, 0.10303825760881107, 0.1032723958293597, 0.10134127736091614, 0.10091410701473554, 0.10060369223356247, 0.10051995515823364, 0.09876178453365962, 0.09869231780370076, 0.09831515947977702, 0.09729117155075073, 0.10025986408193906, 0.09759494662284851, 0.09765498836835225, 0.09626502481599648, 0.09477770576874416, 0.096774825

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…



Epoch:  61%|██████    | 61/100 [06:55<04:17,  6.61s/it][A[A


Loss history: [0.5897560318311056, 0.42438603440920514, 0.3408896128336589, 0.2958464175462723, 0.2669556587934494, 0.24936150014400482, 0.23706879963477454, 0.2241834501425425, 0.20994778722524643, 0.1965125153462092, 0.18459401031335196, 0.17228167752424875, 0.1610290432969729, 0.15713689724604288, 0.1487144815425078, 0.14096936086813608, 0.1378570074836413, 0.1331882749994596, 0.12955481062332788, 0.12415460248788197, 0.124529713143905, 0.12019181996583939, 0.1182880848646164, 0.11627718185385068, 0.11467846234639485, 0.11249915137887001, 0.11126840859651566, 0.10758234187960625, 0.10868677869439125, 0.10692776615420978, 0.10727763672669728, 0.10536192605892818, 0.10303825760881107, 0.1032723958293597, 0.10134127736091614, 0.10091410701473554, 0.10060369223356247, 0.10051995515823364, 0.09876178453365962, 0.09869231780370076, 0.09831515947977702, 0.09729117155075073, 0.10025986408193906, 0.09759494662284851, 0.09765498836835225, 0.09626502481599648, 0.09477770576874416, 0.096774825

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…



Epoch:  62%|██████▏   | 62/100 [07:01<04:09,  6.57s/it][A[A


Loss history: [0.5897560318311056, 0.42438603440920514, 0.3408896128336589, 0.2958464175462723, 0.2669556587934494, 0.24936150014400482, 0.23706879963477454, 0.2241834501425425, 0.20994778722524643, 0.1965125153462092, 0.18459401031335196, 0.17228167752424875, 0.1610290432969729, 0.15713689724604288, 0.1487144815425078, 0.14096936086813608, 0.1378570074836413, 0.1331882749994596, 0.12955481062332788, 0.12415460248788197, 0.124529713143905, 0.12019181996583939, 0.1182880848646164, 0.11627718185385068, 0.11467846234639485, 0.11249915137887001, 0.11126840859651566, 0.10758234187960625, 0.10868677869439125, 0.10692776615420978, 0.10727763672669728, 0.10536192605892818, 0.10303825760881107, 0.1032723958293597, 0.10134127736091614, 0.10091410701473554, 0.10060369223356247, 0.10051995515823364, 0.09876178453365962, 0.09869231780370076, 0.09831515947977702, 0.09729117155075073, 0.10025986408193906, 0.09759494662284851, 0.09765498836835225, 0.09626502481599648, 0.09477770576874416, 0.096774825

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…



Epoch:  63%|██████▎   | 63/100 [07:07<04:01,  6.54s/it][A[A


Loss history: [0.5897560318311056, 0.42438603440920514, 0.3408896128336589, 0.2958464175462723, 0.2669556587934494, 0.24936150014400482, 0.23706879963477454, 0.2241834501425425, 0.20994778722524643, 0.1965125153462092, 0.18459401031335196, 0.17228167752424875, 0.1610290432969729, 0.15713689724604288, 0.1487144815425078, 0.14096936086813608, 0.1378570074836413, 0.1331882749994596, 0.12955481062332788, 0.12415460248788197, 0.124529713143905, 0.12019181996583939, 0.1182880848646164, 0.11627718185385068, 0.11467846234639485, 0.11249915137887001, 0.11126840859651566, 0.10758234187960625, 0.10868677869439125, 0.10692776615420978, 0.10727763672669728, 0.10536192605892818, 0.10303825760881107, 0.1032723958293597, 0.10134127736091614, 0.10091410701473554, 0.10060369223356247, 0.10051995515823364, 0.09876178453365962, 0.09869231780370076, 0.09831515947977702, 0.09729117155075073, 0.10025986408193906, 0.09759494662284851, 0.09765498836835225, 0.09626502481599648, 0.09477770576874416, 0.096774825

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5897560318311056, 0.42438603440920514, 0.3408896128336589, 0.2958464175462723, 0.2669556587934494, 0.24936150014400482, 0.23706879963477454, 0.2241834501425425, 0.20994778722524643, 0.1965125153462092, 0.18459401031335196, 0.17228167752424875, 0.1610290432969729, 0.15713689724604288, 0.1487144815425078, 0.14096936086813608, 0.1378570074836413, 0.1331882749994596, 0.12955481062332788, 0.12415460248788197, 0.124529713143905, 0.12019181996583939, 0.1182880848646164, 0.11627718185385068, 0.11467846234639485, 0.11249915137887001, 0.11126840859651566, 0.10758234187960625, 0.10868677869439125, 0.10692776615420978, 0.10727763672669728, 0.10536192605892818, 0.10303825760881107, 0.1032723958293597, 0.10134127736091614, 0.10091410701473554, 0.10060369223356247, 0.10051995515823364, 0.09876178453365962, 0.09869231780370076, 0.09831515947977702, 0.09729117155075073, 0.10025986408193906, 0.09759494662284851, 0.09765498836835225, 0.09626502481599648, 0.09477770576874416, 0.096774825

I0404 19:23:53.937473 140043817379648 configuration_utils.py:256] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json from cache at /home/yves/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.8f56353af4a709bf5ff0fbc915d8f5b42bfff892cbb6ac98c3c45f481a03c685
I0404 19:23:53.939065 140043817379648 configuration_utils.py:292] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": null,
  "do_sample": false,
  "eos_token_ids": null,
  "finetuning_task": null,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1"
  },
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "is_decoder": false,
  "label2id": {
    "LABEL_0": 0,
    "LABEL_1": 1
  },
  "layer_norm_eps": 1e-12,
  "length_penalty": 1.0,
  "max_length": 20,
  "max_position_embedd

HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Data size: 138
P: 119 / 136 = 0.875
R: 119 / 164 = 0.725609756097561
F: 0.7933333333333333
A: 0.6666666666666666
AL1: 0.7681159420289855
Train size: 461
Final train dataloader length: 29


I0404 19:23:59.856125 140043817379648 configuration_utils.py:256] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json from cache at /home/yves/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.8f56353af4a709bf5ff0fbc915d8f5b42bfff892cbb6ac98c3c45f481a03c685
I0404 19:23:59.857311 140043817379648 configuration_utils.py:292] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": null,
  "do_sample": false,
  "eos_token_ids": null,
  "finetuning_task": null,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1"
  },
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "is_decoder": false,
  "label2id": {
    "LABEL_0": 0,
    "LABEL_1": 1
  },
  "layer_norm_eps": 1e-12,
  "length_penalty": 1.0,
  "max_length": 20,
  "max_position_embedd

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: []
Dev loss: 0.5703889032204946





Epoch:   1%|          | 1/100 [00:06<11:23,  6.91s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5703889032204946]
Dev loss: 0.45168088873227435





Epoch:   2%|▏         | 2/100 [00:13<11:17,  6.91s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5703889032204946, 0.45168088873227435]
Dev loss: 0.3713988810777664





Epoch:   3%|▎         | 3/100 [00:20<11:10,  6.91s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664]
Dev loss: 0.31752805908521015





Epoch:   4%|▍         | 4/100 [00:27<11:03,  6.91s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015]
Dev loss: 0.2832884540160497





Epoch:   5%|▌         | 5/100 [00:34<10:56,  6.92s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497]
Dev loss: 0.2619638790686925





Epoch:   6%|▌         | 6/100 [00:41<10:50,  6.92s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925]
Dev loss: 0.2481021930774053





Epoch:   7%|▋         | 7/100 [00:48<10:42,  6.91s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053]
Dev loss: 0.23104599366585413





Epoch:   8%|▊         | 8/100 [00:55<10:36,  6.92s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413]
Dev loss: 0.20977324495712915





Epoch:   9%|▉         | 9/100 [01:02<10:29,  6.92s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915]
Dev loss: 0.19478107243776321





Epoch:  10%|█         | 10/100 [01:09<10:22,  6.92s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321]
Dev loss: 0.18068071703116098





Epoch:  11%|█         | 11/100 [01:16<10:15,  6.92s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098]
Dev loss: 0.17129809161027273





Epoch:  12%|█▏        | 12/100 [01:23<10:08,  6.92s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098, 0.17129809161027273]
Dev loss: 0.15991338839133581





Epoch:  13%|█▎        | 13/100 [01:29<10:02,  6.92s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098, 0.17129809161027273, 0.15991338839133581]
Dev loss: 0.15267540762821832





Epoch:  14%|█▍        | 14/100 [01:36<09:55,  6.92s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098, 0.17129809161027273, 0.15991338839133581, 0.15267540762821832]
Dev loss: 0.14439847196141878





Epoch:  15%|█▌        | 15/100 [01:43<09:48,  6.92s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098, 0.17129809161027273, 0.15991338839133581, 0.15267540762821832, 0.14439847196141878]
Dev loss: 0.14095059037208557





Epoch:  16%|█▌        | 16/100 [01:50<09:41,  6.92s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098, 0.17129809161027273, 0.15991338839133581, 0.15267540762821832, 0.14439847196141878, 0.14095059037208557]
Dev loss: 0.13442849616209665





Epoch:  17%|█▋        | 17/100 [01:57<09:34,  6.92s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098, 0.17129809161027273, 0.15991338839133581, 0.15267540762821832, 0.14439847196141878, 0.14095059037208557, 0.13442849616209665]
Dev loss: 0.1295055498679479





Epoch:  18%|█▊        | 18/100 [02:04<09:27,  6.92s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098, 0.17129809161027273, 0.15991338839133581, 0.15267540762821832, 0.14439847196141878, 0.14095059037208557, 0.13442849616209665, 0.1295055498679479]
Dev loss: 0.12575887019435564





Epoch:  19%|█▉        | 19/100 [02:11<09:20,  6.92s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098, 0.17129809161027273, 0.15991338839133581, 0.15267540762821832, 0.14439847196141878, 0.14095059037208557, 0.13442849616209665, 0.1295055498679479, 0.12575887019435564]
Dev loss: 0.12197569881876309





Epoch:  20%|██        | 20/100 [02:18<09:13,  6.92s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098, 0.17129809161027273, 0.15991338839133581, 0.15267540762821832, 0.14439847196141878, 0.14095059037208557, 0.13442849616209665, 0.1295055498679479, 0.12575887019435564, 0.12197569881876309]
Dev loss: 0.11888336762785912





Epoch:  21%|██        | 21/100 [02:25<09:06,  6.92s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098, 0.17129809161027273, 0.15991338839133581, 0.15267540762821832, 0.14439847196141878, 0.14095059037208557, 0.13442849616209665, 0.1295055498679479, 0.12575887019435564, 0.12197569881876309, 0.11888336762785912]
Dev loss: 0.11620859305063884





Epoch:  22%|██▏       | 22/100 [02:32<09:00,  6.92s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098, 0.17129809161027273, 0.15991338839133581, 0.15267540762821832, 0.14439847196141878, 0.14095059037208557, 0.13442849616209665, 0.1295055498679479, 0.12575887019435564, 0.12197569881876309, 0.11888336762785912, 0.11620859305063884]
Dev loss: 0.1141701266169548





Epoch:  23%|██▎       | 23/100 [02:39<08:53,  6.92s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…




Epoch:  24%|██▍       | 24/100 [02:45<08:35,  6.78s/it][A[A[A


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098, 0.17129809161027273, 0.15991338839133581, 0.15267540762821832, 0.14439847196141878, 0.14095059037208557, 0.13442849616209665, 0.1295055498679479, 0.12575887019435564, 0.12197569881876309, 0.11888336762785912, 0.11620859305063884, 0.1141701266169548]
Dev loss: 0.11486967901388805


HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098, 0.17129809161027273, 0.15991338839133581, 0.15267540762821832, 0.14439847196141878, 0.14095059037208557, 0.13442849616209665, 0.1295055498679479, 0.12575887019435564, 0.12197569881876309, 0.11888336762785912, 0.11620859305063884, 0.1141701266169548, 0.11486967901388805]
Dev loss: 0.10919986541072528





Epoch:  25%|██▌       | 25/100 [02:52<08:31,  6.83s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098, 0.17129809161027273, 0.15991338839133581, 0.15267540762821832, 0.14439847196141878, 0.14095059037208557, 0.13442849616209665, 0.1295055498679479, 0.12575887019435564, 0.12197569881876309, 0.11888336762785912, 0.11620859305063884, 0.1141701266169548, 0.11486967901388805, 0.10919986541072528]
Dev loss: 0.10814833516875903





Epoch:  26%|██▌       | 26/100 [02:59<08:27,  6.85s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098, 0.17129809161027273, 0.15991338839133581, 0.15267540762821832, 0.14439847196141878, 0.14095059037208557, 0.13442849616209665, 0.1295055498679479, 0.12575887019435564, 0.12197569881876309, 0.11888336762785912, 0.11620859305063884, 0.1141701266169548, 0.11486967901388805, 0.10919986541072528, 0.10814833516875903]
Dev loss: 0.10605745017528534





Epoch:  27%|██▋       | 27/100 [03:06<08:21,  6.87s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098, 0.17129809161027273, 0.15991338839133581, 0.15267540762821832, 0.14439847196141878, 0.14095059037208557, 0.13442849616209665, 0.1295055498679479, 0.12575887019435564, 0.12197569881876309, 0.11888336762785912, 0.11620859305063884, 0.1141701266169548, 0.11486967901388805, 0.10919986541072528, 0.10814833516875903, 0.10605745017528534]
Dev loss: 0.10507416725158691





Epoch:  28%|██▊       | 28/100 [03:13<08:15,  6.89s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098, 0.17129809161027273, 0.15991338839133581, 0.15267540762821832, 0.14439847196141878, 0.14095059037208557, 0.13442849616209665, 0.1295055498679479, 0.12575887019435564, 0.12197569881876309, 0.11888336762785912, 0.11620859305063884, 0.1141701266169548, 0.11486967901388805, 0.10919986541072528, 0.10814833516875903, 0.10605745017528534, 0.10507416725158691]
Dev loss: 0.10250762601693471





Epoch:  29%|██▉       | 29/100 [03:20<08:09,  6.89s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098, 0.17129809161027273, 0.15991338839133581, 0.15267540762821832, 0.14439847196141878, 0.14095059037208557, 0.13442849616209665, 0.1295055498679479, 0.12575887019435564, 0.12197569881876309, 0.11888336762785912, 0.11620859305063884, 0.1141701266169548, 0.11486967901388805, 0.10919986541072528, 0.10814833516875903, 0.10605745017528534, 0.10507416725158691, 0.10250762601693471]
Dev loss: 0.10089514156182607





Epoch:  30%|███       | 30/100 [03:27<08:03,  6.90s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…




Epoch:  31%|███       | 31/100 [03:33<07:47,  6.77s/it][A[A[A


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098, 0.17129809161027273, 0.15991338839133581, 0.15267540762821832, 0.14439847196141878, 0.14095059037208557, 0.13442849616209665, 0.1295055498679479, 0.12575887019435564, 0.12197569881876309, 0.11888336762785912, 0.11620859305063884, 0.1141701266169548, 0.11486967901388805, 0.10919986541072528, 0.10814833516875903, 0.10605745017528534, 0.10507416725158691, 0.10250762601693471, 0.10089514156182607]
Dev loss: 0.10192418346802394


HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098, 0.17129809161027273, 0.15991338839133581, 0.15267540762821832, 0.14439847196141878, 0.14095059037208557, 0.13442849616209665, 0.1295055498679479, 0.12575887019435564, 0.12197569881876309, 0.11888336762785912, 0.11620859305063884, 0.1141701266169548, 0.11486967901388805, 0.10919986541072528, 0.10814833516875903, 0.10605745017528534, 0.10507416725158691, 0.10250762601693471, 0.10089514156182607, 0.10192418346802394]
Dev loss: 0.09975786631306012





Epoch:  32%|███▏      | 32/100 [03:40<07:43,  6.81s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098, 0.17129809161027273, 0.15991338839133581, 0.15267540762821832, 0.14439847196141878, 0.14095059037208557, 0.13442849616209665, 0.1295055498679479, 0.12575887019435564, 0.12197569881876309, 0.11888336762785912, 0.11620859305063884, 0.1141701266169548, 0.11486967901388805, 0.10919986541072528, 0.10814833516875903, 0.10605745017528534, 0.10507416725158691, 0.10250762601693471, 0.10089514156182607, 0.10192418346802394, 0.09975786631306012]
Dev loss: 0.09757911786437035





Epoch:  33%|███▎      | 33/100 [03:47<07:38,  6.85s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098, 0.17129809161027273, 0.15991338839133581, 0.15267540762821832, 0.14439847196141878, 0.14095059037208557, 0.13442849616209665, 0.1295055498679479, 0.12575887019435564, 0.12197569881876309, 0.11888336762785912, 0.11620859305063884, 0.1141701266169548, 0.11486967901388805, 0.10919986541072528, 0.10814833516875903, 0.10605745017528534, 0.10507416725158691, 0.10250762601693471, 0.10089514156182607, 0.10192418346802394, 0.09975786631306012, 0.09757911786437035]
Dev loss: 0.09584837158521016





Epoch:  34%|███▍      | 34/100 [03:54<07:33,  6.87s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098, 0.17129809161027273, 0.15991338839133581, 0.15267540762821832, 0.14439847196141878, 0.14095059037208557, 0.13442849616209665, 0.1295055498679479, 0.12575887019435564, 0.12197569881876309, 0.11888336762785912, 0.11620859305063884, 0.1141701266169548, 0.11486967901388805, 0.10919986541072528, 0.10814833516875903, 0.10605745017528534, 0.10507416725158691, 0.10250762601693471, 0.10089514156182607, 0.10192418346802394, 0.09975786631306012, 0.09757911786437035, 0.09584837158521016]
Dev loss: 0.09579815591375034





Epoch:  35%|███▌      | 35/100 [04:01<07:27,  6.88s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…




Epoch:  36%|███▌      | 36/100 [04:07<07:12,  6.76s/it][A[A[A


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098, 0.17129809161027273, 0.15991338839133581, 0.15267540762821832, 0.14439847196141878, 0.14095059037208557, 0.13442849616209665, 0.1295055498679479, 0.12575887019435564, 0.12197569881876309, 0.11888336762785912, 0.11620859305063884, 0.1141701266169548, 0.11486967901388805, 0.10919986541072528, 0.10814833516875903, 0.10605745017528534, 0.10507416725158691, 0.10250762601693471, 0.10089514156182607, 0.10192418346802394, 0.09975786631306012, 0.09757911786437035, 0.09584837158521016, 0.09579815591375034]
Dev loss: 0.09720322241385777


HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098, 0.17129809161027273, 0.15991338839133581, 0.15267540762821832, 0.14439847196141878, 0.14095059037208557, 0.13442849616209665, 0.1295055498679479, 0.12575887019435564, 0.12197569881876309, 0.11888336762785912, 0.11620859305063884, 0.1141701266169548, 0.11486967901388805, 0.10919986541072528, 0.10814833516875903, 0.10605745017528534, 0.10507416725158691, 0.10250762601693471, 0.10089514156182607, 0.10192418346802394, 0.09975786631306012, 0.09757911786437035, 0.09584837158521016, 0.09579815591375034, 0.09720322241385777]
Dev loss: 0.0943191647529602





Epoch:  37%|███▋      | 37/100 [04:14<07:08,  6.81s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098, 0.17129809161027273, 0.15991338839133581, 0.15267540762821832, 0.14439847196141878, 0.14095059037208557, 0.13442849616209665, 0.1295055498679479, 0.12575887019435564, 0.12197569881876309, 0.11888336762785912, 0.11620859305063884, 0.1141701266169548, 0.11486967901388805, 0.10919986541072528, 0.10814833516875903, 0.10605745017528534, 0.10507416725158691, 0.10250762601693471, 0.10089514156182607, 0.10192418346802394, 0.09975786631306012, 0.09757911786437035, 0.09584837158521016, 0.09579815591375034, 0.09720322241385777, 0.0943191647529602]
Dev loss: 0.09252247959375381





Epoch:  38%|███▊      | 38/100 [04:21<07:04,  6.84s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098, 0.17129809161027273, 0.15991338839133581, 0.15267540762821832, 0.14439847196141878, 0.14095059037208557, 0.13442849616209665, 0.1295055498679479, 0.12575887019435564, 0.12197569881876309, 0.11888336762785912, 0.11620859305063884, 0.1141701266169548, 0.11486967901388805, 0.10919986541072528, 0.10814833516875903, 0.10605745017528534, 0.10507416725158691, 0.10250762601693471, 0.10089514156182607, 0.10192418346802394, 0.09975786631306012, 0.09757911786437035, 0.09584837158521016, 0.09579815591375034, 0.09720322241385777, 0.0943191647529602, 0.09252247959375381]
Dev loss: 0.09215047086278598





Epoch:  39%|███▉      | 39/100 [04:28<06:58,  6.87s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098, 0.17129809161027273, 0.15991338839133581, 0.15267540762821832, 0.14439847196141878, 0.14095059037208557, 0.13442849616209665, 0.1295055498679479, 0.12575887019435564, 0.12197569881876309, 0.11888336762785912, 0.11620859305063884, 0.1141701266169548, 0.11486967901388805, 0.10919986541072528, 0.10814833516875903, 0.10605745017528534, 0.10507416725158691, 0.10250762601693471, 0.10089514156182607, 0.10192418346802394, 0.09975786631306012, 0.09757911786437035, 0.09584837158521016, 0.09579815591375034, 0.09720322241385777, 0.0943191647529602, 0.09252247959375381, 0.09215047086278598]
Dev loss: 0.09151112847030163





Epoch:  40%|████      | 40/100 [04:35<06:52,  6.88s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098, 0.17129809161027273, 0.15991338839133581, 0.15267540762821832, 0.14439847196141878, 0.14095059037208557, 0.13442849616209665, 0.1295055498679479, 0.12575887019435564, 0.12197569881876309, 0.11888336762785912, 0.11620859305063884, 0.1141701266169548, 0.11486967901388805, 0.10919986541072528, 0.10814833516875903, 0.10605745017528534, 0.10507416725158691, 0.10250762601693471, 0.10089514156182607, 0.10192418346802394, 0.09975786631306012, 0.09757911786437035, 0.09584837158521016, 0.09579815591375034, 0.09720322241385777, 0.0943191647529602, 0.09252247959375381, 0.09215047086278598, 0.09151112847030163]
Dev loss: 0.09033517663677533





Epoch:  41%|████      | 41/100 [04:42<06:46,  6.90s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…




Epoch:  42%|████▏     | 42/100 [04:48<06:32,  6.77s/it][A[A[A


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098, 0.17129809161027273, 0.15991338839133581, 0.15267540762821832, 0.14439847196141878, 0.14095059037208557, 0.13442849616209665, 0.1295055498679479, 0.12575887019435564, 0.12197569881876309, 0.11888336762785912, 0.11620859305063884, 0.1141701266169548, 0.11486967901388805, 0.10919986541072528, 0.10814833516875903, 0.10605745017528534, 0.10507416725158691, 0.10250762601693471, 0.10089514156182607, 0.10192418346802394, 0.09975786631306012, 0.09757911786437035, 0.09584837158521016, 0.09579815591375034, 0.09720322241385777, 0.0943191647529602, 0.09252247959375381, 0.09215047086278598, 0.09151112847030163, 0.09033517663677533]
Dev loss: 0.09209948405623436


HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…




Epoch:  43%|████▎     | 43/100 [04:55<06:20,  6.68s/it][A[A[A


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098, 0.17129809161027273, 0.15991338839133581, 0.15267540762821832, 0.14439847196141878, 0.14095059037208557, 0.13442849616209665, 0.1295055498679479, 0.12575887019435564, 0.12197569881876309, 0.11888336762785912, 0.11620859305063884, 0.1141701266169548, 0.11486967901388805, 0.10919986541072528, 0.10814833516875903, 0.10605745017528534, 0.10507416725158691, 0.10250762601693471, 0.10089514156182607, 0.10192418346802394, 0.09975786631306012, 0.09757911786437035, 0.09584837158521016, 0.09579815591375034, 0.09720322241385777, 0.0943191647529602, 0.09252247959375381, 0.09215047086278598, 0.09151112847030163, 0.09033517663677533, 0.09209948405623436]
Dev loss: 0.09305020049214363


HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…




Epoch:  44%|████▍     | 44/100 [05:01<06:10,  6.61s/it][A[A[A


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098, 0.17129809161027273, 0.15991338839133581, 0.15267540762821832, 0.14439847196141878, 0.14095059037208557, 0.13442849616209665, 0.1295055498679479, 0.12575887019435564, 0.12197569881876309, 0.11888336762785912, 0.11620859305063884, 0.1141701266169548, 0.11486967901388805, 0.10919986541072528, 0.10814833516875903, 0.10605745017528534, 0.10507416725158691, 0.10250762601693471, 0.10089514156182607, 0.10192418346802394, 0.09975786631306012, 0.09757911786437035, 0.09584837158521016, 0.09579815591375034, 0.09720322241385777, 0.0943191647529602, 0.09252247959375381, 0.09215047086278598, 0.09151112847030163, 0.09033517663677533, 0.09209948405623436, 0.09305020049214363]
Dev loss: 0.09060696139931679


HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098, 0.17129809161027273, 0.15991338839133581, 0.15267540762821832, 0.14439847196141878, 0.14095059037208557, 0.13442849616209665, 0.1295055498679479, 0.12575887019435564, 0.12197569881876309, 0.11888336762785912, 0.11620859305063884, 0.1141701266169548, 0.11486967901388805, 0.10919986541072528, 0.10814833516875903, 0.10605745017528534, 0.10507416725158691, 0.10250762601693471, 0.10089514156182607, 0.10192418346802394, 0.09975786631306012, 0.09757911786437035, 0.09584837158521016, 0.09579815591375034, 0.09720322241385777, 0.0943191647529602, 0.09252247959375381, 0.09215047086278598, 0.09151112847030163, 0.09033517663677533, 0.09209948405623436, 0.09305020049214363, 0.09060696139931679]
Dev loss: 0.08979978784918785





Epoch:  45%|████▌     | 45/100 [05:08<06:08,  6.70s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098, 0.17129809161027273, 0.15991338839133581, 0.15267540762821832, 0.14439847196141878, 0.14095059037208557, 0.13442849616209665, 0.1295055498679479, 0.12575887019435564, 0.12197569881876309, 0.11888336762785912, 0.11620859305063884, 0.1141701266169548, 0.11486967901388805, 0.10919986541072528, 0.10814833516875903, 0.10605745017528534, 0.10507416725158691, 0.10250762601693471, 0.10089514156182607, 0.10192418346802394, 0.09975786631306012, 0.09757911786437035, 0.09584837158521016, 0.09579815591375034, 0.09720322241385777, 0.0943191647529602, 0.09252247959375381, 0.09215047086278598, 0.09151112847030163, 0.09033517663677533, 0.09209948405623436, 0.09305020049214363, 0.09060696139931679, 0.08979978784918785]
Dev loss: 0.08798108187814553





Epoch:  46%|████▌     | 46/100 [05:15<06:05,  6.77s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…




Epoch:  47%|████▋     | 47/100 [05:22<05:54,  6.68s/it][A[A[A


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098, 0.17129809161027273, 0.15991338839133581, 0.15267540762821832, 0.14439847196141878, 0.14095059037208557, 0.13442849616209665, 0.1295055498679479, 0.12575887019435564, 0.12197569881876309, 0.11888336762785912, 0.11620859305063884, 0.1141701266169548, 0.11486967901388805, 0.10919986541072528, 0.10814833516875903, 0.10605745017528534, 0.10507416725158691, 0.10250762601693471, 0.10089514156182607, 0.10192418346802394, 0.09975786631306012, 0.09757911786437035, 0.09584837158521016, 0.09579815591375034, 0.09720322241385777, 0.0943191647529602, 0.09252247959375381, 0.09215047086278598, 0.09151112847030163, 0.09033517663677533, 0.09209948405623436, 0.09305020049214363, 0.09060696139931679, 0.08979978784918785, 0.08798108187814553]
Dev loss: 0.08955100364983

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…




Epoch:  48%|████▊     | 48/100 [05:28<05:44,  6.62s/it][A[A[A


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098, 0.17129809161027273, 0.15991338839133581, 0.15267540762821832, 0.14439847196141878, 0.14095059037208557, 0.13442849616209665, 0.1295055498679479, 0.12575887019435564, 0.12197569881876309, 0.11888336762785912, 0.11620859305063884, 0.1141701266169548, 0.11486967901388805, 0.10919986541072528, 0.10814833516875903, 0.10605745017528534, 0.10507416725158691, 0.10250762601693471, 0.10089514156182607, 0.10192418346802394, 0.09975786631306012, 0.09757911786437035, 0.09584837158521016, 0.09579815591375034, 0.09720322241385777, 0.0943191647529602, 0.09252247959375381, 0.09215047086278598, 0.09151112847030163, 0.09033517663677533, 0.09209948405623436, 0.09305020049214363, 0.09060696139931679, 0.08979978784918785, 0.08798108187814553, 0.08955100364983082]
Dev l

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…




Epoch:  49%|████▉     | 49/100 [05:34<05:35,  6.57s/it][A[A[A


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098, 0.17129809161027273, 0.15991338839133581, 0.15267540762821832, 0.14439847196141878, 0.14095059037208557, 0.13442849616209665, 0.1295055498679479, 0.12575887019435564, 0.12197569881876309, 0.11888336762785912, 0.11620859305063884, 0.1141701266169548, 0.11486967901388805, 0.10919986541072528, 0.10814833516875903, 0.10605745017528534, 0.10507416725158691, 0.10250762601693471, 0.10089514156182607, 0.10192418346802394, 0.09975786631306012, 0.09757911786437035, 0.09584837158521016, 0.09579815591375034, 0.09720322241385777, 0.0943191647529602, 0.09252247959375381, 0.09215047086278598, 0.09151112847030163, 0.09033517663677533, 0.09209948405623436, 0.09305020049214363, 0.09060696139931679, 0.08979978784918785, 0.08798108187814553, 0.08955100364983082, 0.088

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…




Epoch:  50%|█████     | 50/100 [05:41<05:26,  6.54s/it][A[A[A


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098, 0.17129809161027273, 0.15991338839133581, 0.15267540762821832, 0.14439847196141878, 0.14095059037208557, 0.13442849616209665, 0.1295055498679479, 0.12575887019435564, 0.12197569881876309, 0.11888336762785912, 0.11620859305063884, 0.1141701266169548, 0.11486967901388805, 0.10919986541072528, 0.10814833516875903, 0.10605745017528534, 0.10507416725158691, 0.10250762601693471, 0.10089514156182607, 0.10192418346802394, 0.09975786631306012, 0.09757911786437035, 0.09584837158521016, 0.09579815591375034, 0.09720322241385777, 0.0943191647529602, 0.09252247959375381, 0.09215047086278598, 0.09151112847030163, 0.09033517663677533, 0.09209948405623436, 0.09305020049214363, 0.09060696139931679, 0.08979978784918785, 0.08798108187814553, 0.08955100364983082, 0.088

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098, 0.17129809161027273, 0.15991338839133581, 0.15267540762821832, 0.14439847196141878, 0.14095059037208557, 0.13442849616209665, 0.1295055498679479, 0.12575887019435564, 0.12197569881876309, 0.11888336762785912, 0.11620859305063884, 0.1141701266169548, 0.11486967901388805, 0.10919986541072528, 0.10814833516875903, 0.10605745017528534, 0.10507416725158691, 0.10250762601693471, 0.10089514156182607, 0.10192418346802394, 0.09975786631306012, 0.09757911786437035, 0.09584837158521016, 0.09579815591375034, 0.09720322241385777, 0.0943191647529602, 0.09252247959375381, 0.09215047086278598, 0.09151112847030163, 0.09033517663677533, 0.09209948405623436, 0.09305020049214363, 0.09060696139931679, 0.08979978784918785, 0.08798108187814553, 0.08955100364983082, 0.088




Epoch:  51%|█████     | 51/100 [05:48<05:26,  6.66s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…




Epoch:  52%|█████▏    | 52/100 [05:54<05:16,  6.60s/it][A[A[A


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098, 0.17129809161027273, 0.15991338839133581, 0.15267540762821832, 0.14439847196141878, 0.14095059037208557, 0.13442849616209665, 0.1295055498679479, 0.12575887019435564, 0.12197569881876309, 0.11888336762785912, 0.11620859305063884, 0.1141701266169548, 0.11486967901388805, 0.10919986541072528, 0.10814833516875903, 0.10605745017528534, 0.10507416725158691, 0.10250762601693471, 0.10089514156182607, 0.10192418346802394, 0.09975786631306012, 0.09757911786437035, 0.09584837158521016, 0.09579815591375034, 0.09720322241385777, 0.0943191647529602, 0.09252247959375381, 0.09215047086278598, 0.09151112847030163, 0.09033517663677533, 0.09209948405623436, 0.09305020049214363, 0.09060696139931679, 0.08979978784918785, 0.08798108187814553, 0.08955100364983082, 0.088

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098, 0.17129809161027273, 0.15991338839133581, 0.15267540762821832, 0.14439847196141878, 0.14095059037208557, 0.13442849616209665, 0.1295055498679479, 0.12575887019435564, 0.12197569881876309, 0.11888336762785912, 0.11620859305063884, 0.1141701266169548, 0.11486967901388805, 0.10919986541072528, 0.10814833516875903, 0.10605745017528534, 0.10507416725158691, 0.10250762601693471, 0.10089514156182607, 0.10192418346802394, 0.09975786631306012, 0.09757911786437035, 0.09584837158521016, 0.09579815591375034, 0.09720322241385777, 0.0943191647529602, 0.09252247959375381, 0.09215047086278598, 0.09151112847030163, 0.09033517663677533, 0.09209948405623436, 0.09305020049214363, 0.09060696139931679, 0.08979978784918785, 0.08798108187814553, 0.08955100364983082, 0.088




Epoch:  53%|█████▎    | 53/100 [06:01<05:14,  6.70s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098, 0.17129809161027273, 0.15991338839133581, 0.15267540762821832, 0.14439847196141878, 0.14095059037208557, 0.13442849616209665, 0.1295055498679479, 0.12575887019435564, 0.12197569881876309, 0.11888336762785912, 0.11620859305063884, 0.1141701266169548, 0.11486967901388805, 0.10919986541072528, 0.10814833516875903, 0.10605745017528534, 0.10507416725158691, 0.10250762601693471, 0.10089514156182607, 0.10192418346802394, 0.09975786631306012, 0.09757911786437035, 0.09584837158521016, 0.09579815591375034, 0.09720322241385777, 0.0943191647529602, 0.09252247959375381, 0.09215047086278598, 0.09151112847030163, 0.09033517663677533, 0.09209948405623436, 0.09305020049214363, 0.09060696139931679, 0.08979978784918785, 0.08798108187814553, 0.08955100364983082, 0.088




Epoch:  54%|█████▍    | 54/100 [06:08<05:11,  6.77s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…




Epoch:  55%|█████▌    | 55/100 [06:15<05:00,  6.68s/it][A[A[A


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098, 0.17129809161027273, 0.15991338839133581, 0.15267540762821832, 0.14439847196141878, 0.14095059037208557, 0.13442849616209665, 0.1295055498679479, 0.12575887019435564, 0.12197569881876309, 0.11888336762785912, 0.11620859305063884, 0.1141701266169548, 0.11486967901388805, 0.10919986541072528, 0.10814833516875903, 0.10605745017528534, 0.10507416725158691, 0.10250762601693471, 0.10089514156182607, 0.10192418346802394, 0.09975786631306012, 0.09757911786437035, 0.09584837158521016, 0.09579815591375034, 0.09720322241385777, 0.0943191647529602, 0.09252247959375381, 0.09215047086278598, 0.09151112847030163, 0.09033517663677533, 0.09209948405623436, 0.09305020049214363, 0.09060696139931679, 0.08979978784918785, 0.08798108187814553, 0.08955100364983082, 0.088

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098, 0.17129809161027273, 0.15991338839133581, 0.15267540762821832, 0.14439847196141878, 0.14095059037208557, 0.13442849616209665, 0.1295055498679479, 0.12575887019435564, 0.12197569881876309, 0.11888336762785912, 0.11620859305063884, 0.1141701266169548, 0.11486967901388805, 0.10919986541072528, 0.10814833516875903, 0.10605745017528534, 0.10507416725158691, 0.10250762601693471, 0.10089514156182607, 0.10192418346802394, 0.09975786631306012, 0.09757911786437035, 0.09584837158521016, 0.09579815591375034, 0.09720322241385777, 0.0943191647529602, 0.09252247959375381, 0.09215047086278598, 0.09151112847030163, 0.09033517663677533, 0.09209948405623436, 0.09305020049214363, 0.09060696139931679, 0.08979978784918785, 0.08798108187814553, 0.08955100364983082, 0.088




Epoch:  56%|█████▌    | 56/100 [06:22<04:56,  6.75s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…




Epoch:  57%|█████▋    | 57/100 [06:28<04:46,  6.66s/it][A[A[A


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098, 0.17129809161027273, 0.15991338839133581, 0.15267540762821832, 0.14439847196141878, 0.14095059037208557, 0.13442849616209665, 0.1295055498679479, 0.12575887019435564, 0.12197569881876309, 0.11888336762785912, 0.11620859305063884, 0.1141701266169548, 0.11486967901388805, 0.10919986541072528, 0.10814833516875903, 0.10605745017528534, 0.10507416725158691, 0.10250762601693471, 0.10089514156182607, 0.10192418346802394, 0.09975786631306012, 0.09757911786437035, 0.09584837158521016, 0.09579815591375034, 0.09720322241385777, 0.0943191647529602, 0.09252247959375381, 0.09215047086278598, 0.09151112847030163, 0.09033517663677533, 0.09209948405623436, 0.09305020049214363, 0.09060696139931679, 0.08979978784918785, 0.08798108187814553, 0.08955100364983082, 0.088

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098, 0.17129809161027273, 0.15991338839133581, 0.15267540762821832, 0.14439847196141878, 0.14095059037208557, 0.13442849616209665, 0.1295055498679479, 0.12575887019435564, 0.12197569881876309, 0.11888336762785912, 0.11620859305063884, 0.1141701266169548, 0.11486967901388805, 0.10919986541072528, 0.10814833516875903, 0.10605745017528534, 0.10507416725158691, 0.10250762601693471, 0.10089514156182607, 0.10192418346802394, 0.09975786631306012, 0.09757911786437035, 0.09584837158521016, 0.09579815591375034, 0.09720322241385777, 0.0943191647529602, 0.09252247959375381, 0.09215047086278598, 0.09151112847030163, 0.09033517663677533, 0.09209948405623436, 0.09305020049214363, 0.09060696139931679, 0.08979978784918785, 0.08798108187814553, 0.08955100364983082, 0.088




Epoch:  58%|█████▊    | 58/100 [06:35<04:43,  6.74s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…




Epoch:  59%|█████▉    | 59/100 [06:41<04:32,  6.66s/it][A[A[A


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098, 0.17129809161027273, 0.15991338839133581, 0.15267540762821832, 0.14439847196141878, 0.14095059037208557, 0.13442849616209665, 0.1295055498679479, 0.12575887019435564, 0.12197569881876309, 0.11888336762785912, 0.11620859305063884, 0.1141701266169548, 0.11486967901388805, 0.10919986541072528, 0.10814833516875903, 0.10605745017528534, 0.10507416725158691, 0.10250762601693471, 0.10089514156182607, 0.10192418346802394, 0.09975786631306012, 0.09757911786437035, 0.09584837158521016, 0.09579815591375034, 0.09720322241385777, 0.0943191647529602, 0.09252247959375381, 0.09215047086278598, 0.09151112847030163, 0.09033517663677533, 0.09209948405623436, 0.09305020049214363, 0.09060696139931679, 0.08979978784918785, 0.08798108187814553, 0.08955100364983082, 0.088

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098, 0.17129809161027273, 0.15991338839133581, 0.15267540762821832, 0.14439847196141878, 0.14095059037208557, 0.13442849616209665, 0.1295055498679479, 0.12575887019435564, 0.12197569881876309, 0.11888336762785912, 0.11620859305063884, 0.1141701266169548, 0.11486967901388805, 0.10919986541072528, 0.10814833516875903, 0.10605745017528534, 0.10507416725158691, 0.10250762601693471, 0.10089514156182607, 0.10192418346802394, 0.09975786631306012, 0.09757911786437035, 0.09584837158521016, 0.09579815591375034, 0.09720322241385777, 0.0943191647529602, 0.09252247959375381, 0.09215047086278598, 0.09151112847030163, 0.09033517663677533, 0.09209948405623436, 0.09305020049214363, 0.09060696139931679, 0.08979978784918785, 0.08798108187814553, 0.08955100364983082, 0.088




Epoch:  60%|██████    | 60/100 [06:48<04:29,  6.74s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…




Epoch:  61%|██████    | 61/100 [06:55<04:19,  6.65s/it][A[A[A


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098, 0.17129809161027273, 0.15991338839133581, 0.15267540762821832, 0.14439847196141878, 0.14095059037208557, 0.13442849616209665, 0.1295055498679479, 0.12575887019435564, 0.12197569881876309, 0.11888336762785912, 0.11620859305063884, 0.1141701266169548, 0.11486967901388805, 0.10919986541072528, 0.10814833516875903, 0.10605745017528534, 0.10507416725158691, 0.10250762601693471, 0.10089514156182607, 0.10192418346802394, 0.09975786631306012, 0.09757911786437035, 0.09584837158521016, 0.09579815591375034, 0.09720322241385777, 0.0943191647529602, 0.09252247959375381, 0.09215047086278598, 0.09151112847030163, 0.09033517663677533, 0.09209948405623436, 0.09305020049214363, 0.09060696139931679, 0.08979978784918785, 0.08798108187814553, 0.08955100364983082, 0.088

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098, 0.17129809161027273, 0.15991338839133581, 0.15267540762821832, 0.14439847196141878, 0.14095059037208557, 0.13442849616209665, 0.1295055498679479, 0.12575887019435564, 0.12197569881876309, 0.11888336762785912, 0.11620859305063884, 0.1141701266169548, 0.11486967901388805, 0.10919986541072528, 0.10814833516875903, 0.10605745017528534, 0.10507416725158691, 0.10250762601693471, 0.10089514156182607, 0.10192418346802394, 0.09975786631306012, 0.09757911786437035, 0.09584837158521016, 0.09579815591375034, 0.09720322241385777, 0.0943191647529602, 0.09252247959375381, 0.09215047086278598, 0.09151112847030163, 0.09033517663677533, 0.09209948405623436, 0.09305020049214363, 0.09060696139931679, 0.08979978784918785, 0.08798108187814553, 0.08955100364983082, 0.088




Epoch:  62%|██████▏   | 62/100 [07:02<04:15,  6.73s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…




Epoch:  63%|██████▎   | 63/100 [07:08<04:06,  6.65s/it][A[A[A


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098, 0.17129809161027273, 0.15991338839133581, 0.15267540762821832, 0.14439847196141878, 0.14095059037208557, 0.13442849616209665, 0.1295055498679479, 0.12575887019435564, 0.12197569881876309, 0.11888336762785912, 0.11620859305063884, 0.1141701266169548, 0.11486967901388805, 0.10919986541072528, 0.10814833516875903, 0.10605745017528534, 0.10507416725158691, 0.10250762601693471, 0.10089514156182607, 0.10192418346802394, 0.09975786631306012, 0.09757911786437035, 0.09584837158521016, 0.09579815591375034, 0.09720322241385777, 0.0943191647529602, 0.09252247959375381, 0.09215047086278598, 0.09151112847030163, 0.09033517663677533, 0.09209948405623436, 0.09305020049214363, 0.09060696139931679, 0.08979978784918785, 0.08798108187814553, 0.08955100364983082, 0.088

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…




Epoch:  64%|██████▍   | 64/100 [07:15<03:57,  6.60s/it][A[A[A


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098, 0.17129809161027273, 0.15991338839133581, 0.15267540762821832, 0.14439847196141878, 0.14095059037208557, 0.13442849616209665, 0.1295055498679479, 0.12575887019435564, 0.12197569881876309, 0.11888336762785912, 0.11620859305063884, 0.1141701266169548, 0.11486967901388805, 0.10919986541072528, 0.10814833516875903, 0.10605745017528534, 0.10507416725158691, 0.10250762601693471, 0.10089514156182607, 0.10192418346802394, 0.09975786631306012, 0.09757911786437035, 0.09584837158521016, 0.09579815591375034, 0.09720322241385777, 0.0943191647529602, 0.09252247959375381, 0.09215047086278598, 0.09151112847030163, 0.09033517663677533, 0.09209948405623436, 0.09305020049214363, 0.09060696139931679, 0.08979978784918785, 0.08798108187814553, 0.08955100364983082, 0.088

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098, 0.17129809161027273, 0.15991338839133581, 0.15267540762821832, 0.14439847196141878, 0.14095059037208557, 0.13442849616209665, 0.1295055498679479, 0.12575887019435564, 0.12197569881876309, 0.11888336762785912, 0.11620859305063884, 0.1141701266169548, 0.11486967901388805, 0.10919986541072528, 0.10814833516875903, 0.10605745017528534, 0.10507416725158691, 0.10250762601693471, 0.10089514156182607, 0.10192418346802394, 0.09975786631306012, 0.09757911786437035, 0.09584837158521016, 0.09579815591375034, 0.09720322241385777, 0.0943191647529602, 0.09252247959375381, 0.09215047086278598, 0.09151112847030163, 0.09033517663677533, 0.09209948405623436, 0.09305020049214363, 0.09060696139931679, 0.08979978784918785, 0.08798108187814553, 0.08955100364983082, 0.088




Epoch:  65%|██████▌   | 65/100 [07:22<03:54,  6.69s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…




Epoch:  66%|██████▌   | 66/100 [07:28<03:45,  6.63s/it][A[A[A


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098, 0.17129809161027273, 0.15991338839133581, 0.15267540762821832, 0.14439847196141878, 0.14095059037208557, 0.13442849616209665, 0.1295055498679479, 0.12575887019435564, 0.12197569881876309, 0.11888336762785912, 0.11620859305063884, 0.1141701266169548, 0.11486967901388805, 0.10919986541072528, 0.10814833516875903, 0.10605745017528534, 0.10507416725158691, 0.10250762601693471, 0.10089514156182607, 0.10192418346802394, 0.09975786631306012, 0.09757911786437035, 0.09584837158521016, 0.09579815591375034, 0.09720322241385777, 0.0943191647529602, 0.09252247959375381, 0.09215047086278598, 0.09151112847030163, 0.09033517663677533, 0.09209948405623436, 0.09305020049214363, 0.09060696139931679, 0.08979978784918785, 0.08798108187814553, 0.08955100364983082, 0.088

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098, 0.17129809161027273, 0.15991338839133581, 0.15267540762821832, 0.14439847196141878, 0.14095059037208557, 0.13442849616209665, 0.1295055498679479, 0.12575887019435564, 0.12197569881876309, 0.11888336762785912, 0.11620859305063884, 0.1141701266169548, 0.11486967901388805, 0.10919986541072528, 0.10814833516875903, 0.10605745017528534, 0.10507416725158691, 0.10250762601693471, 0.10089514156182607, 0.10192418346802394, 0.09975786631306012, 0.09757911786437035, 0.09584837158521016, 0.09579815591375034, 0.09720322241385777, 0.0943191647529602, 0.09252247959375381, 0.09215047086278598, 0.09151112847030163, 0.09033517663677533, 0.09209948405623436, 0.09305020049214363, 0.09060696139931679, 0.08979978784918785, 0.08798108187814553, 0.08955100364983082, 0.088




Epoch:  67%|██████▋   | 67/100 [07:35<03:41,  6.71s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098, 0.17129809161027273, 0.15991338839133581, 0.15267540762821832, 0.14439847196141878, 0.14095059037208557, 0.13442849616209665, 0.1295055498679479, 0.12575887019435564, 0.12197569881876309, 0.11888336762785912, 0.11620859305063884, 0.1141701266169548, 0.11486967901388805, 0.10919986541072528, 0.10814833516875903, 0.10605745017528534, 0.10507416725158691, 0.10250762601693471, 0.10089514156182607, 0.10192418346802394, 0.09975786631306012, 0.09757911786437035, 0.09584837158521016, 0.09579815591375034, 0.09720322241385777, 0.0943191647529602, 0.09252247959375381, 0.09215047086278598, 0.09151112847030163, 0.09033517663677533, 0.09209948405623436, 0.09305020049214363, 0.09060696139931679, 0.08979978784918785, 0.08798108187814553, 0.08955100364983082, 0.088




Epoch:  68%|██████▊   | 68/100 [07:42<03:36,  6.78s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…




Epoch:  69%|██████▉   | 69/100 [07:48<03:27,  6.68s/it][A[A[A


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098, 0.17129809161027273, 0.15991338839133581, 0.15267540762821832, 0.14439847196141878, 0.14095059037208557, 0.13442849616209665, 0.1295055498679479, 0.12575887019435564, 0.12197569881876309, 0.11888336762785912, 0.11620859305063884, 0.1141701266169548, 0.11486967901388805, 0.10919986541072528, 0.10814833516875903, 0.10605745017528534, 0.10507416725158691, 0.10250762601693471, 0.10089514156182607, 0.10192418346802394, 0.09975786631306012, 0.09757911786437035, 0.09584837158521016, 0.09579815591375034, 0.09720322241385777, 0.0943191647529602, 0.09252247959375381, 0.09215047086278598, 0.09151112847030163, 0.09033517663677533, 0.09209948405623436, 0.09305020049214363, 0.09060696139931679, 0.08979978784918785, 0.08798108187814553, 0.08955100364983082, 0.088

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…




Epoch:  70%|███████   | 70/100 [07:55<03:18,  6.62s/it][A[A[A


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098, 0.17129809161027273, 0.15991338839133581, 0.15267540762821832, 0.14439847196141878, 0.14095059037208557, 0.13442849616209665, 0.1295055498679479, 0.12575887019435564, 0.12197569881876309, 0.11888336762785912, 0.11620859305063884, 0.1141701266169548, 0.11486967901388805, 0.10919986541072528, 0.10814833516875903, 0.10605745017528534, 0.10507416725158691, 0.10250762601693471, 0.10089514156182607, 0.10192418346802394, 0.09975786631306012, 0.09757911786437035, 0.09584837158521016, 0.09579815591375034, 0.09720322241385777, 0.0943191647529602, 0.09252247959375381, 0.09215047086278598, 0.09151112847030163, 0.09033517663677533, 0.09209948405623436, 0.09305020049214363, 0.09060696139931679, 0.08979978784918785, 0.08798108187814553, 0.08955100364983082, 0.088

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…




Epoch:  71%|███████   | 71/100 [08:01<03:10,  6.57s/it][A[A[A


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098, 0.17129809161027273, 0.15991338839133581, 0.15267540762821832, 0.14439847196141878, 0.14095059037208557, 0.13442849616209665, 0.1295055498679479, 0.12575887019435564, 0.12197569881876309, 0.11888336762785912, 0.11620859305063884, 0.1141701266169548, 0.11486967901388805, 0.10919986541072528, 0.10814833516875903, 0.10605745017528534, 0.10507416725158691, 0.10250762601693471, 0.10089514156182607, 0.10192418346802394, 0.09975786631306012, 0.09757911786437035, 0.09584837158521016, 0.09579815591375034, 0.09720322241385777, 0.0943191647529602, 0.09252247959375381, 0.09215047086278598, 0.09151112847030163, 0.09033517663677533, 0.09209948405623436, 0.09305020049214363, 0.09060696139931679, 0.08979978784918785, 0.08798108187814553, 0.08955100364983082, 0.088

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…




Epoch:  72%|███████▏  | 72/100 [08:08<03:03,  6.54s/it][A[A[A


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098, 0.17129809161027273, 0.15991338839133581, 0.15267540762821832, 0.14439847196141878, 0.14095059037208557, 0.13442849616209665, 0.1295055498679479, 0.12575887019435564, 0.12197569881876309, 0.11888336762785912, 0.11620859305063884, 0.1141701266169548, 0.11486967901388805, 0.10919986541072528, 0.10814833516875903, 0.10605745017528534, 0.10507416725158691, 0.10250762601693471, 0.10089514156182607, 0.10192418346802394, 0.09975786631306012, 0.09757911786437035, 0.09584837158521016, 0.09579815591375034, 0.09720322241385777, 0.0943191647529602, 0.09252247959375381, 0.09215047086278598, 0.09151112847030163, 0.09033517663677533, 0.09209948405623436, 0.09305020049214363, 0.09060696139931679, 0.08979978784918785, 0.08798108187814553, 0.08955100364983082, 0.088

HBox(children=(IntProgress(value=0, description='Training iteration', max=29, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=6, style=ProgressStyle(description…


Loss history: [0.5703889032204946, 0.45168088873227435, 0.3713988810777664, 0.31752805908521015, 0.2832884540160497, 0.2619638790686925, 0.2481021930774053, 0.23104599366585413, 0.20977324495712915, 0.19478107243776321, 0.18068071703116098, 0.17129809161027273, 0.15991338839133581, 0.15267540762821832, 0.14439847196141878, 0.14095059037208557, 0.13442849616209665, 0.1295055498679479, 0.12575887019435564, 0.12197569881876309, 0.11888336762785912, 0.11620859305063884, 0.1141701266169548, 0.11486967901388805, 0.10919986541072528, 0.10814833516875903, 0.10605745017528534, 0.10507416725158691, 0.10250762601693471, 0.10089514156182607, 0.10192418346802394, 0.09975786631306012, 0.09757911786437035, 0.09584837158521016, 0.09579815591375034, 0.09720322241385777, 0.0943191647529602, 0.09252247959375381, 0.09215047086278598, 0.09151112847030163, 0.09033517663677533, 0.09209948405623436, 0.09305020049214363, 0.09060696139931679, 0.08979978784918785, 0.08798108187814553, 0.08955100364983082, 0.088

I0404 19:32:17.245449 140043817379648 configuration_utils.py:256] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json from cache at /home/yves/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.8f56353af4a709bf5ff0fbc915d8f5b42bfff892cbb6ac98c3c45f481a03c685
I0404 19:32:17.247045 140043817379648 configuration_utils.py:292] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": null,
  "do_sample": false,
  "eos_token_ids": null,
  "finetuning_task": null,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1"
  },
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "is_decoder": false,
  "label2id": {
    "LABEL_0": 0,
    "LABEL_1": 1
  },
  "layer_norm_eps": 1e-12,
  "length_penalty": 1.0,
  "max_length": 20,
  "max_position_embedd

HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Data size: 138
P: 119 / 133 = 0.8947368421052632
R: 119 / 157 = 0.7579617834394905
F: 0.8206896551724138
A: 0.7246376811594203
AL1: 0.8043478260869565


## Evaluation

In [6]:
evaluate_output(all_correct, all_predicted)

Data size: 692
P: 600 / 676 = 0.8875739644970414
R: 600 / 801 = 0.7490636704119851
F: 0.8124576844955991
A: 0.703757225433526
AL1: 0.7890173410404624


In [7]:
scores = {}
for item, predicted, correct in zip(all_test_data, all_predicted, all_correct):
    correct_labels = [idx2label[i] for i, l in enumerate(correct) if l == 1]
    predicted_labels = [idx2label[i] for i, l in enumerate(predicted) if l == 1]
    print("{}#{}#{}".format(item.text, ";".join(correct_labels), ";".join(predicted_labels)))    
        
    for label in predicted_labels + correct_labels:
        if label not in scores:
            scores[label] = {"tp": 0, "fp": 0, "fn": 0, "support": 0}
    
    for label in predicted_labels:
        if label in correct_labels:
            scores[label]["tp"] += 1
        else:
            scores[label]["fp"] += 1

    for label in correct_labels:
        scores[label]["support"] += 1
        if label not in predicted_labels:
            scores[label]["fn"] += 1
            
for label in scores:
    lp = scores[label]["tp"] / (scores[label]["tp"] + scores[label]["fp"]) if scores[label]["tp"] + scores[label]["fp"] > 0 else 0
    lr = scores[label]["tp"] / (scores[label]["tp"] + scores[label]["fn"]) if scores[label]["tp"] + scores[label]["fn"] > 0 else 0
    lf = 2*lp*lr/(lp+lr) if lp+lr > 0 else 0
    
    print(label, lp, lr, lf, scores[label]["support"])

Eastern Michigan University cut women's tennis and softball, but Chretien and Mayerova sued the University for violating Title IX.#Women_sued#Women_sued
Eastern Michigan University cut women's tennis and softball, but this is illegal according to Title IX.#Miscellaneous#
Eastern Michigan University cut women's tennis and softball, but students claimed they had broken the law specifically tittle IX#Women_sued#Women_sued
Eastern Michigan University cut women's tennis and softball, but I find a school with aviation but no scholarships or scholarships but no major."#Miscellaneous#Miscellaneous
Eastern Michigan University cut women's tennis and softball, but also cut several men's sports programs.#Cuts_to_mens_sports#Cuts_to_mens_sports
Eastern Michigan University cut women's tennis and softball, but also cut men's wrestling, as well as men's swimming and diving.#Cuts_to_WSD,#Cuts_to_WSD,
Eastern Michigan University cut women's tennis and softball, but had to reinstate the teams due to Titl

In [8]:
print(len([p for p in all_predicted if sum(p) == 0]), "/", len(all_predicted))

84 / 692
