# First BERT Experiments

In this notebook we do some first experiments with BERT: we finetune a BERT model+classifier on each of our datasets separately and compute the accuracy of the resulting classifier on the test data.

For these experiments we use the `pytorch_transformers` package. It contains a variety of neural network architectures for transfer learning and pretrained models, including BERT and XLNET.

Two different BERT models are relevant for our experiments: 

- BERT-base-uncased: a relatively small BERT model that should already give reasonable results,
- BERT-large-uncased: a larger model for real state-of-the-art results.

In [1]:
BERT_MODEL = 'bert-base-uncased'
BATCH_SIZE = 16 if "base" in BERT_MODEL else 2
GRADIENT_ACCUMULATION_STEPS = 1 if "base" in BERT_MODEL else 8
MAX_SEQ_LENGTH = 100
PREFIXES = ["eatingmeat4_so", "eatingmeat4_because", "eatingmeat4_but"]

## Data

We use the same data as for all our previous experiments. Here we load the training, development and test data for a particular prompt.

In [2]:
import sys
sys.path.append('../')

import ndjson
import glob
import numpy as np

from quillnlp.models.bert.preprocessing import preprocess, create_label_vocabulary

data = []
for prefix in PREFIXES:
    data_file = f"../data/interim/{prefix}_withprompt.ndjson"

    with open(data_file) as i:
        new_data = ndjson.load(i)
        for item in new_data:
            item["label"] = prefix + "_" + item["labels"][0]
        data.extend(new_data)
        
# Make sure this is a single-label problem
label_lengths = [len(item["labels"]) for item in data]
assert max(label_lengths) == 1
        
label2idx = create_label_vocabulary(data)
idx2label = {v:k for k,v in label2idx.items()}
target_names = [idx2label[s] for s in range(len(idx2label))]

data_items = preprocess(data, BERT_MODEL, label2idx, MAX_SEQ_LENGTH)
data_items = np.array(data_items)

I0406 17:49:15.544050 140397457205056 file_utils.py:41] PyTorch version 1.2.0+cu92 available.
I0406 17:49:16.530257 140397457205056 file_utils.py:57] TensorFlow version 2.1.0 available.
I0406 17:49:17.201040 140397457205056 tokenization_utils.py:501] loading file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt from cache at /home/yves/.cache/torch/transformers/26bc1ad6c0ac742e9b52263248f6d0f00068293b33709fae12320c0e35ccfbbb.542ce4285a40d23a559526243235df47c5f75c197f04f37d1a0c124c32c9a084


## Training

In [3]:
import torch
import random

from quillnlp.models.bert.train import train, evaluate
from quillnlp.models.bert.models import get_bert_classifier

from quillnlp.models.bert.preprocessing import get_data_loader
from sklearn.model_selection import KFold

random.shuffle(data_items)

kf = KFold(n_splits=5, shuffle=True, random_state=1)
all_correct, all_predicted = [], []
all_test_data = []
for train_idx, test_idx in kf.split(data_items):

    train_and_dev_data = data_items[train_idx]
    cutoff = int(len(train_and_dev_data)/4*3)
    
    train_data = train_and_dev_data[:cutoff]
    dev_data = train_and_dev_data[cutoff:]
    test_data = data_items[test_idx]

    train_dataloader = get_data_loader(train_data, BATCH_SIZE)
    dev_dataloader = get_data_loader(dev_data, BATCH_SIZE)
    test_dataloader = get_data_loader(test_data, BATCH_SIZE, shuffle=False)

    device = "cuda" if torch.cuda.is_available() else "cpu"
    model = get_bert_classifier(BERT_MODEL, len(label2idx), device=device)
    output_model_file = train(model, train_dataloader, dev_dataloader, 
                              BATCH_SIZE, GRADIENT_ACCUMULATION_STEPS, device,
                             num_train_epochs=100)
    
    print("Loading model from", output_model_file)
    device="cpu"

    model = get_bert_classifier(BERT_MODEL, len(label2idx), model_file=output_model_file, device=device)
    model.eval()
    
    _, _, test_correct, test_predicted = evaluate(model, test_dataloader, device)
    all_correct.extend(test_correct)
    all_predicted.extend(test_predicted)
    all_test_data.extend(test_data)


I0406 17:49:18.523230 140397457205056 configuration_utils.py:256] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json from cache at /home/yves/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.8f56353af4a709bf5ff0fbc915d8f5b42bfff892cbb6ac98c3c45f481a03c685
I0406 17:49:18.524959 140397457205056 configuration_utils.py:292] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": null,
  "do_sample": false,
  "eos_token_ids": null,
  "finetuning_task": null,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1"
  },
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "is_decoder": false,
  "label2id": {
    "LABEL_0": 0,
    "LABEL_1": 1
  },
  "layer_norm_eps": 1e-12,
  "length_penalty": 1.0,
  "max_length": 20,
  "max_position_embedd

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…

  outputs = softmax(logits.to('cpu'))



Loss history: []
Dev loss: 3.3455551783243815


Epoch:   1%|          | 1/100 [00:20<34:17, 20.79s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.3455551783243815]
Dev loss: 2.8631558179855348


Epoch:   2%|▏         | 2/100 [00:41<34:00, 20.82s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.3455551783243815, 2.8631558179855348]
Dev loss: 2.3468390862147013


Epoch:   3%|▎         | 3/100 [01:02<33:43, 20.86s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.3455551783243815, 2.8631558179855348, 2.3468390862147013]
Dev loss: 1.9122286677360534


Epoch:   4%|▍         | 4/100 [01:23<33:26, 20.91s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.3455551783243815, 2.8631558179855348, 2.3468390862147013, 1.9122286677360534]
Dev loss: 1.6211103220780692


Epoch:   5%|▌         | 5/100 [01:44<33:10, 20.96s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.3455551783243815, 2.8631558179855348, 2.3468390862147013, 1.9122286677360534, 1.6211103220780692]
Dev loss: 1.3957361658414205


Epoch:   6%|▌         | 6/100 [02:05<32:53, 20.99s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.3455551783243815, 2.8631558179855348, 2.3468390862147013, 1.9122286677360534, 1.6211103220780692, 1.3957361658414205]
Dev loss: 1.2302775005499522


Epoch:   7%|▋         | 7/100 [02:26<32:35, 21.02s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.3455551783243815, 2.8631558179855348, 2.3468390862147013, 1.9122286677360534, 1.6211103220780692, 1.3957361658414205, 1.2302775005499522]
Dev loss: 1.0958113213380178


Epoch:   8%|▊         | 8/100 [02:48<32:16, 21.05s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.3455551783243815, 2.8631558179855348, 2.3468390862147013, 1.9122286677360534, 1.6211103220780692, 1.3957361658414205, 1.2302775005499522, 1.0958113213380178]
Dev loss: 1.0362833758195242


Epoch:   9%|▉         | 9/100 [03:09<31:57, 21.07s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.3455551783243815, 2.8631558179855348, 2.3468390862147013, 1.9122286677360534, 1.6211103220780692, 1.3957361658414205, 1.2302775005499522, 1.0958113213380178, 1.0362833758195242]
Dev loss: 0.9327900499105454


Epoch:  10%|█         | 10/100 [03:30<31:37, 21.09s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.3455551783243815, 2.8631558179855348, 2.3468390862147013, 1.9122286677360534, 1.6211103220780692, 1.3957361658414205, 1.2302775005499522, 1.0958113213380178, 1.0362833758195242, 0.9327900499105454]
Dev loss: 0.8444825987021128


Epoch:  11%|█         | 11/100 [03:51<31:17, 21.10s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.3455551783243815, 2.8631558179855348, 2.3468390862147013, 1.9122286677360534, 1.6211103220780692, 1.3957361658414205, 1.2302775005499522, 1.0958113213380178, 1.0362833758195242, 0.9327900499105454, 0.8444825987021128]
Dev loss: 0.8053811371326447


Epoch:  12%|█▏        | 12/100 [04:12<30:57, 21.11s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.3455551783243815, 2.8631558179855348, 2.3468390862147013, 1.9122286677360534, 1.6211103220780692, 1.3957361658414205, 1.2302775005499522, 1.0958113213380178, 1.0362833758195242, 0.9327900499105454, 0.8444825987021128, 0.8053811371326447]
Dev loss: 0.7851727654536566


Epoch:  13%|█▎        | 13/100 [04:33<30:37, 21.12s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.3455551783243815, 2.8631558179855348, 2.3468390862147013, 1.9122286677360534, 1.6211103220780692, 1.3957361658414205, 1.2302775005499522, 1.0958113213380178, 1.0362833758195242, 0.9327900499105454, 0.8444825987021128, 0.8053811371326447, 0.7851727654536566]
Dev loss: 0.7391756981611252


Epoch:  14%|█▍        | 14/100 [04:54<30:16, 21.12s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.3455551783243815, 2.8631558179855348, 2.3468390862147013, 1.9122286677360534, 1.6211103220780692, 1.3957361658414205, 1.2302775005499522, 1.0958113213380178, 1.0362833758195242, 0.9327900499105454, 0.8444825987021128, 0.8053811371326447, 0.7851727654536566, 0.7391756981611252]
Dev loss: 0.736555090546608


Epoch:  15%|█▌        | 15/100 [05:15<29:55, 21.13s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.3455551783243815, 2.8631558179855348, 2.3468390862147013, 1.9122286677360534, 1.6211103220780692, 1.3957361658414205, 1.2302775005499522, 1.0958113213380178, 1.0362833758195242, 0.9327900499105454, 0.8444825987021128, 0.8053811371326447, 0.7851727654536566, 0.7391756981611252, 0.736555090546608]
Dev loss: 0.7027463148037593


Epoch:  16%|█▌        | 16/100 [05:37<29:35, 21.13s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…

Epoch:  17%|█▋        | 17/100 [05:57<29:01, 20.98s/it]


Loss history: [3.3455551783243815, 2.8631558179855348, 2.3468390862147013, 1.9122286677360534, 1.6211103220780692, 1.3957361658414205, 1.2302775005499522, 1.0958113213380178, 1.0362833758195242, 0.9327900499105454, 0.8444825987021128, 0.8053811371326447, 0.7851727654536566, 0.7391756981611252, 0.736555090546608, 0.7027463148037593]
Dev loss: 0.7049637407064437


HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.3455551783243815, 2.8631558179855348, 2.3468390862147013, 1.9122286677360534, 1.6211103220780692, 1.3957361658414205, 1.2302775005499522, 1.0958113213380178, 1.0362833758195242, 0.9327900499105454, 0.8444825987021128, 0.8053811371326447, 0.7851727654536566, 0.7391756981611252, 0.736555090546608, 0.7027463148037593, 0.7049637407064437]
Dev loss: 0.6723833685119947


Epoch:  18%|█▊        | 18/100 [06:18<28:44, 21.03s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…

Epoch:  19%|█▉        | 19/100 [06:39<28:14, 20.91s/it]


Loss history: [3.3455551783243815, 2.8631558179855348, 2.3468390862147013, 1.9122286677360534, 1.6211103220780692, 1.3957361658414205, 1.2302775005499522, 1.0958113213380178, 1.0362833758195242, 0.9327900499105454, 0.8444825987021128, 0.8053811371326447, 0.7851727654536566, 0.7391756981611252, 0.736555090546608, 0.7027463148037593, 0.7049637407064437, 0.6723833685119947]
Dev loss: 0.6789553532997767


HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…

Epoch:  20%|██        | 20/100 [07:00<27:46, 20.84s/it]


Loss history: [3.3455551783243815, 2.8631558179855348, 2.3468390862147013, 1.9122286677360534, 1.6211103220780692, 1.3957361658414205, 1.2302775005499522, 1.0958113213380178, 1.0362833758195242, 0.9327900499105454, 0.8444825987021128, 0.8053811371326447, 0.7851727654536566, 0.7391756981611252, 0.736555090546608, 0.7027463148037593, 0.7049637407064437, 0.6723833685119947, 0.6789553532997767]
Dev loss: 0.697317898273468


HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…

Epoch:  21%|██        | 21/100 [07:20<27:21, 20.78s/it]


Loss history: [3.3455551783243815, 2.8631558179855348, 2.3468390862147013, 1.9122286677360534, 1.6211103220780692, 1.3957361658414205, 1.2302775005499522, 1.0958113213380178, 1.0362833758195242, 0.9327900499105454, 0.8444825987021128, 0.8053811371326447, 0.7851727654536566, 0.7391756981611252, 0.736555090546608, 0.7027463148037593, 0.7049637407064437, 0.6723833685119947, 0.6789553532997767, 0.697317898273468]
Dev loss: 0.6771357759833336


HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…

Epoch:  22%|██▏       | 22/100 [07:41<26:57, 20.74s/it]


Loss history: [3.3455551783243815, 2.8631558179855348, 2.3468390862147013, 1.9122286677360534, 1.6211103220780692, 1.3957361658414205, 1.2302775005499522, 1.0958113213380178, 1.0362833758195242, 0.9327900499105454, 0.8444825987021128, 0.8053811371326447, 0.7851727654536566, 0.7391756981611252, 0.736555090546608, 0.7027463148037593, 0.7049637407064437, 0.6723833685119947, 0.6789553532997767, 0.697317898273468, 0.6771357759833336]
Dev loss: 0.6777698397636414


HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.3455551783243815, 2.8631558179855348, 2.3468390862147013, 1.9122286677360534, 1.6211103220780692, 1.3957361658414205, 1.2302775005499522, 1.0958113213380178, 1.0362833758195242, 0.9327900499105454, 0.8444825987021128, 0.8053811371326447, 0.7851727654536566, 0.7391756981611252, 0.736555090546608, 0.7027463148037593, 0.7049637407064437, 0.6723833685119947, 0.6789553532997767, 0.697317898273468, 0.6771357759833336, 0.6777698397636414]
Dev loss: 0.6570984582106273


Epoch:  23%|██▎       | 23/100 [08:02<26:46, 20.86s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…

Epoch:  24%|██▍       | 24/100 [08:23<26:20, 20.79s/it]


Loss history: [3.3455551783243815, 2.8631558179855348, 2.3468390862147013, 1.9122286677360534, 1.6211103220780692, 1.3957361658414205, 1.2302775005499522, 1.0958113213380178, 1.0362833758195242, 0.9327900499105454, 0.8444825987021128, 0.8053811371326447, 0.7851727654536566, 0.7391756981611252, 0.736555090546608, 0.7027463148037593, 0.7049637407064437, 0.6723833685119947, 0.6789553532997767, 0.697317898273468, 0.6771357759833336, 0.6777698397636414, 0.6570984582106273]
Dev loss: 0.6700288708011309


HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.3455551783243815, 2.8631558179855348, 2.3468390862147013, 1.9122286677360534, 1.6211103220780692, 1.3957361658414205, 1.2302775005499522, 1.0958113213380178, 1.0362833758195242, 0.9327900499105454, 0.8444825987021128, 0.8053811371326447, 0.7851727654536566, 0.7391756981611252, 0.736555090546608, 0.7027463148037593, 0.7049637407064437, 0.6723833685119947, 0.6789553532997767, 0.697317898273468, 0.6771357759833336, 0.6777698397636414, 0.6570984582106273, 0.6700288708011309]
Dev loss: 0.656030835956335


Epoch:  25%|██▌       | 25/100 [08:44<26:07, 20.90s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.3455551783243815, 2.8631558179855348, 2.3468390862147013, 1.9122286677360534, 1.6211103220780692, 1.3957361658414205, 1.2302775005499522, 1.0958113213380178, 1.0362833758195242, 0.9327900499105454, 0.8444825987021128, 0.8053811371326447, 0.7851727654536566, 0.7391756981611252, 0.736555090546608, 0.7027463148037593, 0.7049637407064437, 0.6723833685119947, 0.6789553532997767, 0.697317898273468, 0.6771357759833336, 0.6777698397636414, 0.6570984582106273, 0.6700288708011309, 0.656030835956335]
Dev loss: 0.6480815261602402


Epoch:  26%|██▌       | 26/100 [09:05<25:51, 20.97s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…

Epoch:  27%|██▋       | 27/100 [09:26<25:23, 20.87s/it]


Loss history: [3.3455551783243815, 2.8631558179855348, 2.3468390862147013, 1.9122286677360534, 1.6211103220780692, 1.3957361658414205, 1.2302775005499522, 1.0958113213380178, 1.0362833758195242, 0.9327900499105454, 0.8444825987021128, 0.8053811371326447, 0.7851727654536566, 0.7391756981611252, 0.736555090546608, 0.7027463148037593, 0.7049637407064437, 0.6723833685119947, 0.6789553532997767, 0.697317898273468, 0.6771357759833336, 0.6777698397636414, 0.6570984582106273, 0.6700288708011309, 0.656030835956335, 0.6480815261602402]
Dev loss: 0.6635347758730252


HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…

Epoch:  28%|██▊       | 28/100 [09:46<24:57, 20.80s/it]


Loss history: [3.3455551783243815, 2.8631558179855348, 2.3468390862147013, 1.9122286677360534, 1.6211103220780692, 1.3957361658414205, 1.2302775005499522, 1.0958113213380178, 1.0362833758195242, 0.9327900499105454, 0.8444825987021128, 0.8053811371326447, 0.7851727654536566, 0.7391756981611252, 0.736555090546608, 0.7027463148037593, 0.7049637407064437, 0.6723833685119947, 0.6789553532997767, 0.697317898273468, 0.6771357759833336, 0.6777698397636414, 0.6570984582106273, 0.6700288708011309, 0.656030835956335, 0.6480815261602402, 0.6635347758730252]
Dev loss: 0.6696336093048255


HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…

Epoch:  29%|██▉       | 29/100 [10:07<24:33, 20.75s/it]


Loss history: [3.3455551783243815, 2.8631558179855348, 2.3468390862147013, 1.9122286677360534, 1.6211103220780692, 1.3957361658414205, 1.2302775005499522, 1.0958113213380178, 1.0362833758195242, 0.9327900499105454, 0.8444825987021128, 0.8053811371326447, 0.7851727654536566, 0.7391756981611252, 0.736555090546608, 0.7027463148037593, 0.7049637407064437, 0.6723833685119947, 0.6789553532997767, 0.697317898273468, 0.6771357759833336, 0.6777698397636414, 0.6570984582106273, 0.6700288708011309, 0.656030835956335, 0.6480815261602402, 0.6635347758730252, 0.6696336093048255]
Dev loss: 0.6554589182138443


HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…

Epoch:  30%|███       | 30/100 [10:28<24:10, 20.72s/it]


Loss history: [3.3455551783243815, 2.8631558179855348, 2.3468390862147013, 1.9122286677360534, 1.6211103220780692, 1.3957361658414205, 1.2302775005499522, 1.0958113213380178, 1.0362833758195242, 0.9327900499105454, 0.8444825987021128, 0.8053811371326447, 0.7851727654536566, 0.7391756981611252, 0.736555090546608, 0.7027463148037593, 0.7049637407064437, 0.6723833685119947, 0.6789553532997767, 0.697317898273468, 0.6771357759833336, 0.6777698397636414, 0.6570984582106273, 0.6700288708011309, 0.656030835956335, 0.6480815261602402, 0.6635347758730252, 0.6696336093048255, 0.6554589182138443]
Dev loss: 0.6898500437537829


HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.3455551783243815, 2.8631558179855348, 2.3468390862147013, 1.9122286677360534, 1.6211103220780692, 1.3957361658414205, 1.2302775005499522, 1.0958113213380178, 1.0362833758195242, 0.9327900499105454, 0.8444825987021128, 0.8053811371326447, 0.7851727654536566, 0.7391756981611252, 0.736555090546608, 0.7027463148037593, 0.7049637407064437, 0.6723833685119947, 0.6789553532997767, 0.697317898273468, 0.6771357759833336, 0.6777698397636414, 0.6570984582106273, 0.6700288708011309, 0.656030835956335, 0.6480815261602402, 0.6635347758730252, 0.6696336093048255, 0.6554589182138443, 0.6898500437537829]
Dev loss: 0.6665377870202065
No improvement on development set. Finish training.
Loading model from /tmp/model.bin



I0406 18:00:13.914494 140397457205056 configuration_utils.py:256] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json from cache at /home/yves/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.8f56353af4a709bf5ff0fbc915d8f5b42bfff892cbb6ac98c3c45f481a03c685
I0406 18:00:13.916090 140397457205056 configuration_utils.py:292] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": null,
  "do_sample": false,
  "eos_token_ids": null,
  "finetuning_task": null,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1"
  },
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "is_decoder": false,
  "label2id": {
    "LABEL_0": 0,
    "LABEL_1": 1
  },
  "layer_norm_eps": 1e-12,
  "length_penalty": 1.0,
  "max_length": 20,
  "max_position_embed

HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…




I0406 18:00:28.704495 140397457205056 configuration_utils.py:256] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json from cache at /home/yves/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.8f56353af4a709bf5ff0fbc915d8f5b42bfff892cbb6ac98c3c45f481a03c685
I0406 18:00:28.705774 140397457205056 configuration_utils.py:292] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": null,
  "do_sample": false,
  "eos_token_ids": null,
  "finetuning_task": null,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1"
  },
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "is_decoder": false,
  "label2id": {
    "LABEL_0": 0,
    "LABEL_1": 1
  },
  "layer_norm_eps": 1e-12,
  "length_penalty": 1.0,
  "max_length": 20,
  "max_position_embedd

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: []
Dev loss: 3.134772276878357


Epoch:   1%|          | 1/100 [00:21<34:40, 21.01s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.134772276878357]
Dev loss: 2.8173413276672363


Epoch:   2%|▏         | 2/100 [00:42<34:20, 21.02s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.134772276878357, 2.8173413276672363]
Dev loss: 2.4336111227671307


Epoch:   3%|▎         | 3/100 [01:03<34:00, 21.04s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.134772276878357, 2.8173413276672363, 2.4336111227671307]
Dev loss: 2.0716185569763184


Epoch:   4%|▍         | 4/100 [01:24<33:40, 21.05s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.134772276878357, 2.8173413276672363, 2.4336111227671307, 2.0716185569763184]
Dev loss: 1.7996585289637248


Epoch:   5%|▌         | 5/100 [01:45<33:19, 21.05s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.134772276878357, 2.8173413276672363, 2.4336111227671307, 2.0716185569763184, 1.7996585289637248]
Dev loss: 1.5348845144112906


Epoch:   6%|▌         | 6/100 [02:06<33:00, 21.07s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.134772276878357, 2.8173413276672363, 2.4336111227671307, 2.0716185569763184, 1.7996585289637248, 1.5348845144112906]
Dev loss: 1.3654666105906168


Epoch:   7%|▋         | 7/100 [02:27<32:40, 21.08s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.134772276878357, 2.8173413276672363, 2.4336111227671307, 2.0716185569763184, 1.7996585289637248, 1.5348845144112906, 1.3654666105906168]
Dev loss: 1.2328002472718558


Epoch:   8%|▊         | 8/100 [02:48<32:20, 21.09s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.134772276878357, 2.8173413276672363, 2.4336111227671307, 2.0716185569763184, 1.7996585289637248, 1.5348845144112906, 1.3654666105906168, 1.2328002472718558]
Dev loss: 1.1408678909142813


Epoch:   9%|▉         | 9/100 [03:09<31:59, 21.09s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.134772276878357, 2.8173413276672363, 2.4336111227671307, 2.0716185569763184, 1.7996585289637248, 1.5348845144112906, 1.3654666105906168, 1.2328002472718558, 1.1408678909142813]
Dev loss: 1.0778270542621613


Epoch:  10%|█         | 10/100 [03:30<31:39, 21.11s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.134772276878357, 2.8173413276672363, 2.4336111227671307, 2.0716185569763184, 1.7996585289637248, 1.5348845144112906, 1.3654666105906168, 1.2328002472718558, 1.1408678909142813, 1.0778270542621613]
Dev loss: 1.000143470366796


Epoch:  11%|█         | 11/100 [03:51<31:19, 21.11s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.134772276878357, 2.8173413276672363, 2.4336111227671307, 2.0716185569763184, 1.7996585289637248, 1.5348845144112906, 1.3654666105906168, 1.2328002472718558, 1.1408678909142813, 1.0778270542621613, 1.000143470366796]
Dev loss: 0.9588430921236674


Epoch:  12%|█▏        | 12/100 [04:13<30:58, 21.12s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.134772276878357, 2.8173413276672363, 2.4336111227671307, 2.0716185569763184, 1.7996585289637248, 1.5348845144112906, 1.3654666105906168, 1.2328002472718558, 1.1408678909142813, 1.0778270542621613, 1.000143470366796, 0.9588430921236674]
Dev loss: 0.9254135469595591


Epoch:  13%|█▎        | 13/100 [04:34<30:37, 21.12s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.134772276878357, 2.8173413276672363, 2.4336111227671307, 2.0716185569763184, 1.7996585289637248, 1.5348845144112906, 1.3654666105906168, 1.2328002472718558, 1.1408678909142813, 1.0778270542621613, 1.000143470366796, 0.9588430921236674, 0.9254135469595591]
Dev loss: 0.8678344517946244


Epoch:  14%|█▍        | 14/100 [04:55<30:16, 21.13s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.134772276878357, 2.8173413276672363, 2.4336111227671307, 2.0716185569763184, 1.7996585289637248, 1.5348845144112906, 1.3654666105906168, 1.2328002472718558, 1.1408678909142813, 1.0778270542621613, 1.000143470366796, 0.9588430921236674, 0.9254135469595591, 0.8678344517946244]
Dev loss: 0.8399032185475032


Epoch:  15%|█▌        | 15/100 [05:16<29:55, 21.13s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.134772276878357, 2.8173413276672363, 2.4336111227671307, 2.0716185569763184, 1.7996585289637248, 1.5348845144112906, 1.3654666105906168, 1.2328002472718558, 1.1408678909142813, 1.0778270542621613, 1.000143470366796, 0.9588430921236674, 0.9254135469595591, 0.8678344517946244, 0.8399032185475032]
Dev loss: 0.8395592356721561


Epoch:  16%|█▌        | 16/100 [05:37<29:35, 21.13s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.134772276878357, 2.8173413276672363, 2.4336111227671307, 2.0716185569763184, 1.7996585289637248, 1.5348845144112906, 1.3654666105906168, 1.2328002472718558, 1.1408678909142813, 1.0778270542621613, 1.000143470366796, 0.9588430921236674, 0.9254135469595591, 0.8678344517946244, 0.8399032185475032, 0.8395592356721561]
Dev loss: 0.8383056263128916


Epoch:  17%|█▋        | 17/100 [05:58<29:14, 21.13s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.134772276878357, 2.8173413276672363, 2.4336111227671307, 2.0716185569763184, 1.7996585289637248, 1.5348845144112906, 1.3654666105906168, 1.2328002472718558, 1.1408678909142813, 1.0778270542621613, 1.000143470366796, 0.9588430921236674, 0.9254135469595591, 0.8678344517946244, 0.8399032185475032, 0.8395592356721561, 0.8383056263128916]
Dev loss: 0.7866982777913412


Epoch:  18%|█▊        | 18/100 [06:19<28:53, 21.13s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.134772276878357, 2.8173413276672363, 2.4336111227671307, 2.0716185569763184, 1.7996585289637248, 1.5348845144112906, 1.3654666105906168, 1.2328002472718558, 1.1408678909142813, 1.0778270542621613, 1.000143470366796, 0.9588430921236674, 0.9254135469595591, 0.8678344517946244, 0.8399032185475032, 0.8395592356721561, 0.8383056263128916, 0.7866982777913412]
Dev loss: 0.7659640570481618


Epoch:  19%|█▉        | 19/100 [06:41<28:31, 21.13s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.134772276878357, 2.8173413276672363, 2.4336111227671307, 2.0716185569763184, 1.7996585289637248, 1.5348845144112906, 1.3654666105906168, 1.2328002472718558, 1.1408678909142813, 1.0778270542621613, 1.000143470366796, 0.9588430921236674, 0.9254135469595591, 0.8678344517946244, 0.8399032185475032, 0.8395592356721561, 0.8383056263128916, 0.7866982777913412, 0.7659640570481618]
Dev loss: 0.7529143343369166


Epoch:  20%|██        | 20/100 [07:02<28:10, 21.13s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.134772276878357, 2.8173413276672363, 2.4336111227671307, 2.0716185569763184, 1.7996585289637248, 1.5348845144112906, 1.3654666105906168, 1.2328002472718558, 1.1408678909142813, 1.0778270542621613, 1.000143470366796, 0.9588430921236674, 0.9254135469595591, 0.8678344517946244, 0.8399032185475032, 0.8395592356721561, 0.8383056263128916, 0.7866982777913412, 0.7659640570481618, 0.7529143343369166]
Dev loss: 0.7375924408435821


Epoch:  21%|██        | 21/100 [07:23<27:49, 21.13s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.134772276878357, 2.8173413276672363, 2.4336111227671307, 2.0716185569763184, 1.7996585289637248, 1.5348845144112906, 1.3654666105906168, 1.2328002472718558, 1.1408678909142813, 1.0778270542621613, 1.000143470366796, 0.9588430921236674, 0.9254135469595591, 0.8678344517946244, 0.8399032185475032, 0.8395592356721561, 0.8383056263128916, 0.7866982777913412, 0.7659640570481618, 0.7529143343369166, 0.7375924408435821]
Dev loss: 0.7371853212515513


Epoch:  22%|██▏       | 22/100 [07:44<27:28, 21.14s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.134772276878357, 2.8173413276672363, 2.4336111227671307, 2.0716185569763184, 1.7996585289637248, 1.5348845144112906, 1.3654666105906168, 1.2328002472718558, 1.1408678909142813, 1.0778270542621613, 1.000143470366796, 0.9588430921236674, 0.9254135469595591, 0.8678344517946244, 0.8399032185475032, 0.8395592356721561, 0.8383056263128916, 0.7866982777913412, 0.7659640570481618, 0.7529143343369166, 0.7375924408435821, 0.7371853212515513]
Dev loss: 0.7232259918625156


Epoch:  23%|██▎       | 23/100 [08:05<27:07, 21.14s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.134772276878357, 2.8173413276672363, 2.4336111227671307, 2.0716185569763184, 1.7996585289637248, 1.5348845144112906, 1.3654666105906168, 1.2328002472718558, 1.1408678909142813, 1.0778270542621613, 1.000143470366796, 0.9588430921236674, 0.9254135469595591, 0.8678344517946244, 0.8399032185475032, 0.8395592356721561, 0.8383056263128916, 0.7866982777913412, 0.7659640570481618, 0.7529143343369166, 0.7375924408435821, 0.7371853212515513, 0.7232259918625156]
Dev loss: 0.7198773210247358


Epoch:  24%|██▍       | 24/100 [08:26<26:46, 21.14s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.134772276878357, 2.8173413276672363, 2.4336111227671307, 2.0716185569763184, 1.7996585289637248, 1.5348845144112906, 1.3654666105906168, 1.2328002472718558, 1.1408678909142813, 1.0778270542621613, 1.000143470366796, 0.9588430921236674, 0.9254135469595591, 0.8678344517946244, 0.8399032185475032, 0.8395592356721561, 0.8383056263128916, 0.7866982777913412, 0.7659640570481618, 0.7529143343369166, 0.7375924408435821, 0.7371853212515513, 0.7232259918625156, 0.7198773210247358]
Dev loss: 0.7129932448267937


Epoch:  25%|██▌       | 25/100 [08:47<26:25, 21.14s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…

Epoch:  26%|██▌       | 26/100 [09:08<25:53, 21.00s/it]


Loss history: [3.134772276878357, 2.8173413276672363, 2.4336111227671307, 2.0716185569763184, 1.7996585289637248, 1.5348845144112906, 1.3654666105906168, 1.2328002472718558, 1.1408678909142813, 1.0778270542621613, 1.000143470366796, 0.9588430921236674, 0.9254135469595591, 0.8678344517946244, 0.8399032185475032, 0.8395592356721561, 0.8383056263128916, 0.7866982777913412, 0.7659640570481618, 0.7529143343369166, 0.7375924408435821, 0.7371853212515513, 0.7232259918625156, 0.7198773210247358, 0.7129932448267937]
Dev loss: 0.7218929961323738


HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…

Epoch:  27%|██▋       | 27/100 [09:29<25:25, 20.90s/it]


Loss history: [3.134772276878357, 2.8173413276672363, 2.4336111227671307, 2.0716185569763184, 1.7996585289637248, 1.5348845144112906, 1.3654666105906168, 1.2328002472718558, 1.1408678909142813, 1.0778270542621613, 1.000143470366796, 0.9588430921236674, 0.9254135469595591, 0.8678344517946244, 0.8399032185475032, 0.8395592356721561, 0.8383056263128916, 0.7866982777913412, 0.7659640570481618, 0.7529143343369166, 0.7375924408435821, 0.7371853212515513, 0.7232259918625156, 0.7198773210247358, 0.7129932448267937, 0.7218929961323738]
Dev loss: 0.7155647099018096


HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.134772276878357, 2.8173413276672363, 2.4336111227671307, 2.0716185569763184, 1.7996585289637248, 1.5348845144112906, 1.3654666105906168, 1.2328002472718558, 1.1408678909142813, 1.0778270542621613, 1.000143470366796, 0.9588430921236674, 0.9254135469595591, 0.8678344517946244, 0.8399032185475032, 0.8395592356721561, 0.8383056263128916, 0.7866982777913412, 0.7659640570481618, 0.7529143343369166, 0.7375924408435821, 0.7371853212515513, 0.7232259918625156, 0.7198773210247358, 0.7129932448267937, 0.7218929961323738, 0.7155647099018096]
Dev loss: 0.6994019741813342


Epoch:  28%|██▊       | 28/100 [09:50<25:10, 20.97s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…

Epoch:  29%|██▉       | 29/100 [10:11<24:42, 20.89s/it]


Loss history: [3.134772276878357, 2.8173413276672363, 2.4336111227671307, 2.0716185569763184, 1.7996585289637248, 1.5348845144112906, 1.3654666105906168, 1.2328002472718558, 1.1408678909142813, 1.0778270542621613, 1.000143470366796, 0.9588430921236674, 0.9254135469595591, 0.8678344517946244, 0.8399032185475032, 0.8395592356721561, 0.8383056263128916, 0.7866982777913412, 0.7659640570481618, 0.7529143343369166, 0.7375924408435821, 0.7371853212515513, 0.7232259918625156, 0.7198773210247358, 0.7129932448267937, 0.7218929961323738, 0.7155647099018096, 0.6994019741813342]
Dev loss: 0.7622968941926956


HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…

Epoch:  30%|███       | 30/100 [10:31<24:17, 20.82s/it]


Loss history: [3.134772276878357, 2.8173413276672363, 2.4336111227671307, 2.0716185569763184, 1.7996585289637248, 1.5348845144112906, 1.3654666105906168, 1.2328002472718558, 1.1408678909142813, 1.0778270542621613, 1.000143470366796, 0.9588430921236674, 0.9254135469595591, 0.8678344517946244, 0.8399032185475032, 0.8395592356721561, 0.8383056263128916, 0.7866982777913412, 0.7659640570481618, 0.7529143343369166, 0.7375924408435821, 0.7371853212515513, 0.7232259918625156, 0.7198773210247358, 0.7129932448267937, 0.7218929961323738, 0.7155647099018096, 0.6994019741813342, 0.7622968941926956]
Dev loss: 0.7025269110997517


HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…

Epoch:  31%|███       | 31/100 [10:52<23:54, 20.78s/it]


Loss history: [3.134772276878357, 2.8173413276672363, 2.4336111227671307, 2.0716185569763184, 1.7996585289637248, 1.5348845144112906, 1.3654666105906168, 1.2328002472718558, 1.1408678909142813, 1.0778270542621613, 1.000143470366796, 0.9588430921236674, 0.9254135469595591, 0.8678344517946244, 0.8399032185475032, 0.8395592356721561, 0.8383056263128916, 0.7866982777913412, 0.7659640570481618, 0.7529143343369166, 0.7375924408435821, 0.7371853212515513, 0.7232259918625156, 0.7198773210247358, 0.7129932448267937, 0.7218929961323738, 0.7155647099018096, 0.6994019741813342, 0.7622968941926956, 0.7025269110997517]
Dev loss: 0.708948037525018


HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…

Epoch:  32%|███▏      | 32/100 [11:13<23:31, 20.76s/it]


Loss history: [3.134772276878357, 2.8173413276672363, 2.4336111227671307, 2.0716185569763184, 1.7996585289637248, 1.5348845144112906, 1.3654666105906168, 1.2328002472718558, 1.1408678909142813, 1.0778270542621613, 1.000143470366796, 0.9588430921236674, 0.9254135469595591, 0.8678344517946244, 0.8399032185475032, 0.8395592356721561, 0.8383056263128916, 0.7866982777913412, 0.7659640570481618, 0.7529143343369166, 0.7375924408435821, 0.7371853212515513, 0.7232259918625156, 0.7198773210247358, 0.7129932448267937, 0.7218929961323738, 0.7155647099018096, 0.6994019741813342, 0.7622968941926956, 0.7025269110997517, 0.708948037525018]
Dev loss: 0.6995967134833336


HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.134772276878357, 2.8173413276672363, 2.4336111227671307, 2.0716185569763184, 1.7996585289637248, 1.5348845144112906, 1.3654666105906168, 1.2328002472718558, 1.1408678909142813, 1.0778270542621613, 1.000143470366796, 0.9588430921236674, 0.9254135469595591, 0.8678344517946244, 0.8399032185475032, 0.8395592356721561, 0.8383056263128916, 0.7866982777913412, 0.7659640570481618, 0.7529143343369166, 0.7375924408435821, 0.7371853212515513, 0.7232259918625156, 0.7198773210247358, 0.7129932448267937, 0.7218929961323738, 0.7155647099018096, 0.6994019741813342, 0.7622968941926956, 0.7025269110997517, 0.708948037525018, 0.6995967134833336]
Dev loss: 0.6973910520474116


Epoch:  33%|███▎      | 33/100 [11:34<23:18, 20.87s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.134772276878357, 2.8173413276672363, 2.4336111227671307, 2.0716185569763184, 1.7996585289637248, 1.5348845144112906, 1.3654666105906168, 1.2328002472718558, 1.1408678909142813, 1.0778270542621613, 1.000143470366796, 0.9588430921236674, 0.9254135469595591, 0.8678344517946244, 0.8399032185475032, 0.8395592356721561, 0.8383056263128916, 0.7866982777913412, 0.7659640570481618, 0.7529143343369166, 0.7375924408435821, 0.7371853212515513, 0.7232259918625156, 0.7198773210247358, 0.7129932448267937, 0.7218929961323738, 0.7155647099018096, 0.6994019741813342, 0.7622968941926956, 0.7025269110997517, 0.708948037525018, 0.6995967134833336, 0.6973910520474116]
Dev loss: 0.6869064050416152


Epoch:  34%|███▍      | 34/100 [11:55<23:02, 20.95s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…

Epoch:  35%|███▌      | 35/100 [12:16<22:36, 20.87s/it]


Loss history: [3.134772276878357, 2.8173413276672363, 2.4336111227671307, 2.0716185569763184, 1.7996585289637248, 1.5348845144112906, 1.3654666105906168, 1.2328002472718558, 1.1408678909142813, 1.0778270542621613, 1.000143470366796, 0.9588430921236674, 0.9254135469595591, 0.8678344517946244, 0.8399032185475032, 0.8395592356721561, 0.8383056263128916, 0.7866982777913412, 0.7659640570481618, 0.7529143343369166, 0.7375924408435821, 0.7371853212515513, 0.7232259918625156, 0.7198773210247358, 0.7129932448267937, 0.7218929961323738, 0.7155647099018096, 0.6994019741813342, 0.7622968941926956, 0.7025269110997517, 0.708948037525018, 0.6995967134833336, 0.6973910520474116, 0.6869064050416152]
Dev loss: 0.6895262313385805


HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…

Epoch:  36%|███▌      | 36/100 [12:36<22:11, 20.81s/it]


Loss history: [3.134772276878357, 2.8173413276672363, 2.4336111227671307, 2.0716185569763184, 1.7996585289637248, 1.5348845144112906, 1.3654666105906168, 1.2328002472718558, 1.1408678909142813, 1.0778270542621613, 1.000143470366796, 0.9588430921236674, 0.9254135469595591, 0.8678344517946244, 0.8399032185475032, 0.8395592356721561, 0.8383056263128916, 0.7866982777913412, 0.7659640570481618, 0.7529143343369166, 0.7375924408435821, 0.7371853212515513, 0.7232259918625156, 0.7198773210247358, 0.7129932448267937, 0.7218929961323738, 0.7155647099018096, 0.6994019741813342, 0.7622968941926956, 0.7025269110997517, 0.708948037525018, 0.6995967134833336, 0.6973910520474116, 0.6869064050416152, 0.6895262313385805]
Dev loss: 0.709361732006073


HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…

Epoch:  37%|███▋      | 37/100 [12:57<21:48, 20.77s/it]


Loss history: [3.134772276878357, 2.8173413276672363, 2.4336111227671307, 2.0716185569763184, 1.7996585289637248, 1.5348845144112906, 1.3654666105906168, 1.2328002472718558, 1.1408678909142813, 1.0778270542621613, 1.000143470366796, 0.9588430921236674, 0.9254135469595591, 0.8678344517946244, 0.8399032185475032, 0.8395592356721561, 0.8383056263128916, 0.7866982777913412, 0.7659640570481618, 0.7529143343369166, 0.7375924408435821, 0.7371853212515513, 0.7232259918625156, 0.7198773210247358, 0.7129932448267937, 0.7218929961323738, 0.7155647099018096, 0.6994019741813342, 0.7622968941926956, 0.7025269110997517, 0.708948037525018, 0.6995967134833336, 0.6973910520474116, 0.6869064050416152, 0.6895262313385805, 0.709361732006073]
Dev loss: 0.709451875090599


HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…

Epoch:  38%|███▊      | 38/100 [13:18<21:26, 20.74s/it]


Loss history: [3.134772276878357, 2.8173413276672363, 2.4336111227671307, 2.0716185569763184, 1.7996585289637248, 1.5348845144112906, 1.3654666105906168, 1.2328002472718558, 1.1408678909142813, 1.0778270542621613, 1.000143470366796, 0.9588430921236674, 0.9254135469595591, 0.8678344517946244, 0.8399032185475032, 0.8395592356721561, 0.8383056263128916, 0.7866982777913412, 0.7659640570481618, 0.7529143343369166, 0.7375924408435821, 0.7371853212515513, 0.7232259918625156, 0.7198773210247358, 0.7129932448267937, 0.7218929961323738, 0.7155647099018096, 0.6994019741813342, 0.7622968941926956, 0.7025269110997517, 0.708948037525018, 0.6995967134833336, 0.6973910520474116, 0.6869064050416152, 0.6895262313385805, 0.709361732006073, 0.709451875090599]
Dev loss: 0.7271672228972117


HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.134772276878357, 2.8173413276672363, 2.4336111227671307, 2.0716185569763184, 1.7996585289637248, 1.5348845144112906, 1.3654666105906168, 1.2328002472718558, 1.1408678909142813, 1.0778270542621613, 1.000143470366796, 0.9588430921236674, 0.9254135469595591, 0.8678344517946244, 0.8399032185475032, 0.8395592356721561, 0.8383056263128916, 0.7866982777913412, 0.7659640570481618, 0.7529143343369166, 0.7375924408435821, 0.7371853212515513, 0.7232259918625156, 0.7198773210247358, 0.7129932448267937, 0.7218929961323738, 0.7155647099018096, 0.6994019741813342, 0.7622968941926956, 0.7025269110997517, 0.708948037525018, 0.6995967134833336, 0.6973910520474116, 0.6869064050416152, 0.6895262313385805, 0.709361732006073, 0.709451875090599, 0.7271672228972117]
Dev loss: 0.760326394935449
No improvement on development set. Finish training.
Loading model from /tmp/model.bin


I0406 18:14:10.303222 140397457205056 configuration_utils.py:256] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json from cache at /home/yves/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.8f56353af4a709bf5ff0fbc915d8f5b42bfff892cbb6ac98c3c45f481a03c685
I0406 18:14:10.304512 140397457205056 configuration_utils.py:292] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": null,
  "do_sample": false,
  "eos_token_ids": null,
  "finetuning_task": null,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1"
  },
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "is_decoder": false,
  "label2id": {
    "LABEL_0": 0,
    "LABEL_1": 1
  },
  "layer_norm_eps": 1e-12,
  "length_penalty": 1.0,
  "max_length": 20,
  "max_position_embedd

HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…




I0406 18:14:25.246183 140397457205056 configuration_utils.py:256] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json from cache at /home/yves/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.8f56353af4a709bf5ff0fbc915d8f5b42bfff892cbb6ac98c3c45f481a03c685
I0406 18:14:25.247345 140397457205056 configuration_utils.py:292] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": null,
  "do_sample": false,
  "eos_token_ids": null,
  "finetuning_task": null,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1"
  },
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "is_decoder": false,
  "label2id": {
    "LABEL_0": 0,
    "LABEL_1": 1
  },
  "layer_norm_eps": 1e-12,
  "length_penalty": 1.0,
  "max_length": 20,
  "max_position_embedd

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: []
Dev loss: 3.1493136644363404



Epoch:   1%|          | 1/100 [00:21<34:42, 21.03s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.1493136644363404]
Dev loss: 2.6439683278401693



Epoch:   2%|▏         | 2/100 [00:42<34:21, 21.04s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.1493136644363404, 2.6439683278401693]
Dev loss: 2.2259405652681985



Epoch:   3%|▎         | 3/100 [01:03<34:01, 21.05s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.1493136644363404, 2.6439683278401693, 2.2259405652681985]
Dev loss: 1.8800074418385824



Epoch:   4%|▍         | 4/100 [01:24<33:41, 21.06s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.1493136644363404, 2.6439683278401693, 2.2259405652681985, 1.8800074418385824]
Dev loss: 1.631400966644287



Epoch:   5%|▌         | 5/100 [01:45<33:21, 21.07s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.1493136644363404, 2.6439683278401693, 2.2259405652681985, 1.8800074418385824, 1.631400966644287]
Dev loss: 1.410132739941279



Epoch:   6%|▌         | 6/100 [02:06<33:02, 21.09s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.1493136644363404, 2.6439683278401693, 2.2259405652681985, 1.8800074418385824, 1.631400966644287, 1.410132739941279]
Dev loss: 1.2975632945696514



Epoch:   7%|▋         | 7/100 [02:27<32:42, 21.10s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.1493136644363404, 2.6439683278401693, 2.2259405652681985, 1.8800074418385824, 1.631400966644287, 1.410132739941279, 1.2975632945696514]
Dev loss: 1.1311195453008016



Epoch:   8%|▊         | 8/100 [02:48<32:22, 21.12s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.1493136644363404, 2.6439683278401693, 2.2259405652681985, 1.8800074418385824, 1.631400966644287, 1.410132739941279, 1.2975632945696514, 1.1311195453008016]
Dev loss: 1.1064495394627254



Epoch:   9%|▉         | 9/100 [03:09<32:02, 21.12s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.1493136644363404, 2.6439683278401693, 2.2259405652681985, 1.8800074418385824, 1.631400966644287, 1.410132739941279, 1.2975632945696514, 1.1311195453008016, 1.1064495394627254]
Dev loss: 0.9857797712087631



Epoch:  10%|█         | 10/100 [03:31<31:41, 21.13s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.1493136644363404, 2.6439683278401693, 2.2259405652681985, 1.8800074418385824, 1.631400966644287, 1.410132739941279, 1.2975632945696514, 1.1311195453008016, 1.1064495394627254, 0.9857797712087631]
Dev loss: 0.9509927789370219



Epoch:  11%|█         | 11/100 [03:52<31:20, 21.13s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.1493136644363404, 2.6439683278401693, 2.2259405652681985, 1.8800074418385824, 1.631400966644287, 1.410132739941279, 1.2975632945696514, 1.1311195453008016, 1.1064495394627254, 0.9857797712087631, 0.9509927789370219]
Dev loss: 0.8965358873208363



Epoch:  12%|█▏        | 12/100 [04:13<30:59, 21.13s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Epoch:  13%|█▎        | 13/100 [04:33<30:26, 21.00s/it][A


Loss history: [3.1493136644363404, 2.6439683278401693, 2.2259405652681985, 1.8800074418385824, 1.631400966644287, 1.410132739941279, 1.2975632945696514, 1.1311195453008016, 1.1064495394627254, 0.9857797712087631, 0.9509927789370219, 0.8965358873208363]
Dev loss: 0.9040395647287369


HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.1493136644363404, 2.6439683278401693, 2.2259405652681985, 1.8800074418385824, 1.631400966644287, 1.410132739941279, 1.2975632945696514, 1.1311195453008016, 1.1064495394627254, 0.9857797712087631, 0.9509927789370219, 0.8965358873208363, 0.9040395647287369]
Dev loss: 0.8781829277674357



Epoch:  14%|█▍        | 14/100 [04:55<30:09, 21.04s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.1493136644363404, 2.6439683278401693, 2.2259405652681985, 1.8800074418385824, 1.631400966644287, 1.410132739941279, 1.2975632945696514, 1.1311195453008016, 1.1064495394627254, 0.9857797712087631, 0.9509927789370219, 0.8965358873208363, 0.9040395647287369, 0.8781829277674357]
Dev loss: 0.8221222887436549



Epoch:  15%|█▌        | 15/100 [05:16<29:50, 21.07s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.1493136644363404, 2.6439683278401693, 2.2259405652681985, 1.8800074418385824, 1.631400966644287, 1.410132739941279, 1.2975632945696514, 1.1311195453008016, 1.1064495394627254, 0.9857797712087631, 0.9509927789370219, 0.8965358873208363, 0.9040395647287369, 0.8781829277674357, 0.8221222887436549]
Dev loss: 0.8082724273204803



Epoch:  16%|█▌        | 16/100 [05:37<29:30, 21.08s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Epoch:  17%|█▋        | 17/100 [05:58<28:59, 20.96s/it][A


Loss history: [3.1493136644363404, 2.6439683278401693, 2.2259405652681985, 1.8800074418385824, 1.631400966644287, 1.410132739941279, 1.2975632945696514, 1.1311195453008016, 1.1064495394627254, 0.9857797712087631, 0.9509927789370219, 0.8965358873208363, 0.9040395647287369, 0.8781829277674357, 0.8221222887436549, 0.8082724273204803]
Dev loss: 0.8254878381888072


HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.1493136644363404, 2.6439683278401693, 2.2259405652681985, 1.8800074418385824, 1.631400966644287, 1.410132739941279, 1.2975632945696514, 1.1311195453008016, 1.1064495394627254, 0.9857797712087631, 0.9509927789370219, 0.8965358873208363, 0.9040395647287369, 0.8781829277674357, 0.8221222887436549, 0.8082724273204803, 0.8254878381888072]
Dev loss: 0.8020556211471558



Epoch:  18%|█▊        | 18/100 [06:19<28:42, 21.01s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.1493136644363404, 2.6439683278401693, 2.2259405652681985, 1.8800074418385824, 1.631400966644287, 1.410132739941279, 1.2975632945696514, 1.1311195453008016, 1.1064495394627254, 0.9857797712087631, 0.9509927789370219, 0.8965358873208363, 0.9040395647287369, 0.8781829277674357, 0.8221222887436549, 0.8082724273204803, 0.8254878381888072, 0.8020556211471558]
Dev loss: 0.8017764161030452



Epoch:  19%|█▉        | 19/100 [06:40<28:24, 21.04s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.1493136644363404, 2.6439683278401693, 2.2259405652681985, 1.8800074418385824, 1.631400966644287, 1.410132739941279, 1.2975632945696514, 1.1311195453008016, 1.1064495394627254, 0.9857797712087631, 0.9509927789370219, 0.8965358873208363, 0.9040395647287369, 0.8781829277674357, 0.8221222887436549, 0.8082724273204803, 0.8254878381888072, 0.8020556211471558, 0.8017764161030452]
Dev loss: 0.7919999281565349



Epoch:  20%|██        | 20/100 [07:01<28:05, 21.07s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.1493136644363404, 2.6439683278401693, 2.2259405652681985, 1.8800074418385824, 1.631400966644287, 1.410132739941279, 1.2975632945696514, 1.1311195453008016, 1.1064495394627254, 0.9857797712087631, 0.9509927789370219, 0.8965358873208363, 0.9040395647287369, 0.8781829277674357, 0.8221222887436549, 0.8082724273204803, 0.8254878381888072, 0.8020556211471558, 0.8017764161030452, 0.7919999281565349]
Dev loss: 0.7748110542694727



Epoch:  21%|██        | 21/100 [07:22<27:45, 21.09s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Epoch:  22%|██▏       | 22/100 [07:43<27:15, 20.97s/it][A


Loss history: [3.1493136644363404, 2.6439683278401693, 2.2259405652681985, 1.8800074418385824, 1.631400966644287, 1.410132739941279, 1.2975632945696514, 1.1311195453008016, 1.1064495394627254, 0.9857797712087631, 0.9509927789370219, 0.8965358873208363, 0.9040395647287369, 0.8781829277674357, 0.8221222887436549, 0.8082724273204803, 0.8254878381888072, 0.8020556211471558, 0.8017764161030452, 0.7919999281565349, 0.7748110542694727]
Dev loss: 0.7820528844992319


HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Epoch:  23%|██▎       | 23/100 [08:03<26:47, 20.88s/it][A


Loss history: [3.1493136644363404, 2.6439683278401693, 2.2259405652681985, 1.8800074418385824, 1.631400966644287, 1.410132739941279, 1.2975632945696514, 1.1311195453008016, 1.1064495394627254, 0.9857797712087631, 0.9509927789370219, 0.8965358873208363, 0.9040395647287369, 0.8781829277674357, 0.8221222887436549, 0.8082724273204803, 0.8254878381888072, 0.8020556211471558, 0.8017764161030452, 0.7919999281565349, 0.7748110542694727, 0.7820528844992319]
Dev loss: 0.8020291258891423


HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Epoch:  24%|██▍       | 24/100 [08:24<26:22, 20.82s/it][A


Loss history: [3.1493136644363404, 2.6439683278401693, 2.2259405652681985, 1.8800074418385824, 1.631400966644287, 1.410132739941279, 1.2975632945696514, 1.1311195453008016, 1.1064495394627254, 0.9857797712087631, 0.9509927789370219, 0.8965358873208363, 0.9040395647287369, 0.8781829277674357, 0.8221222887436549, 0.8082724273204803, 0.8254878381888072, 0.8020556211471558, 0.8017764161030452, 0.7919999281565349, 0.7748110542694727, 0.7820528844992319, 0.8020291258891423]
Dev loss: 0.8070658316214879


HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Epoch:  25%|██▌       | 25/100 [08:45<25:58, 20.78s/it][A


Loss history: [3.1493136644363404, 2.6439683278401693, 2.2259405652681985, 1.8800074418385824, 1.631400966644287, 1.410132739941279, 1.2975632945696514, 1.1311195453008016, 1.1064495394627254, 0.9857797712087631, 0.9509927789370219, 0.8965358873208363, 0.9040395647287369, 0.8781829277674357, 0.8221222887436549, 0.8082724273204803, 0.8254878381888072, 0.8020556211471558, 0.8017764161030452, 0.7919999281565349, 0.7748110542694727, 0.7820528844992319, 0.8020291258891423, 0.8070658316214879]
Dev loss: 0.8202590942382812


HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.1493136644363404, 2.6439683278401693, 2.2259405652681985, 1.8800074418385824, 1.631400966644287, 1.410132739941279, 1.2975632945696514, 1.1311195453008016, 1.1064495394627254, 0.9857797712087631, 0.9509927789370219, 0.8965358873208363, 0.9040395647287369, 0.8781829277674357, 0.8221222887436549, 0.8082724273204803, 0.8254878381888072, 0.8020556211471558, 0.8017764161030452, 0.7919999281565349, 0.7748110542694727, 0.7820528844992319, 0.8020291258891423, 0.8070658316214879, 0.8202590942382812]
Dev loss: 0.792996808886528
No improvement on development set. Finish training.
Loading model from /tmp/model.bin


I0406 18:23:34.196781 140397457205056 configuration_utils.py:256] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json from cache at /home/yves/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.8f56353af4a709bf5ff0fbc915d8f5b42bfff892cbb6ac98c3c45f481a03c685
I0406 18:23:34.198052 140397457205056 configuration_utils.py:292] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": null,
  "do_sample": false,
  "eos_token_ids": null,
  "finetuning_task": null,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1"
  },
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "is_decoder": false,
  "label2id": {
    "LABEL_0": 0,
    "LABEL_1": 1
  },
  "layer_norm_eps": 1e-12,
  "length_penalty": 1.0,
  "max_length": 20,
  "max_position_embedd

HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…




I0406 18:23:48.347981 140397457205056 configuration_utils.py:256] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json from cache at /home/yves/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.8f56353af4a709bf5ff0fbc915d8f5b42bfff892cbb6ac98c3c45f481a03c685
I0406 18:23:48.349199 140397457205056 configuration_utils.py:292] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": null,
  "do_sample": false,
  "eos_token_ids": null,
  "finetuning_task": null,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1"
  },
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "is_decoder": false,
  "label2id": {
    "LABEL_0": 0,
    "LABEL_1": 1
  },
  "layer_norm_eps": 1e-12,
  "length_penalty": 1.0,
  "max_length": 20,
  "max_position_embedd

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: []
Dev loss: 3.2492987155914306




Epoch:   1%|          | 1/100 [00:21<34:45, 21.06s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.2492987155914306]
Dev loss: 2.7014334201812744




Epoch:   2%|▏         | 2/100 [00:42<34:24, 21.06s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.2492987155914306, 2.7014334201812744]
Dev loss: 2.248048802216848




Epoch:   3%|▎         | 3/100 [01:03<34:03, 21.07s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.2492987155914306, 2.7014334201812744, 2.248048802216848]
Dev loss: 1.8916106621424358




Epoch:   4%|▍         | 4/100 [01:24<33:43, 21.08s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.2492987155914306, 2.7014334201812744, 2.248048802216848, 1.8916106621424358]
Dev loss: 1.6385515371958415




Epoch:   5%|▌         | 5/100 [01:45<33:23, 21.09s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.2492987155914306, 2.7014334201812744, 2.248048802216848, 1.8916106621424358, 1.6385515371958415]
Dev loss: 1.3959590633710226




Epoch:   6%|▌         | 6/100 [02:06<33:03, 21.11s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.2492987155914306, 2.7014334201812744, 2.248048802216848, 1.8916106621424358, 1.6385515371958415, 1.3959590633710226]
Dev loss: 1.2262824515501658




Epoch:   7%|▋         | 7/100 [02:27<32:43, 21.12s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.2492987155914306, 2.7014334201812744, 2.248048802216848, 1.8916106621424358, 1.6385515371958415, 1.3959590633710226, 1.2262824515501658]
Dev loss: 1.1264400164286295




Epoch:   8%|▊         | 8/100 [02:48<32:24, 21.13s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.2492987155914306, 2.7014334201812744, 2.248048802216848, 1.8916106621424358, 1.6385515371958415, 1.3959590633710226, 1.2262824515501658, 1.1264400164286295]
Dev loss: 1.0465689639250437




Epoch:   9%|▉         | 9/100 [03:10<32:03, 21.14s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.2492987155914306, 2.7014334201812744, 2.248048802216848, 1.8916106621424358, 1.6385515371958415, 1.3959590633710226, 1.2262824515501658, 1.1264400164286295, 1.0465689639250437]
Dev loss: 1.028343033293883




Epoch:  10%|█         | 10/100 [03:31<31:42, 21.14s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.2492987155914306, 2.7014334201812744, 2.248048802216848, 1.8916106621424358, 1.6385515371958415, 1.3959590633710226, 1.2262824515501658, 1.1264400164286295, 1.0465689639250437, 1.028343033293883]
Dev loss: 0.9858526443441709




Epoch:  11%|█         | 11/100 [03:52<31:21, 21.14s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.2492987155914306, 2.7014334201812744, 2.248048802216848, 1.8916106621424358, 1.6385515371958415, 1.3959590633710226, 1.2262824515501658, 1.1264400164286295, 1.0465689639250437, 1.028343033293883, 0.9858526443441709]
Dev loss: 0.9138036211331685




Epoch:  12%|█▏        | 12/100 [04:13<31:00, 21.15s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…



Epoch:  13%|█▎        | 13/100 [04:34<30:27, 21.01s/it][A[A


Loss history: [3.2492987155914306, 2.7014334201812744, 2.248048802216848, 1.8916106621424358, 1.6385515371958415, 1.3959590633710226, 1.2262824515501658, 1.1264400164286295, 1.0465689639250437, 1.028343033293883, 0.9858526443441709, 0.9138036211331685]
Dev loss: 0.9251658628384273


HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.2492987155914306, 2.7014334201812744, 2.248048802216848, 1.8916106621424358, 1.6385515371958415, 1.3959590633710226, 1.2262824515501658, 1.1264400164286295, 1.0465689639250437, 1.028343033293883, 0.9858526443441709, 0.9138036211331685, 0.9251658628384273]
Dev loss: 0.8946281830469768




Epoch:  14%|█▍        | 14/100 [04:55<30:10, 21.05s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.2492987155914306, 2.7014334201812744, 2.248048802216848, 1.8916106621424358, 1.6385515371958415, 1.3959590633710226, 1.2262824515501658, 1.1264400164286295, 1.0465689639250437, 1.028343033293883, 0.9858526443441709, 0.9138036211331685, 0.9251658628384273, 0.8946281830469768]
Dev loss: 0.8715616991122563




Epoch:  15%|█▌        | 15/100 [05:16<29:51, 21.08s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.2492987155914306, 2.7014334201812744, 2.248048802216848, 1.8916106621424358, 1.6385515371958415, 1.3959590633710226, 1.2262824515501658, 1.1264400164286295, 1.0465689639250437, 1.028343033293883, 0.9858526443441709, 0.9138036211331685, 0.9251658628384273, 0.8946281830469768, 0.8715616991122563]
Dev loss: 0.8634103685617447




Epoch:  16%|█▌        | 16/100 [05:37<29:31, 21.09s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…



Epoch:  17%|█▋        | 17/100 [05:58<29:00, 20.97s/it][A[A


Loss history: [3.2492987155914306, 2.7014334201812744, 2.248048802216848, 1.8916106621424358, 1.6385515371958415, 1.3959590633710226, 1.2262824515501658, 1.1264400164286295, 1.0465689639250437, 1.028343033293883, 0.9858526443441709, 0.9138036211331685, 0.9251658628384273, 0.8946281830469768, 0.8715616991122563, 0.8634103685617447]
Dev loss: 0.8800227627158165


HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.2492987155914306, 2.7014334201812744, 2.248048802216848, 1.8916106621424358, 1.6385515371958415, 1.3959590633710226, 1.2262824515501658, 1.1264400164286295, 1.0465689639250437, 1.028343033293883, 0.9858526443441709, 0.9138036211331685, 0.9251658628384273, 0.8946281830469768, 0.8715616991122563, 0.8634103685617447, 0.8800227627158165]
Dev loss: 0.8408646027247111




Epoch:  18%|█▊        | 18/100 [06:19<28:44, 21.03s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.2492987155914306, 2.7014334201812744, 2.248048802216848, 1.8916106621424358, 1.6385515371958415, 1.3959590633710226, 1.2262824515501658, 1.1264400164286295, 1.0465689639250437, 1.028343033293883, 0.9858526443441709, 0.9138036211331685, 0.9251658628384273, 0.8946281830469768, 0.8715616991122563, 0.8634103685617447, 0.8800227627158165, 0.8408646027247111]
Dev loss: 0.8400544270873069




Epoch:  19%|█▉        | 19/100 [06:40<28:26, 21.06s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…



Epoch:  20%|██        | 20/100 [07:01<27:56, 20.95s/it][A[A


Loss history: [3.2492987155914306, 2.7014334201812744, 2.248048802216848, 1.8916106621424358, 1.6385515371958415, 1.3959590633710226, 1.2262824515501658, 1.1264400164286295, 1.0465689639250437, 1.028343033293883, 0.9858526443441709, 0.9138036211331685, 0.9251658628384273, 0.8946281830469768, 0.8715616991122563, 0.8634103685617447, 0.8800227627158165, 0.8408646027247111, 0.8400544270873069]
Dev loss: 0.8426856637001038


HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…



Epoch:  21%|██        | 21/100 [07:21<27:29, 20.87s/it][A[A


Loss history: [3.2492987155914306, 2.7014334201812744, 2.248048802216848, 1.8916106621424358, 1.6385515371958415, 1.3959590633710226, 1.2262824515501658, 1.1264400164286295, 1.0465689639250437, 1.028343033293883, 0.9858526443441709, 0.9138036211331685, 0.9251658628384273, 0.8946281830469768, 0.8715616991122563, 0.8634103685617447, 0.8800227627158165, 0.8408646027247111, 0.8400544270873069, 0.8426856637001038]
Dev loss: 0.850837238629659


HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…



Epoch:  22%|██▏       | 22/100 [07:42<27:04, 20.82s/it][A[A


Loss history: [3.2492987155914306, 2.7014334201812744, 2.248048802216848, 1.8916106621424358, 1.6385515371958415, 1.3959590633710226, 1.2262824515501658, 1.1264400164286295, 1.0465689639250437, 1.028343033293883, 0.9858526443441709, 0.9138036211331685, 0.9251658628384273, 0.8946281830469768, 0.8715616991122563, 0.8634103685617447, 0.8800227627158165, 0.8408646027247111, 0.8400544270873069, 0.8426856637001038, 0.850837238629659]
Dev loss: 0.8405556430419286


HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.2492987155914306, 2.7014334201812744, 2.248048802216848, 1.8916106621424358, 1.6385515371958415, 1.3959590633710226, 1.2262824515501658, 1.1264400164286295, 1.0465689639250437, 1.028343033293883, 0.9858526443441709, 0.9138036211331685, 0.9251658628384273, 0.8946281830469768, 0.8715616991122563, 0.8634103685617447, 0.8800227627158165, 0.8408646027247111, 0.8400544270873069, 0.8426856637001038, 0.850837238629659, 0.8405556430419286]
Dev loss: 0.838591180741787




Epoch:  23%|██▎       | 23/100 [08:03<26:50, 20.92s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.2492987155914306, 2.7014334201812744, 2.248048802216848, 1.8916106621424358, 1.6385515371958415, 1.3959590633710226, 1.2262824515501658, 1.1264400164286295, 1.0465689639250437, 1.028343033293883, 0.9858526443441709, 0.9138036211331685, 0.9251658628384273, 0.8946281830469768, 0.8715616991122563, 0.8634103685617447, 0.8800227627158165, 0.8408646027247111, 0.8400544270873069, 0.8426856637001038, 0.850837238629659, 0.8405556430419286, 0.838591180741787]
Dev loss: 0.8242155452569325




Epoch:  24%|██▍       | 24/100 [08:24<26:35, 20.99s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…



Epoch:  25%|██▌       | 25/100 [08:45<26:07, 20.90s/it][A[A


Loss history: [3.2492987155914306, 2.7014334201812744, 2.248048802216848, 1.8916106621424358, 1.6385515371958415, 1.3959590633710226, 1.2262824515501658, 1.1264400164286295, 1.0465689639250437, 1.028343033293883, 0.9858526443441709, 0.9138036211331685, 0.9251658628384273, 0.8946281830469768, 0.8715616991122563, 0.8634103685617447, 0.8800227627158165, 0.8408646027247111, 0.8400544270873069, 0.8426856637001038, 0.850837238629659, 0.8405556430419286, 0.838591180741787, 0.8242155452569325]
Dev loss: 0.843588254849116


HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…



Epoch:  26%|██▌       | 26/100 [09:06<25:42, 20.84s/it][A[A


Loss history: [3.2492987155914306, 2.7014334201812744, 2.248048802216848, 1.8916106621424358, 1.6385515371958415, 1.3959590633710226, 1.2262824515501658, 1.1264400164286295, 1.0465689639250437, 1.028343033293883, 0.9858526443441709, 0.9138036211331685, 0.9251658628384273, 0.8946281830469768, 0.8715616991122563, 0.8634103685617447, 0.8800227627158165, 0.8408646027247111, 0.8400544270873069, 0.8426856637001038, 0.850837238629659, 0.8405556430419286, 0.838591180741787, 0.8242155452569325, 0.843588254849116]
Dev loss: 0.8570955107609431


HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…



Epoch:  27%|██▋       | 27/100 [09:27<25:18, 20.80s/it][A[A


Loss history: [3.2492987155914306, 2.7014334201812744, 2.248048802216848, 1.8916106621424358, 1.6385515371958415, 1.3959590633710226, 1.2262824515501658, 1.1264400164286295, 1.0465689639250437, 1.028343033293883, 0.9858526443441709, 0.9138036211331685, 0.9251658628384273, 0.8946281830469768, 0.8715616991122563, 0.8634103685617447, 0.8800227627158165, 0.8408646027247111, 0.8400544270873069, 0.8426856637001038, 0.850837238629659, 0.8405556430419286, 0.838591180741787, 0.8242155452569325, 0.843588254849116, 0.8570955107609431]
Dev loss: 0.8476788947979609


HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…



Epoch:  28%|██▊       | 28/100 [09:47<24:55, 20.77s/it][A[A


Loss history: [3.2492987155914306, 2.7014334201812744, 2.248048802216848, 1.8916106621424358, 1.6385515371958415, 1.3959590633710226, 1.2262824515501658, 1.1264400164286295, 1.0465689639250437, 1.028343033293883, 0.9858526443441709, 0.9138036211331685, 0.9251658628384273, 0.8946281830469768, 0.8715616991122563, 0.8634103685617447, 0.8800227627158165, 0.8408646027247111, 0.8400544270873069, 0.8426856637001038, 0.850837238629659, 0.8405556430419286, 0.838591180741787, 0.8242155452569325, 0.843588254849116, 0.8570955107609431, 0.8476788947979609]
Dev loss: 0.843557799855868


HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.2492987155914306, 2.7014334201812744, 2.248048802216848, 1.8916106621424358, 1.6385515371958415, 1.3959590633710226, 1.2262824515501658, 1.1264400164286295, 1.0465689639250437, 1.028343033293883, 0.9858526443441709, 0.9138036211331685, 0.9251658628384273, 0.8946281830469768, 0.8715616991122563, 0.8634103685617447, 0.8800227627158165, 0.8408646027247111, 0.8400544270873069, 0.8426856637001038, 0.850837238629659, 0.8405556430419286, 0.838591180741787, 0.8242155452569325, 0.843588254849116, 0.8570955107609431, 0.8476788947979609, 0.843557799855868]
Dev loss: 0.8450007796287536
No improvement on development set. Finish training.
Loading model from /tmp/model.bin


I0406 18:33:59.636682 140397457205056 configuration_utils.py:256] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json from cache at /home/yves/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.8f56353af4a709bf5ff0fbc915d8f5b42bfff892cbb6ac98c3c45f481a03c685
I0406 18:33:59.638311 140397457205056 configuration_utils.py:292] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": null,
  "do_sample": false,
  "eos_token_ids": null,
  "finetuning_task": null,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1"
  },
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "is_decoder": false,
  "label2id": {
    "LABEL_0": 0,
    "LABEL_1": 1
  },
  "layer_norm_eps": 1e-12,
  "length_penalty": 1.0,
  "max_length": 20,
  "max_position_embedd

HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…




I0406 18:34:13.588021 140397457205056 configuration_utils.py:256] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json from cache at /home/yves/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.8f56353af4a709bf5ff0fbc915d8f5b42bfff892cbb6ac98c3c45f481a03c685
I0406 18:34:13.589633 140397457205056 configuration_utils.py:292] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": null,
  "do_sample": false,
  "eos_token_ids": null,
  "finetuning_task": null,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1"
  },
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "is_decoder": false,
  "label2id": {
    "LABEL_0": 0,
    "LABEL_1": 1
  },
  "layer_norm_eps": 1e-12,
  "length_penalty": 1.0,
  "max_length": 20,
  "max_position_embedd

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: []
Dev loss: 3.2670257647832233





Epoch:   1%|          | 1/100 [00:21<34:44, 21.06s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.2670257647832233]
Dev loss: 2.903889854749044





Epoch:   2%|▏         | 2/100 [00:42<34:24, 21.07s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.2670257647832233, 2.903889854749044]
Dev loss: 2.5840357065200807





Epoch:   3%|▎         | 3/100 [01:03<34:05, 21.08s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.2670257647832233, 2.903889854749044, 2.5840357065200807]
Dev loss: 2.2129236300786337





Epoch:   4%|▍         | 4/100 [01:24<33:45, 21.10s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.2670257647832233, 2.903889854749044, 2.5840357065200807, 2.2129236300786337]
Dev loss: 1.8371737043062846





Epoch:   5%|▌         | 5/100 [01:45<33:25, 21.11s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.2670257647832233, 2.903889854749044, 2.5840357065200807, 2.2129236300786337, 1.8371737043062846]
Dev loss: 1.5553256710370382





Epoch:   6%|▌         | 6/100 [02:06<33:05, 21.12s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.2670257647832233, 2.903889854749044, 2.5840357065200807, 2.2129236300786337, 1.8371737043062846, 1.5553256710370382]
Dev loss: 1.3585904439290364





Epoch:   7%|▋         | 7/100 [02:27<32:45, 21.13s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.2670257647832233, 2.903889854749044, 2.5840357065200807, 2.2129236300786337, 1.8371737043062846, 1.5553256710370382, 1.3585904439290364]
Dev loss: 1.2083387891451518





Epoch:   8%|▊         | 8/100 [02:48<32:24, 21.14s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.2670257647832233, 2.903889854749044, 2.5840357065200807, 2.2129236300786337, 1.8371737043062846, 1.5553256710370382, 1.3585904439290364, 1.2083387891451518]
Dev loss: 1.130923291047414





Epoch:   9%|▉         | 9/100 [03:10<32:04, 21.15s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.2670257647832233, 2.903889854749044, 2.5840357065200807, 2.2129236300786337, 1.8371737043062846, 1.5553256710370382, 1.3585904439290364, 1.2083387891451518, 1.130923291047414]
Dev loss: 1.06386499106884





Epoch:  10%|█         | 10/100 [03:31<31:43, 21.15s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.2670257647832233, 2.903889854749044, 2.5840357065200807, 2.2129236300786337, 1.8371737043062846, 1.5553256710370382, 1.3585904439290364, 1.2083387891451518, 1.130923291047414, 1.06386499106884]
Dev loss: 1.0118305146694184





Epoch:  11%|█         | 11/100 [03:52<31:23, 21.16s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.2670257647832233, 2.903889854749044, 2.5840357065200807, 2.2129236300786337, 1.8371737043062846, 1.5553256710370382, 1.3585904439290364, 1.2083387891451518, 1.130923291047414, 1.06386499106884, 1.0118305146694184]
Dev loss: 0.9572323580582937





Epoch:  12%|█▏        | 12/100 [04:13<31:02, 21.16s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.2670257647832233, 2.903889854749044, 2.5840357065200807, 2.2129236300786337, 1.8371737043062846, 1.5553256710370382, 1.3585904439290364, 1.2083387891451518, 1.130923291047414, 1.06386499106884, 1.0118305146694184, 0.9572323580582937]
Dev loss: 0.9083395431439082





Epoch:  13%|█▎        | 13/100 [04:34<30:40, 21.16s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.2670257647832233, 2.903889854749044, 2.5840357065200807, 2.2129236300786337, 1.8371737043062846, 1.5553256710370382, 1.3585904439290364, 1.2083387891451518, 1.130923291047414, 1.06386499106884, 1.0118305146694184, 0.9572323580582937, 0.9083395431439082]
Dev loss: 0.8725145826737086





Epoch:  14%|█▍        | 14/100 [04:55<30:19, 21.16s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.2670257647832233, 2.903889854749044, 2.5840357065200807, 2.2129236300786337, 1.8371737043062846, 1.5553256710370382, 1.3585904439290364, 1.2083387891451518, 1.130923291047414, 1.06386499106884, 1.0118305146694184, 0.9572323580582937, 0.9083395431439082, 0.8725145826737086]
Dev loss: 0.863960995276769





Epoch:  15%|█▌        | 15/100 [05:17<29:57, 21.15s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.2670257647832233, 2.903889854749044, 2.5840357065200807, 2.2129236300786337, 1.8371737043062846, 1.5553256710370382, 1.3585904439290364, 1.2083387891451518, 1.130923291047414, 1.06386499106884, 1.0118305146694184, 0.9572323580582937, 0.9083395431439082, 0.8725145826737086, 0.863960995276769]
Dev loss: 0.8496012091636658





Epoch:  16%|█▌        | 16/100 [05:38<29:36, 21.15s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.2670257647832233, 2.903889854749044, 2.5840357065200807, 2.2129236300786337, 1.8371737043062846, 1.5553256710370382, 1.3585904439290364, 1.2083387891451518, 1.130923291047414, 1.06386499106884, 1.0118305146694184, 0.9572323580582937, 0.9083395431439082, 0.8725145826737086, 0.863960995276769, 0.8496012091636658]
Dev loss: 0.8438244213660558





Epoch:  17%|█▋        | 17/100 [05:59<29:15, 21.15s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.2670257647832233, 2.903889854749044, 2.5840357065200807, 2.2129236300786337, 1.8371737043062846, 1.5553256710370382, 1.3585904439290364, 1.2083387891451518, 1.130923291047414, 1.06386499106884, 1.0118305146694184, 0.9572323580582937, 0.9083395431439082, 0.8725145826737086, 0.863960995276769, 0.8496012091636658, 0.8438244213660558]
Dev loss: 0.8395298838615417





Epoch:  18%|█▊        | 18/100 [06:20<28:54, 21.15s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.2670257647832233, 2.903889854749044, 2.5840357065200807, 2.2129236300786337, 1.8371737043062846, 1.5553256710370382, 1.3585904439290364, 1.2083387891451518, 1.130923291047414, 1.06386499106884, 1.0118305146694184, 0.9572323580582937, 0.9083395431439082, 0.8725145826737086, 0.863960995276769, 0.8496012091636658, 0.8438244213660558, 0.8395298838615417]
Dev loss: 0.8296309490998586





Epoch:  19%|█▉        | 19/100 [06:41<28:33, 21.15s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…




Epoch:  20%|██        | 20/100 [07:02<28:01, 21.02s/it][A[A[A


Loss history: [3.2670257647832233, 2.903889854749044, 2.5840357065200807, 2.2129236300786337, 1.8371737043062846, 1.5553256710370382, 1.3585904439290364, 1.2083387891451518, 1.130923291047414, 1.06386499106884, 1.0118305146694184, 0.9572323580582937, 0.9083395431439082, 0.8725145826737086, 0.863960995276769, 0.8496012091636658, 0.8438244213660558, 0.8395298838615417, 0.8296309490998586]
Dev loss: 0.8324170902371406


HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.2670257647832233, 2.903889854749044, 2.5840357065200807, 2.2129236300786337, 1.8371737043062846, 1.5553256710370382, 1.3585904439290364, 1.2083387891451518, 1.130923291047414, 1.06386499106884, 1.0118305146694184, 0.9572323580582937, 0.9083395431439082, 0.8725145826737086, 0.863960995276769, 0.8496012091636658, 0.8438244213660558, 0.8395298838615417, 0.8296309490998586, 0.8324170902371406]
Dev loss: 0.807247390349706





Epoch:  21%|██        | 21/100 [07:23<27:43, 21.06s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.2670257647832233, 2.903889854749044, 2.5840357065200807, 2.2129236300786337, 1.8371737043062846, 1.5553256710370382, 1.3585904439290364, 1.2083387891451518, 1.130923291047414, 1.06386499106884, 1.0118305146694184, 0.9572323580582937, 0.9083395431439082, 0.8725145826737086, 0.863960995276769, 0.8496012091636658, 0.8438244213660558, 0.8395298838615417, 0.8296309490998586, 0.8324170902371406, 0.807247390349706]
Dev loss: 0.7803093115488688





Epoch:  22%|██▏       | 22/100 [07:44<27:24, 21.09s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.2670257647832233, 2.903889854749044, 2.5840357065200807, 2.2129236300786337, 1.8371737043062846, 1.5553256710370382, 1.3585904439290364, 1.2083387891451518, 1.130923291047414, 1.06386499106884, 1.0118305146694184, 0.9572323580582937, 0.9083395431439082, 0.8725145826737086, 0.863960995276769, 0.8496012091636658, 0.8438244213660558, 0.8395298838615417, 0.8296309490998586, 0.8324170902371406, 0.807247390349706, 0.7803093115488688]
Dev loss: 0.7722235937913259





Epoch:  23%|██▎       | 23/100 [08:05<27:05, 21.11s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.2670257647832233, 2.903889854749044, 2.5840357065200807, 2.2129236300786337, 1.8371737043062846, 1.5553256710370382, 1.3585904439290364, 1.2083387891451518, 1.130923291047414, 1.06386499106884, 1.0118305146694184, 0.9572323580582937, 0.9083395431439082, 0.8725145826737086, 0.863960995276769, 0.8496012091636658, 0.8438244213660558, 0.8395298838615417, 0.8296309490998586, 0.8324170902371406, 0.807247390349706, 0.7803093115488688, 0.7722235937913259]
Dev loss: 0.7673863112926483





Epoch:  24%|██▍       | 24/100 [08:27<26:45, 21.12s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…




Epoch:  25%|██▌       | 25/100 [08:47<26:14, 20.99s/it][A[A[A


Loss history: [3.2670257647832233, 2.903889854749044, 2.5840357065200807, 2.2129236300786337, 1.8371737043062846, 1.5553256710370382, 1.3585904439290364, 1.2083387891451518, 1.130923291047414, 1.06386499106884, 1.0118305146694184, 0.9572323580582937, 0.9083395431439082, 0.8725145826737086, 0.863960995276769, 0.8496012091636658, 0.8438244213660558, 0.8395298838615417, 0.8296309490998586, 0.8324170902371406, 0.807247390349706, 0.7803093115488688, 0.7722235937913259, 0.7673863112926483]
Dev loss: 0.7769743020335833


HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…




Epoch:  26%|██▌       | 26/100 [09:08<25:46, 20.90s/it][A[A[A


Loss history: [3.2670257647832233, 2.903889854749044, 2.5840357065200807, 2.2129236300786337, 1.8371737043062846, 1.5553256710370382, 1.3585904439290364, 1.2083387891451518, 1.130923291047414, 1.06386499106884, 1.0118305146694184, 0.9572323580582937, 0.9083395431439082, 0.8725145826737086, 0.863960995276769, 0.8496012091636658, 0.8438244213660558, 0.8395298838615417, 0.8296309490998586, 0.8324170902371406, 0.807247390349706, 0.7803093115488688, 0.7722235937913259, 0.7673863112926483, 0.7769743020335833]
Dev loss: 0.7831111192703247


HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…




Epoch:  27%|██▋       | 27/100 [09:29<25:21, 20.84s/it][A[A[A


Loss history: [3.2670257647832233, 2.903889854749044, 2.5840357065200807, 2.2129236300786337, 1.8371737043062846, 1.5553256710370382, 1.3585904439290364, 1.2083387891451518, 1.130923291047414, 1.06386499106884, 1.0118305146694184, 0.9572323580582937, 0.9083395431439082, 0.8725145826737086, 0.863960995276769, 0.8496012091636658, 0.8438244213660558, 0.8395298838615417, 0.8296309490998586, 0.8324170902371406, 0.807247390349706, 0.7803093115488688, 0.7722235937913259, 0.7673863112926483, 0.7769743020335833, 0.7831111192703247]
Dev loss: 0.7778826514879863


HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.2670257647832233, 2.903889854749044, 2.5840357065200807, 2.2129236300786337, 1.8371737043062846, 1.5553256710370382, 1.3585904439290364, 1.2083387891451518, 1.130923291047414, 1.06386499106884, 1.0118305146694184, 0.9572323580582937, 0.9083395431439082, 0.8725145826737086, 0.863960995276769, 0.8496012091636658, 0.8438244213660558, 0.8395298838615417, 0.8296309490998586, 0.8324170902371406, 0.807247390349706, 0.7803093115488688, 0.7722235937913259, 0.7673863112926483, 0.7769743020335833, 0.7831111192703247, 0.7778826514879863]
Dev loss: 0.7511599664886792





Epoch:  28%|██▊       | 28/100 [09:50<25:07, 20.94s/it][A[A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…




Epoch:  29%|██▉       | 29/100 [10:10<24:41, 20.86s/it][A[A[A


Loss history: [3.2670257647832233, 2.903889854749044, 2.5840357065200807, 2.2129236300786337, 1.8371737043062846, 1.5553256710370382, 1.3585904439290364, 1.2083387891451518, 1.130923291047414, 1.06386499106884, 1.0118305146694184, 0.9572323580582937, 0.9083395431439082, 0.8725145826737086, 0.863960995276769, 0.8496012091636658, 0.8438244213660558, 0.8395298838615417, 0.8296309490998586, 0.8324170902371406, 0.807247390349706, 0.7803093115488688, 0.7722235937913259, 0.7673863112926483, 0.7769743020335833, 0.7831111192703247, 0.7778826514879863, 0.7511599664886792]
Dev loss: 0.7967015504837036


HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…




Epoch:  30%|███       | 30/100 [10:31<24:16, 20.81s/it][A[A[A


Loss history: [3.2670257647832233, 2.903889854749044, 2.5840357065200807, 2.2129236300786337, 1.8371737043062846, 1.5553256710370382, 1.3585904439290364, 1.2083387891451518, 1.130923291047414, 1.06386499106884, 1.0118305146694184, 0.9572323580582937, 0.9083395431439082, 0.8725145826737086, 0.863960995276769, 0.8496012091636658, 0.8438244213660558, 0.8395298838615417, 0.8296309490998586, 0.8324170902371406, 0.807247390349706, 0.7803093115488688, 0.7722235937913259, 0.7673863112926483, 0.7769743020335833, 0.7831111192703247, 0.7778826514879863, 0.7511599664886792, 0.7967015504837036]
Dev loss: 0.7773870507876078


HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…




Epoch:  31%|███       | 31/100 [10:52<23:53, 20.78s/it][A[A[A


Loss history: [3.2670257647832233, 2.903889854749044, 2.5840357065200807, 2.2129236300786337, 1.8371737043062846, 1.5553256710370382, 1.3585904439290364, 1.2083387891451518, 1.130923291047414, 1.06386499106884, 1.0118305146694184, 0.9572323580582937, 0.9083395431439082, 0.8725145826737086, 0.863960995276769, 0.8496012091636658, 0.8438244213660558, 0.8395298838615417, 0.8296309490998586, 0.8324170902371406, 0.807247390349706, 0.7803093115488688, 0.7722235937913259, 0.7673863112926483, 0.7769743020335833, 0.7831111192703247, 0.7778826514879863, 0.7511599664886792, 0.7967015504837036, 0.7773870507876078]
Dev loss: 0.7894340097904206


HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…




Epoch:  32%|███▏      | 32/100 [11:13<23:31, 20.75s/it][A[A[A


Loss history: [3.2670257647832233, 2.903889854749044, 2.5840357065200807, 2.2129236300786337, 1.8371737043062846, 1.5553256710370382, 1.3585904439290364, 1.2083387891451518, 1.130923291047414, 1.06386499106884, 1.0118305146694184, 0.9572323580582937, 0.9083395431439082, 0.8725145826737086, 0.863960995276769, 0.8496012091636658, 0.8438244213660558, 0.8395298838615417, 0.8296309490998586, 0.8324170902371406, 0.807247390349706, 0.7803093115488688, 0.7722235937913259, 0.7673863112926483, 0.7769743020335833, 0.7831111192703247, 0.7778826514879863, 0.7511599664886792, 0.7967015504837036, 0.7773870507876078, 0.7894340097904206]
Dev loss: 0.7840907235940298


HBox(children=(IntProgress(value=0, description='Training iteration', max=89, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…


Loss history: [3.2670257647832233, 2.903889854749044, 2.5840357065200807, 2.2129236300786337, 1.8371737043062846, 1.5553256710370382, 1.3585904439290364, 1.2083387891451518, 1.130923291047414, 1.06386499106884, 1.0118305146694184, 0.9572323580582937, 0.9083395431439082, 0.8725145826737086, 0.863960995276769, 0.8496012091636658, 0.8438244213660558, 0.8395298838615417, 0.8296309490998586, 0.8324170902371406, 0.807247390349706, 0.7803093115488688, 0.7722235937913259, 0.7673863112926483, 0.7769743020335833, 0.7831111192703247, 0.7778826514879863, 0.7511599664886792, 0.7967015504837036, 0.7773870507876078, 0.7894340097904206, 0.7840907235940298]
Dev loss: 0.7799115518728892
No improvement on development set. Finish training.
Loading model from /tmp/model.bin


I0406 18:45:50.138985 140397457205056 configuration_utils.py:256] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json from cache at /home/yves/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.8f56353af4a709bf5ff0fbc915d8f5b42bfff892cbb6ac98c3c45f481a03c685
I0406 18:45:50.140305 140397457205056 configuration_utils.py:292] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": null,
  "do_sample": false,
  "eos_token_ids": null,
  "finetuning_task": null,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1"
  },
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "is_decoder": false,
  "label2id": {
    "LABEL_0": 0,
    "LABEL_1": 1
  },
  "layer_norm_eps": 1e-12,
  "length_penalty": 1.0,
  "max_length": 20,
  "max_position_embedd

HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=30, style=ProgressStyle(descriptio…




## Evaluation

In [4]:
from sklearn.metrics import precision_recall_fscore_support, classification_report

print("Test performance:", precision_recall_fscore_support(all_correct, all_predicted, average="micro"))
print(classification_report(all_correct, all_predicted, target_names=target_names))

Test performance: (0.8082595870206489, 0.8082595870206489, 0.8082595870206488, None)
                                 precision    recall  f1-score   support

     eatingmeat4_so_Feedback_11       0.61      0.69      0.65        16
      eatingmeat4_so_Feedback_1       0.50      0.67      0.57        12
      eatingmeat4_so_Feedback_2       0.86      0.95      0.90       148
      eatingmeat4_so_Feedback_3       0.36      0.25      0.30        16
      eatingmeat4_so_Feedback_4       0.33      0.10      0.15        10
      eatingmeat4_so_Feedback_5       0.60      0.38      0.46        16
      eatingmeat4_so_Feedback_6       1.00      0.30      0.46        10
      eatingmeat4_so_Feedback_7       0.92      0.92      0.92        24
      eatingmeat4_so_Feedback_8       0.82      0.48      0.61        48
      eatingmeat4_so_Feedback_9       0.85      0.89      0.87       309
     eatingmeat4_so_Feedback_10       0.86      0.90      0.88       241
 eatingmeat4_because_Feedback_1       

  _warn_prf(average, modifier, msg_start, len(result))


In [5]:
c = 0
for item, predicted, correct in zip(all_test_data, all_predicted, all_correct):
    assert item.label_ids == correct
    c += (item.label_ids == predicted)
    print("{}#{}#{}".format(item.text, idx2label[correct], idx2label[predicted]))
    
print()
print(c, "/", len(all_test_data), "=", c/len(all_test_data))

Methane from cow burps harms the environment, so either people have to stop eating meat and use plant based "meats" that don't harm the environment as much#eatingmeat4_so_Feedback_9#eatingmeat4_so_Feedback_9
Methane from cow burps harms the environment, but they only produce about 14.5% of greenhouse gas emissions.#eatingmeat4_but_Feedback_1#eatingmeat4_but_Feedback_1
Methane from cow burps harms the environment, but impossible Foods have created plant based meat that have the same qualities as real meat.#eatingmeat4_but_Feedback_8#eatingmeat4_but_Feedback_5
Methane from cow burps harms the environment, but when cows eat seaweed as part of their diet, the methane content in their burps decreases by 99%.#eatingmeat4_but_Feedback_13#eatingmeat4_but_Feedback_13
Methane from cow burps harms the environment because it exacerbates the process of global warming by increasing Earth's temperature.#eatingmeat4_because_Feedback_1#eatingmeat4_because_Feedback_1
Methane from cow burps harms the env