# First BERT Experiments

In this notebook we do some first experiments with BERT: we finetune a BERT model+classifier on each of our datasets separately and compute the accuracy of the resulting classifier on the test data.

For these experiments we use the `pytorch_transformers` package. It contains a variety of neural network architectures for transfer learning and pretrained models, including BERT and XLNET.

Two different BERT models are relevant for our experiments: 

- BERT-base-uncased: a relatively small BERT model that should already give reasonable results,
- BERT-large-uncased: a larger model for real state-of-the-art results.

In [1]:
BERT_MODEL = 'bert-base-uncased'
BATCH_SIZE = 16 if "base" in BERT_MODEL else 2
GRADIENT_ACCUMULATION_STEPS = 1 if "base" in BERT_MODEL else 8
MAX_SEQ_LENGTH = 100
PREFIXES = ["junkfood_because", "junkfood_but"]

## Data

We use the same data as for all our previous experiments. Here we load the training, development and test data for a particular prompt.

In [2]:
import sys
sys.path.append('../')

import ndjson
import glob
import numpy as np

from quillnlp.models.bert.preprocessing import preprocess, create_label_vocabulary

data = []
for prefix in PREFIXES:
    data_file = f"../data/interim/{prefix}_withprompt.ndjson"

    with open(data_file) as i:
        data += ndjson.load(i)
        
        
label2idx = create_label_vocabulary(data)
idx2label = {v:k for k,v in label2idx.items()}
target_names = [idx2label[s] for s in range(len(idx2label))]

data_items = preprocess(data, BERT_MODEL, label2idx, MAX_SEQ_LENGTH)
data_items = np.array(data_items)

I0327 14:11:32.511290 139807978702656 file_utils.py:41] PyTorch version 1.2.0+cu92 available.
I0327 14:11:33.492857 139807978702656 file_utils.py:57] TensorFlow version 2.1.0 available.
I0327 14:11:34.133068 139807978702656 tokenization_utils.py:501] loading file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt from cache at /home/yves/.cache/torch/transformers/26bc1ad6c0ac742e9b52263248f6d0f00068293b33709fae12320c0e35ccfbbb.542ce4285a40d23a559526243235df47c5f75c197f04f37d1a0c124c32c9a084


## Training

In [3]:
import torch

from quillnlp.models.bert.train import train, evaluate
from quillnlp.models.bert.models import get_bert_classifier

from quillnlp.models.bert.preprocessing import get_data_loader
from sklearn.model_selection import KFold

kf = KFold(n_splits=5, shuffle=True, random_state=1)
all_correct, all_predicted = [], []
all_test_data = []
for train_idx, test_idx in kf.split(data_items):

    train_and_dev_data = data_items[train_idx]
    cutoff = int(len(train_and_dev_data)/4*3)
    
    train_data = train_and_dev_data[:cutoff]
    dev_data = train_and_dev_data[cutoff:]
    test_data = data_items[test_idx]

    train_dataloader = get_data_loader(train_data, BATCH_SIZE)
    dev_dataloader = get_data_loader(dev_data, BATCH_SIZE)
    test_dataloader = get_data_loader(test_data, BATCH_SIZE, shuffle=False)

    device = "cuda" if torch.cuda.is_available() else "cpu"
    model = get_bert_classifier(BERT_MODEL, len(label2idx), device=device)
    output_model_file = train(model, train_dataloader, dev_dataloader, BATCH_SIZE, GRADIENT_ACCUMULATION_STEPS, device)
    
    print("Loading model from", output_model_file)
    device="cpu"

    model = get_bert_classifier(BERT_MODEL, len(label2idx), model_file=output_model_file, device=device)
    model.eval()
    
    _, _, test_correct, test_predicted = evaluate(model, test_dataloader, device)
    all_correct.extend(test_correct)
    all_predicted.extend(test_predicted)
    all_test_data.extend(test_data)


I0327 14:11:34.960264 139807978702656 configuration_utils.py:256] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json from cache at /home/yves/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.8f56353af4a709bf5ff0fbc915d8f5b42bfff892cbb6ac98c3c45f481a03c685
I0327 14:11:34.962032 139807978702656 configuration_utils.py:292] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": null,
  "do_sample": false,
  "eos_token_ids": null,
  "finetuning_task": null,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1"
  },
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "is_decoder": false,
  "label2id": {
    "LABEL_0": 0,
    "LABEL_1": 1
  },
  "layer_norm_eps": 1e-12,
  "length_penalty": 1.0,
  "max_length": 20,
  "max_position_embedd

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…

  outputs = softmax(logits.to('cpu'))



Loss history: []
Dev loss: 1.9920349319775899


Epoch:   5%|▌         | 1/20 [00:08<02:36,  8.26s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Loss history: [1.9920349319775899]
Dev loss: 1.5656807323296864


Epoch:  10%|█         | 2/20 [00:16<02:28,  8.26s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Loss history: [1.9920349319775899, 1.5656807323296864]
Dev loss: 1.3153026004632313


Epoch:  15%|█▌        | 3/20 [00:24<02:20,  8.26s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Loss history: [1.9920349319775899, 1.5656807323296864, 1.3153026004632313]
Dev loss: 1.0038234045108159


Epoch:  20%|██        | 4/20 [00:33<02:12,  8.27s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Loss history: [1.9920349319775899, 1.5656807323296864, 1.3153026004632313, 1.0038234045108159]
Dev loss: 0.925411989291509


Epoch:  25%|██▌       | 5/20 [00:41<02:04,  8.27s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Loss history: [1.9920349319775899, 1.5656807323296864, 1.3153026004632313, 1.0038234045108159, 0.925411989291509]
Dev loss: 0.9044687648614248


Epoch:  30%|███       | 6/20 [00:49<01:55,  8.28s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Loss history: [1.9920349319775899, 1.5656807323296864, 1.3153026004632313, 1.0038234045108159, 0.925411989291509, 0.9044687648614248]
Dev loss: 0.7520791093508402


Epoch:  35%|███▌      | 7/20 [00:57<01:47,  8.29s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…

Epoch:  40%|████      | 8/20 [01:05<01:37,  8.15s/it]


Loss history: [1.9920349319775899, 1.5656807323296864, 1.3153026004632313, 1.0038234045108159, 0.925411989291509, 0.9044687648614248, 0.7520791093508402]
Dev loss: 0.761494554579258


HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Loss history: [1.9920349319775899, 1.5656807323296864, 1.3153026004632313, 1.0038234045108159, 0.925411989291509, 0.9044687648614248, 0.7520791093508402, 0.761494554579258]
Dev loss: 0.7450894365708033


Epoch:  45%|████▌     | 9/20 [01:14<01:30,  8.20s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…

Epoch:  50%|█████     | 10/20 [01:21<01:20,  8.10s/it]


Loss history: [1.9920349319775899, 1.5656807323296864, 1.3153026004632313, 1.0038234045108159, 0.925411989291509, 0.9044687648614248, 0.7520791093508402, 0.761494554579258, 0.7450894365708033]
Dev loss: 0.7583351507782936


HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Loss history: [1.9920349319775899, 1.5656807323296864, 1.3153026004632313, 1.0038234045108159, 0.925411989291509, 0.9044687648614248, 0.7520791093508402, 0.761494554579258, 0.7450894365708033, 0.7583351507782936]
Dev loss: 0.6547142465909322


Epoch:  55%|█████▌    | 11/20 [01:30<01:13,  8.17s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Loss history: [1.9920349319775899, 1.5656807323296864, 1.3153026004632313, 1.0038234045108159, 0.925411989291509, 0.9044687648614248, 0.7520791093508402, 0.761494554579258, 0.7450894365708033, 0.7583351507782936, 0.6547142465909322]
Dev loss: 0.6410361106197039


Epoch:  60%|██████    | 12/20 [01:38<01:05,  8.25s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…

Epoch:  65%|██████▌   | 13/20 [01:46<00:56,  8.13s/it]


Loss history: [1.9920349319775899, 1.5656807323296864, 1.3153026004632313, 1.0038234045108159, 0.925411989291509, 0.9044687648614248, 0.7520791093508402, 0.761494554579258, 0.7450894365708033, 0.7583351507782936, 0.6547142465909322, 0.6410361106197039]
Dev loss: 0.6676951063175997


HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…

Epoch:  70%|███████   | 14/20 [01:54<00:48,  8.05s/it]


Loss history: [1.9920349319775899, 1.5656807323296864, 1.3153026004632313, 1.0038234045108159, 0.925411989291509, 0.9044687648614248, 0.7520791093508402, 0.761494554579258, 0.7450894365708033, 0.7583351507782936, 0.6547142465909322, 0.6410361106197039, 0.6676951063175997]
Dev loss: 0.6806474030017853


HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…

Epoch:  75%|███████▌  | 15/20 [02:02<00:39,  7.99s/it]


Loss history: [1.9920349319775899, 1.5656807323296864, 1.3153026004632313, 1.0038234045108159, 0.925411989291509, 0.9044687648614248, 0.7520791093508402, 0.761494554579258, 0.7450894365708033, 0.7583351507782936, 0.6547142465909322, 0.6410361106197039, 0.6676951063175997, 0.6806474030017853]
Dev loss: 0.7074352788428465


HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…

Epoch:  80%|████████  | 16/20 [02:10<00:31,  7.95s/it]


Loss history: [1.9920349319775899, 1.5656807323296864, 1.3153026004632313, 1.0038234045108159, 0.925411989291509, 0.9044687648614248, 0.7520791093508402, 0.761494554579258, 0.7450894365708033, 0.7583351507782936, 0.6547142465909322, 0.6410361106197039, 0.6676951063175997, 0.6806474030017853, 0.7074352788428465]
Dev loss: 0.671694427728653


HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Loss history: [1.9920349319775899, 1.5656807323296864, 1.3153026004632313, 1.0038234045108159, 0.925411989291509, 0.9044687648614248, 0.7520791093508402, 0.761494554579258, 0.7450894365708033, 0.7583351507782936, 0.6547142465909322, 0.6410361106197039, 0.6676951063175997, 0.6806474030017853, 0.7074352788428465, 0.671694427728653]
Dev loss: 0.6254908541838328


Epoch:  85%|████████▌ | 17/20 [02:18<00:24,  8.06s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…

Epoch:  90%|█████████ | 18/20 [02:26<00:16,  8.00s/it]


Loss history: [1.9920349319775899, 1.5656807323296864, 1.3153026004632313, 1.0038234045108159, 0.925411989291509, 0.9044687648614248, 0.7520791093508402, 0.761494554579258, 0.7450894365708033, 0.7583351507782936, 0.6547142465909322, 0.6410361106197039, 0.6676951063175997, 0.6806474030017853, 0.7074352788428465, 0.671694427728653, 0.6254908541838328]
Dev loss: 0.7344433491428694


HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…

Epoch:  95%|█████████▌| 19/20 [02:34<00:07,  7.96s/it]


Loss history: [1.9920349319775899, 1.5656807323296864, 1.3153026004632313, 1.0038234045108159, 0.925411989291509, 0.9044687648614248, 0.7520791093508402, 0.761494554579258, 0.7450894365708033, 0.7583351507782936, 0.6547142465909322, 0.6410361106197039, 0.6676951063175997, 0.6806474030017853, 0.7074352788428465, 0.671694427728653, 0.6254908541838328, 0.7344433491428694]
Dev loss: 0.6258217816551527


HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…

Epoch: 100%|██████████| 20/20 [02:42<00:00,  7.93s/it]


Loss history: [1.9920349319775899, 1.5656807323296864, 1.3153026004632313, 1.0038234045108159, 0.925411989291509, 0.9044687648614248, 0.7520791093508402, 0.761494554579258, 0.7450894365708033, 0.7583351507782936, 0.6547142465909322, 0.6410361106197039, 0.6676951063175997, 0.6806474030017853, 0.7074352788428465, 0.671694427728653, 0.6254908541838328, 0.7344433491428694, 0.6258217816551527]
Dev loss: 0.77571140229702
Loading model from /tmp/model.bin



I0327 14:14:23.883286 139807978702656 configuration_utils.py:256] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json from cache at /home/yves/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.8f56353af4a709bf5ff0fbc915d8f5b42bfff892cbb6ac98c3c45f481a03c685
I0327 14:14:23.884934 139807978702656 configuration_utils.py:292] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": null,
  "do_sample": false,
  "eos_token_ids": null,
  "finetuning_task": null,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1"
  },
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "is_decoder": false,
  "label2id": {
    "LABEL_0": 0,
    "LABEL_1": 1
  },
  "layer_norm_eps": 1e-12,
  "length_penalty": 1.0,
  "max_length": 20,
  "max_position_embed

HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…




I0327 14:14:30.816671 139807978702656 configuration_utils.py:256] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json from cache at /home/yves/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.8f56353af4a709bf5ff0fbc915d8f5b42bfff892cbb6ac98c3c45f481a03c685
I0327 14:14:30.817882 139807978702656 configuration_utils.py:292] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": null,
  "do_sample": false,
  "eos_token_ids": null,
  "finetuning_task": null,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1"
  },
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "is_decoder": false,
  "label2id": {
    "LABEL_0": 0,
    "LABEL_1": 1
  },
  "layer_norm_eps": 1e-12,
  "length_penalty": 1.0,
  "max_length": 20,
  "max_position_embedd

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Loss history: []
Dev loss: 2.0257370869318643


Epoch:   5%|▌         | 1/20 [00:08<02:37,  8.29s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Loss history: [2.0257370869318643]
Dev loss: 1.3874606589476268


Epoch:  10%|█         | 2/20 [00:16<02:29,  8.29s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Loss history: [2.0257370869318643, 1.3874606589476268]
Dev loss: 1.0838301380475361


Epoch:  15%|█▌        | 3/20 [00:24<02:20,  8.29s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Loss history: [2.0257370869318643, 1.3874606589476268, 1.0838301380475361]
Dev loss: 0.9024003048737844


Epoch:  20%|██        | 4/20 [00:33<02:12,  8.29s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Loss history: [2.0257370869318643, 1.3874606589476268, 1.0838301380475361, 0.9024003048737844]
Dev loss: 0.8600484977165858


Epoch:  25%|██▌       | 5/20 [00:41<02:04,  8.29s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Loss history: [2.0257370869318643, 1.3874606589476268, 1.0838301380475361, 0.9024003048737844, 0.8600484977165858]
Dev loss: 0.8033195634682974


Epoch:  30%|███       | 6/20 [00:49<01:56,  8.29s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Loss history: [2.0257370869318643, 1.3874606589476268, 1.0838301380475361, 0.9024003048737844, 0.8600484977165858, 0.8033195634682974]
Dev loss: 0.7161262954274813


Epoch:  35%|███▌      | 7/20 [00:58<01:47,  8.29s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Loss history: [2.0257370869318643, 1.3874606589476268, 1.0838301380475361, 0.9024003048737844, 0.8600484977165858, 0.8033195634682974, 0.7161262954274813]
Dev loss: 0.7109564070900282


Epoch:  40%|████      | 8/20 [01:06<01:39,  8.30s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Loss history: [2.0257370869318643, 1.3874606589476268, 1.0838301380475361, 0.9024003048737844, 0.8600484977165858, 0.8033195634682974, 0.7161262954274813, 0.7109564070900282]
Dev loss: 0.6807349051038424


Epoch:  45%|████▌     | 9/20 [01:14<01:31,  8.30s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…

Epoch:  50%|█████     | 10/20 [01:22<01:21,  8.17s/it]


Loss history: [2.0257370869318643, 1.3874606589476268, 1.0838301380475361, 0.9024003048737844, 0.8600484977165858, 0.8033195634682974, 0.7161262954274813, 0.7109564070900282, 0.6807349051038424]
Dev loss: 0.6867677743236223


HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Loss history: [2.0257370869318643, 1.3874606589476268, 1.0838301380475361, 0.9024003048737844, 0.8600484977165858, 0.8033195634682974, 0.7161262954274813, 0.7109564070900282, 0.6807349051038424, 0.6867677743236223]
Dev loss: 0.635239544014136


Epoch:  55%|█████▌    | 11/20 [01:30<01:13,  8.21s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…

Epoch:  60%|██████    | 12/20 [01:38<01:04,  8.10s/it]


Loss history: [2.0257370869318643, 1.3874606589476268, 1.0838301380475361, 0.9024003048737844, 0.8600484977165858, 0.8033195634682974, 0.7161262954274813, 0.7109564070900282, 0.6807349051038424, 0.6867677743236223, 0.635239544014136]
Dev loss: 0.6550467883547147


HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Loss history: [2.0257370869318643, 1.3874606589476268, 1.0838301380475361, 0.9024003048737844, 0.8600484977165858, 0.8033195634682974, 0.7161262954274813, 0.7109564070900282, 0.6807349051038424, 0.6867677743236223, 0.635239544014136, 0.6550467883547147]
Dev loss: 0.6293700411915779


Epoch:  65%|██████▌   | 13/20 [01:46<00:57,  8.16s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…

Epoch:  70%|███████   | 14/20 [01:54<00:48,  8.07s/it]


Loss history: [2.0257370869318643, 1.3874606589476268, 1.0838301380475361, 0.9024003048737844, 0.8600484977165858, 0.8033195634682974, 0.7161262954274813, 0.7109564070900282, 0.6807349051038424, 0.6867677743236223, 0.635239544014136, 0.6550467883547147, 0.6293700411915779]
Dev loss: 0.6471706889569759


HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…

Epoch:  75%|███████▌  | 15/20 [02:02<00:40,  8.01s/it]


Loss history: [2.0257370869318643, 1.3874606589476268, 1.0838301380475361, 0.9024003048737844, 0.8600484977165858, 0.8033195634682974, 0.7161262954274813, 0.7109564070900282, 0.6807349051038424, 0.6867677743236223, 0.635239544014136, 0.6550467883547147, 0.6293700411915779, 0.6471706889569759]
Dev loss: 0.7189600070317587


HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…

Epoch:  80%|████████  | 16/20 [02:10<00:31,  7.97s/it]


Loss history: [2.0257370869318643, 1.3874606589476268, 1.0838301380475361, 0.9024003048737844, 0.8600484977165858, 0.8033195634682974, 0.7161262954274813, 0.7109564070900282, 0.6807349051038424, 0.6867677743236223, 0.635239544014136, 0.6550467883547147, 0.6293700411915779, 0.6471706889569759, 0.7189600070317587]
Dev loss: 0.6632083666821321


HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…

Epoch:  85%|████████▌ | 17/20 [02:18<00:23,  7.93s/it]


Loss history: [2.0257370869318643, 1.3874606589476268, 1.0838301380475361, 0.9024003048737844, 0.8600484977165858, 0.8033195634682974, 0.7161262954274813, 0.7109564070900282, 0.6807349051038424, 0.6867677743236223, 0.635239544014136, 0.6550467883547147, 0.6293700411915779, 0.6471706889569759, 0.7189600070317587, 0.6632083666821321]
Dev loss: 0.689797893166542


HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Loss history: [2.0257370869318643, 1.3874606589476268, 1.0838301380475361, 0.9024003048737844, 0.8600484977165858, 0.8033195634682974, 0.7161262954274813, 0.7109564070900282, 0.6807349051038424, 0.6867677743236223, 0.635239544014136, 0.6550467883547147, 0.6293700411915779, 0.6471706889569759, 0.7189600070317587, 0.6632083666821321, 0.689797893166542]
Dev loss: 0.6144026766220728


Epoch:  90%|█████████ | 18/20 [02:26<00:16,  8.04s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…

Epoch:  95%|█████████▌| 19/20 [02:34<00:07,  7.99s/it]


Loss history: [2.0257370869318643, 1.3874606589476268, 1.0838301380475361, 0.9024003048737844, 0.8600484977165858, 0.8033195634682974, 0.7161262954274813, 0.7109564070900282, 0.6807349051038424, 0.6867677743236223, 0.635239544014136, 0.6550467883547147, 0.6293700411915779, 0.6471706889569759, 0.7189600070317587, 0.6632083666821321, 0.689797893166542, 0.6144026766220728]
Dev loss: 0.6226236335933208


HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…

Epoch: 100%|██████████| 20/20 [02:42<00:00,  7.95s/it]


Loss history: [2.0257370869318643, 1.3874606589476268, 1.0838301380475361, 0.9024003048737844, 0.8600484977165858, 0.8033195634682974, 0.7161262954274813, 0.7109564070900282, 0.6807349051038424, 0.6867677743236223, 0.635239544014136, 0.6550467883547147, 0.6293700411915779, 0.6471706889569759, 0.7189600070317587, 0.6632083666821321, 0.689797893166542, 0.6144026766220728, 0.6226236335933208]
Dev loss: 0.6439148870607218
Loading model from /tmp/model.bin



I0327 14:17:16.098901 139807978702656 configuration_utils.py:256] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json from cache at /home/yves/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.8f56353af4a709bf5ff0fbc915d8f5b42bfff892cbb6ac98c3c45f481a03c685
I0327 14:17:16.100500 139807978702656 configuration_utils.py:292] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": null,
  "do_sample": false,
  "eos_token_ids": null,
  "finetuning_task": null,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1"
  },
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "is_decoder": false,
  "label2id": {
    "LABEL_0": 0,
    "LABEL_1": 1
  },
  "layer_norm_eps": 1e-12,
  "length_penalty": 1.0,
  "max_length": 20,
  "max_position_embed

HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…




I0327 14:17:23.195345 139807978702656 configuration_utils.py:256] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json from cache at /home/yves/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.8f56353af4a709bf5ff0fbc915d8f5b42bfff892cbb6ac98c3c45f481a03c685
I0327 14:17:23.196957 139807978702656 configuration_utils.py:292] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": null,
  "do_sample": false,
  "eos_token_ids": null,
  "finetuning_task": null,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1"
  },
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "is_decoder": false,
  "label2id": {
    "LABEL_0": 0,
    "LABEL_1": 1
  },
  "layer_norm_eps": 1e-12,
  "length_penalty": 1.0,
  "max_length": 20,
  "max_position_embedd

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Loss history: []
Dev loss: 2.304175913333893


Epoch:   5%|▌         | 1/20 [00:08<02:37,  8.29s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Loss history: [2.304175913333893]
Dev loss: 1.6379866898059845


Epoch:  10%|█         | 2/20 [00:16<02:29,  8.29s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Loss history: [2.304175913333893, 1.6379866898059845]
Dev loss: 1.2700283229351044


Epoch:  15%|█▌        | 3/20 [00:24<02:20,  8.29s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Loss history: [2.304175913333893, 1.6379866898059845, 1.2700283229351044]
Dev loss: 0.9931914011637369


Epoch:  20%|██        | 4/20 [00:33<02:12,  8.29s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Loss history: [2.304175913333893, 1.6379866898059845, 1.2700283229351044, 0.9931914011637369]
Dev loss: 0.9007962346076965


Epoch:  25%|██▌       | 5/20 [00:41<02:04,  8.29s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Loss history: [2.304175913333893, 1.6379866898059845, 1.2700283229351044, 0.9931914011637369, 0.9007962346076965]
Dev loss: 0.794106254975001


Epoch:  30%|███       | 6/20 [00:49<01:56,  8.29s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Loss history: [2.304175913333893, 1.6379866898059845, 1.2700283229351044, 0.9931914011637369, 0.9007962346076965, 0.794106254975001]
Dev loss: 0.7710083400209745


Epoch:  35%|███▌      | 7/20 [00:58<01:47,  8.29s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Loss history: [2.304175913333893, 1.6379866898059845, 1.2700283229351044, 0.9931914011637369, 0.9007962346076965, 0.794106254975001, 0.7710083400209745]
Dev loss: 0.68764000137647


Epoch:  40%|████      | 8/20 [01:06<01:39,  8.29s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Loss history: [2.304175913333893, 1.6379866898059845, 1.2700283229351044, 0.9931914011637369, 0.9007962346076965, 0.794106254975001, 0.7710083400209745, 0.68764000137647]
Dev loss: 0.6267144319911798


Epoch:  45%|████▌     | 9/20 [01:14<01:31,  8.29s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…

Epoch:  50%|█████     | 10/20 [01:22<01:21,  8.16s/it]


Loss history: [2.304175913333893, 1.6379866898059845, 1.2700283229351044, 0.9931914011637369, 0.9007962346076965, 0.794106254975001, 0.7710083400209745, 0.68764000137647, 0.6267144319911798]
Dev loss: 0.6494924699266752


HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…

Epoch:  55%|█████▌    | 11/20 [01:30<01:12,  8.07s/it]


Loss history: [2.304175913333893, 1.6379866898059845, 1.2700283229351044, 0.9931914011637369, 0.9007962346076965, 0.794106254975001, 0.7710083400209745, 0.68764000137647, 0.6267144319911798, 0.6494924699266752]
Dev loss: 0.6492921610673269


HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…

Epoch:  60%|██████    | 12/20 [01:38<01:04,  8.01s/it]


Loss history: [2.304175913333893, 1.6379866898059845, 1.2700283229351044, 0.9931914011637369, 0.9007962346076965, 0.794106254975001, 0.7710083400209745, 0.68764000137647, 0.6267144319911798, 0.6494924699266752, 0.6492921610673269]
Dev loss: 0.6371764615178108


HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…

Epoch:  65%|██████▌   | 13/20 [01:46<00:55,  7.97s/it]


Loss history: [2.304175913333893, 1.6379866898059845, 1.2700283229351044, 0.9931914011637369, 0.9007962346076965, 0.794106254975001, 0.7710083400209745, 0.68764000137647, 0.6267144319911798, 0.6494924699266752, 0.6492921610673269, 0.6371764615178108]
Dev loss: 0.658463254570961


HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Loss history: [2.304175913333893, 1.6379866898059845, 1.2700283229351044, 0.9931914011637369, 0.9007962346076965, 0.794106254975001, 0.7710083400209745, 0.68764000137647, 0.6267144319911798, 0.6494924699266752, 0.6492921610673269, 0.6371764615178108, 0.658463254570961]
Dev loss: 0.6821168437600136
No improvement on development set. Finish training.
Loading model from /tmp/model.bin


I0327 14:19:19.929467 139807978702656 configuration_utils.py:256] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json from cache at /home/yves/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.8f56353af4a709bf5ff0fbc915d8f5b42bfff892cbb6ac98c3c45f481a03c685
I0327 14:19:19.930788 139807978702656 configuration_utils.py:292] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": null,
  "do_sample": false,
  "eos_token_ids": null,
  "finetuning_task": null,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1"
  },
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "is_decoder": false,
  "label2id": {
    "LABEL_0": 0,
    "LABEL_1": 1
  },
  "layer_norm_eps": 1e-12,
  "length_penalty": 1.0,
  "max_length": 20,
  "max_position_embedd

HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…




I0327 14:19:26.930316 139807978702656 configuration_utils.py:256] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json from cache at /home/yves/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.8f56353af4a709bf5ff0fbc915d8f5b42bfff892cbb6ac98c3c45f481a03c685
I0327 14:19:26.931814 139807978702656 configuration_utils.py:292] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": null,
  "do_sample": false,
  "eos_token_ids": null,
  "finetuning_task": null,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1"
  },
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "is_decoder": false,
  "label2id": {
    "LABEL_0": 0,
    "LABEL_1": 1
  },
  "layer_norm_eps": 1e-12,
  "length_penalty": 1.0,
  "max_length": 20,
  "max_position_embedd

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Loss history: []
Dev loss: 1.9860019584496815



Epoch:   5%|▌         | 1/20 [00:08<02:37,  8.30s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Loss history: [1.9860019584496815]
Dev loss: 1.4614042143026988



Epoch:  10%|█         | 2/20 [00:16<02:29,  8.29s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Loss history: [1.9860019584496815, 1.4614042143026988]
Dev loss: 1.1656756053368251



Epoch:  15%|█▌        | 3/20 [00:24<02:20,  8.29s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Loss history: [1.9860019584496815, 1.4614042143026988, 1.1656756053368251]
Dev loss: 0.9249066958824793



Epoch:  20%|██        | 4/20 [00:33<02:12,  8.30s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Loss history: [1.9860019584496815, 1.4614042143026988, 1.1656756053368251, 0.9249066958824793]
Dev loss: 0.85404105981191



Epoch:  25%|██▌       | 5/20 [00:41<02:04,  8.30s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Loss history: [1.9860019584496815, 1.4614042143026988, 1.1656756053368251, 0.9249066958824793, 0.85404105981191]
Dev loss: 0.7640060285727183



Epoch:  30%|███       | 6/20 [00:49<01:56,  8.30s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Loss history: [1.9860019584496815, 1.4614042143026988, 1.1656756053368251, 0.9249066958824793, 0.85404105981191, 0.7640060285727183]
Dev loss: 0.7344999462366104



Epoch:  35%|███▌      | 7/20 [00:58<01:47,  8.30s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Loss history: [1.9860019584496815, 1.4614042143026988, 1.1656756053368251, 0.9249066958824793, 0.85404105981191, 0.7640060285727183, 0.7344999462366104]
Dev loss: 0.6857321386535963



Epoch:  40%|████      | 8/20 [01:06<01:39,  8.30s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Epoch:  45%|████▌     | 9/20 [01:14<01:29,  8.17s/it][A


Loss history: [1.9860019584496815, 1.4614042143026988, 1.1656756053368251, 0.9249066958824793, 0.85404105981191, 0.7640060285727183, 0.7344999462366104, 0.6857321386535963]
Dev loss: 0.7599036047856013


HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Loss history: [1.9860019584496815, 1.4614042143026988, 1.1656756053368251, 0.9249066958824793, 0.85404105981191, 0.7640060285727183, 0.7344999462366104, 0.6857321386535963, 0.7599036047856013]
Dev loss: 0.6695963193972906



Epoch:  50%|█████     | 10/20 [01:22<01:22,  8.21s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Loss history: [1.9860019584496815, 1.4614042143026988, 1.1656756053368251, 0.9249066958824793, 0.85404105981191, 0.7640060285727183, 0.7344999462366104, 0.6857321386535963, 0.7599036047856013, 0.6695963193972906]
Dev loss: 0.6585346683859825



Epoch:  55%|█████▌    | 11/20 [01:30<01:14,  8.23s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Loss history: [1.9860019584496815, 1.4614042143026988, 1.1656756053368251, 0.9249066958824793, 0.85404105981191, 0.7640060285727183, 0.7344999462366104, 0.6857321386535963, 0.7599036047856013, 0.6695963193972906, 0.6585346683859825]
Dev loss: 0.6542998999357224



Epoch:  60%|██████    | 12/20 [01:39<01:06,  8.25s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Loss history: [1.9860019584496815, 1.4614042143026988, 1.1656756053368251, 0.9249066958824793, 0.85404105981191, 0.7640060285727183, 0.7344999462366104, 0.6857321386535963, 0.7599036047856013, 0.6695963193972906, 0.6585346683859825, 0.6542998999357224]
Dev loss: 0.6435454736153284



Epoch:  65%|██████▌   | 13/20 [01:47<00:57,  8.27s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Loss history: [1.9860019584496815, 1.4614042143026988, 1.1656756053368251, 0.9249066958824793, 0.85404105981191, 0.7640060285727183, 0.7344999462366104, 0.6857321386535963, 0.7599036047856013, 0.6695963193972906, 0.6585346683859825, 0.6542998999357224, 0.6435454736153284]
Dev loss: 0.6390317529439926



Epoch:  70%|███████   | 14/20 [01:55<00:49,  8.28s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Loss history: [1.9860019584496815, 1.4614042143026988, 1.1656756053368251, 0.9249066958824793, 0.85404105981191, 0.7640060285727183, 0.7344999462366104, 0.6857321386535963, 0.7599036047856013, 0.6695963193972906, 0.6585346683859825, 0.6542998999357224, 0.6435454736153284, 0.6390317529439926]
Dev loss: 0.6326373666524887



Epoch:  75%|███████▌  | 15/20 [02:04<00:41,  8.29s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Epoch:  80%|████████  | 16/20 [02:11<00:32,  8.16s/it][A


Loss history: [1.9860019584496815, 1.4614042143026988, 1.1656756053368251, 0.9249066958824793, 0.85404105981191, 0.7640060285727183, 0.7344999462366104, 0.6857321386535963, 0.7599036047856013, 0.6695963193972906, 0.6585346683859825, 0.6542998999357224, 0.6435454736153284, 0.6390317529439926, 0.6326373666524887]
Dev loss: 0.6949272205432256


HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Epoch:  85%|████████▌ | 17/20 [02:19<00:24,  8.07s/it][A


Loss history: [1.9860019584496815, 1.4614042143026988, 1.1656756053368251, 0.9249066958824793, 0.85404105981191, 0.7640060285727183, 0.7344999462366104, 0.6857321386535963, 0.7599036047856013, 0.6695963193972906, 0.6585346683859825, 0.6542998999357224, 0.6435454736153284, 0.6390317529439926, 0.6326373666524887, 0.6949272205432256]
Dev loss: 0.6756567222376665


HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Epoch:  90%|█████████ | 18/20 [02:27<00:16,  8.01s/it][A


Loss history: [1.9860019584496815, 1.4614042143026988, 1.1656756053368251, 0.9249066958824793, 0.85404105981191, 0.7640060285727183, 0.7344999462366104, 0.6857321386535963, 0.7599036047856013, 0.6695963193972906, 0.6585346683859825, 0.6542998999357224, 0.6435454736153284, 0.6390317529439926, 0.6326373666524887, 0.6949272205432256, 0.6756567222376665]
Dev loss: 0.6450708719591299


HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Epoch:  95%|█████████▌| 19/20 [02:35<00:07,  7.97s/it][A


Loss history: [1.9860019584496815, 1.4614042143026988, 1.1656756053368251, 0.9249066958824793, 0.85404105981191, 0.7640060285727183, 0.7344999462366104, 0.6857321386535963, 0.7599036047856013, 0.6695963193972906, 0.6585346683859825, 0.6542998999357224, 0.6435454736153284, 0.6390317529439926, 0.6326373666524887, 0.6949272205432256, 0.6756567222376665, 0.6450708719591299]
Dev loss: 0.7588294843832651


HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Loss history: [1.9860019584496815, 1.4614042143026988, 1.1656756053368251, 0.9249066958824793, 0.85404105981191, 0.7640060285727183, 0.7344999462366104, 0.6857321386535963, 0.7599036047856013, 0.6695963193972906, 0.6585346683859825, 0.6542998999357224, 0.6435454736153284, 0.6390317529439926, 0.6326373666524887, 0.6949272205432256, 0.6756567222376665, 0.6450708719591299, 0.7588294843832651]
Dev loss: 0.6329938371976217
No improvement on development set. Finish training.
Loading model from /tmp/model.bin


I0327 14:22:13.776871 139807978702656 configuration_utils.py:256] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json from cache at /home/yves/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.8f56353af4a709bf5ff0fbc915d8f5b42bfff892cbb6ac98c3c45f481a03c685
I0327 14:22:13.778498 139807978702656 configuration_utils.py:292] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": null,
  "do_sample": false,
  "eos_token_ids": null,
  "finetuning_task": null,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1"
  },
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "is_decoder": false,
  "label2id": {
    "LABEL_0": 0,
    "LABEL_1": 1
  },
  "layer_norm_eps": 1e-12,
  "length_penalty": 1.0,
  "max_length": 20,
  "max_position_embedd

HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…




I0327 14:22:21.119175 139807978702656 configuration_utils.py:256] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json from cache at /home/yves/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.8f56353af4a709bf5ff0fbc915d8f5b42bfff892cbb6ac98c3c45f481a03c685
I0327 14:22:21.120768 139807978702656 configuration_utils.py:292] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": null,
  "do_sample": false,
  "eos_token_ids": null,
  "finetuning_task": null,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1"
  },
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "is_decoder": false,
  "label2id": {
    "LABEL_0": 0,
    "LABEL_1": 1
  },
  "layer_norm_eps": 1e-12,
  "length_penalty": 1.0,
  "max_length": 20,
  "max_position_embedd

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Loss history: []
Dev loss: 2.0948343674341836




Epoch:   5%|▌         | 1/20 [00:08<02:37,  8.29s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Loss history: [2.0948343674341836]
Dev loss: 1.3785478621721268




Epoch:  10%|█         | 2/20 [00:16<02:29,  8.28s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Loss history: [2.0948343674341836, 1.3785478621721268]
Dev loss: 1.111532673239708




Epoch:  15%|█▌        | 3/20 [00:24<02:20,  8.29s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Loss history: [2.0948343674341836, 1.3785478621721268, 1.111532673239708]
Dev loss: 0.9135365237792333




Epoch:  20%|██        | 4/20 [00:33<02:12,  8.29s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Loss history: [2.0948343674341836, 1.3785478621721268, 1.111532673239708, 0.9135365237792333]
Dev loss: 0.8139717479546865




Epoch:  25%|██▌       | 5/20 [00:41<02:04,  8.30s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Loss history: [2.0948343674341836, 1.3785478621721268, 1.111532673239708, 0.9135365237792333, 0.8139717479546865]
Dev loss: 0.7882746507724127




Epoch:  30%|███       | 6/20 [00:49<01:56,  8.30s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Loss history: [2.0948343674341836, 1.3785478621721268, 1.111532673239708, 0.9135365237792333, 0.8139717479546865, 0.7882746507724127]
Dev loss: 0.760879397392273




Epoch:  35%|███▌      | 7/20 [00:58<01:47,  8.30s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Loss history: [2.0948343674341836, 1.3785478621721268, 1.111532673239708, 0.9135365237792333, 0.8139717479546865, 0.7882746507724127, 0.760879397392273]
Dev loss: 0.7029429202278455




Epoch:  40%|████      | 8/20 [01:06<01:39,  8.30s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Loss history: [2.0948343674341836, 1.3785478621721268, 1.111532673239708, 0.9135365237792333, 0.8139717479546865, 0.7882746507724127, 0.760879397392273, 0.7029429202278455]
Dev loss: 0.6822322110335032




Epoch:  45%|████▌     | 9/20 [01:14<01:30,  8.27s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…



Epoch:  50%|█████     | 10/20 [01:22<01:21,  8.15s/it][A[A


Loss history: [2.0948343674341836, 1.3785478621721268, 1.111532673239708, 0.9135365237792333, 0.8139717479546865, 0.7882746507724127, 0.760879397392273, 0.7029429202278455, 0.6822322110335032]
Dev loss: 0.69319649040699


HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Loss history: [2.0948343674341836, 1.3785478621721268, 1.111532673239708, 0.9135365237792333, 0.8139717479546865, 0.7882746507724127, 0.760879397392273, 0.7029429202278455, 0.6822322110335032, 0.69319649040699]
Dev loss: 0.6449295605222384




Epoch:  55%|█████▌    | 11/20 [01:30<01:13,  8.19s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Loss history: [2.0948343674341836, 1.3785478621721268, 1.111532673239708, 0.9135365237792333, 0.8139717479546865, 0.7882746507724127, 0.760879397392273, 0.7029429202278455, 0.6822322110335032, 0.69319649040699, 0.6449295605222384]
Dev loss: 0.6413164685169855




Epoch:  60%|██████    | 12/20 [01:39<01:05,  8.23s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…



Epoch:  65%|██████▌   | 13/20 [01:46<00:56,  8.12s/it][A[A


Loss history: [2.0948343674341836, 1.3785478621721268, 1.111532673239708, 0.9135365237792333, 0.8139717479546865, 0.7882746507724127, 0.760879397392273, 0.7029429202278455, 0.6822322110335032, 0.69319649040699, 0.6449295605222384, 0.6413164685169855]
Dev loss: 0.6772202675541242


HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Loss history: [2.0948343674341836, 1.3785478621721268, 1.111532673239708, 0.9135365237792333, 0.8139717479546865, 0.7882746507724127, 0.760879397392273, 0.7029429202278455, 0.6822322110335032, 0.69319649040699, 0.6449295605222384, 0.6413164685169855, 0.6772202675541242]
Dev loss: 0.6400696138540903




Epoch:  70%|███████   | 14/20 [01:55<00:49,  8.18s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…



Epoch:  75%|███████▌  | 15/20 [02:03<00:40,  8.09s/it][A[A


Loss history: [2.0948343674341836, 1.3785478621721268, 1.111532673239708, 0.9135365237792333, 0.8139717479546865, 0.7882746507724127, 0.760879397392273, 0.7029429202278455, 0.6822322110335032, 0.69319649040699, 0.6449295605222384, 0.6413164685169855, 0.6772202675541242, 0.6400696138540903]
Dev loss: 0.6450052062670389


HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…



Epoch:  80%|████████  | 16/20 [02:10<00:32,  8.02s/it][A[A


Loss history: [2.0948343674341836, 1.3785478621721268, 1.111532673239708, 0.9135365237792333, 0.8139717479546865, 0.7882746507724127, 0.760879397392273, 0.7029429202278455, 0.6822322110335032, 0.69319649040699, 0.6449295605222384, 0.6413164685169855, 0.6772202675541242, 0.6400696138540903, 0.6450052062670389]
Dev loss: 0.6769409204522768


HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Loss history: [2.0948343674341836, 1.3785478621721268, 1.111532673239708, 0.9135365237792333, 0.8139717479546865, 0.7882746507724127, 0.760879397392273, 0.7029429202278455, 0.6822322110335032, 0.69319649040699, 0.6449295605222384, 0.6413164685169855, 0.6772202675541242, 0.6400696138540903, 0.6450052062670389, 0.6769409204522768]
Dev loss: 0.5762488270799319




Epoch:  85%|████████▌ | 17/20 [02:19<00:24,  8.11s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…



Epoch:  90%|█████████ | 18/20 [02:27<00:16,  8.03s/it][A[A


Loss history: [2.0948343674341836, 1.3785478621721268, 1.111532673239708, 0.9135365237792333, 0.8139717479546865, 0.7882746507724127, 0.760879397392273, 0.7029429202278455, 0.6822322110335032, 0.69319649040699, 0.6449295605222384, 0.6413164685169855, 0.6772202675541242, 0.6400696138540903, 0.6450052062670389, 0.6769409204522768, 0.5762488270799319]
Dev loss: 0.5996729085842768


HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…


Loss history: [2.0948343674341836, 1.3785478621721268, 1.111532673239708, 0.9135365237792333, 0.8139717479546865, 0.7882746507724127, 0.760879397392273, 0.7029429202278455, 0.6822322110335032, 0.69319649040699, 0.6449295605222384, 0.6413164685169855, 0.6772202675541242, 0.6400696138540903, 0.6450052062670389, 0.6769409204522768, 0.5762488270799319, 0.5996729085842768]
Dev loss: 0.5690595532457033




Epoch:  95%|█████████▌| 19/20 [02:35<00:08,  8.12s/it][A[A

HBox(children=(IntProgress(value=0, description='Training iteration', max=34, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…



Epoch: 100%|██████████| 20/20 [02:43<00:00,  8.04s/it][A[A

[A[A


Loss history: [2.0948343674341836, 1.3785478621721268, 1.111532673239708, 0.9135365237792333, 0.8139717479546865, 0.7882746507724127, 0.760879397392273, 0.7029429202278455, 0.6822322110335032, 0.69319649040699, 0.6449295605222384, 0.6413164685169855, 0.6772202675541242, 0.6400696138540903, 0.6450052062670389, 0.6769409204522768, 0.5762488270799319, 0.5996729085842768, 0.5690595532457033]
Dev loss: 0.57136228804787
Loading model from /tmp/model.bin


I0327 14:25:07.923154 139807978702656 configuration_utils.py:256] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json from cache at /home/yves/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.8f56353af4a709bf5ff0fbc915d8f5b42bfff892cbb6ac98c3c45f481a03c685
I0327 14:25:07.924779 139807978702656 configuration_utils.py:292] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": null,
  "do_sample": false,
  "eos_token_ids": null,
  "finetuning_task": null,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1"
  },
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "is_decoder": false,
  "label2id": {
    "LABEL_0": 0,
    "LABEL_1": 1
  },
  "layer_norm_eps": 1e-12,
  "length_penalty": 1.0,
  "max_length": 20,
  "max_position_embedd

HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=12, style=ProgressStyle(descriptio…




## Evaluation

In [4]:
from sklearn.metrics import precision_recall_fscore_support, classification_report

print("Test performance:", precision_recall_fscore_support(all_correct, all_predicted, average="micro"))
print(classification_report(all_correct, all_predicted, target_names=target_names))

Test performance: (0.8677777777777778, 0.8677777777777778, 0.8677777777777778, None)
                                                     precision    recall  f1-score   support

        Unhealthy without Diabetes and Risk Factors       0.98      0.98      0.98       262
                          Diabetes and Risk Factors       1.00      0.99      0.99        94
Nutritional value without Diabetes and Risk Factors       0.89      0.89      0.89        18
                           Obesity without Diabetes       0.89      0.94      0.92        18
                             Unclassified Off-Topic       0.53      0.40      0.46        47
                    School without generating money       0.57      0.40      0.47        52
             Schools providing healthy alternatives       0.94      0.96      0.95       245
                                     Student choice       0.63      0.91      0.74        35
                            Students without choice       0.63      0.71     

  'precision', 'predicted', average, warn_for)


In [5]:
c = 0
for item, predicted, correct in zip(all_test_data, all_predicted, all_correct):
    assert item.label_id == correct
    c += (item.label_id == predicted)
    print("{}#{}#{}".format(item.text, idx2label[correct], idx2label[predicted]))
    
print()
print(c, "/", len(all_test_data), "=", c/len(all_test_data))

AttributeError: 'BertInputItem' object has no attribute 'label_id'