# First BERT Experiments

In this notebook we do some first experiments with BERT: we finetune a BERT model+classifier on each of our datasets separately and compute the accuracy of the resulting classifier on the test data.

For these experiments we use the `pytorch_transformers` package. It contains a variety of neural network architectures for transfer learning and pretrained models, including BERT and XLNET.

Two different BERT models are relevant for our experiments: 

- BERT-base-uncased: a relatively small BERT model that should already give reasonable results,
- BERT-large-uncased: a larger model for real state-of-the-art results.

In [1]:
BERT_MODEL = 'bert-base-uncased'
BATCH_SIZE = 16 if "base" in BERT_MODEL else 2
GRADIENT_ACCUMULATION_STEPS = 1 if "base" in BERT_MODEL else 8
MAX_SEQ_LENGTH = 100
PREFIX = "eatingmeat4_because"

## Data

We use the same data as for all our previous experiments. Here we load the training, development and test data for a particular prompt.

In [2]:
import sys
sys.path.append('../')

import ndjson
import glob
import numpy as np

from quillnlp.models.bert.preprocessing import preprocess, create_label_vocabulary

data_file = f"../data/interim/{PREFIX}_withprompt.ndjson"

with open(data_file) as i:
    data = ndjson.load(i)
        
# Make sure this is a single-label problem
label_lengths = [len(item["labels"]) for item in data]
assert max(label_lengths) == 1

for item in data:
    item["label"] = item["labels"][0]
        
label2idx = create_label_vocabulary(data)
idx2label = {v:k for k,v in label2idx.items()}
target_names = [idx2label[s] for s in range(len(idx2label))]

data_items = preprocess(data, BERT_MODEL, label2idx, MAX_SEQ_LENGTH)
data_items = np.array(data_items)

I0404 09:56:08.656721 140621602613056 file_utils.py:41] PyTorch version 1.2.0+cu92 available.
I0404 09:56:09.640355 140621602613056 file_utils.py:57] TensorFlow version 2.1.0 available.
I0404 09:56:10.284965 140621602613056 tokenization_utils.py:501] loading file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt from cache at /home/yves/.cache/torch/transformers/26bc1ad6c0ac742e9b52263248f6d0f00068293b33709fae12320c0e35ccfbbb.542ce4285a40d23a559526243235df47c5f75c197f04f37d1a0c124c32c9a084


## Training

In [3]:
import torch
import random

from quillnlp.models.bert.train import train, evaluate
from quillnlp.models.bert.models import get_bert_classifier

from quillnlp.models.bert.preprocessing import get_data_loader
from sklearn.model_selection import KFold

random.shuffle(data_items)

kf = KFold(n_splits=5, shuffle=True, random_state=1)
all_correct, all_predicted = [], []
all_test_data = []
for train_idx, test_idx in kf.split(data_items):

    train_and_dev_data = data_items[train_idx]
    cutoff = int(len(train_and_dev_data)/4*3)
    
    train_data = train_and_dev_data[:cutoff]
    dev_data = train_and_dev_data[cutoff:]
    test_data = data_items[test_idx]

    train_dataloader = get_data_loader(train_data, BATCH_SIZE)
    dev_dataloader = get_data_loader(dev_data, BATCH_SIZE)
    test_dataloader = get_data_loader(test_data, BATCH_SIZE, shuffle=False)

    device = "cuda" if torch.cuda.is_available() else "cpu"
    model = get_bert_classifier(BERT_MODEL, len(label2idx), device=device)
    output_model_file = train(model, train_dataloader, dev_dataloader, 
                              BATCH_SIZE, GRADIENT_ACCUMULATION_STEPS, device)
    
    print("Loading model from", output_model_file)
    device="cpu"

    model = get_bert_classifier(BERT_MODEL, len(label2idx), model_file=output_model_file, device=device)
    model.eval()
    
    _, _, test_correct, test_predicted = evaluate(model, test_dataloader, device)
    all_correct.extend(test_correct)
    all_predicted.extend(test_predicted)
    all_test_data.extend(test_data)


I0404 09:56:11.153807 140621602613056 configuration_utils.py:256] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json from cache at /home/yves/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.8f56353af4a709bf5ff0fbc915d8f5b42bfff892cbb6ac98c3c45f481a03c685
I0404 09:56:11.154999 140621602613056 configuration_utils.py:292] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": null,
  "do_sample": false,
  "eos_token_ids": null,
  "finetuning_task": null,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1"
  },
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "is_decoder": false,
  "label2id": {
    "LABEL_0": 0,
    "LABEL_1": 1
  },
  "layer_norm_eps": 1e-12,
  "length_penalty": 1.0,
  "max_length": 20,
  "max_position_embedd

HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…

  outputs = softmax(logits.to('cpu'))



Loss history: []
Dev loss: 1.6724319458007812


Epoch:   5%|▌         | 1/20 [00:06<02:06,  6.65s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Loss history: [1.6724319458007812]
Dev loss: 1.2261486517058477


Epoch:  10%|█         | 2/20 [00:13<01:59,  6.64s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Loss history: [1.6724319458007812, 1.2261486517058477]
Dev loss: 1.0467679036988153


Epoch:  15%|█▌        | 3/20 [00:19<01:52,  6.64s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Loss history: [1.6724319458007812, 1.2261486517058477, 1.0467679036988153]
Dev loss: 0.9637160334322188


Epoch:  20%|██        | 4/20 [00:26<01:46,  6.64s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Loss history: [1.6724319458007812, 1.2261486517058477, 1.0467679036988153, 0.9637160334322188]
Dev loss: 0.8569456620348824


Epoch:  25%|██▌       | 5/20 [00:33<01:39,  6.64s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Loss history: [1.6724319458007812, 1.2261486517058477, 1.0467679036988153, 0.9637160334322188, 0.8569456620348824]
Dev loss: 0.852675543891059


Epoch:  30%|███       | 6/20 [00:39<01:33,  6.64s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…

Epoch:  35%|███▌      | 7/20 [00:45<01:24,  6.50s/it]


Loss history: [1.6724319458007812, 1.2261486517058477, 1.0467679036988153, 0.9637160334322188, 0.8569456620348824, 0.852675543891059]
Dev loss: 0.8658503558900621


HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Loss history: [1.6724319458007812, 1.2261486517058477, 1.0467679036988153, 0.9637160334322188, 0.8569456620348824, 0.852675543891059, 0.8658503558900621]
Dev loss: 0.8167597353458405


Epoch:  40%|████      | 8/20 [00:52<01:18,  6.55s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Loss history: [1.6724319458007812, 1.2261486517058477, 1.0467679036988153, 0.9637160334322188, 0.8569456620348824, 0.852675543891059, 0.8658503558900621, 0.8167597353458405]
Dev loss: 0.8027659638060464


Epoch:  45%|████▌     | 9/20 [00:59<01:12,  6.58s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…

Epoch:  50%|█████     | 10/20 [01:05<01:04,  6.45s/it]


Loss history: [1.6724319458007812, 1.2261486517058477, 1.0467679036988153, 0.9637160334322188, 0.8569456620348824, 0.852675543891059, 0.8658503558900621, 0.8167597353458405, 0.8027659638060464]
Dev loss: 0.8138176302115122


HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…

Epoch:  55%|█████▌    | 11/20 [01:11<00:57,  6.37s/it]


Loss history: [1.6724319458007812, 1.2261486517058477, 1.0467679036988153, 0.9637160334322188, 0.8569456620348824, 0.852675543891059, 0.8658503558900621, 0.8167597353458405, 0.8027659638060464, 0.8138176302115122]
Dev loss: 0.8340393420722749


HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…

Epoch:  60%|██████    | 12/20 [01:17<00:50,  6.31s/it]


Loss history: [1.6724319458007812, 1.2261486517058477, 1.0467679036988153, 0.9637160334322188, 0.8569456620348824, 0.852675543891059, 0.8658503558900621, 0.8167597353458405, 0.8027659638060464, 0.8138176302115122, 0.8340393420722749]
Dev loss: 0.8193564679887559


HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…

Epoch:  65%|██████▌   | 13/20 [01:23<00:43,  6.26s/it]


Loss history: [1.6724319458007812, 1.2261486517058477, 1.0467679036988153, 0.9637160334322188, 0.8569456620348824, 0.852675543891059, 0.8658503558900621, 0.8167597353458405, 0.8027659638060464, 0.8138176302115122, 0.8340393420722749, 0.8193564679887559]
Dev loss: 0.8329831510782242


HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Loss history: [1.6724319458007812, 1.2261486517058477, 1.0467679036988153, 0.9637160334322188, 0.8569456620348824, 0.852675543891059, 0.8658503558900621, 0.8167597353458405, 0.8027659638060464, 0.8138176302115122, 0.8340393420722749, 0.8193564679887559, 0.8329831510782242]
Dev loss: 0.8003248804145389


Epoch:  70%|███████   | 14/20 [01:30<00:38,  6.41s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Loss history: [1.6724319458007812, 1.2261486517058477, 1.0467679036988153, 0.9637160334322188, 0.8569456620348824, 0.852675543891059, 0.8658503558900621, 0.8167597353458405, 0.8027659638060464, 0.8138176302115122, 0.8340393420722749, 0.8193564679887559, 0.8329831510782242, 0.8003248804145389]
Dev loss: 0.7860618498590257


Epoch:  75%|███████▌  | 15/20 [01:37<00:32,  6.49s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…

Epoch:  80%|████████  | 16/20 [01:43<00:25,  6.39s/it]


Loss history: [1.6724319458007812, 1.2261486517058477, 1.0467679036988153, 0.9637160334322188, 0.8569456620348824, 0.852675543891059, 0.8658503558900621, 0.8167597353458405, 0.8027659638060464, 0.8138176302115122, 0.8340393420722749, 0.8193564679887559, 0.8329831510782242, 0.8003248804145389, 0.7860618498590257]
Dev loss: 0.7872317218118243


HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…

Epoch:  85%|████████▌ | 17/20 [01:49<00:18,  6.32s/it]


Loss history: [1.6724319458007812, 1.2261486517058477, 1.0467679036988153, 0.9637160334322188, 0.8569456620348824, 0.852675543891059, 0.8658503558900621, 0.8167597353458405, 0.8027659638060464, 0.8138176302115122, 0.8340393420722749, 0.8193564679887559, 0.8329831510782242, 0.8003248804145389, 0.7860618498590257, 0.7872317218118243]
Dev loss: 0.7895626607868407


HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…

Epoch:  90%|█████████ | 18/20 [01:55<00:12,  6.28s/it]


Loss history: [1.6724319458007812, 1.2261486517058477, 1.0467679036988153, 0.9637160334322188, 0.8569456620348824, 0.852675543891059, 0.8658503558900621, 0.8167597353458405, 0.8027659638060464, 0.8138176302115122, 0.8340393420722749, 0.8193564679887559, 0.8329831510782242, 0.8003248804145389, 0.7860618498590257, 0.7872317218118243, 0.7895626607868407]
Dev loss: 0.7910694380601248


HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…

Epoch:  95%|█████████▌| 19/20 [02:02<00:06,  6.25s/it]


Loss history: [1.6724319458007812, 1.2261486517058477, 1.0467679036988153, 0.9637160334322188, 0.8569456620348824, 0.852675543891059, 0.8658503558900621, 0.8167597353458405, 0.8027659638060464, 0.8138176302115122, 0.8340393420722749, 0.8193564679887559, 0.8329831510782242, 0.8003248804145389, 0.7860618498590257, 0.7872317218118243, 0.7895626607868407, 0.7910694380601248]
Dev loss: 0.7961310479376051


HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Loss history: [1.6724319458007812, 1.2261486517058477, 1.0467679036988153, 0.9637160334322188, 0.8569456620348824, 0.852675543891059, 0.8658503558900621, 0.8167597353458405, 0.8027659638060464, 0.8138176302115122, 0.8340393420722749, 0.8193564679887559, 0.8329831510782242, 0.8003248804145389, 0.7860618498590257, 0.7872317218118243, 0.7895626607868407, 0.7910694380601248, 0.7961310479376051]
Dev loss: 0.8008312715424432
No improvement on development set. Finish training.
Loading model from /tmp/model.bin



I0404 09:58:26.061485 140621602613056 configuration_utils.py:256] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json from cache at /home/yves/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.8f56353af4a709bf5ff0fbc915d8f5b42bfff892cbb6ac98c3c45f481a03c685
I0404 09:58:26.062742 140621602613056 configuration_utils.py:292] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": null,
  "do_sample": false,
  "eos_token_ids": null,
  "finetuning_task": null,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1"
  },
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "is_decoder": false,
  "label2id": {
    "LABEL_0": 0,
    "LABEL_1": 1
  },
  "layer_norm_eps": 1e-12,
  "length_penalty": 1.0,
  "max_length": 20,
  "max_position_embed

HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…




I0404 09:58:32.113768 140621602613056 configuration_utils.py:256] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json from cache at /home/yves/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.8f56353af4a709bf5ff0fbc915d8f5b42bfff892cbb6ac98c3c45f481a03c685
I0404 09:58:32.114997 140621602613056 configuration_utils.py:292] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": null,
  "do_sample": false,
  "eos_token_ids": null,
  "finetuning_task": null,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1"
  },
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "is_decoder": false,
  "label2id": {
    "LABEL_0": 0,
    "LABEL_1": 1
  },
  "layer_norm_eps": 1e-12,
  "length_penalty": 1.0,
  "max_length": 20,
  "max_position_embedd

HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Loss history: []
Dev loss: 1.6564411852094862


Epoch:   5%|▌         | 1/20 [00:06<02:05,  6.60s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Loss history: [1.6564411852094862]
Dev loss: 1.2285301155514188


Epoch:  10%|█         | 2/20 [00:13<01:58,  6.61s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Loss history: [1.6564411852094862, 1.2285301155514188]
Dev loss: 1.0432660712136164


Epoch:  15%|█▌        | 3/20 [00:19<01:52,  6.61s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Loss history: [1.6564411852094862, 1.2285301155514188, 1.0432660712136164]
Dev loss: 0.926780770222346


Epoch:  20%|██        | 4/20 [00:26<01:45,  6.61s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Loss history: [1.6564411852094862, 1.2285301155514188, 1.0432660712136164, 0.926780770222346]
Dev loss: 0.8913482228914896


Epoch:  25%|██▌       | 5/20 [00:33<01:39,  6.62s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Loss history: [1.6564411852094862, 1.2285301155514188, 1.0432660712136164, 0.926780770222346, 0.8913482228914896]
Dev loss: 0.8772368696000841


Epoch:  30%|███       | 6/20 [00:39<01:32,  6.62s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Loss history: [1.6564411852094862, 1.2285301155514188, 1.0432660712136164, 0.926780770222346, 0.8913482228914896, 0.8772368696000841]
Dev loss: 0.8552145229445564


Epoch:  35%|███▌      | 7/20 [00:46<01:26,  6.62s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…

Epoch:  40%|████      | 8/20 [00:52<01:17,  6.48s/it]


Loss history: [1.6564411852094862, 1.2285301155514188, 1.0432660712136164, 0.926780770222346, 0.8913482228914896, 0.8772368696000841, 0.8552145229445564]
Dev loss: 0.8723413497209549


HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…

Epoch:  45%|████▌     | 9/20 [00:58<01:10,  6.39s/it]


Loss history: [1.6564411852094862, 1.2285301155514188, 1.0432660712136164, 0.926780770222346, 0.8913482228914896, 0.8772368696000841, 0.8552145229445564, 0.8723413497209549]
Dev loss: 0.9733612802293565


HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…

Epoch:  50%|█████     | 10/20 [01:04<01:03,  6.32s/it]


Loss history: [1.6564411852094862, 1.2285301155514188, 1.0432660712136164, 0.926780770222346, 0.8913482228914896, 0.8772368696000841, 0.8552145229445564, 0.8723413497209549, 0.9733612802293565]
Dev loss: 0.9280348453256819


HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…

Epoch:  55%|█████▌    | 11/20 [01:11<00:56,  6.28s/it]


Loss history: [1.6564411852094862, 1.2285301155514188, 1.0432660712136164, 0.926780770222346, 0.8913482228914896, 0.8772368696000841, 0.8552145229445564, 0.8723413497209549, 0.9733612802293565, 0.9280348453256819]
Dev loss: 0.9245101941956414


HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Loss history: [1.6564411852094862, 1.2285301155514188, 1.0432660712136164, 0.926780770222346, 0.8913482228914896, 0.8772368696000841, 0.8552145229445564, 0.8723413497209549, 0.9733612802293565, 0.9280348453256819, 0.9245101941956414]
Dev loss: 0.9199409468306435
No improvement on development set. Finish training.
Loading model from /tmp/model.bin


I0404 09:59:52.103757 140621602613056 configuration_utils.py:256] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json from cache at /home/yves/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.8f56353af4a709bf5ff0fbc915d8f5b42bfff892cbb6ac98c3c45f481a03c685
I0404 09:59:52.105395 140621602613056 configuration_utils.py:292] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": null,
  "do_sample": false,
  "eos_token_ids": null,
  "finetuning_task": null,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1"
  },
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "is_decoder": false,
  "label2id": {
    "LABEL_0": 0,
    "LABEL_1": 1
  },
  "layer_norm_eps": 1e-12,
  "length_penalty": 1.0,
  "max_length": 20,
  "max_position_embedd

HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…




I0404 09:59:58.364272 140621602613056 configuration_utils.py:256] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json from cache at /home/yves/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.8f56353af4a709bf5ff0fbc915d8f5b42bfff892cbb6ac98c3c45f481a03c685
I0404 09:59:58.365408 140621602613056 configuration_utils.py:292] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": null,
  "do_sample": false,
  "eos_token_ids": null,
  "finetuning_task": null,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1"
  },
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "is_decoder": false,
  "label2id": {
    "LABEL_0": 0,
    "LABEL_1": 1
  },
  "layer_norm_eps": 1e-12,
  "length_penalty": 1.0,
  "max_length": 20,
  "max_position_embedd

HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Loss history: []
Dev loss: 1.653434885872735



Epoch:   5%|▌         | 1/20 [00:06<02:05,  6.63s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Loss history: [1.653434885872735]
Dev loss: 1.4084867503907945



Epoch:  10%|█         | 2/20 [00:13<01:59,  6.62s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Loss history: [1.653434885872735, 1.4084867503907945]
Dev loss: 1.1605694360203214



Epoch:  15%|█▌        | 3/20 [00:19<01:52,  6.62s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Loss history: [1.653434885872735, 1.4084867503907945, 1.1605694360203214]
Dev loss: 0.9438766804006364



Epoch:  20%|██        | 4/20 [00:26<01:45,  6.62s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Loss history: [1.653434885872735, 1.4084867503907945, 1.1605694360203214, 0.9438766804006364]
Dev loss: 0.9253192179732852



Epoch:  25%|██▌       | 5/20 [00:33<01:39,  6.62s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Loss history: [1.653434885872735, 1.4084867503907945, 1.1605694360203214, 0.9438766804006364, 0.9253192179732852]
Dev loss: 0.8716648088561164



Epoch:  30%|███       | 6/20 [00:39<01:32,  6.62s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Loss history: [1.653434885872735, 1.4084867503907945, 1.1605694360203214, 0.9438766804006364, 0.9253192179732852, 0.8716648088561164]
Dev loss: 0.791989846362008



Epoch:  35%|███▌      | 7/20 [00:46<01:26,  6.62s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Epoch:  40%|████      | 8/20 [00:52<01:17,  6.49s/it][A


Loss history: [1.653434885872735, 1.4084867503907945, 1.1605694360203214, 0.9438766804006364, 0.9253192179732852, 0.8716648088561164, 0.791989846362008]
Dev loss: 0.8141893711354997


HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Epoch:  45%|████▌     | 9/20 [00:58<01:10,  6.39s/it][A


Loss history: [1.653434885872735, 1.4084867503907945, 1.1605694360203214, 0.9438766804006364, 0.9253192179732852, 0.8716648088561164, 0.791989846362008, 0.8141893711354997]
Dev loss: 0.8117420772711436


HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Loss history: [1.653434885872735, 1.4084867503907945, 1.1605694360203214, 0.9438766804006364, 0.9253192179732852, 0.8716648088561164, 0.791989846362008, 0.8141893711354997, 0.8117420772711436]
Dev loss: 0.7652361558543311



Epoch:  50%|█████     | 10/20 [01:05<01:04,  6.46s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Loss history: [1.653434885872735, 1.4084867503907945, 1.1605694360203214, 0.9438766804006364, 0.9253192179732852, 0.8716648088561164, 0.791989846362008, 0.8141893711354997, 0.8117420772711436, 0.7652361558543311]
Dev loss: 0.7441992676920361



Epoch:  55%|█████▌    | 11/20 [01:11<00:58,  6.51s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Epoch:  60%|██████    | 12/20 [01:18<00:51,  6.41s/it][A


Loss history: [1.653434885872735, 1.4084867503907945, 1.1605694360203214, 0.9438766804006364, 0.9253192179732852, 0.8716648088561164, 0.791989846362008, 0.8141893711354997, 0.8117420772711436, 0.7652361558543311, 0.7441992676920361]
Dev loss: 0.746176133553187


HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Loss history: [1.653434885872735, 1.4084867503907945, 1.1605694360203214, 0.9438766804006364, 0.9253192179732852, 0.8716648088561164, 0.791989846362008, 0.8141893711354997, 0.8117420772711436, 0.7652361558543311, 0.7441992676920361, 0.746176133553187]
Dev loss: 0.7230991290675269



Epoch:  65%|██████▌   | 13/20 [01:24<00:45,  6.47s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Epoch:  70%|███████   | 14/20 [01:30<00:38,  6.38s/it][A


Loss history: [1.653434885872735, 1.4084867503907945, 1.1605694360203214, 0.9438766804006364, 0.9253192179732852, 0.8716648088561164, 0.791989846362008, 0.8141893711354997, 0.8117420772711436, 0.7652361558543311, 0.7441992676920361, 0.746176133553187, 0.7230991290675269]
Dev loss: 0.7487855090035332


HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Epoch:  75%|███████▌  | 15/20 [01:37<00:31,  6.32s/it][A


Loss history: [1.653434885872735, 1.4084867503907945, 1.1605694360203214, 0.9438766804006364, 0.9253192179732852, 0.8716648088561164, 0.791989846362008, 0.8141893711354997, 0.8117420772711436, 0.7652361558543311, 0.7441992676920361, 0.746176133553187, 0.7230991290675269, 0.7487855090035332]
Dev loss: 0.7410226331816779


HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Epoch:  80%|████████  | 16/20 [01:43<00:25,  6.28s/it][A


Loss history: [1.653434885872735, 1.4084867503907945, 1.1605694360203214, 0.9438766804006364, 0.9253192179732852, 0.8716648088561164, 0.791989846362008, 0.8141893711354997, 0.8117420772711436, 0.7652361558543311, 0.7441992676920361, 0.746176133553187, 0.7230991290675269, 0.7487855090035332, 0.7410226331816779]
Dev loss: 0.7312250816159778


HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Epoch:  85%|████████▌ | 17/20 [01:49<00:18,  6.25s/it][A


Loss history: [1.653434885872735, 1.4084867503907945, 1.1605694360203214, 0.9438766804006364, 0.9253192179732852, 0.8716648088561164, 0.791989846362008, 0.8141893711354997, 0.8117420772711436, 0.7652361558543311, 0.7441992676920361, 0.746176133553187, 0.7230991290675269, 0.7487855090035332, 0.7410226331816779, 0.7312250816159778]
Dev loss: 0.7256852471166186


HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Loss history: [1.653434885872735, 1.4084867503907945, 1.1605694360203214, 0.9438766804006364, 0.9253192179732852, 0.8716648088561164, 0.791989846362008, 0.8141893711354997, 0.8117420772711436, 0.7652361558543311, 0.7441992676920361, 0.746176133553187, 0.7230991290675269, 0.7487855090035332, 0.7410226331816779, 0.7312250816159778, 0.7256852471166186]
Dev loss: 0.7192803306712044



Epoch:  90%|█████████ | 18/20 [01:56<00:12,  6.36s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Epoch:  95%|█████████▌| 19/20 [02:02<00:06,  6.30s/it][A


Loss history: [1.653434885872735, 1.4084867503907945, 1.1605694360203214, 0.9438766804006364, 0.9253192179732852, 0.8716648088561164, 0.791989846362008, 0.8141893711354997, 0.8117420772711436, 0.7652361558543311, 0.7441992676920361, 0.746176133553187, 0.7230991290675269, 0.7487855090035332, 0.7410226331816779, 0.7312250816159778, 0.7256852471166186, 0.7192803306712044]
Dev loss: 0.7326894402503967


HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Epoch: 100%|██████████| 20/20 [02:08<00:00,  6.27s/it][A
[A


Loss history: [1.653434885872735, 1.4084867503907945, 1.1605694360203214, 0.9438766804006364, 0.9253192179732852, 0.8716648088561164, 0.791989846362008, 0.8141893711354997, 0.8117420772711436, 0.7652361558543311, 0.7441992676920361, 0.746176133553187, 0.7230991290675269, 0.7487855090035332, 0.7410226331816779, 0.7312250816159778, 0.7256852471166186, 0.7192803306712044, 0.7326894402503967]
Dev loss: 0.7385553105009927
Loading model from /tmp/model.bin


I0404 10:02:09.561176 140621602613056 configuration_utils.py:256] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json from cache at /home/yves/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.8f56353af4a709bf5ff0fbc915d8f5b42bfff892cbb6ac98c3c45f481a03c685
I0404 10:02:09.562777 140621602613056 configuration_utils.py:292] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": null,
  "do_sample": false,
  "eos_token_ids": null,
  "finetuning_task": null,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1"
  },
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "is_decoder": false,
  "label2id": {
    "LABEL_0": 0,
    "LABEL_1": 1
  },
  "layer_norm_eps": 1e-12,
  "length_penalty": 1.0,
  "max_length": 20,
  "max_position_embedd

HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…




I0404 10:02:15.367894 140621602613056 configuration_utils.py:256] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json from cache at /home/yves/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.8f56353af4a709bf5ff0fbc915d8f5b42bfff892cbb6ac98c3c45f481a03c685
I0404 10:02:15.369500 140621602613056 configuration_utils.py:292] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": null,
  "do_sample": false,
  "eos_token_ids": null,
  "finetuning_task": null,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1"
  },
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "is_decoder": false,
  "label2id": {
    "LABEL_0": 0,
    "LABEL_1": 1
  },
  "layer_norm_eps": 1e-12,
  "length_penalty": 1.0,
  "max_length": 20,
  "max_position_embedd

HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Loss history: []
Dev loss: 1.924617264005873



Epoch:   5%|▌         | 1/20 [00:06<02:06,  6.63s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Loss history: [1.924617264005873]
Dev loss: 1.588990052541097



Epoch:  10%|█         | 2/20 [00:13<01:59,  6.63s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Loss history: [1.924617264005873, 1.588990052541097]
Dev loss: 1.5806211696730719



Epoch:  15%|█▌        | 3/20 [00:19<01:52,  6.63s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Loss history: [1.924617264005873, 1.588990052541097, 1.5806211696730719]
Dev loss: 1.2724697788556416



Epoch:  20%|██        | 4/20 [00:26<01:46,  6.63s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Loss history: [1.924617264005873, 1.588990052541097, 1.5806211696730719, 1.2724697788556416]
Dev loss: 1.0437237620353699



Epoch:  25%|██▌       | 5/20 [00:33<01:39,  6.63s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Loss history: [1.924617264005873, 1.588990052541097, 1.5806211696730719, 1.2724697788556416, 1.0437237620353699]
Dev loss: 0.9297102888425192



Epoch:  30%|███       | 6/20 [00:39<01:32,  6.63s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Loss history: [1.924617264005873, 1.588990052541097, 1.5806211696730719, 1.2724697788556416, 1.0437237620353699, 0.9297102888425192]
Dev loss: 0.8672048780653212



Epoch:  35%|███▌      | 7/20 [00:46<01:26,  6.63s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Loss history: [1.924617264005873, 1.588990052541097, 1.5806211696730719, 1.2724697788556416, 1.0437237620353699, 0.9297102888425192, 0.8672048780653212]
Dev loss: 0.8122357428073883



Epoch:  40%|████      | 8/20 [00:53<01:19,  6.63s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Loss history: [1.924617264005873, 1.588990052541097, 1.5806211696730719, 1.2724697788556416, 1.0437237620353699, 0.9297102888425192, 0.8672048780653212, 0.8122357428073883]
Dev loss: 0.7851364413897196



Epoch:  45%|████▌     | 9/20 [00:59<01:12,  6.62s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Loss history: [1.924617264005873, 1.588990052541097, 1.5806211696730719, 1.2724697788556416, 1.0437237620353699, 0.9297102888425192, 0.8672048780653212, 0.8122357428073883, 0.7851364413897196]
Dev loss: 0.7586074504587386



Epoch:  50%|█████     | 10/20 [01:06<01:06,  6.63s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Loss history: [1.924617264005873, 1.588990052541097, 1.5806211696730719, 1.2724697788556416, 1.0437237620353699, 0.9297102888425192, 0.8672048780653212, 0.8122357428073883, 0.7851364413897196, 0.7586074504587386]
Dev loss: 0.7412183052963681



Epoch:  55%|█████▌    | 11/20 [01:12<00:59,  6.63s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Loss history: [1.924617264005873, 1.588990052541097, 1.5806211696730719, 1.2724697788556416, 1.0437237620353699, 0.9297102888425192, 0.8672048780653212, 0.8122357428073883, 0.7851364413897196, 0.7586074504587386, 0.7412183052963681]
Dev loss: 0.6999309377537833



Epoch:  60%|██████    | 12/20 [01:19<00:53,  6.63s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Loss history: [1.924617264005873, 1.588990052541097, 1.5806211696730719, 1.2724697788556416, 1.0437237620353699, 0.9297102888425192, 0.8672048780653212, 0.8122357428073883, 0.7851364413897196, 0.7586074504587386, 0.7412183052963681, 0.6999309377537833]
Dev loss: 0.6976620621151395



Epoch:  65%|██████▌   | 13/20 [01:26<00:46,  6.63s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Epoch:  70%|███████   | 14/20 [01:32<00:38,  6.49s/it][A


Loss history: [1.924617264005873, 1.588990052541097, 1.5806211696730719, 1.2724697788556416, 1.0437237620353699, 0.9297102888425192, 0.8672048780653212, 0.8122357428073883, 0.7851364413897196, 0.7586074504587386, 0.7412183052963681, 0.6999309377537833, 0.6976620621151395]
Dev loss: 0.7007838487625122


HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Loss history: [1.924617264005873, 1.588990052541097, 1.5806211696730719, 1.2724697788556416, 1.0437237620353699, 0.9297102888425192, 0.8672048780653212, 0.8122357428073883, 0.7851364413897196, 0.7586074504587386, 0.7412183052963681, 0.6999309377537833, 0.6976620621151395, 0.7007838487625122]
Dev loss: 0.6808620227707757



Epoch:  75%|███████▌  | 15/20 [01:38<00:32,  6.53s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Loss history: [1.924617264005873, 1.588990052541097, 1.5806211696730719, 1.2724697788556416, 1.0437237620353699, 0.9297102888425192, 0.8672048780653212, 0.8122357428073883, 0.7851364413897196, 0.7586074504587386, 0.7412183052963681, 0.6999309377537833, 0.6976620621151395, 0.7007838487625122, 0.6808620227707757]
Dev loss: 0.6734292035301527



Epoch:  80%|████████  | 16/20 [01:45<00:26,  6.56s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Epoch:  85%|████████▌ | 17/20 [01:51<00:19,  6.45s/it][A


Loss history: [1.924617264005873, 1.588990052541097, 1.5806211696730719, 1.2724697788556416, 1.0437237620353699, 0.9297102888425192, 0.8672048780653212, 0.8122357428073883, 0.7851364413897196, 0.7586074504587386, 0.7412183052963681, 0.6999309377537833, 0.6976620621151395, 0.7007838487625122, 0.6808620227707757, 0.6734292035301527]
Dev loss: 0.6887140191263623


HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Loss history: [1.924617264005873, 1.588990052541097, 1.5806211696730719, 1.2724697788556416, 1.0437237620353699, 0.9297102888425192, 0.8672048780653212, 0.8122357428073883, 0.7851364413897196, 0.7586074504587386, 0.7412183052963681, 0.6999309377537833, 0.6976620621151395, 0.7007838487625122, 0.6808620227707757, 0.6734292035301527, 0.6887140191263623]
Dev loss: 0.6731502711772919



Epoch:  90%|█████████ | 18/20 [01:58<00:13,  6.50s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Epoch:  95%|█████████▌| 19/20 [02:04<00:06,  6.40s/it][A


Loss history: [1.924617264005873, 1.588990052541097, 1.5806211696730719, 1.2724697788556416, 1.0437237620353699, 0.9297102888425192, 0.8672048780653212, 0.8122357428073883, 0.7851364413897196, 0.7586074504587386, 0.7412183052963681, 0.6999309377537833, 0.6976620621151395, 0.7007838487625122, 0.6808620227707757, 0.6734292035301527, 0.6887140191263623, 0.6731502711772919]
Dev loss: 0.6816315154234568


HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Epoch: 100%|██████████| 20/20 [02:10<00:00,  6.33s/it][A
[A


Loss history: [1.924617264005873, 1.588990052541097, 1.5806211696730719, 1.2724697788556416, 1.0437237620353699, 0.9297102888425192, 0.8672048780653212, 0.8122357428073883, 0.7851364413897196, 0.7586074504587386, 0.7412183052963681, 0.6999309377537833, 0.6976620621151395, 0.7007838487625122, 0.6808620227707757, 0.6734292035301527, 0.6887140191263623, 0.6731502711772919, 0.6816315154234568]
Dev loss: 0.6732065280278524
Loading model from /tmp/model.bin


I0404 10:04:28.851391 140621602613056 configuration_utils.py:256] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json from cache at /home/yves/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.8f56353af4a709bf5ff0fbc915d8f5b42bfff892cbb6ac98c3c45f481a03c685
I0404 10:04:28.852716 140621602613056 configuration_utils.py:292] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": null,
  "do_sample": false,
  "eos_token_ids": null,
  "finetuning_task": null,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1"
  },
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "is_decoder": false,
  "label2id": {
    "LABEL_0": 0,
    "LABEL_1": 1
  },
  "layer_norm_eps": 1e-12,
  "length_penalty": 1.0,
  "max_length": 20,
  "max_position_embedd

HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…




I0404 10:04:34.820593 140621602613056 configuration_utils.py:256] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json from cache at /home/yves/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.8f56353af4a709bf5ff0fbc915d8f5b42bfff892cbb6ac98c3c45f481a03c685
I0404 10:04:34.821524 140621602613056 configuration_utils.py:292] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": null,
  "do_sample": false,
  "eos_token_ids": null,
  "finetuning_task": null,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1"
  },
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "is_decoder": false,
  "label2id": {
    "LABEL_0": 0,
    "LABEL_1": 1
  },
  "layer_norm_eps": 1e-12,
  "length_penalty": 1.0,
  "max_length": 20,
  "max_position_embedd

HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Loss history: []
Dev loss: 1.573062194718255



Epoch:   5%|▌         | 1/20 [00:06<02:06,  6.64s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Loss history: [1.573062194718255]
Dev loss: 1.3266699181662664



Epoch:  10%|█         | 2/20 [00:13<01:59,  6.63s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Loss history: [1.573062194718255, 1.3266699181662664]
Dev loss: 1.15270553694831



Epoch:  15%|█▌        | 3/20 [00:19<01:52,  6.62s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Loss history: [1.573062194718255, 1.3266699181662664, 1.15270553694831]
Dev loss: 0.9924946824709574



Epoch:  20%|██        | 4/20 [00:26<01:45,  6.62s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Loss history: [1.573062194718255, 1.3266699181662664, 1.15270553694831, 0.9924946824709574]
Dev loss: 0.8813459873199463



Epoch:  25%|██▌       | 5/20 [00:33<01:39,  6.62s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Loss history: [1.573062194718255, 1.3266699181662664, 1.15270553694831, 0.9924946824709574, 0.8813459873199463]
Dev loss: 0.8213360442055596



Epoch:  30%|███       | 6/20 [00:39<01:32,  6.62s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Loss history: [1.573062194718255, 1.3266699181662664, 1.15270553694831, 0.9924946824709574, 0.8813459873199463, 0.8213360442055596]
Dev loss: 0.8199739389949374



Epoch:  35%|███▌      | 7/20 [00:46<01:26,  6.62s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Loss history: [1.573062194718255, 1.3266699181662664, 1.15270553694831, 0.9924946824709574, 0.8813459873199463, 0.8213360442055596, 0.8199739389949374]
Dev loss: 0.7752813100814819



Epoch:  40%|████      | 8/20 [00:52<01:19,  6.62s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Loss history: [1.573062194718255, 1.3266699181662664, 1.15270553694831, 0.9924946824709574, 0.8813459873199463, 0.8213360442055596, 0.8199739389949374, 0.7752813100814819]
Dev loss: 0.7582661079035865



Epoch:  45%|████▌     | 9/20 [00:59<01:12,  6.62s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Loss history: [1.573062194718255, 1.3266699181662664, 1.15270553694831, 0.9924946824709574, 0.8813459873199463, 0.8213360442055596, 0.8199739389949374, 0.7752813100814819, 0.7582661079035865]
Dev loss: 0.7575248446729448



Epoch:  50%|█████     | 10/20 [01:06<01:06,  6.62s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Epoch:  55%|█████▌    | 11/20 [01:12<00:58,  6.48s/it][A


Loss history: [1.573062194718255, 1.3266699181662664, 1.15270553694831, 0.9924946824709574, 0.8813459873199463, 0.8213360442055596, 0.8199739389949374, 0.7752813100814819, 0.7582661079035865, 0.7575248446729448]
Dev loss: 0.7645473778247833


HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Loss history: [1.573062194718255, 1.3266699181662664, 1.15270553694831, 0.9924946824709574, 0.8813459873199463, 0.8213360442055596, 0.8199739389949374, 0.7752813100814819, 0.7582661079035865, 0.7575248446729448, 0.7645473778247833]
Dev loss: 0.7203055123488108



Epoch:  60%|██████    | 12/20 [01:18<00:52,  6.53s/it][A

HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Epoch:  65%|██████▌   | 13/20 [01:25<00:44,  6.42s/it][A


Loss history: [1.573062194718255, 1.3266699181662664, 1.15270553694831, 0.9924946824709574, 0.8813459873199463, 0.8213360442055596, 0.8199739389949374, 0.7752813100814819, 0.7582661079035865, 0.7575248446729448, 0.7645473778247833, 0.7203055123488108]
Dev loss: 0.7410715437597699


HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Epoch:  70%|███████   | 14/20 [01:31<00:38,  6.35s/it][A


Loss history: [1.573062194718255, 1.3266699181662664, 1.15270553694831, 0.9924946824709574, 0.8813459873199463, 0.8213360442055596, 0.8199739389949374, 0.7752813100814819, 0.7582661079035865, 0.7575248446729448, 0.7645473778247833, 0.7203055123488108, 0.7410715437597699]
Dev loss: 0.733455134762658


HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Epoch:  75%|███████▌  | 15/20 [01:37<00:31,  6.30s/it][A


Loss history: [1.573062194718255, 1.3266699181662664, 1.15270553694831, 0.9924946824709574, 0.8813459873199463, 0.8213360442055596, 0.8199739389949374, 0.7752813100814819, 0.7582661079035865, 0.7575248446729448, 0.7645473778247833, 0.7203055123488108, 0.7410715437597699, 0.733455134762658]
Dev loss: 0.7386853131983016


HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Epoch:  80%|████████  | 16/20 [01:43<00:25,  6.26s/it][A


Loss history: [1.573062194718255, 1.3266699181662664, 1.15270553694831, 0.9924946824709574, 0.8813459873199463, 0.8213360442055596, 0.8199739389949374, 0.7752813100814819, 0.7582661079035865, 0.7575248446729448, 0.7645473778247833, 0.7203055123488108, 0.7410715437597699, 0.733455134762658, 0.7386853131983016]
Dev loss: 0.7507240325212479


HBox(children=(IntProgress(value=0, description='Training iteration', max=27, style=ProgressStyle(description_…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…


Loss history: [1.573062194718255, 1.3266699181662664, 1.15270553694831, 0.9924946824709574, 0.8813459873199463, 0.8213360442055596, 0.8199739389949374, 0.7752813100814819, 0.7582661079035865, 0.7575248446729448, 0.7645473778247833, 0.7203055123488108, 0.7410715437597699, 0.733455134762658, 0.7386853131983016, 0.7507240325212479]
Dev loss: 0.744017438756095
No improvement on development set. Finish training.
Loading model from /tmp/model.bin


I0404 10:06:27.408982 140621602613056 configuration_utils.py:256] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json from cache at /home/yves/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.8f56353af4a709bf5ff0fbc915d8f5b42bfff892cbb6ac98c3c45f481a03c685
I0404 10:06:27.410612 140621602613056 configuration_utils.py:292] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": null,
  "do_sample": false,
  "eos_token_ids": null,
  "finetuning_task": null,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1"
  },
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "is_decoder": false,
  "label2id": {
    "LABEL_0": 0,
    "LABEL_1": 1
  },
  "layer_norm_eps": 1e-12,
  "length_penalty": 1.0,
  "max_length": 20,
  "max_position_embedd

HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=9, style=ProgressStyle(description…




## Evaluation

In [4]:
from sklearn.metrics import precision_recall_fscore_support, classification_report

print("Test performance:", precision_recall_fscore_support(all_correct, all_predicted, average="micro"))
print(classification_report(all_correct, all_predicted, target_names=target_names))

Test performance: (0.8144475920679887, 0.8144475920679887, 0.8144475920679887, None)
              precision    recall  f1-score   support

  Feedback_1       0.91      0.96      0.93       282
 Feedback_10       0.00      0.00      0.00        15
  Feedback_2       0.84      0.95      0.89        22
  Feedback_3       0.67      0.15      0.25        13
 Feedback _6       0.63      0.79      0.70        42
  Feedback_4       0.65      0.52      0.58        33
  Feedback_5       0.79      0.91      0.85       205
  Feedback_7       0.64      0.79      0.71        43
  Feedback_8       0.69      0.33      0.45        33
  Feedback_9       1.00      0.06      0.11        18

    accuracy                           0.81       706
   macro avg       0.68      0.55      0.55       706
weighted avg       0.80      0.81      0.79       706



  _warn_prf(average, modifier, msg_start, len(result))


In [5]:
c = 0
for item, predicted, correct in zip(all_test_data, all_predicted, all_correct):
    assert item.label_ids == correct
    c += (item.label_ids == predicted)
    print("{}#{}#{}".format(item.text, idx2label[correct], idx2label[predicted]))
    
print()
print(c, "/", len(all_test_data), "=", c/len(all_test_data))

Methane from cow burps harms the environment because cows burp about 30-50 gallons of methane gas into the atmosphere each day.#Feedback_5#Feedback_5
Methane from cow burps harms the environment because it is a greenhouse gas which contributes to global warming#Feedback_1#Feedback_1
Methane from cow burps harms the environment because the gas increases earth temperature.#Feedback_1#Feedback_1
Methane from cow burps harms the environment because it adds carbon to the atmospehre#Feedback_10#Feedback_8
Methane from cow burps harms the environment because it adds to the global warming effect.#Feedback_1#Feedback_1
Methane from cow burps harms the environment because it produces 14.5% of greenhouse gases worldwide.#Feedback_7#Feedback_7
Methane from cow burps harms the environment because they're burps contain methane an overall create 30 to 50 gallons of methane released into the atmosphere.#Feedback_5#Feedback_5
Methane from cow burps harms the environment because it burps out 30-50 gallo