# Bert Experiments: One Model

In this notebook, we continue our BERT experiments. We try to finetune *one* BERT model on *several* of our data sets. This makes it easier to deploy our solution in production. 

As a first test, we'll just train a BERT model that takes as input a response from any of several data sets, and outputs probabilities for *all* labels in *all* data sets. This is slightly suboptimal (after all, we don't need probabilities for labels that are not relevant to a specific prompt), but as long as we're not working with thousands of different labels, I don't think this is very problematic.

The setup and preprocessing procedure is very similar to that in the first "Bert experiments" notebook. I'll highlight the areas where it is different.

In [1]:
import torch

from pytorch_transformers.tokenization_bert import BertTokenizer
from pytorch_transformers.modeling_bert import BertForSequenceClassification

BERT_MODEL = 'bert-large-uncased'

tokenizer = BertTokenizer.from_pretrained(BERT_MODEL)

## Data

As we build one "big" model, we combine the data from all of our input files. We keep the test files separate, because we want to be able to evaluate on every prompt separately.

In addition, we also remember which labels are relevant for every prompt, because in the prediction phase, we will only look at the probabilities of the relevant labels.

In [2]:
import ndjson

file_prefixes = ["eatingmeat_emma", "junkfood_but", "junkfood_because"]

train_data = []
dev_data = []
test_data = {}
label2idx = {}
target_names = {}

for prefix in file_prefixes:
    
    train_file = f"../data/interim/{prefix}_train_withprompt.ndjson"
    dev_file = f"../data/interim/{prefix}_dev_withprompt.ndjson"
    test_file = f"../data/interim/{prefix}_test_withprompt.ndjson"

    target_names[prefix] = []
    with open(train_file) as i:
        new_train_data = ndjson.load(i)
        for item in new_train_data:
            if item["label"] not in label2idx:
                target_names[prefix].append(item["label"])
                label2idx[item["label"]] = len(label2idx)
        train_data += new_train_data
                
    with open(dev_file) as i:
        dev_data += ndjson.load(i)

    with open(test_file) as i:
        test_data[prefix] = ndjson.load(i)

In [3]:
print(label2idx)
print(target_names)

{'Less meat consumption could harm economy and cut jobs': 0, 'Flexitarian w/o connection to environment or jobs': 1, 'The meat industry is important/thriving': 2, 'Meat creates jobs and benefits economy': 3, 'People will or should still eat meat': 4, "Outside of article's scope": 5, 'Flexitarians benefit environment': 6, 'Exports and demand are increasing': 7, 'Unclassified Off-Topic': 8, 'School without generating money': 9, 'Schools providing healthy alternatives': 10, 'Student choice': 11, 'Students without choice': 12, 'Schools generate money': 13, 'Students can still bring/access junk food': 14, 'Unhealthy without Diabetes and Risk Factors': 15, 'Diabetes and Risk Factors': 16, 'Nutritional value without Diabetes and Risk Factors': 17, 'Obesity without Diabetes': 18}
{'eatingmeat_emma': ['Less meat consumption could harm economy and cut jobs', 'Flexitarian w/o connection to environment or jobs', 'The meat industry is important/thriving', 'Meat creates jobs and benefits economy', '

## Model

In [4]:
model = BertForSequenceClassification.from_pretrained(BERT_MODEL, num_labels=len(label2idx))

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)
model.train()

## Preprocessing

In [8]:
import logging
import numpy as np

logging.basicConfig(format = '%(asctime)s - %(levelname)s - %(name)s -   %(message)s',
                    datefmt = '%m/%d/%Y %H:%M:%S',
                    level = logging.INFO)
logger = logging.getLogger(__name__)

MAX_SEQ_LENGTH=100

class InputFeatures(object):
    """A single set of features of data."""

    def __init__(self, input_ids, input_mask, segment_ids, label_id):
        self.input_ids = input_ids
        self.input_mask = input_mask
        self.segment_ids = segment_ids
        self.label_id = label_id
        
        
def convert_examples_to_features(examples, label2idx, max_seq_length, tokenizer, verbose=0):
    """Loads a data file into a list of `InputBatch`s."""
    
    features = []
    for (ex_index, ex) in enumerate(examples):
        
        # TODO: should deal better with sentences > max tok length
        input_ids = tokenizer.encode("[CLS] " + ex["text"] + " [SEP]")
        segment_ids = [0] * len(input_ids)
            
        # The mask has 1 for real tokens and 0 for padding tokens. Only real
        # tokens are attended to.
        input_mask = [1] * len(input_ids)

        # Zero-pad up to the sequence length.
        padding = [0] * (max_seq_length - len(input_ids))
        input_ids += padding
        input_mask += padding
        segment_ids += padding

        assert len(input_ids) == max_seq_length
        assert len(input_mask) == max_seq_length
        assert len(segment_ids) == max_seq_length

        label_id = label2idx[ex["label"]]
        if verbose and ex_index == 0:
            logger.info("*** Example ***")
            logger.info("text: %s" % ex["text"])
            logger.info("input_ids: %s" % " ".join([str(x) for x in input_ids]))
            logger.info("input_mask: %s" % " ".join([str(x) for x in input_mask]))
            logger.info("segment_ids: %s" % " ".join([str(x) for x in segment_ids]))
            logger.info("label:" + str(ex["label"]) + " id: " + str(label_id))

        features.append(
                InputFeatures(input_ids=input_ids,
                              input_mask=input_mask,
                              segment_ids=segment_ids,
                              label_id=label_id))
    return features

train_features = convert_examples_to_features(train_data, label2idx, MAX_SEQ_LENGTH, tokenizer, verbose=0)
dev_features = convert_examples_to_features(dev_data, label2idx, MAX_SEQ_LENGTH, tokenizer)

test_features = {}
for prefix in test_data:
    test_features[prefix] = convert_examples_to_features(test_data[prefix], label2idx, MAX_SEQ_LENGTH, tokenizer, verbose=1)

07/19/2019 16:30:14 - INFO - __main__ -   *** Example ***
07/19/2019 16:30:14 - INFO - __main__ -   text: Large amounts of meat consumption are harming the environment, but it's hard to do anything about it because meat consumption is a large part of American culture and is quickly becoming a part of cultures around the world.
07/19/2019 16:30:14 - INFO - __main__ -   input_ids: 101 2312 8310 1997 6240 8381 2024 7386 2075 1996 4044 1010 2021 2009 1005 1055 2524 2000 2079 2505 2055 2009 2138 6240 8381 2003 1037 2312 2112 1997 2137 3226 1998 2003 2855 3352 1037 2112 1997 8578 2105 1996 2088 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
07/19/2019 16:30:14 - INFO - __main__ -   input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
07/19/2019 16:30:14 - INFO - __

In [9]:
import torch
from torch.utils.data import TensorDataset, DataLoader, SequentialSampler

def get_data_loader(features, max_seq_length, batch_size): 

    all_input_ids = torch.tensor([f.input_ids for f in features], dtype=torch.long)
    all_input_mask = torch.tensor([f.input_mask for f in features], dtype=torch.long)
    all_segment_ids = torch.tensor([f.segment_ids for f in features], dtype=torch.long)
    all_label_ids = torch.tensor([f.label_id for f in features], dtype=torch.long)
    data = TensorDataset(all_input_ids, all_input_mask, all_segment_ids, all_label_ids)
    sampler = SequentialSampler(data)
    dataloader = DataLoader(data, sampler=sampler, batch_size=batch_size)
    return dataloader

BATCH_SIZE = 2

train_dataloader = get_data_loader(train_features, MAX_SEQ_LENGTH, BATCH_SIZE)
dev_dataloader = get_data_loader(dev_features, MAX_SEQ_LENGTH, BATCH_SIZE)
test_dataloaders = {} 
for prefix in test_features:
    test_dataloaders[prefix] = get_data_loader(test_features[prefix], MAX_SEQ_LENGTH, BATCH_SIZE)

## Evaluation

In [10]:
def evaluate(model, dataloader):

    eval_loss = 0
    nb_eval_steps = 0
    predicted_labels, correct_labels = [], []

    for step, batch in enumerate(tqdm(dataloader, desc="Evaluation iteration")):
        batch = tuple(t.to(device) for t in batch)
        input_ids, input_mask, segment_ids, label_ids = batch

        with torch.no_grad():
            tmp_eval_loss, logits = model(input_ids, segment_ids, input_mask, label_ids)

        outputs = np.argmax(logits.to('cpu'), axis=1)
        label_ids = label_ids.to('cpu').numpy()
        
        predicted_labels += list(outputs)
        correct_labels += list(label_ids)
        
        eval_loss += tmp_eval_loss.mean().item()
        nb_eval_steps += 1

    eval_loss = eval_loss / nb_eval_steps
    
    correct_labels = np.array(correct_labels)
    predicted_labels = np.array(predicted_labels)
        
    return eval_loss, correct_labels, predicted_labels

## Training

In [11]:
from pytorch_transformers.optimization import AdamW, WarmupLinearSchedule

GRADIENT_ACCUMULATION_STEPS = 8
NUM_TRAIN_EPOCHS = 100
LEARNING_RATE = 1e-5
WARMUP_PROPORTION = 0.1

def warmup_linear(x, warmup=0.002):
    if x < warmup:
        return x/warmup
    return 1.0 - x

num_train_steps = int(len(train_data) / BATCH_SIZE / GRADIENT_ACCUMULATION_STEPS * NUM_TRAIN_EPOCHS)

param_optimizer = list(model.named_parameters())
no_decay = ['bias', 'LayerNorm.bias', 'LayerNorm.weight']
optimizer_grouped_parameters = [
    {'params': [p for n, p in param_optimizer if not any(nd in n for nd in no_decay)], 'weight_decay': 0.01},
    {'params': [p for n, p in param_optimizer if any(nd in n for nd in no_decay)], 'weight_decay': 0.0}
    ]

optimizer = AdamW(optimizer_grouped_parameters, lr=LEARNING_RATE, correct_bias=False)
scheduler = WarmupLinearSchedule(optimizer, warmup_steps=100, t_total=num_train_steps)

In [12]:
import os

OUTPUT_DIR = "/tmp/"
MODEL_FILE_NAME = "pytorch_model.bin"
output_model_file = os.path.join(OUTPUT_DIR, MODEL_FILE_NAME)

In [13]:
from tqdm import trange
from tqdm import tqdm_notebook as tqdm
from sklearn.metrics import classification_report, precision_recall_fscore_support


PATIENCE = 5

global_step = 0
model.train()
loss_history = []
best_epoch = 0
for epoch in trange(int(NUM_TRAIN_EPOCHS), desc="Epoch"):
    tr_loss = 0
    nb_tr_examples, nb_tr_steps = 0, 0
    for step, batch in enumerate(tqdm(train_dataloader, desc="Training iteration")):
        batch = tuple(t.to(device) for t in batch)
        input_ids, input_mask, segment_ids, label_ids = batch
        outputs = model(input_ids, segment_ids, input_mask, label_ids)
        loss = outputs[0]
        
        if GRADIENT_ACCUMULATION_STEPS > 1:
            loss = loss / GRADIENT_ACCUMULATION_STEPS

        loss.backward()

        tr_loss += loss.item()
        nb_tr_examples += input_ids.size(0)
        nb_tr_steps += 1
        if (step + 1) % GRADIENT_ACCUMULATION_STEPS == 0:
            lr_this_step = LEARNING_RATE * warmup_linear(global_step/num_train_steps, WARMUP_PROPORTION)
            for param_group in optimizer.param_groups:
                param_group['lr'] = lr_this_step
            optimizer.step()
            optimizer.zero_grad()
            global_step += 1

    dev_loss, _, _ = evaluate(model, dev_dataloader)
    
    print("Loss history:", loss_history)
    print("Dev loss:", dev_loss)
    
    if len(loss_history) == 0 or dev_loss < min(loss_history):
        model_to_save = model.module if hasattr(model, 'module') else model
        torch.save(model_to_save.state_dict(), output_model_file)
        best_epoch = epoch
    
    if epoch-best_epoch >= PATIENCE: 
        print("No improvement on development set. Finish training.")
        break
    
    loss_history.append(dev_loss)

Epoch:   0%|          | 0/100 [00:00<?, ?it/s]

HBox(children=(IntProgress(value=0, description='Training iteration', max=384, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=97, style=ProgressStyle(descriptio…


Loss history: []
Dev loss: 2.7108593724437595


Epoch:   1%|          | 1/100 [00:49<1:21:56, 49.66s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=384, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=97, style=ProgressStyle(descriptio…


Loss history: [2.7108593724437595]
Dev loss: 2.4951083426623


Epoch:   2%|▏         | 2/100 [01:39<1:21:09, 49.69s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=384, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=97, style=ProgressStyle(descriptio…


Loss history: [2.7108593724437595, 2.4951083426623]
Dev loss: 2.3867563500846782


Epoch:   3%|▎         | 3/100 [02:29<1:20:29, 49.78s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=384, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=97, style=ProgressStyle(descriptio…


Loss history: [2.7108593724437595, 2.4951083426623, 2.3867563500846782]
Dev loss: 2.1398411313283074


Epoch:   4%|▍         | 4/100 [03:19<1:19:45, 49.85s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=384, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=97, style=ProgressStyle(descriptio…

Epoch:   5%|▌         | 5/100 [04:08<1:18:22, 49.50s/it]


Loss history: [2.7108593724437595, 2.4951083426623, 2.3867563500846782, 2.1398411313283074]
Dev loss: 2.240928824414912


HBox(children=(IntProgress(value=0, description='Training iteration', max=384, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=97, style=ProgressStyle(descriptio…


Loss history: [2.7108593724437595, 2.4951083426623, 2.3867563500846782, 2.1398411313283074, 2.240928824414912]
Dev loss: 1.609696202671405


Epoch:   6%|▌         | 6/100 [04:58<1:17:47, 49.65s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=384, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=97, style=ProgressStyle(descriptio…


Loss history: [2.7108593724437595, 2.4951083426623, 2.3867563500846782, 2.1398411313283074, 2.240928824414912, 1.609696202671405]
Dev loss: 1.236571031132924


Epoch:   7%|▋         | 7/100 [05:48<1:17:07, 49.75s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=384, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=97, style=ProgressStyle(descriptio…


Loss history: [2.7108593724437595, 2.4951083426623, 2.3867563500846782, 2.1398411313283074, 2.240928824414912, 1.609696202671405, 1.236571031132924]
Dev loss: 1.0338673358111037


Epoch:   8%|▊         | 8/100 [06:38<1:16:24, 49.84s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=384, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=97, style=ProgressStyle(descriptio…


Loss history: [2.7108593724437595, 2.4951083426623, 2.3867563500846782, 2.1398411313283074, 2.240928824414912, 1.609696202671405, 1.236571031132924, 1.0338673358111037]
Dev loss: 0.8588105163623377


Epoch:   9%|▉         | 9/100 [07:28<1:15:39, 49.89s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=384, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=97, style=ProgressStyle(descriptio…

Epoch:  10%|█         | 10/100 [08:16<1:14:16, 49.52s/it]


Loss history: [2.7108593724437595, 2.4951083426623, 2.3867563500846782, 2.1398411313283074, 2.240928824414912, 1.609696202671405, 1.236571031132924, 1.0338673358111037, 0.8588105163623377]
Dev loss: 0.9929000372739182


HBox(children=(IntProgress(value=0, description='Training iteration', max=384, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=97, style=ProgressStyle(descriptio…


Loss history: [2.7108593724437595, 2.4951083426623, 2.3867563500846782, 2.1398411313283074, 2.240928824414912, 1.609696202671405, 1.236571031132924, 1.0338673358111037, 0.8588105163623377, 0.9929000372739182]
Dev loss: 0.6890989787799796


Epoch:  11%|█         | 11/100 [09:06<1:13:40, 49.67s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=384, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=97, style=ProgressStyle(descriptio…


Loss history: [2.7108593724437595, 2.4951083426623, 2.3867563500846782, 2.1398411313283074, 2.240928824414912, 1.609696202671405, 1.236571031132924, 1.0338673358111037, 0.8588105163623377, 0.9929000372739182, 0.6890989787799796]
Dev loss: 0.60732707043284


Epoch:  12%|█▏        | 12/100 [09:56<1:13:00, 49.77s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=384, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=97, style=ProgressStyle(descriptio…


Loss history: [2.7108593724437595, 2.4951083426623, 2.3867563500846782, 2.1398411313283074, 2.240928824414912, 1.609696202671405, 1.236571031132924, 1.0338673358111037, 0.8588105163623377, 0.9929000372739182, 0.6890989787799796, 0.60732707043284]
Dev loss: 0.5589588014120909


Epoch:  13%|█▎        | 13/100 [10:46<1:12:16, 49.85s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=384, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=97, style=ProgressStyle(descriptio…


Loss history: [2.7108593724437595, 2.4951083426623, 2.3867563500846782, 2.1398411313283074, 2.240928824414912, 1.609696202671405, 1.236571031132924, 1.0338673358111037, 0.8588105163623377, 0.9929000372739182, 0.6890989787799796, 0.60732707043284, 0.5589588014120909]
Dev loss: 0.5114151798572737


Epoch:  14%|█▍        | 14/100 [11:36<1:11:30, 49.90s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=384, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=97, style=ProgressStyle(descriptio…


Loss history: [2.7108593724437595, 2.4951083426623, 2.3867563500846782, 2.1398411313283074, 2.240928824414912, 1.609696202671405, 1.236571031132924, 1.0338673358111037, 0.8588105163623377, 0.9929000372739182, 0.6890989787799796, 0.60732707043284, 0.5589588014120909, 0.5114151798572737]
Dev loss: 0.49940373602601673


Epoch:  15%|█▌        | 15/100 [12:26<1:10:43, 49.93s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=384, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=97, style=ProgressStyle(descriptio…


Loss history: [2.7108593724437595, 2.4951083426623, 2.3867563500846782, 2.1398411313283074, 2.240928824414912, 1.609696202671405, 1.236571031132924, 1.0338673358111037, 0.8588105163623377, 0.9929000372739182, 0.6890989787799796, 0.60732707043284, 0.5589588014120909, 0.5114151798572737, 0.49940373602601673]
Dev loss: 0.44829600865078956


Epoch:  16%|█▌        | 16/100 [13:16<1:09:55, 49.95s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=384, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=97, style=ProgressStyle(descriptio…

Epoch:  17%|█▋        | 17/100 [14:05<1:08:34, 49.57s/it]


Loss history: [2.7108593724437595, 2.4951083426623, 2.3867563500846782, 2.1398411313283074, 2.240928824414912, 1.609696202671405, 1.236571031132924, 1.0338673358111037, 0.8588105163623377, 0.9929000372739182, 0.6890989787799796, 0.60732707043284, 0.5589588014120909, 0.5114151798572737, 0.49940373602601673, 0.44829600865078956]
Dev loss: 0.45402995581479416


HBox(children=(IntProgress(value=0, description='Training iteration', max=384, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=97, style=ProgressStyle(descriptio…


Loss history: [2.7108593724437595, 2.4951083426623, 2.3867563500846782, 2.1398411313283074, 2.240928824414912, 1.609696202671405, 1.236571031132924, 1.0338673358111037, 0.8588105163623377, 0.9929000372739182, 0.6890989787799796, 0.60732707043284, 0.5589588014120909, 0.5114151798572737, 0.49940373602601673, 0.44829600865078956, 0.45402995581479416]
Dev loss: 0.415479629310136


Epoch:  18%|█▊        | 18/100 [14:55<1:07:55, 49.70s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=384, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=97, style=ProgressStyle(descriptio…


Loss history: [2.7108593724437595, 2.4951083426623, 2.3867563500846782, 2.1398411313283074, 2.240928824414912, 1.609696202671405, 1.236571031132924, 1.0338673358111037, 0.8588105163623377, 0.9929000372739182, 0.6890989787799796, 0.60732707043284, 0.5589588014120909, 0.5114151798572737, 0.49940373602601673, 0.44829600865078956, 0.45402995581479416, 0.415479629310136]
Dev loss: 0.38255221081763197


Epoch:  19%|█▉        | 19/100 [15:45<1:07:12, 49.79s/it]

HBox(children=(IntProgress(value=0, description='Training iteration', max=384, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=97, style=ProgressStyle(descriptio…

Epoch:  20%|██        | 20/100 [16:34<1:05:56, 49.46s/it]


Loss history: [2.7108593724437595, 2.4951083426623, 2.3867563500846782, 2.1398411313283074, 2.240928824414912, 1.609696202671405, 1.236571031132924, 1.0338673358111037, 0.8588105163623377, 0.9929000372739182, 0.6890989787799796, 0.60732707043284, 0.5589588014120909, 0.5114151798572737, 0.49940373602601673, 0.44829600865078956, 0.45402995581479416, 0.415479629310136, 0.38255221081763197]
Dev loss: 0.43627593812254284


HBox(children=(IntProgress(value=0, description='Training iteration', max=384, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=97, style=ProgressStyle(descriptio…

Epoch:  21%|██        | 21/100 [17:22<1:04:48, 49.22s/it]


Loss history: [2.7108593724437595, 2.4951083426623, 2.3867563500846782, 2.1398411313283074, 2.240928824414912, 1.609696202671405, 1.236571031132924, 1.0338673358111037, 0.8588105163623377, 0.9929000372739182, 0.6890989787799796, 0.60732707043284, 0.5589588014120909, 0.5114151798572737, 0.49940373602601673, 0.44829600865078956, 0.45402995581479416, 0.415479629310136, 0.38255221081763197, 0.43627593812254284]
Dev loss: 0.44740037082396833


HBox(children=(IntProgress(value=0, description='Training iteration', max=384, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=97, style=ProgressStyle(descriptio…

Epoch:  22%|██▏       | 22/100 [18:11<1:03:46, 49.06s/it]


Loss history: [2.7108593724437595, 2.4951083426623, 2.3867563500846782, 2.1398411313283074, 2.240928824414912, 1.609696202671405, 1.236571031132924, 1.0338673358111037, 0.8588105163623377, 0.9929000372739182, 0.6890989787799796, 0.60732707043284, 0.5589588014120909, 0.5114151798572737, 0.49940373602601673, 0.44829600865078956, 0.45402995581479416, 0.415479629310136, 0.38255221081763197, 0.43627593812254284, 0.44740037082396833]
Dev loss: 0.43383764237472694


HBox(children=(IntProgress(value=0, description='Training iteration', max=384, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=97, style=ProgressStyle(descriptio…

Epoch:  23%|██▎       | 23/100 [19:00<1:02:49, 48.95s/it]


Loss history: [2.7108593724437595, 2.4951083426623, 2.3867563500846782, 2.1398411313283074, 2.240928824414912, 1.609696202671405, 1.236571031132924, 1.0338673358111037, 0.8588105163623377, 0.9929000372739182, 0.6890989787799796, 0.60732707043284, 0.5589588014120909, 0.5114151798572737, 0.49940373602601673, 0.44829600865078956, 0.45402995581479416, 0.415479629310136, 0.38255221081763197, 0.43627593812254284, 0.44740037082396833, 0.43383764237472694]
Dev loss: 0.44486228706910436


HBox(children=(IntProgress(value=0, description='Training iteration', max=384, style=ProgressStyle(description…




HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=97, style=ProgressStyle(descriptio…


Loss history: [2.7108593724437595, 2.4951083426623, 2.3867563500846782, 2.1398411313283074, 2.240928824414912, 1.609696202671405, 1.236571031132924, 1.0338673358111037, 0.8588105163623377, 0.9929000372739182, 0.6890989787799796, 0.60732707043284, 0.5589588014120909, 0.5114151798572737, 0.49940373602601673, 0.44829600865078956, 0.45402995581479416, 0.415479629310136, 0.38255221081763197, 0.43627593812254284, 0.44740037082396833, 0.43383764237472694, 0.44486228706910436]
Dev loss: 0.43390876484900404
No improvement on development set. Finish training.





## Results

In [14]:
from tqdm import tqdm_notebook as tqdm
from sklearn.metrics import classification_report, precision_recall_fscore_support

device="cpu"
print("Loading model from", output_model_file)

model_state_dict = torch.load(output_model_file, map_location=lambda storage, loc: storage)
model = BertForSequenceClassification.from_pretrained(BERT_MODEL, state_dict=model_state_dict, num_labels=len(label2idx))
model.to(device)

model.eval()

#_, train_correct, train_predicted = evaluate(model, train_dataloader)
#_, dev_correct, dev_predicted = evaluate(model, dev_dataloader)

#print("Training performance:", precision_recall_fscore_support(train_correct, train_predicted, average="micro"))
#print("Development performance:", precision_recall_fscore_support(dev_correct, dev_predicted, average="micro"))

for prefix in test_dataloaders:
    print(prefix)
    _, test_correct, test_predicted = evaluate(model, test_dataloaders[prefix])

    print("Test performance:", precision_recall_fscore_support(test_correct, test_predicted, average="micro"))

    print(classification_report(test_correct, test_predicted, target_names=target_names[prefix]))

Loading model from /tmp/pytorch_model.bin


07/19/2019 16:50:05 - INFO - pytorch_transformers.modeling_utils -   loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-config.json from cache at /home/yves/.cache/torch/pytorch_transformers/6dfaed860471b03ab5b9acb6153bea82b6632fb9bbe514d3fff050fe1319ee6d.4c88e2dec8f8b017f319f6db2b157fee632c0860d9422e4851bd0d6999f9ce38
07/19/2019 16:50:05 - INFO - pytorch_transformers.modeling_utils -   Model config {
  "attention_probs_dropout_prob": 0.1,
  "finetuning_task": null,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 1024,
  "initializer_range": 0.02,
  "intermediate_size": 4096,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "num_attention_heads": 16,
  "num_hidden_layers": 24,
  "num_labels": 19,
  "output_attentions": false,
  "output_hidden_states": false,
  "torchscript": false,
  "type_vocab_size": 2,
  "vocab_size": 30522
}

07/19/2019 16:50:06 - INFO - pytorch_transformers.modeling_utils -   loadi

eatingmeat_emma


HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=71, style=ProgressStyle(descriptio…


Test performance: (0.9436619718309859, 0.9436619718309859, 0.9436619718309859, None)
                                                       precision    recall  f1-score   support

Less meat consumption could harm economy and cut jobs       1.00      1.00      1.00        42
    Flexitarian w/o connection to environment or jobs       1.00      1.00      1.00         8
              The meat industry is important/thriving       1.00      0.67      0.80         6
               Meat creates jobs and benefits economy       0.89      1.00      0.94        34
                 People will or should still eat meat       0.91      1.00      0.95        31
                           Outside of article's scope       1.00      0.56      0.71         9
                     Flexitarians benefit environment       0.88      1.00      0.93         7
                    Exports and demand are increasing       1.00      0.60      0.75         5

                                          avg / total    

HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=77, style=ProgressStyle(descriptio…


Test performance: (0.8366013071895425, 0.8366013071895425, 0.8366013071895425, None)
                                           precision    recall  f1-score   support

                   Unclassified Off-Topic       0.46      0.55      0.50        11
          School without generating money       0.78      0.44      0.56        16
   Schools providing healthy alternatives       0.97      0.96      0.97        75
                           Student choice       0.58      1.00      0.74         7
                  Students without choice       0.78      0.85      0.81        33
                   Schools generate money       0.89      1.00      0.94         8
Students can still bring/access junk food       0.00      0.00      0.00         3

                              avg / total       0.83      0.84      0.83       153

junkfood_because


  'precision', 'predicted', average, warn_for)


HBox(children=(IntProgress(value=0, description='Evaluation iteration', max=59, style=ProgressStyle(descriptio…


Test performance: (0.940677966101695, 0.940677966101695, 0.940677966101695, None)
                                                     precision    recall  f1-score   support

        Unhealthy without Diabetes and Risk Factors       0.97      0.96      0.97        75
                          Diabetes and Risk Factors       1.00      0.90      0.95        29
Nutritional value without Diabetes and Risk Factors       0.86      0.86      0.86         7
                           Obesity without Diabetes       0.64      1.00      0.78         7

                                        avg / total       0.95      0.94      0.94       118

