# Fine tuning models
All notebooks for finetuning models were the same, only the model changed. This notebook shows how we fine-tuned ```flax-community/roberta-base-danish``` but could just as have been for any other model.

The notebook requires GPU access, otherwise it will either be very slow or the script will crash. 


# GPU, installing packages and login to WANDB

In [None]:
# Initialize GPU
!nvidia-smi

Mon Nov 28 10:16:17 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03    Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla T4            Off  | 00000000:00:04.0 Off |                    0 |
| N/A   53C    P8    10W /  70W |      0MiB / 15109MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

## Installing required packages

In [None]:
!pip install -q transformers transformers-interpret datasets evaluate wandb tensorflow spacy spacy_langdetect

[K     |████████████████████████████████| 5.5 MB 4.9 MB/s 
[K     |████████████████████████████████| 45 kB 4.3 MB/s 
[K     |████████████████████████████████| 451 kB 89.4 MB/s 
[K     |████████████████████████████████| 72 kB 1.7 MB/s 
[K     |████████████████████████████████| 1.9 MB 62.5 MB/s 
[K     |████████████████████████████████| 182 kB 84.3 MB/s 
[K     |████████████████████████████████| 7.6 MB 52.3 MB/s 
[K     |████████████████████████████████| 1.4 MB 73.2 MB/s 
[K     |████████████████████████████████| 793 kB 84.3 MB/s 
[K     |████████████████████████████████| 1.6 MB 63.2 MB/s 
[K     |████████████████████████████████| 115 kB 86.2 MB/s 
[K     |████████████████████████████████| 212 kB 90.4 MB/s 
[K     |████████████████████████████████| 127 kB 80.2 MB/s 
[K     |████████████████████████████████| 168 kB 89.2 MB/s 
[K     |████████████████████████████████| 182 kB 75.6 MB/s 
[K     |████████████████████████████████| 62 kB 1.4 MB/s 
[K     |██████████████████████

# Importing packages, data and model

In [None]:
from datasets import load_dataset, load_metric #load_dataset will cache the dataset to avoid downloading it again the next time you run this cell.
import datasets as datasets
from transformers import AutoTokenizer, AutoModelForSequenceClassification, TrainingArguments, Trainer, EarlyStoppingCallback
import pandas as pd

# Loading data
dataset_1 = datasets.load_dataset('ScandEval/angry-tweets-mini', split='train+test+val')# sentiment classification dataset: negative, neutral or positive

# Loading larger dataset
dataset_2 = load_dataset("DDSC/twitter-sent", split='train+test')


# Loading model and tokenizer
model_name = "flax-community/roberta-base-danish"
tokenizer = AutoTokenizer.from_pretrained(model_name)
num_labels=3
model = AutoModelForSequenceClassification.from_pretrained(model_name, num_labels=num_labels) # giving the number of labels and huggingface path,beware that "AutoModelForSequenceClassification" will automatically add an empty linear layer on top of the model, we don't need to do that manually

Some weights of the model checkpoint at flax-community/roberta-base-danish were not used when initializing RobertaForSequenceClassification: ['lm_head.dense.bias', 'lm_head.layer_norm.weight', 'lm_head.layer_norm.bias', 'lm_head.bias', 'lm_head.dense.weight']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at flax-community/roberta-base-danish and are newly initialized: ['classifier.dense.bias', 'classifier.out_proj.bias', 'classifier.dense.weight',

In [None]:
from datasets import concatenate_datasets
raw_dataset = concatenate_datasets([dataset_1, dataset_2])

In [None]:
from datasets import dataset_dict, Dataset

pandas = pd.DataFrame(raw_dataset)

pandas.loc[pandas.label == 'negativ', 'label'] = "negative"
pandas.loc[pandas.label == 'positiv', 'label'] = "positive"

# Pre-processing data

**Removing duplicate rows**

In [None]:
# removing duplicate rows
pandas = pandas.drop_duplicates().reset_index()
pandas = pandas.drop(['index'], axis = 1)

**Removing specific words**

In [None]:
def remove_mystopwords(sentence, stopword_list):
    tokens = sentence.split(" ")
    tokens_filtered= [word for word in tokens if not word in stopword_list]
    return (" ").join(tokens_filtered)

# creating stopword list
stopwords = ["link", "rt", "amp", "@USER", "[LINK]"]

pandas.text = [remove_mystopwords(sentence, stopwords) for sentence in pandas.text]

**Removing non-Danish sentences**

In [None]:
no_eng = 1
no_below_4 = 1
no_ttr_below_3 = 1

if no_eng == 1:
  import spacy
  from spacy.language import Language
  from spacy_langdetect import LanguageDetector
  import numpy as np
  import pandas as pd

  def get_lang_detector(nlp, name):
    return LanguageDetector()
    
  # loading the language model instance that will be used for language detection
  nlp = spacy.load("en_core_web_sm")
  Language.factory("language_detector", func=get_lang_detector)
  nlp.add_pipe('language_detector', last=True)

  # applying the language detection to the data
  data = [nlp(text_i)._.language for i, text_i in enumerate(pandas['text'])]

  # transforming the data to a pandas dataframe
  data_pd = pd.DataFrame.from_dict(data)
  data_pd["tweets"] = pandas['text'] # adding the tweets to the dataframe

  # removing all that have been detected to be english
 # data_pd_1 = data_pd[data_pd['language'] != 'en']

**Removing sentences of length <4 words**

In [None]:
data_pd['label'] = pandas['label']

if no_below_4 == 1:

  # removing all with a sentence length below 4
  data_pd['tweet_len'] = [len(data_pd['tweets'].iloc[i].split()) for i in range(data_pd.shape[0])]
  data_pd = data_pd[data_pd['tweet_len'] > 3]

data_pd_3 = data_pd[data_pd['language'] != 'en']

data_pd_4 = data_pd_3.drop(['score', 'language', 'tweet_len'], axis= 1)
data_pd_5 = data_pd_4.rename(columns = {'tweets': 'text'})

In [None]:
raw_dataset = Dataset.from_dict(data_pd_5)
raw_dataset

Dataset({
    features: ['text', 'label'],
    num_rows: 3806
})

**Changing predefined dataset splits**

In [None]:
from datasets import concatenate_datasets, dataset_dict, Dataset

#### combine datasets
#dataset_combined = concatenate_datasets([raw_dataset['val'], raw_dataset['train'], raw_dataset['test']])

##### 60% train, 40% test
train_test = raw_dataset.train_test_split(test_size=0.4, seed = 42) # seed when splitting data is fairly crucial when comparing different models, to make sure they get the same test and training data.
# 20% validation, 20% test
test_valid = train_test['test'].train_test_split(test_size=0.5, seed = 42)
# combining into test 60%, test 20%, val 20%
dataset_recombined = datasets.DatasetDict({
    'train': train_test['train'],
    'valid': test_valid['train'],
    'test': test_valid['test']})

In [None]:
# creating a ClassLabel instance to use for mapping classes to integers (it is needed for creating tensors)
from datasets import ClassLabel
labels_cl = ClassLabel(num_classes=3, names=['negative', 'neutral', 'positive'])

# defining a function to tokenize the text and translate all labels into integers intead of strings
def tokenize_function(example):
  tokens = tokenizer(example["text"], padding="max_length", truncation=True, max_length=128)
  tokens['label'] = labels_cl.str2int(example['label'])
  return tokens

tokenized_datasets = dataset_recombined.map(tokenize_function, batched=True, remove_columns=dataset_recombined['train'].column_names) # batched=True speeds up tokenization by allowing to process multiple lines at once

  0%|          | 0/3 [00:00<?, ?ba/s]

  0%|          | 0/1 [00:00<?, ?ba/s]

  0%|          | 0/1 [00:00<?, ?ba/s]

## Evaluation metrics

In [None]:
import numpy as np
import evaluate

def compute_metrics(eval_pred):
    metric0 = evaluate.load("accuracy")
    metric1 = evaluate.load("precision")
    metric2 = evaluate.load("recall")
    metric3 = evaluate.load("f1")

    logits, labels = eval_pred
    predictions = np.argmax(logits, axis=-1)
    accuracy = metric0.compute(predictions=predictions, references=labels)["accuracy"]
    precision = metric1.compute(predictions=predictions, references=labels, average="weighted")["precision"]
    recall = metric2.compute(predictions=predictions, references=labels, average="weighted")["recall"]
    f1 = metric3.compute(predictions=predictions, references=labels, average="weighted")["f1"]
    return {"accuracy": accuracy, "precision": precision, "recall": recall, "f1": f1}

## Early stopping

In [None]:
# Makes the model stop when validation loss hasn't improved for n(early_stopping_patience) epochs
early_stop = EarlyStoppingCallback(early_stopping_patience = 8)

 ## Defining hyperparameters 

In [None]:
batch_size = 128 # stating batch size
epochs = 200
learning_rate = 2e-5

## WANDB

In [None]:
import wandb
wandb.login()
wandb.init(project="bachelor_thesis_cogsci",
           tags=["HPsearch_nbailab"])

wandb.config.dropout = 0.2

ERROR:wandb.jupyter:Failed to detect the name of this notebook, you can set it manually with the WANDB_NOTEBOOK_NAME environment variable to enable code saving.


<IPython.core.display.Javascript object>

[34m[1mwandb[0m: Logging into wandb.ai. (Learn how to deploy a W&B server locally: https://wandb.me/wandb-server)
[34m[1mwandb[0m: You can find your API key in your browser here: https://wandb.ai/authorize
wandb: Paste an API key from your profile and hit enter, or press ctrl+c to quit: 

··········


[34m[1mwandb[0m: Appending key for api.wandb.ai to your netrc file: /root/.netrc
[34m[1mwandb[0m: Currently logged in as: [33mjorgenhw[0m ([33mbachelor_thesis_cogsci[0m). Use [1m`wandb login --relogin`[0m to force relogin


## Hyperparameter tuning

In [None]:
!pip install -q optuna ray[tune]

[K     |████████████████████████████████| 348 kB 4.4 MB/s 
[K     |████████████████████████████████| 59.1 MB 107.1 MB/s 
[K     |████████████████████████████████| 81 kB 12.6 MB/s 
[K     |████████████████████████████████| 209 kB 92.8 MB/s 
[K     |████████████████████████████████| 78 kB 9.4 MB/s 
[K     |████████████████████████████████| 50 kB 7.8 MB/s 
[K     |████████████████████████████████| 147 kB 90.7 MB/s 
[K     |████████████████████████████████| 112 kB 63.0 MB/s 
[K     |████████████████████████████████| 8.8 MB 71.8 MB/s 
[K     |████████████████████████████████| 125 kB 84.1 MB/s 
[K     |████████████████████████████████| 468 kB 62.4 MB/s 
[?25h  Building wheel for pyperclip (setup.py) ... [?25l[?25hdone


In [None]:
def model_init():
    return model #model = AutoModelForSequenceClassification.from_pretrained(model, num_labels=3) (defined earlier)

**Training parameters**

In [None]:
training_args = TrainingArguments(output_dir=model_name, 
                                  evaluation_strategy = "epoch",
                                  save_strategy = "epoch", 
                                  num_train_epochs = epochs, 
                                  per_device_train_batch_size = batch_size,
                                  per_device_eval_batch_size = batch_size,
                                  learning_rate = learning_rate,
                                  weight_decay=0.01,
                                  load_best_model_at_end=True,
                                  report_to="wandb",
                                  save_total_limit = 2)

In [None]:
trainer = Trainer(
    model_init=model_init,
    args=training_args,
    train_dataset=tokenized_datasets["train"],
    eval_dataset=tokenized_datasets["test"],
    tokenizer=tokenizer,
    compute_metrics=compute_metrics,
    callbacks = [early_stop]
)

In [None]:
def my_hp_space(trial):
    return {
        "learning_rate": trial.suggest_float("learning_rate", 1e-6, 1e-4, log=True),
        "weight_decay": trial.suggest_loguniform('weight_decay', 1e-4, 1e-2)
    }

# Initializing hyperparameter tuning

In [None]:
import optuna
from optuna.samplers import TPESampler


sampler = optuna.samplers.TPESampler()
pruner = optuna.pruners.MedianPruner(n_warmup_steps=10)



best_run = trainer.hyperparameter_search(
    n_trials=10, 
    direction="minimize", 
    hp_space=my_hp_space, 
    backend = "optuna",
    sampler = sampler,
    pruner = pruner
    )

[32m[I 2022-11-28 10:24:12,072][0m A new study created in memory with name: no-name-17663d9a-56ae-44a0-abab-db7bc77e939c[0m
  after removing the cwd from sys.path.
Trial: {'learning_rate': 1.0211178508623191e-06, 'weight_decay': 0.0035668865310456033}
***** Running training *****
  Num examples = 2283
  Num Epochs = 200
  Instantaneous batch size per device = 128
  Total train batch size (w. parallel, distributed & accumulation) = 128
  Gradient Accumulation steps = 1
  Total optimization steps = 3600
  Number of trainable parameters = 355090435


VBox(children=(Label(value='0.001 MB of 0.001 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

Automatic Weights & Biases logging enabled, to disable set os.environ["WANDB_DISABLED"] = "true"


You're using a BertTokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,No log,1.125018,0.347769,0.359674,0.347769,0.35262
2,No log,1.10276,0.42126,0.412319,0.42126,0.41208
3,No log,1.089727,0.425197,0.409788,0.425197,0.39962
4,No log,1.077516,0.427822,0.420304,0.427822,0.391176
5,No log,1.072517,0.433071,0.426764,0.433071,0.383327
6,No log,1.06071,0.446194,0.438701,0.446194,0.397581
7,No log,1.057013,0.446194,0.438887,0.446194,0.400846
8,No log,1.05645,0.429134,0.429545,0.429134,0.393532
9,No log,1.040166,0.459318,0.452581,0.459318,0.420635
10,No log,1.02463,0.475066,0.471017,0.475066,0.433427


***** Running Evaluation *****
  Num examples = 762
  Batch size = 128


Downloading builder script:   0%|          | 0.00/4.20k [00:00<?, ?B/s]

Downloading builder script:   0%|          | 0.00/7.55k [00:00<?, ?B/s]

Downloading builder script:   0%|          | 0.00/7.36k [00:00<?, ?B/s]

Downloading builder script:   0%|          | 0.00/6.77k [00:00<?, ?B/s]

Saving model checkpoint to NbAiLab/nb-bert-large/run-0/checkpoint-18
Configuration saved in NbAiLab/nb-bert-large/run-0/checkpoint-18/config.json
Model weights saved in NbAiLab/nb-bert-large/run-0/checkpoint-18/pytorch_model.bin
tokenizer config file saved in NbAiLab/nb-bert-large/run-0/checkpoint-18/tokenizer_config.json
Special tokens file saved in NbAiLab/nb-bert-large/run-0/checkpoint-18/special_tokens_map.json
***** Running Evaluation *****
  Num examples = 762
  Batch size = 128
Saving model checkpoint to NbAiLab/nb-bert-large/run-0/checkpoint-36
Configuration saved in NbAiLab/nb-bert-large/run-0/checkpoint-36/config.json
Model weights saved in NbAiLab/nb-bert-large/run-0/checkpoint-36/pytorch_model.bin
tokenizer config file saved in NbAiLab/nb-bert-large/run-0/checkpoint-36/tokenizer_config.json
Special tokens file saved in NbAiLab/nb-bert-large/run-0/checkpoint-36/special_tokens_map.json
***** Running Evaluation *****
  Num examples = 762
  Batch size = 128
Saving model checkpo

VBox(children=(Label(value='0.001 MB of 0.760 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=0.000969…

0,1
eval/accuracy,▁▃▃▃▃▃▃▄▄▅▅▅▅▅▆▆▆▆▇▇▇▇▇▇▇▇▇▇▇▇▇▇█▇▇█▇███
eval/f1,▁▂▂▂▂▂▃▃▃▄▅▅▅▅▅▆▆▆▆▆▇▇▇▇▇▇▇▇▇▇▇▇█▇▇█▇███
eval/loss,██▇▇▇▇▆▆▆▅▅▅▄▄▃▃▃▃▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
eval/precision,▁▂▂▃▃▃▃▄▄▅▅▅▅▅▅▆▆▆▆▆▇▇▇▇▇▇▇▇▇▇▇▇█▇▇█▇███
eval/recall,▁▃▃▃▃▃▃▄▄▅▅▅▅▅▆▆▆▆▇▇▇▇▇▇▇▇▇▇▇▇▇▇█▇▇█▇███
eval/runtime,█▁▁▁▁▄▁▂▁▁▁▁▁▁▁▁▂▂▁▁▁▁▁▁▂▁▁▁▁▂▁▁▁▁▁▂▁▁▁▁
eval/samples_per_second,▁████▅█▇████████▇▇██████▇████▇███▇█▇█▇▇█
eval/steps_per_second,▁████▅█▇████████▇▇██████▇████▇███▇█▇█▇▇█
train/epoch,▁▁▁▂▂▂▂▂▂▃▃▃▃▃▄▄▄▄▄▄▅▅▅▅▅▅▆▆▆▆▆▇▇▇▇▇▇███
train/global_step,▁▁▁▂▂▂▂▂▂▃▃▃▃▃▄▄▄▄▄▄▅▅▅▅▅▅▆▆▆▆▆▇▇▇▇▇▇███

0,1
eval/accuracy,0.64961
eval/f1,0.64754
eval/loss,0.81086
eval/precision,0.64918
eval/recall,0.64961
eval/runtime,5.0718
eval/samples_per_second,150.241
eval/steps_per_second,1.183
train/epoch,54.0
train/global_step,972.0


Automatic Weights & Biases logging enabled, to disable set os.environ["WANDB_DISABLED"] = "true"


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,No log,0.868492,0.637795,0.638915,0.637795,0.635773
2,No log,0.800372,0.673228,0.695098,0.673228,0.673525
3,No log,0.863975,0.678478,0.684785,0.678478,0.672514
4,No log,0.961372,0.674541,0.677061,0.674541,0.674446
5,No log,1.312872,0.674541,0.687958,0.674541,0.676708
6,No log,1.40405,0.669291,0.67328,0.669291,0.670695
7,No log,1.405536,0.637795,0.697995,0.637795,0.635482
8,No log,1.40578,0.699475,0.700066,0.699475,0.697834
9,No log,1.462885,0.691601,0.689826,0.691601,0.686718
10,No log,1.44218,0.670604,0.668244,0.670604,0.668887


***** Running Evaluation *****
  Num examples = 762
  Batch size = 128
Saving model checkpoint to NbAiLab/nb-bert-large/run-1/checkpoint-18
Configuration saved in NbAiLab/nb-bert-large/run-1/checkpoint-18/config.json
Model weights saved in NbAiLab/nb-bert-large/run-1/checkpoint-18/pytorch_model.bin
tokenizer config file saved in NbAiLab/nb-bert-large/run-1/checkpoint-18/tokenizer_config.json
Special tokens file saved in NbAiLab/nb-bert-large/run-1/checkpoint-18/special_tokens_map.json
***** Running Evaluation *****
  Num examples = 762
  Batch size = 128
Saving model checkpoint to NbAiLab/nb-bert-large/run-1/checkpoint-36
Configuration saved in NbAiLab/nb-bert-large/run-1/checkpoint-36/config.json
Model weights saved in NbAiLab/nb-bert-large/run-1/checkpoint-36/pytorch_model.bin
tokenizer config file saved in NbAiLab/nb-bert-large/run-1/checkpoint-36/tokenizer_config.json
Special tokens file saved in NbAiLab/nb-bert-large/run-1/checkpoint-36/special_tokens_map.json
***** Running Evalua

VBox(children=(Label(value='0.001 MB of 0.001 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
eval/accuracy,▁▅▆▅▅▅▁█▇▅
eval/f1,▁▅▅▅▆▅▁█▇▅
eval/loss,▂▁▂▃▆▇▇▇██
eval/precision,▁▇▆▅▇▅██▇▄
eval/recall,▁▅▆▅▅▅▁█▇▅
eval/runtime,▇▄█▃▇▁█▁▆▂
eval/samples_per_second,▂▅▁▆▂█▁█▃▇
eval/steps_per_second,▂▅▁▆▂█▁▇▃▇
train/epoch,▁▂▃▃▄▅▆▆▇██
train/global_step,▁▂▃▃▄▅▆▆▇██

0,1
eval/accuracy,0.6706
eval/f1,0.66889
eval/loss,1.44218
eval/precision,0.66824
eval/recall,0.6706
eval/runtime,5.0652
eval/samples_per_second,150.439
eval/steps_per_second,1.185
train/epoch,10.0
train/global_step,180.0


Automatic Weights & Biases logging enabled, to disable set os.environ["WANDB_DISABLED"] = "true"


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,No log,0.80722,0.700787,0.69985,0.700787,0.699231
2,No log,0.857132,0.67979,0.693428,0.67979,0.681615
3,No log,0.858071,0.685039,0.69034,0.685039,0.68578
4,No log,0.865185,0.681102,0.687618,0.681102,0.682449
5,No log,0.877061,0.683727,0.6894,0.683727,0.684939
6,No log,0.890857,0.69685,0.699576,0.69685,0.69707
7,No log,0.920534,0.683727,0.686316,0.683727,0.684666
8,No log,0.931244,0.67979,0.68678,0.67979,0.68102
9,No log,0.95715,0.69685,0.697568,0.69685,0.696596


***** Running Evaluation *****
  Num examples = 762
  Batch size = 128
Saving model checkpoint to NbAiLab/nb-bert-large/run-2/checkpoint-18
Configuration saved in NbAiLab/nb-bert-large/run-2/checkpoint-18/config.json
Model weights saved in NbAiLab/nb-bert-large/run-2/checkpoint-18/pytorch_model.bin
tokenizer config file saved in NbAiLab/nb-bert-large/run-2/checkpoint-18/tokenizer_config.json
Special tokens file saved in NbAiLab/nb-bert-large/run-2/checkpoint-18/special_tokens_map.json
***** Running Evaluation *****
  Num examples = 762
  Batch size = 128
Saving model checkpoint to NbAiLab/nb-bert-large/run-2/checkpoint-36
Configuration saved in NbAiLab/nb-bert-large/run-2/checkpoint-36/config.json
Model weights saved in NbAiLab/nb-bert-large/run-2/checkpoint-36/pytorch_model.bin
tokenizer config file saved in NbAiLab/nb-bert-large/run-2/checkpoint-36/tokenizer_config.json
Special tokens file saved in NbAiLab/nb-bert-large/run-2/checkpoint-36/special_tokens_map.json
***** Running Evalua

0,1
eval/accuracy,█▁▃▁▂▇▂▁▇
eval/f1,█▁▃▂▃▇▂▁▇
eval/loss,▁▃▃▄▄▅▆▇█
eval/precision,█▅▃▂▃█▁▁▇
eval/recall,█▁▃▁▂▇▂▁▇
eval/runtime,▅▅▅▃▃█▂▁▅
eval/samples_per_second,▄▄▄▆▆▁▇█▄
eval/steps_per_second,▄▄▅▆▆▁▇█▄
train/epoch,▁▂▃▄▅▅▆▇██
train/global_step,▁▂▃▄▅▅▆▇██

0,1
eval/accuracy,0.69685
eval/f1,0.6966
eval/loss,0.95715
eval/precision,0.69757
eval/recall,0.69685
eval/runtime,5.0964
eval/samples_per_second,149.517
eval/steps_per_second,1.177
train/epoch,9.0
train/global_step,162.0


Automatic Weights & Biases logging enabled, to disable set os.environ["WANDB_DISABLED"] = "true"


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,No log,0.823002,0.688976,0.691427,0.688976,0.688998
2,No log,0.84058,0.685039,0.692376,0.685039,0.686315
3,No log,0.845481,0.686352,0.693955,0.686352,0.687448
4,No log,0.848472,0.686352,0.693991,0.686352,0.687581
5,No log,0.855913,0.686352,0.69192,0.686352,0.687366
6,No log,0.863987,0.682415,0.689315,0.682415,0.683891
7,No log,0.867894,0.691601,0.695479,0.691601,0.691967
8,No log,0.880469,0.686352,0.690256,0.686352,0.687336
9,No log,0.888521,0.685039,0.689093,0.685039,0.686018


***** Running Evaluation *****
  Num examples = 762
  Batch size = 128
Saving model checkpoint to NbAiLab/nb-bert-large/run-3/checkpoint-18
Configuration saved in NbAiLab/nb-bert-large/run-3/checkpoint-18/config.json
Model weights saved in NbAiLab/nb-bert-large/run-3/checkpoint-18/pytorch_model.bin
tokenizer config file saved in NbAiLab/nb-bert-large/run-3/checkpoint-18/tokenizer_config.json
Special tokens file saved in NbAiLab/nb-bert-large/run-3/checkpoint-18/special_tokens_map.json
***** Running Evaluation *****
  Num examples = 762
  Batch size = 128
Saving model checkpoint to NbAiLab/nb-bert-large/run-3/checkpoint-36
Configuration saved in NbAiLab/nb-bert-large/run-3/checkpoint-36/config.json
Model weights saved in NbAiLab/nb-bert-large/run-3/checkpoint-36/pytorch_model.bin
tokenizer config file saved in NbAiLab/nb-bert-large/run-3/checkpoint-36/tokenizer_config.json
Special tokens file saved in NbAiLab/nb-bert-large/run-3/checkpoint-36/special_tokens_map.json
***** Running Evalua

VBox(children=(Label(value='0.001 MB of 0.001 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
eval/accuracy,▆▃▄▄▄▁█▄▃
eval/f1,▅▃▄▄▄▁█▄▃
eval/loss,▁▃▃▄▅▅▆▇█
eval/precision,▄▅▆▆▄▁█▂▁
eval/recall,▆▃▄▄▄▁█▄▃
eval/runtime,▄▆▆█▃▆▃▁▄
eval/samples_per_second,▅▃▃▁▆▃▆█▅
eval/steps_per_second,▅▃▃▁▆▃▆█▅
train/epoch,▁▂▃▄▅▅▆▇██
train/global_step,▁▂▃▄▅▅▆▇██

0,1
eval/accuracy,0.68504
eval/f1,0.68602
eval/loss,0.88852
eval/precision,0.68909
eval/recall,0.68504
eval/runtime,5.0892
eval/samples_per_second,149.729
eval/steps_per_second,1.179
train/epoch,9.0
train/global_step,162.0


Automatic Weights & Biases logging enabled, to disable set os.environ["WANDB_DISABLED"] = "true"


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,No log,0.848346,0.69685,0.69835,0.69685,0.696519
2,No log,0.885878,0.675853,0.689821,0.675853,0.677877
3,No log,0.880229,0.688976,0.694486,0.688976,0.689606
4,No log,0.889938,0.682415,0.689698,0.682415,0.683998
5,No log,0.897045,0.685039,0.68986,0.685039,0.686063
6,No log,0.909977,0.691601,0.694828,0.691601,0.692132
7,No log,0.930803,0.688976,0.691699,0.688976,0.689733
8,No log,0.942103,0.678478,0.684021,0.678478,0.679686
9,No log,0.963396,0.698163,0.700051,0.698163,0.698286


***** Running Evaluation *****
  Num examples = 762
  Batch size = 128
Saving model checkpoint to NbAiLab/nb-bert-large/run-4/checkpoint-18
Configuration saved in NbAiLab/nb-bert-large/run-4/checkpoint-18/config.json
Model weights saved in NbAiLab/nb-bert-large/run-4/checkpoint-18/pytorch_model.bin
tokenizer config file saved in NbAiLab/nb-bert-large/run-4/checkpoint-18/tokenizer_config.json
Special tokens file saved in NbAiLab/nb-bert-large/run-4/checkpoint-18/special_tokens_map.json
***** Running Evaluation *****
  Num examples = 762
  Batch size = 128
Saving model checkpoint to NbAiLab/nb-bert-large/run-4/checkpoint-36
Configuration saved in NbAiLab/nb-bert-large/run-4/checkpoint-36/config.json
Model weights saved in NbAiLab/nb-bert-large/run-4/checkpoint-36/pytorch_model.bin
tokenizer config file saved in NbAiLab/nb-bert-large/run-4/checkpoint-36/tokenizer_config.json
Special tokens file saved in NbAiLab/nb-bert-large/run-4/checkpoint-36/special_tokens_map.json
***** Running Evalua

VBox(children=(Label(value='0.001 MB of 0.001 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
eval/accuracy,█▁▅▃▄▆▅▂█
eval/f1,▇▁▅▃▄▆▅▂█
eval/loss,▁▃▃▄▄▅▆▇█
eval/precision,▇▄▆▃▄▆▄▁█
eval/recall,█▁▅▃▄▆▅▂█
eval/runtime,▃▃▅▁▆▄▂█▅
eval/samples_per_second,▆▆▄█▃▅▇▁▄
eval/steps_per_second,▆▆▃█▃▅▇▁▄
train/epoch,▁▂▃▄▅▅▆▇██
train/global_step,▁▂▃▄▅▅▆▇██

0,1
eval/accuracy,0.69816
eval/f1,0.69829
eval/loss,0.9634
eval/precision,0.70005
eval/recall,0.69816
eval/runtime,5.113
eval/samples_per_second,149.033
eval/steps_per_second,1.173
train/epoch,9.0
train/global_step,162.0


Automatic Weights & Biases logging enabled, to disable set os.environ["WANDB_DISABLED"] = "true"


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,No log,0.871669,0.695538,0.696479,0.695538,0.695322


***** Running Evaluation *****
  Num examples = 762
  Batch size = 128
[32m[I 2022-11-28 11:41:28,023][0m Trial 5 pruned. [0m
  after removing the cwd from sys.path.
Trial: {'learning_rate': 1.6358357089468507e-05, 'weight_decay': 0.00027882917298134975}
***** Running training *****
  Num examples = 2283
  Num Epochs = 200
  Instantaneous batch size per device = 128
  Total train batch size (w. parallel, distributed & accumulation) = 128
  Gradient Accumulation steps = 1
  Total optimization steps = 3600
  Number of trainable parameters = 355090435


VBox(children=(Label(value='0.001 MB of 0.887 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=0.000831…

0,1
eval/accuracy,▁
eval/f1,▁
eval/loss,▁
eval/precision,▁
eval/recall,▁
eval/runtime,▁
eval/samples_per_second,▁
eval/steps_per_second,▁
train/epoch,▁
train/global_step,▁

0,1
eval/accuracy,0.69554
eval/f1,0.69532
eval/loss,0.87167
eval/precision,0.69648
eval/recall,0.69554
eval/runtime,5.0657
eval/samples_per_second,150.424
eval/steps_per_second,1.184
train/epoch,1.0
train/global_step,18.0


Automatic Weights & Biases logging enabled, to disable set os.environ["WANDB_DISABLED"] = "true"


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,No log,0.941349,0.681102,0.681839,0.681102,0.681259
2,No log,1.071323,0.67979,0.679285,0.67979,0.675749
3,No log,1.149028,0.688976,0.688823,0.688976,0.688799
4,No log,1.215278,0.677165,0.68644,0.677165,0.6796
5,No log,1.268815,0.674541,0.687835,0.674541,0.677114
6,No log,1.394922,0.687664,0.689565,0.687664,0.688384
7,No log,1.629775,0.65748,0.664344,0.65748,0.65484
8,No log,1.590895,0.682415,0.691583,0.682415,0.68493
9,No log,1.55267,0.690289,0.695629,0.690289,0.691705


***** Running Evaluation *****
  Num examples = 762
  Batch size = 128
Saving model checkpoint to NbAiLab/nb-bert-large/run-6/checkpoint-18
Configuration saved in NbAiLab/nb-bert-large/run-6/checkpoint-18/config.json
Model weights saved in NbAiLab/nb-bert-large/run-6/checkpoint-18/pytorch_model.bin
tokenizer config file saved in NbAiLab/nb-bert-large/run-6/checkpoint-18/tokenizer_config.json
Special tokens file saved in NbAiLab/nb-bert-large/run-6/checkpoint-18/special_tokens_map.json
***** Running Evaluation *****
  Num examples = 762
  Batch size = 128
Saving model checkpoint to NbAiLab/nb-bert-large/run-6/checkpoint-36
Configuration saved in NbAiLab/nb-bert-large/run-6/checkpoint-36/config.json
Model weights saved in NbAiLab/nb-bert-large/run-6/checkpoint-36/pytorch_model.bin
tokenizer config file saved in NbAiLab/nb-bert-large/run-6/checkpoint-36/tokenizer_config.json
Special tokens file saved in NbAiLab/nb-bert-large/run-6/checkpoint-36/special_tokens_map.json
***** Running Evalua

VBox(children=(Label(value='0.001 MB of 0.001 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
eval/accuracy,▆▆█▅▅▇▁▆█
eval/f1,▆▅▇▆▅▇▁▇█
eval/loss,▁▂▃▄▄▆██▇
eval/precision,▅▄▆▆▆▇▁▇█
eval/recall,▆▆█▅▅▇▁▆█
eval/runtime,▄█▂▁▃▂▃▃▅
eval/samples_per_second,▅▁▇█▆▇▆▆▄
eval/steps_per_second,▅▁▇█▆▇▆▅▄
train/epoch,▁▂▃▄▅▅▆▇██
train/global_step,▁▂▃▄▅▅▆▇██

0,1
eval/accuracy,0.69029
eval/f1,0.69171
eval/loss,1.55267
eval/precision,0.69563
eval/recall,0.69029
eval/runtime,5.1119
eval/samples_per_second,149.065
eval/steps_per_second,1.174
train/epoch,9.0
train/global_step,162.0


Automatic Weights & Biases logging enabled, to disable set os.environ["WANDB_DISABLED"] = "true"


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,No log,1.006448,0.695538,0.69273,0.695538,0.693469


***** Running Evaluation *****
  Num examples = 762
  Batch size = 128
[32m[I 2022-11-28 11:49:50,906][0m Trial 7 pruned. [0m
  after removing the cwd from sys.path.
Trial: {'learning_rate': 8.924919766561136e-05, 'weight_decay': 0.00022652421826064277}
***** Running training *****
  Num examples = 2283
  Num Epochs = 200
  Instantaneous batch size per device = 128
  Total train batch size (w. parallel, distributed & accumulation) = 128
  Gradient Accumulation steps = 1
  Total optimization steps = 3600
  Number of trainable parameters = 355090435


VBox(children=(Label(value='0.001 MB of 0.018 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=0.040237…

0,1
eval/accuracy,▁
eval/f1,▁
eval/loss,▁
eval/precision,▁
eval/recall,▁
eval/runtime,▁
eval/samples_per_second,▁
eval/steps_per_second,▁
train/epoch,▁
train/global_step,▁

0,1
eval/accuracy,0.69554
eval/f1,0.69347
eval/loss,1.00645
eval/precision,0.69273
eval/recall,0.69554
eval/runtime,5.1163
eval/samples_per_second,148.935
eval/steps_per_second,1.173
train/epoch,1.0
train/global_step,18.0


Automatic Weights & Biases logging enabled, to disable set os.environ["WANDB_DISABLED"] = "true"


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,No log,0.920397,0.653543,0.675565,0.653543,0.65551
2,No log,0.951824,0.667979,0.67402,0.667979,0.660286
3,No log,1.123179,0.654856,0.655878,0.654856,0.65057
4,No log,1.306013,0.66273,0.659891,0.66273,0.658307
5,No log,1.308832,0.669291,0.685823,0.669291,0.664943
6,No log,1.253387,0.669291,0.675781,0.669291,0.664048
7,No log,1.662563,0.648294,0.690669,0.648294,0.649038
8,No log,1.622404,0.675853,0.672968,0.675853,0.67356
9,No log,2.201128,0.658793,0.666766,0.658793,0.643362


***** Running Evaluation *****
  Num examples = 762
  Batch size = 128
Saving model checkpoint to NbAiLab/nb-bert-large/run-8/checkpoint-18
Configuration saved in NbAiLab/nb-bert-large/run-8/checkpoint-18/config.json
Model weights saved in NbAiLab/nb-bert-large/run-8/checkpoint-18/pytorch_model.bin
tokenizer config file saved in NbAiLab/nb-bert-large/run-8/checkpoint-18/tokenizer_config.json
Special tokens file saved in NbAiLab/nb-bert-large/run-8/checkpoint-18/special_tokens_map.json
***** Running Evaluation *****
  Num examples = 762
  Batch size = 128
Saving model checkpoint to NbAiLab/nb-bert-large/run-8/checkpoint-36
Configuration saved in NbAiLab/nb-bert-large/run-8/checkpoint-36/config.json
Model weights saved in NbAiLab/nb-bert-large/run-8/checkpoint-36/pytorch_model.bin
tokenizer config file saved in NbAiLab/nb-bert-large/run-8/checkpoint-36/tokenizer_config.json
Special tokens file saved in NbAiLab/nb-bert-large/run-8/checkpoint-36/special_tokens_map.json
***** Running Evalua

VBox(children=(Label(value='0.001 MB of 0.001 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
eval/accuracy,▂▆▃▅▆▆▁█▄
eval/f1,▄▅▃▄▆▆▂█▁
eval/loss,▁▁▂▃▃▃▅▅█
eval/precision,▅▅▁▂▇▅█▄▃
eval/recall,▂▆▃▅▆▆▁█▄
eval/runtime,▅▄▁▁█▃▅▇▁
eval/samples_per_second,▄▅██▁▆▄▂█
eval/steps_per_second,▄▅██▁▆▄▂▇
train/epoch,▁▂▃▄▅▅▆▇██
train/global_step,▁▂▃▄▅▅▆▇██

0,1
eval/accuracy,0.65879
eval/f1,0.64336
eval/loss,2.20113
eval/precision,0.66677
eval/recall,0.65879
eval/runtime,5.0494
eval/samples_per_second,150.908
eval/steps_per_second,1.188
train/epoch,9.0
train/global_step,162.0


Automatic Weights & Biases logging enabled, to disable set os.environ["WANDB_DISABLED"] = "true"


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,No log,1.210971,0.653543,0.663371,0.653543,0.654579
2,No log,1.044037,0.681102,0.69546,0.681102,0.683973
3,No log,1.123331,0.698163,0.701714,0.698163,0.698936
4,No log,1.363359,0.685039,0.68616,0.685039,0.676897
5,No log,1.325615,0.641732,0.676949,0.641732,0.644387
6,No log,1.509836,0.681102,0.679322,0.681102,0.678639
7,No log,1.484062,0.690289,0.694544,0.690289,0.682628
8,No log,1.476684,0.687664,0.686338,0.687664,0.686116
9,No log,1.563144,0.669291,0.675941,0.669291,0.67129
10,No log,1.726755,0.620735,0.654785,0.620735,0.623568


***** Running Evaluation *****
  Num examples = 762
  Batch size = 128
Saving model checkpoint to NbAiLab/nb-bert-large/run-9/checkpoint-18
Configuration saved in NbAiLab/nb-bert-large/run-9/checkpoint-18/config.json
Model weights saved in NbAiLab/nb-bert-large/run-9/checkpoint-18/pytorch_model.bin
tokenizer config file saved in NbAiLab/nb-bert-large/run-9/checkpoint-18/tokenizer_config.json
Special tokens file saved in NbAiLab/nb-bert-large/run-9/checkpoint-18/special_tokens_map.json
***** Running Evaluation *****
  Num examples = 762
  Batch size = 128
Saving model checkpoint to NbAiLab/nb-bert-large/run-9/checkpoint-36
Configuration saved in NbAiLab/nb-bert-large/run-9/checkpoint-36/config.json
Model weights saved in NbAiLab/nb-bert-large/run-9/checkpoint-36/pytorch_model.bin
tokenizer config file saved in NbAiLab/nb-bert-large/run-9/checkpoint-36/tokenizer_config.json
Special tokens file saved in NbAiLab/nb-bert-large/run-9/checkpoint-36/special_tokens_map.json
***** Running Evalua

In [None]:
# calling best run
best_run

BestRun(run_id='0', objective=2.595929793216958, hyperparameters={'learning_rate': 1.0211178508623191e-06, 'weight_decay': 0.0035668865310456033})

# Initializing fine-tuning with parameters from hyperparameter search
The below loop trains the model 10 times and saves the results from each training. This is due to langauge models being stochastic, so taking the average performance of 10 runs gives a more accurate estimate of performance.

In [None]:
for i in range(10):
  trainer = Trainer(
      model=model,
      args=training_args,
      train_dataset=tokenized_datasets["train"],
      eval_dataset=tokenized_datasets["test"],
      compute_metrics=compute_metrics,
      callbacks = [early_stop])

  for n, v in best_run.hyperparameters.items():
      setattr(trainer.args, n, v) # for running the experiment with the best hyperparameters from the hyperparameters search

  trainer.train() # argument trial can be used for hyperparameter search

  trainer.evaluate()

  import tensorflow as tf

  # creating model predictions for the validation data
  predictions_val = trainer.predict(tokenized_datasets["valid"])

  # choosing the prediction that has the highest probability 
  preds_val_val = np.argmax(predictions_val.predictions, axis=-1)

  # calculating the probabilities instead of logits from each
  predictions_probabilities = tf.nn.softmax(predictions_val.predictions)

  def compute_metrics_end(preds, refs):
      metric0 = evaluate.load("accuracy")
      metric1 = evaluate.load("precision")
      metric2 = evaluate.load("recall")
      metric3 = evaluate.load("f1")
      
      #logits, labels = eval_pred
      #predictions = np.argmax(logits, axis=-1)
      accuracy = metric0.compute(predictions=preds, references=refs)["accuracy"]
      precision = metric1.compute(predictions=preds, references=refs, average="weighted")["precision"]
      recall = metric2.compute(predictions=preds, references=refs, average="weighted")["recall"]
      f1 = metric3.compute(predictions=preds, references=refs, average="weighted")["f1"]
      return {"accuracy": accuracy, "precision": precision, "recall": recall, "f1": f1}

  metrics_val = compute_metrics_end(preds=preds_val_val, refs=predictions_val.label_ids)

  import tensorflow as tf

  # creating model predictions for the validation data
  predictions_test = trainer.predict(tokenized_datasets["test"])

  # choosing the prediction that has the highest probability 
  preds_test_test = np.argmax(predictions_test.predictions, axis=-1)

  # calculating the probabilities instead of logits from each
  predictions_probabilities_test = tf.nn.softmax(predictions_test.predictions)

  metrics_test = compute_metrics_end(preds=preds_test_test, refs=predictions_test.label_ids)

  import pandas as pd

  data = {'Predicted Labels': ["negative" if i == 0 else "neutral" if i == 1 else "positive" for i in preds_val_val],
          'True Labels': ["negative" if i == 0 else "neutral" if i == 1 else "positive" for i in predictions_val.label_ids],
          'Misclassification': ["TRUE" if preds_val_val[i] == predictions_val.label_ids[i] else 'MISS' for i, val in enumerate(preds_val_val)],
          'Text': dataset_recombined['valid']['text'],
          'Logit Values': [str(i) for i in predictions_val.predictions],
          'Probabilities': [str(i) for i in np.asarray(predictions_probabilities)]}
  df = pd.DataFrame(data)
  df_metrics_val = pd.DataFrame(metrics_val.items())
  df_metrics_test = pd.DataFrame(metrics_test.items())

  df.to_csv(f"/content/drive/MyDrive/BA_data/nbailab/df_classification_report{i}.csv")
  df_metrics_val.to_csv(f"/content/drive/MyDrive/BA_data/nbailab/df_classification_metrics_val{i}.csv")
  df_metrics_test.to_csv(f"/content/drive/MyDrive/BA_data/nbailab/df_classification_metrics_test{i}.csv")

wandb.finish()

***** Running training *****
  Num examples = 2283
  Num Epochs = 200
  Instantaneous batch size per device = 128
  Total train batch size (w. parallel, distributed & accumulation) = 128
  Gradient Accumulation steps = 1
  Total optimization steps = 3600
  Number of trainable parameters = 355090435
Automatic Weights & Biases logging enabled, to disable set os.environ["WANDB_DISABLED"] = "true"


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,No log,1.616209,0.658793,0.670291,0.658793,0.661832
2,No log,1.611414,0.671916,0.673999,0.671916,0.672767
3,No log,1.638412,0.671916,0.671111,0.671916,0.671282
4,No log,1.674264,0.671916,0.671023,0.671916,0.670918
5,No log,1.70373,0.675853,0.675233,0.675853,0.675121
6,No log,1.726892,0.674541,0.673049,0.674541,0.673514
7,No log,1.745444,0.677165,0.675593,0.677165,0.676094
8,No log,1.774629,0.677165,0.675461,0.677165,0.675948
9,No log,1.793698,0.675853,0.674472,0.675853,0.674952
10,No log,1.819797,0.669291,0.668465,0.669291,0.668685


***** Running Evaluation *****
  Num examples = 762
  Batch size = 128
Saving model checkpoint to NbAiLab/nb-bert-large/checkpoint-18
Configuration saved in NbAiLab/nb-bert-large/checkpoint-18/config.json
Model weights saved in NbAiLab/nb-bert-large/checkpoint-18/pytorch_model.bin
***** Running Evaluation *****
  Num examples = 762
  Batch size = 128
Saving model checkpoint to NbAiLab/nb-bert-large/checkpoint-36
Configuration saved in NbAiLab/nb-bert-large/checkpoint-36/config.json
Model weights saved in NbAiLab/nb-bert-large/checkpoint-36/pytorch_model.bin
***** Running Evaluation *****
  Num examples = 762
  Batch size = 128
Saving model checkpoint to NbAiLab/nb-bert-large/checkpoint-54
Configuration saved in NbAiLab/nb-bert-large/checkpoint-54/config.json
Model weights saved in NbAiLab/nb-bert-large/checkpoint-54/pytorch_model.bin
Deleting older checkpoint [NbAiLab/nb-bert-large/checkpoint-18] due to args.save_total_limit
***** Running Evaluation *****
  Num examples = 762
  Batch s

***** Running Prediction *****
  Num examples = 761
  Batch size = 128
***** Running Prediction *****
  Num examples = 762
  Batch size = 128


***** Running training *****
  Num examples = 2283
  Num Epochs = 200
  Instantaneous batch size per device = 128
  Total train batch size (w. parallel, distributed & accumulation) = 128
  Gradient Accumulation steps = 1
  Total optimization steps = 3600
  Number of trainable parameters = 355090435
Automatic Weights & Biases logging enabled, to disable set os.environ["WANDB_DISABLED"] = "true"


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,No log,1.642041,0.678478,0.679075,0.678478,0.678756
2,No log,1.695572,0.674541,0.672987,0.674541,0.673445


***** Running Evaluation *****
  Num examples = 762
  Batch size = 128
Saving model checkpoint to NbAiLab/nb-bert-large/checkpoint-18
Configuration saved in NbAiLab/nb-bert-large/checkpoint-18/config.json
Model weights saved in NbAiLab/nb-bert-large/checkpoint-18/pytorch_model.bin
Deleting older checkpoint [NbAiLab/nb-bert-large/checkpoint-36] due to args.save_total_limit
***** Running Evaluation *****
  Num examples = 762
  Batch size = 128
Saving model checkpoint to NbAiLab/nb-bert-large/checkpoint-36
Configuration saved in NbAiLab/nb-bert-large/checkpoint-36/config.json
Model weights saved in NbAiLab/nb-bert-large/checkpoint-36/pytorch_model.bin
Deleting older checkpoint [NbAiLab/nb-bert-large/checkpoint-180] due to args.save_total_limit


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,No log,1.642041,0.678478,0.679075,0.678478,0.678756
2,No log,1.695572,0.674541,0.672987,0.674541,0.673445
3,No log,1.73827,0.674541,0.673263,0.674541,0.673573
4,No log,1.7785,0.667979,0.66772,0.667979,0.667131
5,No log,1.792992,0.671916,0.670668,0.671916,0.670814
6,No log,1.802612,0.678478,0.676864,0.678478,0.677385
7,No log,1.812596,0.678478,0.676836,0.678478,0.677396
8,No log,1.84016,0.675853,0.674441,0.675853,0.674865
9,No log,1.8566,0.674541,0.673284,0.674541,0.673733


***** Running Evaluation *****
  Num examples = 762
  Batch size = 128
Saving model checkpoint to NbAiLab/nb-bert-large/checkpoint-54
Configuration saved in NbAiLab/nb-bert-large/checkpoint-54/config.json
Model weights saved in NbAiLab/nb-bert-large/checkpoint-54/pytorch_model.bin
Deleting older checkpoint [NbAiLab/nb-bert-large/checkpoint-36] due to args.save_total_limit
***** Running Evaluation *****
  Num examples = 762
  Batch size = 128
Saving model checkpoint to NbAiLab/nb-bert-large/checkpoint-72
Configuration saved in NbAiLab/nb-bert-large/checkpoint-72/config.json
Model weights saved in NbAiLab/nb-bert-large/checkpoint-72/pytorch_model.bin
Deleting older checkpoint [NbAiLab/nb-bert-large/checkpoint-54] due to args.save_total_limit
***** Running Evaluation *****
  Num examples = 762
  Batch size = 128
Saving model checkpoint to NbAiLab/nb-bert-large/checkpoint-90
Configuration saved in NbAiLab/nb-bert-large/checkpoint-90/config.json
Model weights saved in NbAiLab/nb-bert-large/

***** Running Prediction *****
  Num examples = 761
  Batch size = 128
***** Running Prediction *****
  Num examples = 762
  Batch size = 128


***** Running training *****
  Num examples = 2283
  Num Epochs = 200
  Instantaneous batch size per device = 128
  Total train batch size (w. parallel, distributed & accumulation) = 128
  Gradient Accumulation steps = 1
  Total optimization steps = 3600
  Number of trainable parameters = 355090435
Automatic Weights & Biases logging enabled, to disable set os.environ["WANDB_DISABLED"] = "true"


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,No log,1.708058,0.677165,0.677523,0.677165,0.677306
2,No log,1.768358,0.671916,0.671305,0.671916,0.671275
3,No log,1.79152,0.674541,0.672987,0.674541,0.673445
4,No log,1.825149,0.666667,0.666315,0.666667,0.665536
5,No log,1.836949,0.670604,0.669722,0.670604,0.66948
6,No log,1.837024,0.677165,0.675593,0.677165,0.676094
7,No log,1.843356,0.678478,0.676813,0.678478,0.677413
8,No log,1.869613,0.677165,0.675551,0.677165,0.676107
9,No log,1.885252,0.677165,0.675551,0.677165,0.676107


***** Running Evaluation *****
  Num examples = 762
  Batch size = 128
Saving model checkpoint to NbAiLab/nb-bert-large/checkpoint-18
Configuration saved in NbAiLab/nb-bert-large/checkpoint-18/config.json
Model weights saved in NbAiLab/nb-bert-large/checkpoint-18/pytorch_model.bin
***** Running Evaluation *****
  Num examples = 762
  Batch size = 128
Saving model checkpoint to NbAiLab/nb-bert-large/checkpoint-36
Configuration saved in NbAiLab/nb-bert-large/checkpoint-36/config.json
Model weights saved in NbAiLab/nb-bert-large/checkpoint-36/pytorch_model.bin
Deleting older checkpoint [NbAiLab/nb-bert-large/checkpoint-162] due to args.save_total_limit
***** Running Evaluation *****
  Num examples = 762
  Batch size = 128
Saving model checkpoint to NbAiLab/nb-bert-large/checkpoint-54
Configuration saved in NbAiLab/nb-bert-large/checkpoint-54/config.json
Model weights saved in NbAiLab/nb-bert-large/checkpoint-54/pytorch_model.bin
Deleting older checkpoint [NbAiLab/nb-bert-large/checkpoint-

***** Running Prediction *****
  Num examples = 761
  Batch size = 128
***** Running Prediction *****
  Num examples = 762
  Batch size = 128


***** Running training *****
  Num examples = 2283
  Num Epochs = 200
  Instantaneous batch size per device = 128
  Total train batch size (w. parallel, distributed & accumulation) = 128
  Gradient Accumulation steps = 1
  Total optimization steps = 3600
  Number of trainable parameters = 355090435
Automatic Weights & Biases logging enabled, to disable set os.environ["WANDB_DISABLED"] = "true"


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,No log,1.792211,0.670604,0.672782,0.670604,0.67137
2,No log,1.833226,0.665354,0.665255,0.665354,0.664519
3,No log,1.837805,0.675853,0.674472,0.675853,0.674952
4,No log,1.868113,0.669291,0.668808,0.669291,0.668611
5,No log,1.878737,0.667979,0.667606,0.667979,0.667386
6,No log,1.875308,0.677165,0.675551,0.677165,0.676107
7,No log,1.877509,0.678478,0.676905,0.678478,0.677469
8,No log,1.90199,0.677165,0.675551,0.677165,0.676107
9,No log,1.916203,0.677165,0.675669,0.677165,0.676171


***** Running Evaluation *****
  Num examples = 762
  Batch size = 128
Saving model checkpoint to NbAiLab/nb-bert-large/checkpoint-18
Configuration saved in NbAiLab/nb-bert-large/checkpoint-18/config.json
Model weights saved in NbAiLab/nb-bert-large/checkpoint-18/pytorch_model.bin
***** Running Evaluation *****
  Num examples = 762
  Batch size = 128
Saving model checkpoint to NbAiLab/nb-bert-large/checkpoint-36
Configuration saved in NbAiLab/nb-bert-large/checkpoint-36/config.json
Model weights saved in NbAiLab/nb-bert-large/checkpoint-36/pytorch_model.bin
Deleting older checkpoint [NbAiLab/nb-bert-large/checkpoint-162] due to args.save_total_limit
***** Running Evaluation *****
  Num examples = 762
  Batch size = 128
Saving model checkpoint to NbAiLab/nb-bert-large/checkpoint-54
Configuration saved in NbAiLab/nb-bert-large/checkpoint-54/config.json
Model weights saved in NbAiLab/nb-bert-large/checkpoint-54/pytorch_model.bin
Deleting older checkpoint [NbAiLab/nb-bert-large/checkpoint-

***** Running Prediction *****
  Num examples = 761
  Batch size = 128
***** Running Prediction *****
  Num examples = 762
  Batch size = 128


***** Running training *****
  Num examples = 2283
  Num Epochs = 200
  Instantaneous batch size per device = 128
  Total train batch size (w. parallel, distributed & accumulation) = 128
  Gradient Accumulation steps = 1
  Total optimization steps = 3600
  Number of trainable parameters = 355090435
Automatic Weights & Biases logging enabled, to disable set os.environ["WANDB_DISABLED"] = "true"


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,No log,1.841328,0.667979,0.670145,0.667979,0.668769
2,No log,1.87426,0.667979,0.66825,0.667979,0.667466
3,No log,1.879728,0.674541,0.67313,0.674541,0.673591
4,No log,1.905888,0.666667,0.666002,0.666667,0.665899
5,No log,1.916048,0.666667,0.666529,0.666667,0.666243
6,No log,1.909873,0.670604,0.669251,0.670604,0.66975
7,No log,1.907953,0.675853,0.674125,0.675853,0.674781
8,No log,1.928145,0.674541,0.672894,0.674541,0.673467
9,No log,1.94062,0.675853,0.674563,0.675853,0.675029


***** Running Evaluation *****
  Num examples = 762
  Batch size = 128
Saving model checkpoint to NbAiLab/nb-bert-large/checkpoint-18
Configuration saved in NbAiLab/nb-bert-large/checkpoint-18/config.json
Model weights saved in NbAiLab/nb-bert-large/checkpoint-18/pytorch_model.bin
***** Running Evaluation *****
  Num examples = 762
  Batch size = 128
Saving model checkpoint to NbAiLab/nb-bert-large/checkpoint-36
Configuration saved in NbAiLab/nb-bert-large/checkpoint-36/config.json
Model weights saved in NbAiLab/nb-bert-large/checkpoint-36/pytorch_model.bin
Deleting older checkpoint [NbAiLab/nb-bert-large/checkpoint-162] due to args.save_total_limit
***** Running Evaluation *****
  Num examples = 762
  Batch size = 128
Saving model checkpoint to NbAiLab/nb-bert-large/checkpoint-54
Configuration saved in NbAiLab/nb-bert-large/checkpoint-54/config.json
Model weights saved in NbAiLab/nb-bert-large/checkpoint-54/pytorch_model.bin
Deleting older checkpoint [NbAiLab/nb-bert-large/checkpoint-

***** Running Prediction *****
  Num examples = 761
  Batch size = 128
***** Running Prediction *****
  Num examples = 762
  Batch size = 128


***** Running training *****
  Num examples = 2283
  Num Epochs = 200
  Instantaneous batch size per device = 128
  Total train batch size (w. parallel, distributed & accumulation) = 128
  Gradient Accumulation steps = 1
  Total optimization steps = 3600
  Number of trainable parameters = 355090435
Automatic Weights & Biases logging enabled, to disable set os.environ["WANDB_DISABLED"] = "true"


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,No log,1.878359,0.670604,0.672524,0.670604,0.671338
2,No log,1.909589,0.667979,0.668293,0.667979,0.667643
3,No log,1.91995,0.670604,0.669495,0.670604,0.669714
4,No log,1.9353,0.666667,0.666002,0.666667,0.665899
5,No log,1.942691,0.667979,0.667405,0.667979,0.667255
6,No log,1.936604,0.671916,0.670731,0.671916,0.671174
7,No log,1.931325,0.678478,0.676688,0.678478,0.677357
8,No log,1.949004,0.675853,0.674165,0.675853,0.674759
9,No log,1.960617,0.673228,0.671909,0.673228,0.672391


***** Running Evaluation *****
  Num examples = 762
  Batch size = 128
Saving model checkpoint to NbAiLab/nb-bert-large/checkpoint-18
Configuration saved in NbAiLab/nb-bert-large/checkpoint-18/config.json
Model weights saved in NbAiLab/nb-bert-large/checkpoint-18/pytorch_model.bin
***** Running Evaluation *****
  Num examples = 762
  Batch size = 128
Saving model checkpoint to NbAiLab/nb-bert-large/checkpoint-36
Configuration saved in NbAiLab/nb-bert-large/checkpoint-36/config.json
Model weights saved in NbAiLab/nb-bert-large/checkpoint-36/pytorch_model.bin
Deleting older checkpoint [NbAiLab/nb-bert-large/checkpoint-162] due to args.save_total_limit
***** Running Evaluation *****
  Num examples = 762
  Batch size = 128
Saving model checkpoint to NbAiLab/nb-bert-large/checkpoint-54
Configuration saved in NbAiLab/nb-bert-large/checkpoint-54/config.json
Model weights saved in NbAiLab/nb-bert-large/checkpoint-54/pytorch_model.bin
Deleting older checkpoint [NbAiLab/nb-bert-large/checkpoint-

***** Running Prediction *****
  Num examples = 761
  Batch size = 128
***** Running Prediction *****
  Num examples = 762
  Batch size = 128


***** Running training *****
  Num examples = 2283
  Num Epochs = 200
  Instantaneous batch size per device = 128
  Total train batch size (w. parallel, distributed & accumulation) = 128
  Gradient Accumulation steps = 1
  Total optimization steps = 3600
  Number of trainable parameters = 355090435
Automatic Weights & Biases logging enabled, to disable set os.environ["WANDB_DISABLED"] = "true"


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,No log,1.909895,0.674541,0.67609,0.674541,0.675194
2,No log,1.937727,0.670604,0.671232,0.670604,0.670626
3,No log,1.955168,0.666667,0.666109,0.666667,0.666084
4,No log,1.960753,0.667979,0.667513,0.667979,0.66734
5,No log,1.967935,0.669291,0.668858,0.669291,0.668793
6,No log,1.963429,0.673228,0.671909,0.673228,0.672391
7,No log,1.957422,0.677165,0.675356,0.677165,0.676015
8,No log,1.973472,0.674541,0.672701,0.674541,0.673332
9,No log,1.986351,0.669291,0.668236,0.669291,0.668505


***** Running Evaluation *****
  Num examples = 762
  Batch size = 128
Saving model checkpoint to NbAiLab/nb-bert-large/checkpoint-18
Configuration saved in NbAiLab/nb-bert-large/checkpoint-18/config.json
Model weights saved in NbAiLab/nb-bert-large/checkpoint-18/pytorch_model.bin
***** Running Evaluation *****
  Num examples = 762
  Batch size = 128
Saving model checkpoint to NbAiLab/nb-bert-large/checkpoint-36
Configuration saved in NbAiLab/nb-bert-large/checkpoint-36/config.json
Model weights saved in NbAiLab/nb-bert-large/checkpoint-36/pytorch_model.bin
Deleting older checkpoint [NbAiLab/nb-bert-large/checkpoint-162] due to args.save_total_limit
***** Running Evaluation *****
  Num examples = 762
  Batch size = 128
Saving model checkpoint to NbAiLab/nb-bert-large/checkpoint-54
Configuration saved in NbAiLab/nb-bert-large/checkpoint-54/config.json
Model weights saved in NbAiLab/nb-bert-large/checkpoint-54/pytorch_model.bin
Deleting older checkpoint [NbAiLab/nb-bert-large/checkpoint-

***** Running Prediction *****
  Num examples = 761
  Batch size = 128
***** Running Prediction *****
  Num examples = 762
  Batch size = 128


***** Running training *****
  Num examples = 2283
  Num Epochs = 200
  Instantaneous batch size per device = 128
  Total train batch size (w. parallel, distributed & accumulation) = 128
  Gradient Accumulation steps = 1
  Total optimization steps = 3600
  Number of trainable parameters = 355090435
Automatic Weights & Biases logging enabled, to disable set os.environ["WANDB_DISABLED"] = "true"


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,No log,1.937043,0.67979,0.681003,0.67979,0.680334
2,No log,1.960954,0.674541,0.675122,0.674541,0.674759
3,No log,1.984301,0.666667,0.666585,0.666667,0.666339
4,No log,1.984695,0.666667,0.666116,0.666667,0.665987
5,No log,1.988836,0.669291,0.66873,0.669291,0.668704
6,No log,1.983042,0.671916,0.670898,0.671916,0.671222
7,No log,1.971852,0.677165,0.675906,0.677165,0.676435
8,No log,1.985535,0.678478,0.677071,0.678478,0.677632
9,No log,1.998818,0.671916,0.671056,0.671916,0.671293


***** Running Evaluation *****
  Num examples = 762
  Batch size = 128
Saving model checkpoint to NbAiLab/nb-bert-large/checkpoint-18
Configuration saved in NbAiLab/nb-bert-large/checkpoint-18/config.json
Model weights saved in NbAiLab/nb-bert-large/checkpoint-18/pytorch_model.bin
***** Running Evaluation *****
  Num examples = 762
  Batch size = 128
Saving model checkpoint to NbAiLab/nb-bert-large/checkpoint-36
Configuration saved in NbAiLab/nb-bert-large/checkpoint-36/config.json
Model weights saved in NbAiLab/nb-bert-large/checkpoint-36/pytorch_model.bin
Deleting older checkpoint [NbAiLab/nb-bert-large/checkpoint-162] due to args.save_total_limit
***** Running Evaluation *****
  Num examples = 762
  Batch size = 128
Saving model checkpoint to NbAiLab/nb-bert-large/checkpoint-54
Configuration saved in NbAiLab/nb-bert-large/checkpoint-54/config.json
Model weights saved in NbAiLab/nb-bert-large/checkpoint-54/pytorch_model.bin
Deleting older checkpoint [NbAiLab/nb-bert-large/checkpoint-

***** Running Prediction *****
  Num examples = 761
  Batch size = 128
***** Running Prediction *****
  Num examples = 762
  Batch size = 128


***** Running training *****
  Num examples = 2283
  Num Epochs = 200
  Instantaneous batch size per device = 128
  Total train batch size (w. parallel, distributed & accumulation) = 128
  Gradient Accumulation steps = 1
  Total optimization steps = 3600
  Number of trainable parameters = 355090435
Automatic Weights & Biases logging enabled, to disable set os.environ["WANDB_DISABLED"] = "true"


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,No log,1.968036,0.678478,0.67957,0.678478,0.678974
2,No log,1.991967,0.674541,0.675302,0.674541,0.674861
3,No log,2.010931,0.666667,0.666585,0.666667,0.666339
4,No log,2.009233,0.666667,0.66603,0.666667,0.665996
5,No log,2.019124,0.666667,0.666887,0.666667,0.666558
6,No log,2.012761,0.669291,0.668927,0.669291,0.669054
7,No log,1.997847,0.673228,0.672706,0.673228,0.672953
8,No log,2.00984,0.673228,0.672706,0.673228,0.672953
9,No log,2.034338,0.669291,0.670207,0.669291,0.669681


***** Running Evaluation *****
  Num examples = 762
  Batch size = 128
Saving model checkpoint to NbAiLab/nb-bert-large/checkpoint-18
Configuration saved in NbAiLab/nb-bert-large/checkpoint-18/config.json
Model weights saved in NbAiLab/nb-bert-large/checkpoint-18/pytorch_model.bin
***** Running Evaluation *****
  Num examples = 762
  Batch size = 128
Saving model checkpoint to NbAiLab/nb-bert-large/checkpoint-36
Configuration saved in NbAiLab/nb-bert-large/checkpoint-36/config.json
Model weights saved in NbAiLab/nb-bert-large/checkpoint-36/pytorch_model.bin
Deleting older checkpoint [NbAiLab/nb-bert-large/checkpoint-162] due to args.save_total_limit
***** Running Evaluation *****
  Num examples = 762
  Batch size = 128
Saving model checkpoint to NbAiLab/nb-bert-large/checkpoint-54
Configuration saved in NbAiLab/nb-bert-large/checkpoint-54/config.json
Model weights saved in NbAiLab/nb-bert-large/checkpoint-54/pytorch_model.bin
Deleting older checkpoint [NbAiLab/nb-bert-large/checkpoint-

***** Running Prediction *****
  Num examples = 761
  Batch size = 128
***** Running Prediction *****
  Num examples = 762
  Batch size = 128


***** Running training *****
  Num examples = 2283
  Num Epochs = 200
  Instantaneous batch size per device = 128
  Total train batch size (w. parallel, distributed & accumulation) = 128
  Gradient Accumulation steps = 1
  Total optimization steps = 3600
  Number of trainable parameters = 355090435
Automatic Weights & Biases logging enabled, to disable set os.environ["WANDB_DISABLED"] = "true"


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,No log,1.993423,0.682415,0.683739,0.682415,0.682954
2,No log,2.01026,0.678478,0.67938,0.678478,0.678869
3,No log,2.031716,0.666667,0.666418,0.666667,0.666395
4,No log,2.03078,0.664042,0.663085,0.664042,0.663097
5,No log,2.047827,0.664042,0.66428,0.664042,0.663979
6,No log,2.047661,0.670604,0.671228,0.670604,0.670892
7,No log,2.043708,0.678478,0.680406,0.678478,0.679259
8,No log,2.049685,0.678478,0.679109,0.678478,0.678745
9,No log,2.070817,0.674541,0.675105,0.674541,0.674809


***** Running Evaluation *****
  Num examples = 762
  Batch size = 128
Saving model checkpoint to NbAiLab/nb-bert-large/checkpoint-18
Configuration saved in NbAiLab/nb-bert-large/checkpoint-18/config.json
Model weights saved in NbAiLab/nb-bert-large/checkpoint-18/pytorch_model.bin
***** Running Evaluation *****
  Num examples = 762
  Batch size = 128
Saving model checkpoint to NbAiLab/nb-bert-large/checkpoint-36
Configuration saved in NbAiLab/nb-bert-large/checkpoint-36/config.json
Model weights saved in NbAiLab/nb-bert-large/checkpoint-36/pytorch_model.bin
Deleting older checkpoint [NbAiLab/nb-bert-large/checkpoint-162] due to args.save_total_limit
***** Running Evaluation *****
  Num examples = 762
  Batch size = 128
Saving model checkpoint to NbAiLab/nb-bert-large/checkpoint-54
Configuration saved in NbAiLab/nb-bert-large/checkpoint-54/config.json
Model weights saved in NbAiLab/nb-bert-large/checkpoint-54/pytorch_model.bin
Deleting older checkpoint [NbAiLab/nb-bert-large/checkpoint-

***** Running Prediction *****
  Num examples = 761
  Batch size = 128
***** Running Prediction *****
  Num examples = 762
  Batch size = 128


VBox(children=(Label(value='0.001 MB of 0.001 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
eval/accuracy,▁█▅▃▄▅▅▃▄▄▅▅▄▅▅▃▃▅▄▄▄▅▃▃▅▄▃▄▄▅▃▅▅▃▃▄▆▃▅▆
eval/f1,▁█▅▄▄▄▄▃▄▄▄▅▄▄▄▃▃▅▄▄▃▄▃▃▅▄▃▄▄▅▃▄▅▃▃▄▅▂▅▅
eval/loss,▂▁▄▄▅▅▆▆▅▆▆▅▆▆▇▆▇▇▆▇▇▇▇▇▇▇▇▇▇▇█▇▇███████
eval/precision,▁█▄▃▃▃▃▂▃▂▃▄▃▃▃▁▂▄▃▃▂▃▂▂▃▃▂▃▃▄▂▃▄▂▂▃▅▁▄▅
eval/recall,▁█▅▃▄▅▅▃▄▄▅▅▄▅▅▃▃▅▄▄▄▅▃▃▅▄▃▄▄▅▃▅▅▃▃▄▆▃▅▆
eval/runtime,▁▃▂▃▃▃▂▃▂▂▂▁▂▁▂▁▁▁▄▂▁▂▂█▁▂▃▂▁▃▁▂▃▂▂▂▄▂▂▄
eval/samples_per_second,█▆▇▆▆▆▇▆▇▇▇█▇█▇███▅▇█▇▇▁█▇▆▇█▆█▇▆▇▇▇▅▇▇▅
eval/steps_per_second,█▆▇▆▆▆▇▆▇▇▇█▇█▇███▅▇█▇▇▁█▇▆▇█▆█▇▆▇▇▇▅▇▇▅
train/epoch,▁▃▆█▃▅▇█▃▅▇▁▃▆▇▃▅▇▁▃▆▇▂▄▆▇▃▆▇▂▄▆▇▃▅▇▁▃▆▇
train/global_step,▁▃▆█▃▅▇█▃▅▇▁▃▆▇▃▅▇▁▃▆▇▂▄▆▇▃▆▇▂▄▆▇▃▅▇▁▃▆▇

0,1
eval/accuracy,0.68241
eval/f1,0.68295
eval/loss,1.99342
eval/precision,0.68374
eval/recall,0.68241
eval/runtime,5.1456
eval/samples_per_second,148.087
eval/steps_per_second,1.166
train/epoch,9.0
train/global_step,162.0
