<a href="https://colab.research.google.com/github/goerlitz/nlp-classification/blob/main/notebooks/10kGNAD/colab/TransformersEvaluation10kGNAD.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Evaluation of Pre-trained Transformers Models for German Text Classification

This Notebook uses [SimpleTransformers](https://simpletransformers.ai/) and the [Ten Thousand German News Articles Dataset](https://tblock.github.io/10kGNAD/) to evaluate the performance of different pre-trained Transformer Models from the Hugging Face model hub.

* Bert
* DistilBert
* Electra

## Prerequisites

In [1]:
models_names = [
          "bert-base-german-cased",
          "distilbert-base-german-cased",
          "dbmdz/bert-base-german-cased",
          "dbmdz/bert-base-german-uncased",
          "dbmdz/bert-base-german-europeana-cased",
          "dbmdz/bert-base-german-europeana-uncased",
          "dbmdz/distilbert-base-german-europeana-cased",
          "deepset/gbert-base",
          "deepset/gbert-large",
          "deepset/gelectra-base",
          "deepset/gelectra-large",
          "german-nlp-group/electra-base-german-uncased",
          
          "bert-base-multilingual-cased",
          "distilbert-base-multilingual-cased",
]

In [2]:
gpu_info = !nvidia-smi
gpu_info = '\n'.join(gpu_info)
if gpu_info.find('failed') >= 0:
  print('Select the Runtime > "Change runtime type" menu to enable a GPU accelerator, ')
  print('and then re-execute this cell.')
else:
  print(gpu_info)

Sun May 16 07:29:55 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 465.19.01    Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla P100-PCIE...  Off  | 00000000:00:04.0 Off |                    0 |
| N/A   35C    P0    26W / 250W |      0MiB / 16280MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces


### Install Tranformers

In [12]:
# install transformers
!pip install -q --upgrade tqdm==4.47.0 transformers simpletransformers >/dev/null

# check installed version
!pip freeze | grep transformers
# simpletransformers==0.61.4
# transformers==4.6.0

simpletransformers==0.61.4
transformers==4.6.0


In [10]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from pathlib import Path
import time

from simpletransformers.classification import ClassificationModel
from transformers import AutoTokenizer

### Connect Google Drive

In [13]:
from google.colab import drive

In [14]:
drive.mount('/content/gdrive')

Mounted at /content/gdrive


### Download Data

Get the 10k German News Articles Dataset

In [15]:
%env DIR=data

!mkdir -p $DIR
!wget -nc https://github.com/tblock/10kGNAD/blob/master/train.csv?raw=true -nv -O $DIR/train.csv
!wget -nc https://github.com/tblock/10kGNAD/blob/master/test.csv?raw=true -nv -O $DIR/test.csv
!ls -lAh $DIR | cut -d " " -f 5-

env: DIR=data
2021-05-16 07:34:29 URL:https://raw.githubusercontent.com/tblock/10kGNAD/master/train.csv [24405789/24405789] -> "data/train.csv" [1]
2021-05-16 07:34:30 URL:https://raw.githubusercontent.com/tblock/10kGNAD/master/test.csv [2755020/2755020] -> "data/test.csv" [1]

2.7M May 16 07:34 test.csv
 24M May 16 07:34 train.csv


## Import Data

In [16]:
data_dir = Path("data/")

train_file = data_dir / 'train.csv'
test_file = data_dir / 'test.csv'

In [17]:
def load_file(filepath: Path) -> pd.DataFrame:
    f = pd.read_csv(filepath, sep=";", quotechar="'", names=['labels', 'text'])
    return f

In [18]:
train_df = load_file(data_dir / 'train.csv')
print(train_df.shape[0], 'articles')
display(train_df.head())

9245 articles


Unnamed: 0,labels,text
0,Sport,21-Jähriger fällt wohl bis Saisonende aus. Wie...
1,Kultur,"Erfundene Bilder zu Filmen, die als verloren g..."
2,Web,Der frischgekürte CEO Sundar Pichai setzt auf ...
3,Wirtschaft,"Putin: ""Einigung, dass wir Menge auf Niveau vo..."
4,Inland,Estland sieht den künftigen österreichischen P...


In [19]:
# load test dataset
test_df = load_file(data_dir / 'test.csv')
print(test_df.shape[0], 'articles')
display(test_df.head())

1028 articles


Unnamed: 0,labels,text
0,Wirtschaft,"Die Gewerkschaft GPA-djp lanciert den ""All-in-..."
1,Sport,Franzosen verteidigen 2:1-Führung – Kritische ...
2,Web,Neues Video von Designern macht im Netz die Ru...
3,Sport,23-jähriger Brasilianer muss vier Spiele pausi...
4,International,Aufständische verwendeten Chemikalie bei Gefec...


## Prepare for Model Training

Requirements of SimpleTransformers

* columns should be labeled `labels` and `text`
* labels must be int values starting at `0`

In [20]:
# compute weights for training (where least frequent class has weight 1.0)
train_weights_s = (train_df
                   .labels
                   .value_counts()
                   .pipe(lambda x: 1 / (x / x.min()))
)
train_weights_s

Panorama         0.321192
Web              0.321405
International    0.356618
Wirtschaft       0.381890
Sport            0.448659
Inland           0.531216
Etat             0.806988
Wissenschaft     0.939922
Kultur           1.000000
Name: labels, dtype: float64

In [21]:
train_weights_s.sum()

5.107889622672269

In [22]:
# map label to integers
mapping_s = pd.Series(train_weights_s.index)
mapping_s

0         Panorama
1              Web
2    International
3       Wirtschaft
4            Sport
5           Inland
6             Etat
7     Wissenschaft
8           Kultur
dtype: object

In [23]:
# replace labels with integers starting at 0
train_df.labels.replace(mapping_s.values, mapping_s.index, inplace=True)
test_df.labels.replace(mapping_s.values, mapping_s.index, inplace=True)
display(train_df.head())
display(test_df.head())

Unnamed: 0,labels,text
0,4,21-Jähriger fällt wohl bis Saisonende aus. Wie...
1,8,"Erfundene Bilder zu Filmen, die als verloren g..."
2,1,Der frischgekürte CEO Sundar Pichai setzt auf ...
3,3,"Putin: ""Einigung, dass wir Menge auf Niveau vo..."
4,5,Estland sieht den künftigen österreichischen P...


Unnamed: 0,labels,text
0,3,"Die Gewerkschaft GPA-djp lanciert den ""All-in-..."
1,4,Franzosen verteidigen 2:1-Führung – Kritische ...
2,1,Neues Video von Designern macht im Netz die Ru...
3,4,23-jähriger Brasilianer muss vier Spiele pausi...
4,2,Aufständische verwendeten Chemikalie bei Gefec...


## Evaluation Setup

There are many different German (or multilingual) language models we want to evaluate

In [24]:
model_list = [
              "bert", "bert-base-german-cased",
              "bert", "dbmdz/bert-base-german-cased",
              "bert", "dbmdz/bert-base-german-uncased",
              "bert", "deepset/gbert-base",
              "bert", "deepset/gbert-large",
              "bert", "bert-base-multilingual-cased",
              "distilbert", "distilbert-base-german-cased",
              "distilbert", "distilbert-base-multilingual-cased",
              "distilbert", "dbmdz/distilbert-base-german-europeana-cased",
              "electra", "deepset/gelectra-base",
              "electra", "deepset/gelectra-large",
              "electra", "german-nlp-group/electra-base-german-uncased",
]

In [30]:
model_df = pd.DataFrame(np.array(model_list).reshape(-1,2), columns=["type", "name"])

mdl = model_df.iloc[0]
print(f"using model: '{mdl['name']}'")

model_df

using model: 'bert-base-german-cased'


Unnamed: 0,type,name
0,bert,bert-base-german-cased
1,bert,dbmdz/bert-base-german-cased
2,bert,dbmdz/bert-base-german-uncased
3,bert,deepset/gbert-base
4,bert,deepset/gbert-large
5,bert,bert-base-multilingual-cased
6,distilbert,distilbert-base-german-cased
7,distilbert,distilbert-base-multilingual-cased
8,distilbert,dbmdz/distilbert-base-german-europeana-cased
9,electra,deepset/gelectra-base


In [26]:
import wandb

# initialize weights & biases logging
project_name = "german_news_article_classification2"

In [27]:
# define hyperparameters
train_args ={"reprocess_input_data": True,
             "fp16": False,
             "num_train_epochs": 4,
             # "weight": train_weights_s.values,
             "evaluate_during_training": True,
             "overwrite_output_dir": True,
             "wandb_project": project_name}

from sklearn.metrics import f1_score, accuracy_score, precision_score, recall_score

def f1_multiclass(labels, preds):
    return f1_score(labels, preds, average='macro')

def precision_multiclass(labels, preds):
    return precision_score(labels, preds, average='macro')

def recall_multiclass(labels, preds):
    return recall_score(labels, preds, average='macro')

In [28]:
def init_classifier(model_type:str, model_name:str, num_labels, train_args) -> ClassificationModel:

    # need to create a tokenizer first and adjust train args with lower case setting
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    train_args = {**train_args, **{ "do_lower_case": tokenizer.do_lower_case }}

    # Create a ClassificationModel
    return ClassificationModel(model_type, model_name, tokenizer_name=model_name, num_labels=num_labels, args=train_args)

def train_model(model:ClassificationModel, train_df, eval_df):
    return model.train_model(train_df, eval_df=eval_df, verbose=False, f1=f1_multiclass, acc=accuracy_score, precision=precision_multiclass, recall=recall_multiclass)

def eval_model(model:ClassificationModel, eval_df):
    return model.eval_model(test_df, wandb_log=False, f1=f1_multiclass, acc=accuracy_score, precision=precision_multiclass, recall=recall_multiclass)

def log_results(model_name, start, end, details):
    eval_df = pd.DataFrame(details)[-1:].reset_index(drop=True).round(4)
    result_df = pd.DataFrame({
        "start_time": [time.strftime('%Y-%m-%d %H:%M:%S %Z', time.gmtime(start))],
        "runtime": [int(end - start)],
        "model_name": [model_name],
        })
    output_df = pd.concat([result_df, eval_df], axis=1)

    eval_log = Path("/content/gdrive/My Drive/Colab Notebooks/nlp-classification/data") / "eval_log2.txt"

    if eval_log.exists():
        output_df.to_csv(eval_log, mode='a', header=False, index=False)
    else:
        output_df.to_csv(eval_log, index=False)

In [31]:
# run training multiple times
num_runs = 10
for i in range(num_runs):

    model = init_classifier(mdl["type"], mdl["name"], len(mapping_s), train_args)
    
    start = time.time()
    steps, details = train_model(model, train_df, test_df)
    end = time.time()
    
    wandb.finish()
    log_results(mdl["name"], start, end, details)

HBox(children=(FloatProgress(value=0.0, description='Downloading', max=433.0, style=ProgressStyle(description_…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=254728.0, style=ProgressStyle(descripti…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=485115.0, style=ProgressStyle(descripti…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=29.0, style=ProgressStyle(description_w…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=438869143.0, style=ProgressStyle(descri…




Some weights of the model checkpoint at bert-base-german-cased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.LayerNorm.weight', 'cls.predictions.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoi

HBox(children=(FloatProgress(value=0.0, max=9245.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Epoch', max=4.0, style=ProgressStyle(description_width='i…

HBox(children=(FloatProgress(value=0.0, description='Running Epoch 0 of 4', max=1156.0, style=ProgressStyle(de…




HBox(children=(FloatProgress(value=0.0, max=1028.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Running Epoch 1 of 4', max=1156.0, style=ProgressStyle(de…

HBox(children=(FloatProgress(value=0.0, max=1028.0), HTML(value='')))





HBox(children=(FloatProgress(value=0.0, max=1028.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Running Epoch 2 of 4', max=1156.0, style=ProgressStyle(de…




HBox(children=(FloatProgress(value=0.0, max=1028.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Running Epoch 3 of 4', max=1156.0, style=ProgressStyle(de…

HBox(children=(FloatProgress(value=0.0, max=1028.0), HTML(value='')))





HBox(children=(FloatProgress(value=0.0, max=1028.0), HTML(value='')))





VBox(children=(Label(value=' 0.03MB of 0.03MB uploaded (0.00MB deduped)\r'), FloatProgress(value=1.0, max=1.0)…

0,1
Training loss,0.00124
lr,0.0
global_step,4624.0
_runtime,691.0
_timestamp,1621156099.0
_step,97.0
mcc,0.87627
train_loss,0.00057
eval_loss,0.68136
f1,0.88735


0,1
Training loss,█▄▆▂▃▄▁▄▄▁▁▄▁▁▁▁▁▄▁▁▃▁▁▁▁▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁
lr,▂▅▇███▇▇▇▇▇▆▆▆▆▆▅▅▅▅▅▄▄▄▄▄▄▃▃▃▃▃▂▂▂▂▂▁▁▁
global_step,▁▁▁▂▂▂▂▂▂▃▃▃▃▃▄▄▄▄▄▄▅▅▅▅▅▆▆▆▆▆▆▇▇▇▇▇▇███
_runtime,▁▁▁▁▂▂▂▂▂▂▃▃▃▃▃▄▄▄▄▄▅▅▅▅▅▅▆▆▆▆▆▆▇▇▇▇▇███
_timestamp,▁▁▁▁▂▂▂▂▂▂▃▃▃▃▃▄▄▄▄▄▅▅▅▅▅▅▆▆▆▆▆▆▇▇▇▇▇███
_step,▁▁▁▂▂▂▂▂▂▃▃▃▃▃▃▄▄▄▄▄▅▅▅▅▅▅▆▆▆▆▆▇▇▇▇▇▇███
mcc,▁▂▂█▆▆
train_loss,▃█▁▁▁▁
eval_loss,▁▄▄▅██
f1,▁▁▁█▅▆


Some weights of the model checkpoint at bert-base-german-cased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.LayerNorm.weight', 'cls.predictions.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoi

HBox(children=(FloatProgress(value=0.0, max=9245.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Epoch', max=4.0, style=ProgressStyle(description_width='i…

HBox(children=(FloatProgress(value=0.0, description='Running Epoch 0 of 4', max=1156.0, style=ProgressStyle(de…




HBox(children=(FloatProgress(value=0.0, max=1028.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Running Epoch 1 of 4', max=1156.0, style=ProgressStyle(de…

HBox(children=(FloatProgress(value=0.0, max=1028.0), HTML(value='')))





HBox(children=(FloatProgress(value=0.0, max=1028.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Running Epoch 2 of 4', max=1156.0, style=ProgressStyle(de…




HBox(children=(FloatProgress(value=0.0, max=1028.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Running Epoch 3 of 4', max=1156.0, style=ProgressStyle(de…

HBox(children=(FloatProgress(value=0.0, max=1028.0), HTML(value='')))





HBox(children=(FloatProgress(value=0.0, max=1028.0), HTML(value='')))





VBox(children=(Label(value=' 0.03MB of 0.03MB uploaded (0.00MB deduped)\r'), FloatProgress(value=1.0, max=1.0)…

0,1
Training loss,0.0482
lr,0.0
global_step,4624.0
_runtime,690.0
_timestamp,1621156816.0
_step,97.0
mcc,0.88299
train_loss,0.00051
eval_loss,0.63761
f1,0.8942


0,1
Training loss,█▄▅▄▃▂▁▂▃▂▁▂▁▂▁▅▁▁▁▂▁▁▁▁▁▁▁▅▁▁▁▂▄▁▁▂▁▁▁▁
lr,▂▅▇███▇▇▇▇▇▆▆▆▆▆▅▅▅▅▅▄▄▄▄▄▄▃▃▃▃▃▂▂▂▂▂▁▁▁
global_step,▁▁▁▂▂▂▂▂▂▃▃▃▃▃▄▄▄▄▄▄▅▅▅▅▅▆▆▆▆▆▆▇▇▇▇▇▇███
_runtime,▁▁▁▁▂▂▂▂▂▂▃▃▃▃▃▄▄▄▄▄▅▅▅▅▅▅▆▆▆▆▆▆▇▇▇▇▇███
_timestamp,▁▁▁▁▂▂▂▂▂▂▃▃▃▃▃▄▄▄▄▄▅▅▅▅▅▅▆▆▆▆▆▆▇▇▇▇▇███
_step,▁▁▁▂▂▂▂▂▂▃▃▃▃▃▃▄▄▄▄▄▅▅▅▅▅▅▆▆▆▆▆▇▇▇▇▇▇███
mcc,▁▇▆▇▇█
train_loss,█▁▁▁▁▁
eval_loss,▁▂▃█▇█
f1,▁▆▆▇▇█


Some weights of the model checkpoint at bert-base-german-cased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.LayerNorm.weight', 'cls.predictions.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoi

HBox(children=(FloatProgress(value=0.0, max=9245.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Epoch', max=4.0, style=ProgressStyle(description_width='i…

HBox(children=(FloatProgress(value=0.0, description='Running Epoch 0 of 4', max=1156.0, style=ProgressStyle(de…




HBox(children=(FloatProgress(value=0.0, max=1028.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Running Epoch 1 of 4', max=1156.0, style=ProgressStyle(de…

HBox(children=(FloatProgress(value=0.0, max=1028.0), HTML(value='')))





HBox(children=(FloatProgress(value=0.0, max=1028.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Running Epoch 2 of 4', max=1156.0, style=ProgressStyle(de…




HBox(children=(FloatProgress(value=0.0, max=1028.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Running Epoch 3 of 4', max=1156.0, style=ProgressStyle(de…

HBox(children=(FloatProgress(value=0.0, max=1028.0), HTML(value='')))





HBox(children=(FloatProgress(value=0.0, max=1028.0), HTML(value='')))





VBox(children=(Label(value=' 0.03MB of 0.03MB uploaded (0.00MB deduped)\r'), FloatProgress(value=1.0, max=1.0)…

0,1
Training loss,0.48784
lr,0.0
global_step,4624.0
_runtime,693.0
_timestamp,1621157535.0
_step,97.0
mcc,0.86977
train_loss,0.00041
eval_loss,0.72382
f1,0.88538


0,1
Training loss,█▃▃▂▃▂▄▂▁▂▁▁▁▁▁▁▁▁▂▁▄▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▃
lr,▂▅▇███▇▇▇▇▇▆▆▆▆▆▅▅▅▅▅▄▄▄▄▄▄▃▃▃▃▃▂▂▂▂▂▁▁▁
global_step,▁▁▁▂▂▂▂▂▂▃▃▃▃▃▄▄▄▄▄▄▅▅▅▅▅▆▆▆▆▆▆▇▇▇▇▇▇███
_runtime,▁▁▁▁▂▂▂▂▂▂▃▃▃▃▃▄▄▄▄▄▅▅▅▅▅▅▆▆▆▆▆▆▇▇▇▇▇███
_timestamp,▁▁▁▁▂▂▂▂▂▂▃▃▃▃▃▄▄▄▄▄▅▅▅▅▅▅▆▆▆▆▆▆▇▇▇▇▇███
_step,▁▁▁▂▂▂▂▂▂▃▃▃▃▃▃▄▄▄▄▄▅▅▅▅▅▅▆▆▆▆▆▇▇▇▇▇▇███
mcc,▁▂▅▆██
train_loss,▆▃█▁▂▁
eval_loss,▁▂▃▅██
f1,▁▅▅▆██


Some weights of the model checkpoint at bert-base-german-cased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.LayerNorm.weight', 'cls.predictions.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoi

HBox(children=(FloatProgress(value=0.0, max=9245.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Epoch', max=4.0, style=ProgressStyle(description_width='i…

HBox(children=(FloatProgress(value=0.0, description='Running Epoch 0 of 4', max=1156.0, style=ProgressStyle(de…




HBox(children=(FloatProgress(value=0.0, max=1028.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Running Epoch 1 of 4', max=1156.0, style=ProgressStyle(de…

HBox(children=(FloatProgress(value=0.0, max=1028.0), HTML(value='')))





HBox(children=(FloatProgress(value=0.0, max=1028.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Running Epoch 2 of 4', max=1156.0, style=ProgressStyle(de…




HBox(children=(FloatProgress(value=0.0, max=1028.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Running Epoch 3 of 4', max=1156.0, style=ProgressStyle(de…

HBox(children=(FloatProgress(value=0.0, max=1028.0), HTML(value='')))





HBox(children=(FloatProgress(value=0.0, max=1028.0), HTML(value='')))





VBox(children=(Label(value=' 0.03MB of 0.03MB uploaded (0.00MB deduped)\r'), FloatProgress(value=1.0, max=1.0)…

0,1
Training loss,0.00045
lr,0.0
global_step,4624.0
_runtime,692.0
_timestamp,1621158252.0
_step,97.0
mcc,0.87414
train_loss,0.00087
eval_loss,0.65672
f1,0.88644


0,1
Training loss,█▅▂▄▁▃▁▄▄▅▄▄▂▁▁▄▁▄▃▄▁▁▂▄▄▁▁▁▁▄▁▁▁▁▁▁▁▁▁▁
lr,▂▅▇███▇▇▇▇▇▆▆▆▆▆▅▅▅▅▅▄▄▄▄▄▄▃▃▃▃▃▂▂▂▂▂▁▁▁
global_step,▁▁▁▂▂▂▂▂▂▃▃▃▃▃▄▄▄▄▄▄▅▅▅▅▅▆▆▆▆▆▆▇▇▇▇▇▇███
_runtime,▁▁▁▁▂▂▂▂▂▂▃▃▃▃▃▄▄▄▄▄▅▅▅▅▅▅▆▆▆▆▆▆▇▇▇▇▇███
_timestamp,▁▁▁▁▂▂▂▂▂▂▃▃▃▃▃▄▄▄▄▄▅▅▅▅▅▅▆▆▆▆▆▆▇▇▇▇▇███
_step,▁▁▁▂▂▂▂▂▂▃▃▃▃▃▃▄▄▄▄▄▅▅▅▅▅▅▆▆▆▆▆▇▇▇▇▇▇███
mcc,▁▄▄█▇▆
train_loss,▄▆▂█▁▁
eval_loss,▁▂▂▅▆█
f1,▁▅▅█▇▇


Some weights of the model checkpoint at bert-base-german-cased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.LayerNorm.weight', 'cls.predictions.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoi

HBox(children=(FloatProgress(value=0.0, max=9245.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Epoch', max=4.0, style=ProgressStyle(description_width='i…

HBox(children=(FloatProgress(value=0.0, description='Running Epoch 0 of 4', max=1156.0, style=ProgressStyle(de…




HBox(children=(FloatProgress(value=0.0, max=1028.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Running Epoch 1 of 4', max=1156.0, style=ProgressStyle(de…

HBox(children=(FloatProgress(value=0.0, max=1028.0), HTML(value='')))





HBox(children=(FloatProgress(value=0.0, max=1028.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Running Epoch 2 of 4', max=1156.0, style=ProgressStyle(de…




HBox(children=(FloatProgress(value=0.0, max=1028.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Running Epoch 3 of 4', max=1156.0, style=ProgressStyle(de…

HBox(children=(FloatProgress(value=0.0, max=1028.0), HTML(value='')))





HBox(children=(FloatProgress(value=0.0, max=1028.0), HTML(value='')))





VBox(children=(Label(value=' 0.03MB of 0.03MB uploaded (0.00MB deduped)\r'), FloatProgress(value=1.0, max=1.0)…

0,1
Training loss,0.00069
lr,0.0
global_step,4624.0
_runtime,716.0
_timestamp,1621158994.0
_step,97.0
mcc,0.88526
train_loss,0.00033
eval_loss,0.65417
f1,0.89692


0,1
Training loss,█▄▃▃▅▁▂▁▄▄▁▁▁▂▁▄▁▁▁▂▁▁▁▁▁▆▁▂▁▁▁▁▁▁▁▁▁▁▁▁
lr,▂▅▇███▇▇▇▇▇▆▆▆▆▆▅▅▅▅▅▄▄▄▄▄▄▃▃▃▃▃▂▂▂▂▂▁▁▁
global_step,▁▁▁▂▂▂▂▂▂▃▃▃▃▃▄▄▄▄▄▄▅▅▅▅▅▆▆▆▆▆▆▇▇▇▇▇▇███
_runtime,▁▁▁▁▂▂▂▂▂▂▃▃▃▃▃▄▄▄▄▄▅▅▅▅▅▆▆▆▆▆▆▇▇▇▇▇▇███
_timestamp,▁▁▁▁▂▂▂▂▂▂▃▃▃▃▃▄▄▄▄▄▅▅▅▅▅▆▆▆▆▆▆▇▇▇▇▇▇███
_step,▁▁▁▂▂▂▂▂▂▃▃▃▃▃▃▄▄▄▄▄▅▅▅▅▅▅▆▆▆▆▆▇▇▇▇▇▇███
mcc,▁▅▅█▆█
train_loss,█▁▂▁▁▁
eval_loss,▁▁▁▅██
f1,▁▅▄█▇█


Some weights of the model checkpoint at bert-base-german-cased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.LayerNorm.weight', 'cls.predictions.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoi

HBox(children=(FloatProgress(value=0.0, max=9245.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Epoch', max=4.0, style=ProgressStyle(description_width='i…

HBox(children=(FloatProgress(value=0.0, description='Running Epoch 0 of 4', max=1156.0, style=ProgressStyle(de…




HBox(children=(FloatProgress(value=0.0, max=1028.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Running Epoch 1 of 4', max=1156.0, style=ProgressStyle(de…

HBox(children=(FloatProgress(value=0.0, max=1028.0), HTML(value='')))





HBox(children=(FloatProgress(value=0.0, max=1028.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Running Epoch 2 of 4', max=1156.0, style=ProgressStyle(de…




HBox(children=(FloatProgress(value=0.0, max=1028.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Running Epoch 3 of 4', max=1156.0, style=ProgressStyle(de…

HBox(children=(FloatProgress(value=0.0, max=1028.0), HTML(value='')))





HBox(children=(FloatProgress(value=0.0, max=1028.0), HTML(value='')))





VBox(children=(Label(value=' 0.03MB of 0.03MB uploaded (0.00MB deduped)\r'), FloatProgress(value=1.0, max=1.0)…

0,1
Training loss,0.00212
lr,0.0
global_step,4624.0
_runtime,701.0
_timestamp,1621159721.0
_step,97.0
mcc,0.88409
train_loss,0.0007
eval_loss,0.63609
f1,0.89448


0,1
Training loss,▆▃▂▁▂▂▃▁▄▃▂▁▁█▃▁▂▃▄▁▁▁▁▁▁▁▄▁▁▁▁▁▁▁▁▁▁▃▁▁
lr,▂▅▇███▇▇▇▇▇▆▆▆▆▆▅▅▅▅▅▄▄▄▄▄▄▃▃▃▃▃▂▂▂▂▂▁▁▁
global_step,▁▁▁▂▂▂▂▂▂▃▃▃▃▃▄▄▄▄▄▄▅▅▅▅▅▆▆▆▆▆▆▇▇▇▇▇▇███
_runtime,▁▁▁▁▂▂▂▂▂▂▃▃▃▃▃▄▄▄▄▄▅▅▅▅▅▅▆▆▆▆▆▆▇▇▇▇▇███
_timestamp,▁▁▁▁▂▂▂▂▂▂▃▃▃▃▃▄▄▄▄▄▅▅▅▅▅▅▆▆▆▆▆▆▇▇▇▇▇███
_step,▁▁▁▂▂▂▂▂▂▃▃▃▃▃▃▄▄▄▄▄▅▅▅▅▅▅▆▆▆▆▆▇▇▇▇▇▇███
mcc,▁▄▇▇▇█
train_loss,▁▃█▄▁▁
eval_loss,▁▄▂▆██
f1,▁▃▇█▇█


Some weights of the model checkpoint at bert-base-german-cased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.LayerNorm.weight', 'cls.predictions.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoi

HBox(children=(FloatProgress(value=0.0, max=9245.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Epoch', max=4.0, style=ProgressStyle(description_width='i…

HBox(children=(FloatProgress(value=0.0, description='Running Epoch 0 of 4', max=1156.0, style=ProgressStyle(de…




HBox(children=(FloatProgress(value=0.0, max=1028.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Running Epoch 1 of 4', max=1156.0, style=ProgressStyle(de…

HBox(children=(FloatProgress(value=0.0, max=1028.0), HTML(value='')))





HBox(children=(FloatProgress(value=0.0, max=1028.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Running Epoch 2 of 4', max=1156.0, style=ProgressStyle(de…




HBox(children=(FloatProgress(value=0.0, max=1028.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Running Epoch 3 of 4', max=1156.0, style=ProgressStyle(de…

HBox(children=(FloatProgress(value=0.0, max=1028.0), HTML(value='')))





HBox(children=(FloatProgress(value=0.0, max=1028.0), HTML(value='')))





VBox(children=(Label(value=' 0.03MB of 0.03MB uploaded (0.00MB deduped)\r'), FloatProgress(value=1.0, max=1.0)…

0,1
Training loss,0.0005
lr,0.0
global_step,4624.0
_runtime,696.0
_timestamp,1621160443.0
_step,97.0
mcc,0.88074
train_loss,0.00083
eval_loss,0.64539
f1,0.89477


0,1
Training loss,█▄▃▁▂▂▂▆▅▄▇▁▁▁▁▆▁▃▃▃▁▁▁▁▅▁▁▁▁▁▁▁▅▁▁▁▁▁▁▁
lr,▂▅▇███▇▇▇▇▇▆▆▆▆▆▅▅▅▅▅▄▄▄▄▄▄▃▃▃▃▃▂▂▂▂▂▁▁▁
global_step,▁▁▁▂▂▂▂▂▂▃▃▃▃▃▄▄▄▄▄▄▅▅▅▅▅▆▆▆▆▆▆▇▇▇▇▇▇███
_runtime,▁▁▁▁▂▂▂▂▂▂▃▃▃▃▃▄▄▄▄▄▅▅▅▅▅▅▆▆▆▆▆▆▇▇▇▇▇███
_timestamp,▁▁▁▁▂▂▂▂▂▂▃▃▃▃▃▄▄▄▄▄▅▅▅▅▅▅▆▆▆▆▆▆▇▇▇▇▇███
_step,▁▁▁▂▂▂▂▂▂▃▃▃▃▃▃▄▄▄▄▄▅▅▅▅▅▅▆▆▆▆▆▇▇▇▇▇▇███
mcc,▁█▆▆▇▆
train_loss,▄▅█▂▁▁
eval_loss,▁▁▂███
f1,▁█▆▆██


Some weights of the model checkpoint at bert-base-german-cased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.LayerNorm.weight', 'cls.predictions.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoi

HBox(children=(FloatProgress(value=0.0, max=9245.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Epoch', max=4.0, style=ProgressStyle(description_width='i…

HBox(children=(FloatProgress(value=0.0, description='Running Epoch 0 of 4', max=1156.0, style=ProgressStyle(de…




HBox(children=(FloatProgress(value=0.0, max=1028.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Running Epoch 1 of 4', max=1156.0, style=ProgressStyle(de…

HBox(children=(FloatProgress(value=0.0, max=1028.0), HTML(value='')))





HBox(children=(FloatProgress(value=0.0, max=1028.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Running Epoch 2 of 4', max=1156.0, style=ProgressStyle(de…




HBox(children=(FloatProgress(value=0.0, max=1028.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Running Epoch 3 of 4', max=1156.0, style=ProgressStyle(de…

HBox(children=(FloatProgress(value=0.0, max=1028.0), HTML(value='')))





HBox(children=(FloatProgress(value=0.0, max=1028.0), HTML(value='')))





VBox(children=(Label(value=' 0.03MB of 0.03MB uploaded (0.00MB deduped)\r'), FloatProgress(value=1.0, max=1.0)…

0,1
Training loss,0.0071
lr,0.0
global_step,4624.0
_runtime,697.0
_timestamp,1621161166.0
_step,97.0
mcc,0.89305
train_loss,0.00054
eval_loss,0.58453
f1,0.90527


0,1
Training loss,█▃▂▃▁▂▃▂▁▃▄▄▁▁▁▁▂▁▁▁▂▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂▁▁▁
lr,▂▅▇███▇▇▇▇▇▆▆▆▆▆▅▅▅▅▅▄▄▄▄▄▄▃▃▃▃▃▂▂▂▂▂▁▁▁
global_step,▁▁▁▂▂▂▂▂▂▃▃▃▃▃▄▄▄▄▄▄▅▅▅▅▅▆▆▆▆▆▆▇▇▇▇▇▇███
_runtime,▁▁▁▁▂▂▂▂▂▂▃▃▃▃▃▄▄▄▄▄▅▅▅▅▅▅▆▆▆▆▆▆▇▇▇▇▇███
_timestamp,▁▁▁▁▂▂▂▂▂▂▃▃▃▃▃▄▄▄▄▄▅▅▅▅▅▅▆▆▆▆▆▆▇▇▇▇▇███
_step,▁▁▁▂▂▂▂▂▂▃▃▃▃▃▃▄▄▄▄▄▅▅▅▅▅▅▆▆▆▆▆▇▇▇▇▇▇███
mcc,▁▄▆▇██
train_loss,▁▁▅█▁▁
eval_loss,▁▅▁▆██
f1,▁▄▅▇██


Some weights of the model checkpoint at bert-base-german-cased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.LayerNorm.weight', 'cls.predictions.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoi

HBox(children=(FloatProgress(value=0.0, max=9245.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Epoch', max=4.0, style=ProgressStyle(description_width='i…

HBox(children=(FloatProgress(value=0.0, description='Running Epoch 0 of 4', max=1156.0, style=ProgressStyle(de…




HBox(children=(FloatProgress(value=0.0, max=1028.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Running Epoch 1 of 4', max=1156.0, style=ProgressStyle(de…

HBox(children=(FloatProgress(value=0.0, max=1028.0), HTML(value='')))





HBox(children=(FloatProgress(value=0.0, max=1028.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Running Epoch 2 of 4', max=1156.0, style=ProgressStyle(de…




HBox(children=(FloatProgress(value=0.0, max=1028.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Running Epoch 3 of 4', max=1156.0, style=ProgressStyle(de…

HBox(children=(FloatProgress(value=0.0, max=1028.0), HTML(value='')))





HBox(children=(FloatProgress(value=0.0, max=1028.0), HTML(value='')))





VBox(children=(Label(value=' 0.03MB of 0.03MB uploaded (0.00MB deduped)\r'), FloatProgress(value=1.0, max=1.0)…

0,1
Training loss,0.00081
lr,0.0
global_step,4624.0
_runtime,692.0
_timestamp,1621161885.0
_step,97.0
mcc,0.88968
train_loss,0.00057
eval_loss,0.6193
f1,0.89666


0,1
Training loss,█▃▃▃▂▃▂▇▁▂▂▂▁▃▂▁▄▅▁▁▁▁▁▁▂▁▃▁▁▁▁▁▁▁▁▁▄▁▁▁
lr,▂▅▇███▇▇▇▇▇▆▆▆▆▆▅▅▅▅▅▄▄▄▄▄▄▃▃▃▃▃▂▂▂▂▂▁▁▁
global_step,▁▁▁▂▂▂▂▂▂▃▃▃▃▃▄▄▄▄▄▄▅▅▅▅▅▆▆▆▆▆▆▇▇▇▇▇▇███
_runtime,▁▁▁▁▂▂▂▂▂▂▃▃▃▃▃▄▄▄▄▄▅▅▅▅▅▅▆▆▆▆▆▆▇▇▇▇▇███
_timestamp,▁▁▁▁▂▂▂▂▂▂▃▃▃▃▃▄▄▄▄▄▅▅▅▅▅▅▆▆▆▆▆▆▇▇▇▇▇███
_step,▁▁▁▂▂▂▂▂▂▃▃▃▃▃▃▄▄▄▄▄▅▅▅▅▅▅▆▆▆▆▆▇▇▇▇▇▇███
mcc,▁▃▅▇▇█
train_loss,▂█▁▁▁▁
eval_loss,▁▆▂███
f1,▁▃▆█▇█


Some weights of the model checkpoint at bert-base-german-cased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.LayerNorm.weight', 'cls.predictions.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoi

HBox(children=(FloatProgress(value=0.0, max=9245.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Epoch', max=4.0, style=ProgressStyle(description_width='i…

HBox(children=(FloatProgress(value=0.0, description='Running Epoch 0 of 4', max=1156.0, style=ProgressStyle(de…




HBox(children=(FloatProgress(value=0.0, max=1028.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Running Epoch 1 of 4', max=1156.0, style=ProgressStyle(de…

HBox(children=(FloatProgress(value=0.0, max=1028.0), HTML(value='')))





HBox(children=(FloatProgress(value=0.0, max=1028.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Running Epoch 2 of 4', max=1156.0, style=ProgressStyle(de…




HBox(children=(FloatProgress(value=0.0, max=1028.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Running Epoch 3 of 4', max=1156.0, style=ProgressStyle(de…

HBox(children=(FloatProgress(value=0.0, max=1028.0), HTML(value='')))





HBox(children=(FloatProgress(value=0.0, max=1028.0), HTML(value='')))





VBox(children=(Label(value=' 0.03MB of 0.03MB uploaded (0.00MB deduped)\r'), FloatProgress(value=1.0, max=1.0)…

0,1
Training loss,0.00051
lr,0.0
global_step,4624.0
_runtime,694.0
_timestamp,1621162606.0
_step,97.0
mcc,0.88644
train_loss,0.00052
eval_loss,0.62718
f1,0.89704


0,1
Training loss,█▅▅▁▃▃▃▁▂▂▃▁▂▁▂▂▁▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
lr,▂▅▇███▇▇▇▇▇▆▆▆▆▆▅▅▅▅▅▄▄▄▄▄▄▃▃▃▃▃▂▂▂▂▂▁▁▁
global_step,▁▁▁▂▂▂▂▂▂▃▃▃▃▃▄▄▄▄▄▄▅▅▅▅▅▆▆▆▆▆▆▇▇▇▇▇▇███
_runtime,▁▁▁▁▂▂▂▂▂▂▃▃▃▃▃▄▄▄▄▄▅▅▅▅▅▅▆▆▆▆▆▆▇▇▇▇▇███
_timestamp,▁▁▁▁▂▂▂▂▂▂▃▃▃▃▃▄▄▄▄▄▅▅▅▅▅▅▆▆▆▆▆▆▇▇▇▇▇███
_step,▁▁▁▂▂▂▂▂▂▃▃▃▃▃▃▄▄▄▄▄▅▅▅▅▅▅▆▆▆▆▆▇▇▇▇▇▇███
mcc,▁▂▃▇██
train_loss,█▁▁▁▁▁
eval_loss,▁▁▄▇▇█
f1,▂▁▃▆██
