<a href="https://colab.research.google.com/github/Tempate/Prophet/blob/main/mbert.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Building a classifier with mBERT

We start by installing the huggingface transformer package.

In [1]:
!pip install transformers



## Preparing the data

We load the dataset.

In [2]:
from google.colab import drive
drive.mount('/content/drive')

with open('/content/drive/My Drive/dataset (1).csv') as file:
  dataset = file.read().splitlines()

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


We remove characters that aren't letters.

In [3]:
import regex as re
import random


SAMPLE_COUNT = 19

data = []

for entry in random.sample(dataset, SAMPLE_COUNT):
    text = re.findall('\p{L}+', entry.lower())
    data.append((" ".join(text), int(entry[-1])))

# Remove the dataset from memory
del dataset

## Setting up the model

The idea is to use mBERT as the input vector for the classifier.

In [4]:
import torch.nn as nn


class Classifier(nn.Module):
    def __init__(self, input_size):
        super(Classifier, self).__init__()

        self.function1 = nn.Linear(input_size, 16)
        self.activation1 = nn.ReLU()

        self.function2 = nn.Linear(16, 16)
        self.activation2 = nn.ReLU()

        self.function3 = nn.Linear(16, 1)
        self.activation3 = nn.Sigmoid()

    def forward(self, vector):
        layer1 = self.activation1(self.function1(vector))
        layer2 = self.activation2(self.function2(layer1))
        layer3 = self.activation3(self.function3(layer2))

        return (layer3[0]).float()


We are algo going to create a class for our dataset.

In [5]:
from torch.utils.data import Dataset

import torch


class ReviewDataset(Dataset):
  def __init__(self, review, target, tokenizer, max_length):
    self.review = review
    self.target = target
    self.tokenizer = tokenizer
    self.max_length = max_length

  def __len__(self):
    return len(self.review)

  def __getitem__(self, item):
    review = str(self.review[item])

    encoding = self.tokenizer.encode_plus(
      review,
      
      add_special_tokens=True,
      
      truncation=True,
      padding='max_length',
      max_length=self.max_length,
      
      return_token_type_ids=False,
      return_attention_mask=True,
      return_tensors='pt'
    )

    return {
      'text': review,
      'targets': torch.tensor(self.target[item], dtype=torch.long),

      'attention_mask': encoding['attention_mask'].flatten(),
      'input_ids': encoding['input_ids'].flatten()
    }


## Merging the classifier with mBERT.

This class has two purposes: to vectorize data by passing it to mBERT and to prepare data using DataLoader and our ReviewDataset.

In [6]:
from torch.utils.data import DataLoader

from transformers import BertTokenizer
from transformers import BertModel

import numpy as np
import torch


NUM_WORKERS = 2
MAX_LENGTH = 128
BATCH_SIZE = 1


class Vectorizer():
    def __init__(self, MODEL):
        self.tokenizer = BertTokenizer.from_pretrained(MODEL)
        self.model = BertModel.from_pretrained(MODEL)

    def vectorize(self, data):
        self.vectors = {}

        for entry in data:
            for batch in self.data_loader([entry]):
                
                with torch.no_grad():
                    id = batch["text"][0]
                    
                    self.vectors[id] = self.model(
                        input_ids=batch["input_ids"], 
                        attention_mask=batch["attention_mask"]
                    )[1]

        print("[+] Data vectorized correctly")

    def data_loader(self, data):
        texts, labels = zip(*data)

        entry = ReviewDataset(
            np.array(texts),
            np.array(labels),
            self.tokenizer,
            MAX_LENGTH
        )

        loader = DataLoader(
            entry, 
            batch_size=BATCH_SIZE, 
            num_workers=NUM_WORKERS
        )

        return loader


## Training the model

This class combines all the previous classes to train and validate our model.

In [7]:
from torch.utils.data import DataLoader
from sklearn.metrics import f1_score

import torch
import torch.nn as nn
import torch.optim as optim

import numpy as np


MODEL = 'bert-base-multilingual-uncased'

LEARNING_RATE = 5e-3


class Transformers:

    def __init__(self, data):
        self.vectorizer = Vectorizer(MODEL)
        self.vectorizer.vectorize(data)

        self.model = Classifier(self.vectorizer.model.config.hidden_size)

    def train(self, train_data, valid_data, epochs, patience):
        optimizer = optim.NAdam(self.model.parameters(), lr=LEARNING_RATE)
        loss_function = nn.BCELoss()

        losses = []

        # We pass several times over the training data.
        # Usually there are between 5 and 30 epochs.
        for epoch in range(epochs):
            for review in train_data:
              for batch in self.vectorizer.data_loader([review]):
                  
                  vector = self.vectorizer.vectors[batch["text"][0]]
                  target = batch["targets"].float()

                  # We clear the gradients before each instance
                  self.model.zero_grad()

                  # We run the forward pass
                  log_probs = self.model(vector)

                  # We compute the loss, gradients, and update the parameters
                  loss = loss_function(log_probs, target)
                  loss.backward()

                  optimizer.step()

            score, loss = self.validate(valid_data, loss_function)
            losses.append(loss)

            print(f"{epoch + 1}.\tLoss: {loss}\tF1-Score: {score}")

            # We count the patience to avoid over-fitting
            if len(losses) >= 2 and losses[-1] - losses[-2] >= 0:
                patience -= 1

            if patience <= 0:
                break

        return score, loss

    def validate(self, data, loss_function):
        targets = []
        guesses = []

        loss = 0

        batches = self.vectorizer.data_loader(data)
        
        for batch in batches:

            vector = self.vectorizer.vectors[batch["text"][0]]
            
            target = batch["targets"].float()
            targets.append(target)

            with torch.no_grad():
                # We run the forward pass
                output = self.model(vector)

                # We save our prediction
                guess = torch.round(torch.flatten(output))
                guesses.append(guess)
                
                # We calculate the loss
                loss += loss_function(output, target).item()

        score = f1_score(targets, guesses, zero_division=1)
        loss /= len(batches)

        return score, loss


## Testing our model

This approach is good when testing with a tiny dataset. In our case we only have 19 entries.

In [None]:
NUMBER_OF_TIMES_TO_TEST = 5
EPOCHS = 100
PATIENCE = 10

def leave_one_out(data):
    total_score = 0

    for n in range(NUMBER_OF_TIMES_TO_TEST):
        score = 0

        for i in range(len(data)):
            train = data[:i] + data[i+1:]
            valid = [data[i]]

            model = Transformers(data)
            score += model.train(train, valid, EPOCHS, PATIENCE)[0]

        score /= len(data)
        total_score += score

        print("[%d]\t%.4f" % (n + 1, score))

    print("Average score: %.4f" % (total_score / NUMBER_OF_TIMES_TO_TEST))


leave_one_out(data)

Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 0.6190440058708191	F1-Score: 1.0
2.	Loss: 0.6032919883728027	F1-Score: 1.0
3.	Loss: 0.5857675671577454	F1-Score: 1.0
4.	Loss: 0.5696262717247009	F1-Score: 1.0
5.	Loss: 0.5541063547134399	F1-Score: 1.0
6.	Loss: 0.542468786239624	F1-Score: 1.0
7.	Loss: 0.5332370400428772	F1-Score: 1.0
8.	Loss: 0.5258433222770691	F1-Score: 1.0
9.	Loss: 0.5199636816978455	F1-Score: 1.0
10.	Loss: 0.5150814652442932	F1-Score: 1.0
11.	Loss: 0.5110149383544922	F1-Score: 1.0
12.	Loss: 0.5076237916946411	F1-Score: 1.0
13.	Loss: 0.5047928690910339	F1-Score: 1.0
14.	Loss: 0.5024266839027405	F1-Score: 1.0
15.	Loss: 0.5004462599754333	F1-Score: 1.0
16.	Loss: 0.4987863004207611	F1-Score: 1.0
17.	Loss: 0.4973927140235901	F1-Score: 1.0
18.	Loss: 0.4962209463119507	F1-Score: 1.0
19.	Loss: 0.4952346682548523	F1-Score: 1.0
20.	Loss: 0.494403600692749	F1-Score: 1.0
21.	Loss: 0.49370276927948	F1-Score: 1.0
22.	Loss: 0.49311643838882446	F1-Score: 1.0
23.	Loss: 0.4926508963108063	F1-Scor

Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 0.6212506294250488	F1-Score: 1.0
2.	Loss: 0.5083303451538086	F1-Score: 1.0
3.	Loss: 0.48239952325820923	F1-Score: 1.0
4.	Loss: 0.48061835765838623	F1-Score: 1.0
5.	Loss: 0.481477826833725	F1-Score: 1.0
6.	Loss: 0.48208749294281006	F1-Score: 1.0
7.	Loss: 0.4824933707714081	F1-Score: 1.0
8.	Loss: 0.4827989637851715	F1-Score: 1.0
9.	Loss: 0.48301786184310913	F1-Score: 1.0
10.	Loss: 0.4831470549106598	F1-Score: 1.0
11.	Loss: 0.48273220658302307	F1-Score: 1.0
12.	Loss: 0.48249125480651855	F1-Score: 1.0
13.	Loss: 0.4823012351989746	F1-Score: 1.0
14.	Loss: 0.4820607602596283	F1-Score: 1.0
15.	Loss: 0.48172634840011597	F1-Score: 1.0
16.	Loss: 0.4817485511302948	F1-Score: 1.0
17.	Loss: 0.4814371168613434	F1-Score: 1.0
18.	Loss: 0.4802631735801697	F1-Score: 1.0
19.	Loss: 0.4793137013912201	F1-Score: 1.0
20.	Loss: 0.47931215167045593	F1-Score: 1.0
21.	Loss: 0.4780392646789551	F1-Score: 1.0
22.	Loss: 0.4768325090408325	F1-Score: 1.0
23.	Loss: 0.47566232085227

Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 1.059593677520752	F1-Score: 0.0
2.	Loss: 1.1606991291046143	F1-Score: 0.0
3.	Loss: 1.1247069835662842	F1-Score: 0.0
4.	Loss: 1.1170135736465454	F1-Score: 0.0
5.	Loss: 1.1134294271469116	F1-Score: 0.0
6.	Loss: 1.1099966764450073	F1-Score: 0.0
7.	Loss: 1.1082415580749512	F1-Score: 0.0
8.	Loss: 1.2058563232421875	F1-Score: 0.0
9.	Loss: 1.0986970663070679	F1-Score: 0.0
10.	Loss: 1.0987474918365479	F1-Score: 0.0
11.	Loss: 1.1028931140899658	F1-Score: 0.0
12.	Loss: 1.1029994487762451	F1-Score: 0.0
13.	Loss: 1.1090290546417236	F1-Score: 0.0
14.	Loss: 1.1011141538619995	F1-Score: 0.0
15.	Loss: 1.1078579425811768	F1-Score: 0.0
16.	Loss: 1.1076323986053467	F1-Score: 0.0
17.	Loss: 1.1014184951782227	F1-Score: 0.0
18.	Loss: 1.100975751876831	F1-Score: 0.0
19.	Loss: 1.1091190576553345	F1-Score: 0.0
20.	Loss: 1.10247802734375	F1-Score: 0.0
21.	Loss: 1.110443353652954	F1-Score: 0.0
22.	Loss: 1.1034506559371948	F1-Score: 0.0
23.	Loss: 1.1107606887817383	F1-Score:

Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 1.119490146636963	F1-Score: 0.0
2.	Loss: 1.0795447826385498	F1-Score: 0.0
3.	Loss: 1.0630652904510498	F1-Score: 0.0
4.	Loss: 1.0631237030029297	F1-Score: 0.0
5.	Loss: 1.063907504081726	F1-Score: 0.0
6.	Loss: 1.0643653869628906	F1-Score: 0.0
7.	Loss: 1.0656185150146484	F1-Score: 0.0
8.	Loss: 1.067048192024231	F1-Score: 0.0
9.	Loss: 1.067569375038147	F1-Score: 0.0
10.	Loss: 1.0696942806243896	F1-Score: 0.0
11.	Loss: 1.0706816911697388	F1-Score: 0.0
12.	Loss: 1.0804648399353027	F1-Score: 0.0
13.	Loss: 1.0861347913742065	F1-Score: 0.0


Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 1.1852777004241943	F1-Score: 0.0
2.	Loss: 1.234259843826294	F1-Score: 0.0
3.	Loss: 1.2202656269073486	F1-Score: 0.0
4.	Loss: 1.213104009628296	F1-Score: 0.0
5.	Loss: 1.209858775138855	F1-Score: 0.0
6.	Loss: 1.2074637413024902	F1-Score: 0.0
7.	Loss: 1.2053370475769043	F1-Score: 0.0
8.	Loss: 1.2035056352615356	F1-Score: 0.0
9.	Loss: 1.201987624168396	F1-Score: 0.0
10.	Loss: 1.20090651512146	F1-Score: 0.0
11.	Loss: 1.200021743774414	F1-Score: 0.0
12.	Loss: 1.1992323398590088	F1-Score: 0.0
13.	Loss: 1.1985770463943481	F1-Score: 0.0
14.	Loss: 1.1983046531677246	F1-Score: 0.0
15.	Loss: 1.1977170705795288	F1-Score: 0.0
16.	Loss: 1.1974787712097168	F1-Score: 0.0
17.	Loss: 1.1973206996917725	F1-Score: 0.0
18.	Loss: 1.1972334384918213	F1-Score: 0.0
19.	Loss: 1.1972534656524658	F1-Score: 0.0
20.	Loss: 1.1974053382873535	F1-Score: 0.0
21.	Loss: 1.1975464820861816	F1-Score: 0.0
22.	Loss: 1.1978708505630493	F1-Score: 0.0
23.	Loss: 1.1981847286224365	F1-Score: 0

Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 0.7806882858276367	F1-Score: 0.0
2.	Loss: 0.7368279099464417	F1-Score: 0.0
3.	Loss: 0.699065625667572	F1-Score: 0.0
4.	Loss: 0.6652102470397949	F1-Score: 1.0
5.	Loss: 0.6355254054069519	F1-Score: 1.0
6.	Loss: 0.6100040674209595	F1-Score: 1.0
7.	Loss: 0.5884284973144531	F1-Score: 1.0
8.	Loss: 0.5704309940338135	F1-Score: 1.0
9.	Loss: 0.5555670857429504	F1-Score: 1.0
10.	Loss: 0.5433767437934875	F1-Score: 1.0
11.	Loss: 0.5334252119064331	F1-Score: 1.0
12.	Loss: 0.5253244042396545	F1-Score: 1.0
13.	Loss: 0.5187841653823853	F1-Score: 1.0
14.	Loss: 0.5134801268577576	F1-Score: 1.0
15.	Loss: 0.5091602206230164	F1-Score: 1.0
16.	Loss: 0.505639910697937	F1-Score: 1.0
17.	Loss: 0.5027687549591064	F1-Score: 1.0
18.	Loss: 0.5004248023033142	F1-Score: 1.0
19.	Loss: 0.4985092282295227	F1-Score: 1.0
20.	Loss: 0.4969422221183777	F1-Score: 1.0
21.	Loss: 0.49565935134887695	F1-Score: 1.0
22.	Loss: 0.4946080446243286	F1-Score: 1.0
23.	Loss: 0.49374592304229736	F1-S

Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 0.7195601463317871	F1-Score: 0.0
2.	Loss: 0.9848086833953857	F1-Score: 0.0
3.	Loss: 1.0822395086288452	F1-Score: 0.0
4.	Loss: 1.0934683084487915	F1-Score: 0.0
5.	Loss: 1.0914844274520874	F1-Score: 0.0
6.	Loss: 1.0901715755462646	F1-Score: 0.0
7.	Loss: 1.0898025035858154	F1-Score: 0.0
8.	Loss: 1.0897811651229858	F1-Score: 0.0
9.	Loss: 1.0898922681808472	F1-Score: 0.0
10.	Loss: 1.0900721549987793	F1-Score: 0.0
11.	Loss: 1.090285301208496	F1-Score: 0.0
12.	Loss: 1.0905060768127441	F1-Score: 0.0
13.	Loss: 1.0907232761383057	F1-Score: 0.0
14.	Loss: 1.0909380912780762	F1-Score: 0.0
15.	Loss: 1.0911571979522705	F1-Score: 0.0


Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 0.6078998446464539	F1-Score: 1.0
2.	Loss: 0.5848588347434998	F1-Score: 1.0
3.	Loss: 0.5656788349151611	F1-Score: 1.0
4.	Loss: 0.5501943230628967	F1-Score: 1.0
5.	Loss: 0.5378090143203735	F1-Score: 1.0
6.	Loss: 0.5279470086097717	F1-Score: 1.0
7.	Loss: 0.5201225876808167	F1-Score: 1.0
8.	Loss: 0.5139334201812744	F1-Score: 1.0
9.	Loss: 0.5090491771697998	F1-Score: 1.0
10.	Loss: 0.5052000284194946	F1-Score: 1.0
11.	Loss: 0.5021693110466003	F1-Score: 1.0
12.	Loss: 0.49978283047676086	F1-Score: 1.0
13.	Loss: 0.49790331721305847	F1-Score: 1.0
14.	Loss: 0.49642205238342285	F1-Score: 1.0
15.	Loss: 0.4952540397644043	F1-Score: 1.0
16.	Loss: 0.4943326413631439	F1-Score: 1.0
17.	Loss: 0.49360552430152893	F1-Score: 1.0
18.	Loss: 0.4930320978164673	F1-Score: 1.0
19.	Loss: 0.492580384016037	F1-Score: 1.0
20.	Loss: 0.49222517013549805	F1-Score: 1.0
21.	Loss: 0.49194711446762085	F1-Score: 1.0
22.	Loss: 0.49173033237457275	F1-Score: 1.0
23.	Loss: 0.491562902927398

Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 0.7733896374702454	F1-Score: 0.0
2.	Loss: 0.7476636171340942	F1-Score: 0.0
3.	Loss: 0.7197152972221375	F1-Score: 0.0
4.	Loss: 0.6959806084632874	F1-Score: 0.0
5.	Loss: 0.6738236546516418	F1-Score: 1.0
6.	Loss: 0.6527543663978577	F1-Score: 1.0
7.	Loss: 0.6328756213188171	F1-Score: 1.0
8.	Loss: 0.6143747568130493	F1-Score: 1.0
9.	Loss: 0.5975285172462463	F1-Score: 1.0
10.	Loss: 0.5824818015098572	F1-Score: 1.0
11.	Loss: 0.5691022276878357	F1-Score: 1.0
12.	Loss: 0.557378351688385	F1-Score: 1.0
13.	Loss: 0.5472307801246643	F1-Score: 1.0
14.	Loss: 0.5385333895683289	F1-Score: 1.0
15.	Loss: 0.531135082244873	F1-Score: 1.0
16.	Loss: 0.5248762965202332	F1-Score: 1.0
17.	Loss: 0.5196020007133484	F1-Score: 1.0
18.	Loss: 0.5151686668395996	F1-Score: 1.0
19.	Loss: 0.5114479660987854	F1-Score: 1.0
20.	Loss: 0.5083276033401489	F1-Score: 1.0
21.	Loss: 0.5057113170623779	F1-Score: 1.0
22.	Loss: 0.5035169124603271	F1-Score: 1.0
23.	Loss: 0.501675546169281	F1-Scor

Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 1.2808622121810913	F1-Score: 0.0
2.	Loss: 1.2304704189300537	F1-Score: 0.0
3.	Loss: 1.1923195123672485	F1-Score: 0.0
4.	Loss: 1.1811188459396362	F1-Score: 0.0
5.	Loss: 1.1813764572143555	F1-Score: 0.0
6.	Loss: 1.1701995134353638	F1-Score: 0.0
7.	Loss: 1.1602649688720703	F1-Score: 0.0
8.	Loss: 1.152870535850525	F1-Score: 0.0
9.	Loss: 1.1476305723190308	F1-Score: 0.0
10.	Loss: 1.142870306968689	F1-Score: 0.0
11.	Loss: 1.138620138168335	F1-Score: 0.0
12.	Loss: 1.1404668092727661	F1-Score: 0.0
13.	Loss: 1.1372631788253784	F1-Score: 0.0
14.	Loss: 1.1307717561721802	F1-Score: 0.0
15.	Loss: 1.1275876760482788	F1-Score: 0.0
16.	Loss: 1.1255704164505005	F1-Score: 0.0
17.	Loss: 1.1237351894378662	F1-Score: 0.0
18.	Loss: 1.1219666004180908	F1-Score: 0.0
19.	Loss: 1.1203020811080933	F1-Score: 0.0
20.	Loss: 1.1187479496002197	F1-Score: 0.0
21.	Loss: 1.1189357042312622	F1-Score: 0.0
22.	Loss: 1.116318702697754	F1-Score: 0.0
23.	Loss: 1.11464262008667	F1-Score: 

Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 0.9867160320281982	F1-Score: 0.0
2.	Loss: 1.0502163171768188	F1-Score: 0.0
3.	Loss: 1.0807082653045654	F1-Score: 0.0
4.	Loss: 1.089363694190979	F1-Score: 0.0
5.	Loss: 1.092118263244629	F1-Score: 0.0
6.	Loss: 1.0909878015518188	F1-Score: 0.0
7.	Loss: 1.0898410081863403	F1-Score: 0.0
8.	Loss: 1.0889955759048462	F1-Score: 0.0
9.	Loss: 1.0882867574691772	F1-Score: 0.0
10.	Loss: 1.087626338005066	F1-Score: 0.0
11.	Loss: 1.0869929790496826	F1-Score: 0.0
12.	Loss: 1.0863877534866333	F1-Score: 0.0
13.	Loss: 1.0858138799667358	F1-Score: 0.0
14.	Loss: 1.0852733850479126	F1-Score: 0.0
15.	Loss: 1.0847665071487427	F1-Score: 0.0
16.	Loss: 1.0842927694320679	F1-Score: 0.0
17.	Loss: 1.0838508605957031	F1-Score: 0.0
18.	Loss: 1.0834389925003052	F1-Score: 0.0
19.	Loss: 1.0830557346343994	F1-Score: 0.0
20.	Loss: 1.0826995372772217	F1-Score: 0.0
21.	Loss: 1.0823687314987183	F1-Score: 0.0
22.	Loss: 1.082061767578125	F1-Score: 0.0
23.	Loss: 1.0817774534225464	F1-Score

Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 0.6329155564308167	F1-Score: 1.0
2.	Loss: 0.6438727378845215	F1-Score: 1.0
3.	Loss: 0.6188392639160156	F1-Score: 1.0
4.	Loss: 0.6109501123428345	F1-Score: 1.0
5.	Loss: 0.594868004322052	F1-Score: 1.0
6.	Loss: 0.5807433128356934	F1-Score: 1.0
7.	Loss: 0.5683759450912476	F1-Score: 1.0
8.	Loss: 0.5575895309448242	F1-Score: 1.0
9.	Loss: 0.5482246279716492	F1-Score: 1.0
10.	Loss: 0.5401304364204407	F1-Score: 1.0
11.	Loss: 0.5331645011901855	F1-Score: 1.0
12.	Loss: 0.5271919369697571	F1-Score: 1.0
13.	Loss: 0.5220872759819031	F1-Score: 1.0
14.	Loss: 0.5177361369132996	F1-Score: 1.0
15.	Loss: 0.5140346884727478	F1-Score: 1.0
16.	Loss: 0.5108909606933594	F1-Score: 1.0
17.	Loss: 0.5082233548164368	F1-Score: 1.0
18.	Loss: 0.5059611797332764	F1-Score: 1.0
19.	Loss: 0.5040428638458252	F1-Score: 1.0
20.	Loss: 0.5024157762527466	F1-Score: 1.0
21.	Loss: 0.5010347366333008	F1-Score: 1.0
22.	Loss: 0.49986162781715393	F1-Score: 1.0
23.	Loss: 0.4988640248775482	F1-S

Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 0.6926242113113403	F1-Score: 1.0
2.	Loss: 0.5805683135986328	F1-Score: 1.0
3.	Loss: 0.5386560559272766	F1-Score: 1.0
4.	Loss: 0.537747859954834	F1-Score: 1.0
5.	Loss: 0.5999608039855957	F1-Score: 1.0
6.	Loss: 0.5484752058982849	F1-Score: 1.0
7.	Loss: 0.5273371338844299	F1-Score: 1.0
8.	Loss: 0.5211060047149658	F1-Score: 1.0
9.	Loss: 0.5190329551696777	F1-Score: 1.0
10.	Loss: 0.5178297758102417	F1-Score: 1.0
11.	Loss: 0.516840934753418	F1-Score: 1.0
12.	Loss: 0.5159715414047241	F1-Score: 1.0
13.	Loss: 0.5151962041854858	F1-Score: 1.0
14.	Loss: 0.5144920349121094	F1-Score: 1.0
15.	Loss: 0.5138385891914368	F1-Score: 1.0
16.	Loss: 0.513220489025116	F1-Score: 1.0
17.	Loss: 0.5126268863677979	F1-Score: 1.0
18.	Loss: 0.5120500326156616	F1-Score: 1.0
19.	Loss: 0.5114844441413879	F1-Score: 1.0
20.	Loss: 0.5109255313873291	F1-Score: 1.0
21.	Loss: 0.510370135307312	F1-Score: 1.0
22.	Loss: 0.5098150968551636	F1-Score: 1.0
23.	Loss: 0.5094919204711914	F1-Score

Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 0.5384238958358765	F1-Score: 1.0
2.	Loss: 0.5320359468460083	F1-Score: 1.0
3.	Loss: 0.5263504981994629	F1-Score: 1.0
4.	Loss: 0.5233837366104126	F1-Score: 1.0
5.	Loss: 0.5226306319236755	F1-Score: 1.0
6.	Loss: 0.522712767124176	F1-Score: 1.0
7.	Loss: 0.5230034589767456	F1-Score: 1.0
8.	Loss: 0.5232959985733032	F1-Score: 1.0
9.	Loss: 0.5235204696655273	F1-Score: 1.0
10.	Loss: 0.5236433148384094	F1-Score: 1.0
11.	Loss: 0.5219243168830872	F1-Score: 1.0
12.	Loss: 0.5205708742141724	F1-Score: 1.0
13.	Loss: 0.5197998881340027	F1-Score: 1.0
14.	Loss: 0.5191365480422974	F1-Score: 1.0
15.	Loss: 0.5140162110328674	F1-Score: 1.0
16.	Loss: 0.5159494876861572	F1-Score: 1.0
17.	Loss: 0.5147928595542908	F1-Score: 1.0
18.	Loss: 0.5129404067993164	F1-Score: 1.0
19.	Loss: 0.5124520063400269	F1-Score: 1.0
20.	Loss: 0.5098947882652283	F1-Score: 1.0
21.	Loss: 0.5067028999328613	F1-Score: 1.0
22.	Loss: 0.509020984172821	F1-Score: 1.0
23.	Loss: 0.5064892172813416	F1-Sco

Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 0.536858856678009	F1-Score: 1.0
2.	Loss: 0.3583829700946808	F1-Score: 1.0
3.	Loss: 0.5113580226898193	F1-Score: 1.0
4.	Loss: 0.5353361368179321	F1-Score: 1.0
5.	Loss: 0.5377224683761597	F1-Score: 1.0
6.	Loss: 0.537451446056366	F1-Score: 1.0
7.	Loss: 0.5373520851135254	F1-Score: 1.0
8.	Loss: 0.5373882055282593	F1-Score: 1.0
9.	Loss: 0.5373116731643677	F1-Score: 1.0
10.	Loss: 0.5369946956634521	F1-Score: 1.0
11.	Loss: 0.5363818407058716	F1-Score: 1.0
12.	Loss: 0.535440981388092	F1-Score: 1.0
13.	Loss: 0.5336708426475525	F1-Score: 1.0
14.	Loss: 0.5290609002113342	F1-Score: 1.0
15.	Loss: 0.5231992602348328	F1-Score: 1.0
16.	Loss: 0.5063266158103943	F1-Score: 1.0
17.	Loss: 0.514647364616394	F1-Score: 1.0
18.	Loss: 0.5182891488075256	F1-Score: 1.0
19.	Loss: 0.5140293836593628	F1-Score: 1.0
20.	Loss: 0.5157413482666016	F1-Score: 1.0
21.	Loss: 0.48552918434143066	F1-Score: 1.0
22.	Loss: 0.48046061396598816	F1-Score: 1.0
23.	Loss: 0.48398450016975403	F1-Sc

Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 0.6757416725158691	F1-Score: 1.0
2.	Loss: 0.6587401628494263	F1-Score: 1.0
3.	Loss: 0.6424775719642639	F1-Score: 1.0
4.	Loss: 0.6275922060012817	F1-Score: 1.0
5.	Loss: 0.6139497756958008	F1-Score: 1.0
6.	Loss: 0.6013804078102112	F1-Score: 1.0
7.	Loss: 0.5897753834724426	F1-Score: 1.0
8.	Loss: 0.5790761709213257	F1-Score: 1.0
9.	Loss: 0.5692753791809082	F1-Score: 1.0
10.	Loss: 0.560649037361145	F1-Score: 1.0
11.	Loss: 0.5527797341346741	F1-Score: 1.0
12.	Loss: 0.5456520318984985	F1-Score: 1.0
13.	Loss: 0.5392430424690247	F1-Score: 1.0
14.	Loss: 0.5335187315940857	F1-Score: 1.0
15.	Loss: 0.5284363627433777	F1-Score: 1.0
16.	Loss: 0.5239466428756714	F1-Score: 1.0
17.	Loss: 0.5199976563453674	F1-Score: 1.0
18.	Loss: 0.5165367722511292	F1-Score: 1.0
19.	Loss: 0.5135123133659363	F1-Score: 1.0
20.	Loss: 0.5108516216278076	F1-Score: 1.0
21.	Loss: 0.5086008310317993	F1-Score: 1.0
22.	Loss: 0.5066385865211487	F1-Score: 1.0
23.	Loss: 0.5049295425415039	F1-Sc

Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 0.8104305863380432	F1-Score: 0.0
2.	Loss: 0.8180322647094727	F1-Score: 0.0
3.	Loss: 0.8504951596260071	F1-Score: 0.0
4.	Loss: 0.8808003067970276	F1-Score: 0.0
5.	Loss: 0.9095478057861328	F1-Score: 0.0
6.	Loss: 0.9364197254180908	F1-Score: 0.0
7.	Loss: 0.9610282182693481	F1-Score: 0.0
8.	Loss: 0.9830806851387024	F1-Score: 0.0
9.	Loss: 1.0024499893188477	F1-Score: 0.0
10.	Loss: 1.0191725492477417	F1-Score: 0.0
11.	Loss: 1.0334099531173706	F1-Score: 0.0


Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 0.6763008832931519	F1-Score: 1.0
2.	Loss: 0.6614552140235901	F1-Score: 1.0
3.	Loss: 0.6454972624778748	F1-Score: 1.0
4.	Loss: 0.629649817943573	F1-Score: 1.0
5.	Loss: 0.6143364310264587	F1-Score: 1.0
6.	Loss: 0.5998371243476868	F1-Score: 1.0
7.	Loss: 0.5863649845123291	F1-Score: 1.0
8.	Loss: 0.5740622878074646	F1-Score: 1.0
9.	Loss: 0.5629979968070984	F1-Score: 1.0
10.	Loss: 0.5531771183013916	F1-Score: 1.0
11.	Loss: 0.5445534586906433	F1-Score: 1.0
12.	Loss: 0.5370469093322754	F1-Score: 1.0
13.	Loss: 0.5305576920509338	F1-Score: 1.0
14.	Loss: 0.524977445602417	F1-Score: 1.0
15.	Loss: 0.5201981067657471	F1-Score: 1.0
16.	Loss: 0.5161166787147522	F1-Score: 1.0
17.	Loss: 0.5126383900642395	F1-Score: 1.0
18.	Loss: 0.5096782445907593	F1-Score: 1.0
19.	Loss: 0.5071610808372498	F1-Score: 1.0
20.	Loss: 0.5050216913223267	F1-Score: 1.0
21.	Loss: 0.503203272819519	F1-Score: 1.0
22.	Loss: 0.5016576051712036	F1-Score: 1.0
23.	Loss: 0.5003432035446167	F1-Scor

Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 0.6213222742080688	F1-Score: 1.0
2.	Loss: 0.6008532643318176	F1-Score: 1.0
3.	Loss: 0.5739682912826538	F1-Score: 1.0
4.	Loss: 0.5590644478797913	F1-Score: 1.0
5.	Loss: 0.5493889451026917	F1-Score: 1.0
6.	Loss: 0.5437943935394287	F1-Score: 1.0
7.	Loss: 0.5399001240730286	F1-Score: 1.0
8.	Loss: 0.5367963910102844	F1-Score: 1.0
9.	Loss: 0.5341479182243347	F1-Score: 1.0
10.	Loss: 0.5320068001747131	F1-Score: 1.0
11.	Loss: 0.5298838019371033	F1-Score: 1.0
12.	Loss: 0.5279141664505005	F1-Score: 1.0
13.	Loss: 0.5261229872703552	F1-Score: 1.0
14.	Loss: 0.5244774222373962	F1-Score: 1.0
15.	Loss: 0.5229482650756836	F1-Score: 1.0
16.	Loss: 0.5215147137641907	F1-Score: 1.0
17.	Loss: 0.5201054215431213	F1-Score: 1.0
18.	Loss: 0.5187103152275085	F1-Score: 1.0
19.	Loss: 0.5176761150360107	F1-Score: 1.0
20.	Loss: 0.5165876150131226	F1-Score: 1.0
21.	Loss: 0.5152949690818787	F1-Score: 1.0
22.	Loss: 0.5143185257911682	F1-Score: 1.0
23.	Loss: 0.5132827162742615	F1-S

Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 0.48258742690086365	F1-Score: 1.0
2.	Loss: 0.48115238547325134	F1-Score: 1.0
3.	Loss: 0.49554702639579773	F1-Score: 1.0
4.	Loss: 0.5009101033210754	F1-Score: 1.0
5.	Loss: 0.5021902918815613	F1-Score: 1.0
6.	Loss: 0.5023582577705383	F1-Score: 1.0
7.	Loss: 0.5022519826889038	F1-Score: 1.0
8.	Loss: 0.5020281076431274	F1-Score: 1.0
9.	Loss: 0.5016946196556091	F1-Score: 1.0
10.	Loss: 0.5012602210044861	F1-Score: 1.0
11.	Loss: 0.5006856918334961	F1-Score: 1.0
12.	Loss: 0.500078022480011	F1-Score: 1.0
13.	Loss: 0.4994048476219177	F1-Score: 1.0
14.	Loss: 0.49888426065444946	F1-Score: 1.0
15.	Loss: 0.4981716573238373	F1-Score: 1.0
16.	Loss: 0.4973415434360504	F1-Score: 1.0
17.	Loss: 0.496428906917572	F1-Score: 1.0
18.	Loss: 0.49544546008110046	F1-Score: 1.0
19.	Loss: 0.4943934381008148	F1-Score: 1.0
20.	Loss: 0.4932718276977539	F1-Score: 1.0
21.	Loss: 0.4813823103904724	F1-Score: 1.0
22.	Loss: 0.48774293065071106	F1-Score: 1.0
23.	Loss: 0.48953548073768616

Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 0.7161861062049866	F1-Score: 0.0
2.	Loss: 0.6240310668945312	F1-Score: 1.0
3.	Loss: 0.5573776960372925	F1-Score: 1.0
4.	Loss: 0.5243604779243469	F1-Score: 1.0
5.	Loss: 0.513725996017456	F1-Score: 1.0
6.	Loss: 0.5086737871170044	F1-Score: 1.0
7.	Loss: 0.5055197477340698	F1-Score: 1.0
8.	Loss: 0.5033363699913025	F1-Score: 1.0
9.	Loss: 0.5017808675765991	F1-Score: 1.0
10.	Loss: 0.5006521344184875	F1-Score: 1.0
11.	Loss: 0.4996947944164276	F1-Score: 1.0
12.	Loss: 0.4988867938518524	F1-Score: 1.0
13.	Loss: 0.4981301724910736	F1-Score: 1.0
14.	Loss: 0.4973703622817993	F1-Score: 1.0
15.	Loss: 0.49663376808166504	F1-Score: 1.0
16.	Loss: 0.4958978295326233	F1-Score: 1.0
17.	Loss: 0.49514734745025635	F1-Score: 1.0
18.	Loss: 0.494382381439209	F1-Score: 1.0
19.	Loss: 0.49359947443008423	F1-Score: 1.0
20.	Loss: 0.49279069900512695	F1-Score: 1.0
21.	Loss: 0.49194398522377014	F1-Score: 1.0
22.	Loss: 0.4911116361618042	F1-Score: 1.0
23.	Loss: 0.49019327759742737	

Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 1.1557189226150513	F1-Score: 0.0
2.	Loss: 1.1061012744903564	F1-Score: 0.0
3.	Loss: 1.0959160327911377	F1-Score: 0.0
4.	Loss: 1.0935441255569458	F1-Score: 0.0
5.	Loss: 1.0931360721588135	F1-Score: 0.0
6.	Loss: 1.0927983522415161	F1-Score: 0.0
7.	Loss: 1.0922973155975342	F1-Score: 0.0
8.	Loss: 1.091740369796753	F1-Score: 0.0
9.	Loss: 1.091235876083374	F1-Score: 0.0
10.	Loss: 1.0908429622650146	F1-Score: 0.0
11.	Loss: 1.0905792713165283	F1-Score: 0.0
12.	Loss: 1.0904399156570435	F1-Score: 0.0
13.	Loss: 1.0904111862182617	F1-Score: 0.0
14.	Loss: 1.090658187866211	F1-Score: 0.0
15.	Loss: 1.090651273727417	F1-Score: 0.0
16.	Loss: 1.0907680988311768	F1-Score: 0.0
17.	Loss: 1.0910921096801758	F1-Score: 0.0
18.	Loss: 1.0915478467941284	F1-Score: 0.0
19.	Loss: 1.0920542478561401	F1-Score: 0.0
20.	Loss: 1.0925920009613037	F1-Score: 0.0
21.	Loss: 1.0943083763122559	F1-Score: 0.0
22.	Loss: 1.0941569805145264	F1-Score: 0.0
23.	Loss: 1.0944794416427612	F1-Score

Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 1.1268491744995117	F1-Score: 0.0
2.	Loss: 1.0697057247161865	F1-Score: 0.0
3.	Loss: 1.0632083415985107	F1-Score: 0.0
4.	Loss: 1.0609697103500366	F1-Score: 0.0
5.	Loss: 1.0599595308303833	F1-Score: 0.0
6.	Loss: 1.0596683025360107	F1-Score: 0.0
7.	Loss: 1.0603586435317993	F1-Score: 0.0
8.	Loss: 1.0617074966430664	F1-Score: 0.0
9.	Loss: 1.0634552240371704	F1-Score: 0.0
10.	Loss: 1.065455436706543	F1-Score: 0.0
11.	Loss: 1.0676238536834717	F1-Score: 0.0
12.	Loss: 1.071238398551941	F1-Score: 0.0
13.	Loss: 1.0734748840332031	F1-Score: 0.0
14.	Loss: 1.0756858587265015	F1-Score: 0.0
15.	Loss: 1.0768437385559082	F1-Score: 0.0
16.	Loss: 1.0819377899169922	F1-Score: 0.0


Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 0.9993035197257996	F1-Score: 0.0
2.	Loss: 1.0209808349609375	F1-Score: 0.0
3.	Loss: 1.0301581621170044	F1-Score: 0.0
4.	Loss: 1.0352526903152466	F1-Score: 0.0
5.	Loss: 1.0377672910690308	F1-Score: 0.0
6.	Loss: 1.0389553308486938	F1-Score: 0.0
7.	Loss: 1.0395450592041016	F1-Score: 0.0
8.	Loss: 1.039900541305542	F1-Score: 0.0
9.	Loss: 1.0402034521102905	F1-Score: 0.0
10.	Loss: 1.0405515432357788	F1-Score: 0.0
11.	Loss: 1.0410047769546509	F1-Score: 0.0


Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 0.6101931929588318	F1-Score: 1.0
2.	Loss: 0.5656620264053345	F1-Score: 1.0
3.	Loss: 0.5535145401954651	F1-Score: 1.0
4.	Loss: 0.5472775101661682	F1-Score: 1.0
5.	Loss: 0.5441333651542664	F1-Score: 1.0
6.	Loss: 0.5407675504684448	F1-Score: 1.0
7.	Loss: 0.5388482809066772	F1-Score: 1.0
8.	Loss: 0.5373666882514954	F1-Score: 1.0
9.	Loss: 0.5358462929725647	F1-Score: 1.0
10.	Loss: 0.5342851281166077	F1-Score: 1.0
11.	Loss: 0.5327340960502625	F1-Score: 1.0
12.	Loss: 0.5312104821205139	F1-Score: 1.0
13.	Loss: 0.5297124981880188	F1-Score: 1.0
14.	Loss: 0.5282333493232727	F1-Score: 1.0
15.	Loss: 0.5267658829689026	F1-Score: 1.0
16.	Loss: 0.5253045558929443	F1-Score: 1.0
17.	Loss: 0.5238439440727234	F1-Score: 1.0
18.	Loss: 0.5223800539970398	F1-Score: 1.0
19.	Loss: 0.5209087133407593	F1-Score: 1.0
20.	Loss: 0.5170628428459167	F1-Score: 1.0
21.	Loss: 0.5155128240585327	F1-Score: 1.0
22.	Loss: 0.5144186615943909	F1-Score: 1.0
23.	Loss: 0.5131110548973083	F1-S

Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 0.9950699806213379	F1-Score: 0.0
2.	Loss: 1.0682618618011475	F1-Score: 0.0
3.	Loss: 1.0793074369430542	F1-Score: 0.0
4.	Loss: 1.0794786214828491	F1-Score: 0.0
5.	Loss: 1.0781326293945312	F1-Score: 0.0
6.	Loss: 1.0771210193634033	F1-Score: 0.0
7.	Loss: 1.0764422416687012	F1-Score: 0.0
8.	Loss: 1.0760343074798584	F1-Score: 0.0
9.	Loss: 1.0758816003799438	F1-Score: 0.0
10.	Loss: 1.0759687423706055	F1-Score: 0.0
11.	Loss: 1.0762697458267212	F1-Score: 0.0
12.	Loss: 1.0768322944641113	F1-Score: 0.0
13.	Loss: 1.0774534940719604	F1-Score: 0.0
14.	Loss: 1.078220248222351	F1-Score: 0.0
15.	Loss: 1.0790587663650513	F1-Score: 0.0
16.	Loss: 1.0800012350082397	F1-Score: 0.0


Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 0.638259768486023	F1-Score: 1.0
2.	Loss: 0.6008602380752563	F1-Score: 1.0
3.	Loss: 0.5868315696716309	F1-Score: 1.0
4.	Loss: 0.5763235688209534	F1-Score: 1.0
5.	Loss: 0.5668326020240784	F1-Score: 1.0
6.	Loss: 0.5582966208457947	F1-Score: 1.0
7.	Loss: 0.5506245493888855	F1-Score: 1.0
8.	Loss: 0.5437251925468445	F1-Score: 1.0
9.	Loss: 0.5375524759292603	F1-Score: 1.0
10.	Loss: 0.5320557951927185	F1-Score: 1.0
11.	Loss: 0.5271823406219482	F1-Score: 1.0
12.	Loss: 0.5228774547576904	F1-Score: 1.0
13.	Loss: 0.5190872550010681	F1-Score: 1.0
14.	Loss: 0.5157586932182312	F1-Score: 1.0
15.	Loss: 0.5128422379493713	F1-Score: 1.0
16.	Loss: 0.5102914571762085	F1-Score: 1.0
17.	Loss: 0.508063554763794	F1-Score: 1.0
18.	Loss: 0.506119966506958	F1-Score: 1.0
19.	Loss: 0.5044257044792175	F1-Score: 1.0
20.	Loss: 0.5029497146606445	F1-Score: 1.0
21.	Loss: 0.5016646981239319	F1-Score: 1.0
22.	Loss: 0.5005459189414978	F1-Score: 1.0
23.	Loss: 0.49957239627838135	F1-Sco

Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 0.5807408690452576	F1-Score: 1.0
2.	Loss: 0.5589325428009033	F1-Score: 1.0
3.	Loss: 0.5505329966545105	F1-Score: 1.0
4.	Loss: 0.5436917543411255	F1-Score: 1.0
5.	Loss: 0.5376797914505005	F1-Score: 1.0
6.	Loss: 0.5323426127433777	F1-Score: 1.0
7.	Loss: 0.5276052951812744	F1-Score: 1.0
8.	Loss: 0.5234106183052063	F1-Score: 1.0
9.	Loss: 0.519706130027771	F1-Score: 1.0
10.	Loss: 0.5164430737495422	F1-Score: 1.0
11.	Loss: 0.5135754346847534	F1-Score: 1.0
12.	Loss: 0.511060357093811	F1-Score: 1.0
13.	Loss: 0.5088582038879395	F1-Score: 1.0
14.	Loss: 0.5069324374198914	F1-Score: 1.0
15.	Loss: 0.5052504539489746	F1-Score: 1.0
16.	Loss: 0.5037827491760254	F1-Score: 1.0
17.	Loss: 0.5025025606155396	F1-Score: 1.0
18.	Loss: 0.5013865828514099	F1-Score: 1.0
19.	Loss: 0.5004141926765442	F1-Score: 1.0
20.	Loss: 0.49956679344177246	F1-Score: 1.0
21.	Loss: 0.4988286793231964	F1-Score: 1.0
22.	Loss: 0.49818548560142517	F1-Score: 1.0
23.	Loss: 0.4976252317428589	F1-S

Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 0.7810875773429871	F1-Score: 0.0
2.	Loss: 0.8054102659225464	F1-Score: 0.0
3.	Loss: 0.8293878436088562	F1-Score: 0.0
4.	Loss: 0.8536514639854431	F1-Score: 0.0
5.	Loss: 0.8780303001403809	F1-Score: 0.0
6.	Loss: 0.9021334648132324	F1-Score: 0.0
7.	Loss: 0.9254997968673706	F1-Score: 0.0
8.	Loss: 0.9476598501205444	F1-Score: 0.0
9.	Loss: 0.9682114720344543	F1-Score: 0.0
10.	Loss: 0.9868751764297485	F1-Score: 0.0
11.	Loss: 1.0035133361816406	F1-Score: 0.0


Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 0.9283021092414856	F1-Score: 0.0
2.	Loss: 1.0554107427597046	F1-Score: 0.0
3.	Loss: 1.1066395044326782	F1-Score: 0.0
4.	Loss: 1.1189769506454468	F1-Score: 0.0
5.	Loss: 1.1201611757278442	F1-Score: 0.0
6.	Loss: 1.1191504001617432	F1-Score: 0.0
7.	Loss: 1.1178277730941772	F1-Score: 0.0
8.	Loss: 1.11646568775177	F1-Score: 0.0
9.	Loss: 1.1151005029678345	F1-Score: 0.0
10.	Loss: 1.1137526035308838	F1-Score: 0.0
11.	Loss: 1.1124391555786133	F1-Score: 0.0
12.	Loss: 1.111170768737793	F1-Score: 0.0
13.	Loss: 1.1099525690078735	F1-Score: 0.0
14.	Loss: 1.1087863445281982	F1-Score: 0.0
15.	Loss: 1.1076716184616089	F1-Score: 0.0
16.	Loss: 1.1066069602966309	F1-Score: 0.0
17.	Loss: 1.1055909395217896	F1-Score: 0.0
18.	Loss: 1.104621171951294	F1-Score: 0.0
19.	Loss: 1.104184627532959	F1-Score: 0.0
20.	Loss: 1.1036103963851929	F1-Score: 0.0
21.	Loss: 1.1023576259613037	F1-Score: 0.0
22.	Loss: 1.1012845039367676	F1-Score: 0.0
23.	Loss: 1.1005494594573975	F1-Score:

Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 0.6689756512641907	F1-Score: 1.0
2.	Loss: 0.6508122086524963	F1-Score: 1.0
3.	Loss: 0.6387967467308044	F1-Score: 1.0
4.	Loss: 0.6277523636817932	F1-Score: 1.0
5.	Loss: 0.6174151301383972	F1-Score: 1.0
6.	Loss: 0.6076148152351379	F1-Score: 1.0
7.	Loss: 0.5989819169044495	F1-Score: 1.0
8.	Loss: 0.5907631516456604	F1-Score: 1.0
9.	Loss: 0.5828005075454712	F1-Score: 1.0
10.	Loss: 0.5751407742500305	F1-Score: 1.0
11.	Loss: 0.5678403973579407	F1-Score: 1.0
12.	Loss: 0.5609508156776428	F1-Score: 1.0
13.	Loss: 0.5545114278793335	F1-Score: 1.0
14.	Loss: 0.5485455989837646	F1-Score: 1.0
15.	Loss: 0.5430617332458496	F1-Score: 1.0
16.	Loss: 0.5380542874336243	F1-Score: 1.0
17.	Loss: 0.5335075259208679	F1-Score: 1.0
18.	Loss: 0.5293979048728943	F1-Score: 1.0
19.	Loss: 0.5256971120834351	F1-Score: 1.0
20.	Loss: 0.5223740339279175	F1-Score: 1.0
21.	Loss: 0.5193967819213867	F1-Score: 1.0
22.	Loss: 0.516733705997467	F1-Score: 1.0
23.	Loss: 0.5143545269966125	F1-Sc

Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 0.6413552165031433	F1-Score: 1.0
2.	Loss: 0.6220137476921082	F1-Score: 1.0
3.	Loss: 0.5911004543304443	F1-Score: 1.0
4.	Loss: 0.5996478796005249	F1-Score: 1.0
5.	Loss: 0.6062796711921692	F1-Score: 1.0
6.	Loss: 0.5937919616699219	F1-Score: 1.0
7.	Loss: 0.5833955407142639	F1-Score: 1.0
8.	Loss: 0.5642520189285278	F1-Score: 1.0
9.	Loss: 0.5616239309310913	F1-Score: 1.0
10.	Loss: 0.559734582901001	F1-Score: 1.0
11.	Loss: 0.5548226833343506	F1-Score: 1.0
12.	Loss: 0.5464950203895569	F1-Score: 1.0
13.	Loss: 0.5375149846076965	F1-Score: 1.0
14.	Loss: 0.5316533446311951	F1-Score: 1.0
15.	Loss: 0.5275078415870667	F1-Score: 1.0
16.	Loss: 0.522962749004364	F1-Score: 1.0
17.	Loss: 0.526034951210022	F1-Score: 1.0
18.	Loss: 0.522523820400238	F1-Score: 1.0
19.	Loss: 0.5202934741973877	F1-Score: 1.0
20.	Loss: 0.5190271735191345	F1-Score: 1.0
21.	Loss: 0.5136515498161316	F1-Score: 1.0
22.	Loss: 0.5105594992637634	F1-Score: 1.0
23.	Loss: 0.5089815258979797	F1-Score

Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 0.603203535079956	F1-Score: 1.0
2.	Loss: 0.5490844249725342	F1-Score: 1.0
3.	Loss: 0.5316634178161621	F1-Score: 1.0
4.	Loss: 0.5250886678695679	F1-Score: 1.0
5.	Loss: 0.5223501324653625	F1-Score: 1.0
6.	Loss: 0.5211005806922913	F1-Score: 1.0
7.	Loss: 0.5241883993148804	F1-Score: 1.0
8.	Loss: 0.5227986574172974	F1-Score: 1.0
9.	Loss: 0.5150394439697266	F1-Score: 1.0
10.	Loss: 0.5173685550689697	F1-Score: 1.0
11.	Loss: 0.5184325575828552	F1-Score: 1.0
12.	Loss: 0.5183020830154419	F1-Score: 1.0
13.	Loss: 0.5180824995040894	F1-Score: 1.0
14.	Loss: 0.5179082155227661	F1-Score: 1.0
15.	Loss: 0.5177087187767029	F1-Score: 1.0
16.	Loss: 0.5174579620361328	F1-Score: 1.0
17.	Loss: 0.5171632170677185	F1-Score: 1.0
18.	Loss: 0.5168378353118896	F1-Score: 1.0
19.	Loss: 0.5164913535118103	F1-Score: 1.0
20.	Loss: 0.5161295533180237	F1-Score: 1.0
21.	Loss: 0.5157564878463745	F1-Score: 1.0
22.	Loss: 0.5153751373291016	F1-Score: 1.0
23.	Loss: 0.514988362789154	F1-Sco

Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 0.5770494341850281	F1-Score: 1.0
2.	Loss: 0.5695927739143372	F1-Score: 1.0
3.	Loss: 0.5609109401702881	F1-Score: 1.0
4.	Loss: 0.5544156432151794	F1-Score: 1.0
5.	Loss: 0.5494416952133179	F1-Score: 1.0
6.	Loss: 0.5422344207763672	F1-Score: 1.0
7.	Loss: 0.5392012000083923	F1-Score: 1.0
8.	Loss: 0.5304893255233765	F1-Score: 1.0
9.	Loss: 0.5313767790794373	F1-Score: 1.0
10.	Loss: 0.5262435674667358	F1-Score: 1.0
11.	Loss: 0.5330838561058044	F1-Score: 1.0
12.	Loss: 0.5270067453384399	F1-Score: 1.0
13.	Loss: 0.509377658367157	F1-Score: 1.0
14.	Loss: 0.4886970520019531	F1-Score: 1.0
15.	Loss: 0.5368104577064514	F1-Score: 1.0
16.	Loss: 0.5305629372596741	F1-Score: 1.0
17.	Loss: 0.5253353714942932	F1-Score: 1.0
18.	Loss: 0.5210182666778564	F1-Score: 1.0
19.	Loss: 0.5172761082649231	F1-Score: 1.0
20.	Loss: 0.5140329003334045	F1-Score: 1.0
21.	Loss: 0.5112204551696777	F1-Score: 1.0
22.	Loss: 0.5087806582450867	F1-Score: 1.0
23.	Loss: 0.5066633224487305	F1-Sc

Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 0.6924469470977783	F1-Score: 1.0
2.	Loss: 0.5700799226760864	F1-Score: 1.0
3.	Loss: 0.5452776551246643	F1-Score: 1.0
4.	Loss: 0.554862380027771	F1-Score: 1.0
5.	Loss: 0.5436849594116211	F1-Score: 1.0
6.	Loss: 0.5369659662246704	F1-Score: 1.0
7.	Loss: 0.5333883166313171	F1-Score: 1.0
8.	Loss: 0.5309765338897705	F1-Score: 1.0
9.	Loss: 0.5287383794784546	F1-Score: 1.0
10.	Loss: 0.5273115038871765	F1-Score: 1.0
11.	Loss: 0.525796115398407	F1-Score: 1.0
12.	Loss: 0.5241981148719788	F1-Score: 1.0
13.	Loss: 0.5203336477279663	F1-Score: 1.0
14.	Loss: 0.5201218724250793	F1-Score: 1.0
15.	Loss: 0.5174732804298401	F1-Score: 1.0
16.	Loss: 0.5175632834434509	F1-Score: 1.0
17.	Loss: 0.5151618719100952	F1-Score: 1.0
18.	Loss: 0.5127975344657898	F1-Score: 1.0
19.	Loss: 0.51378333568573	F1-Score: 1.0
20.	Loss: 0.5141350626945496	F1-Score: 1.0
21.	Loss: 0.510828971862793	F1-Score: 1.0
22.	Loss: 0.5111789107322693	F1-Score: 1.0
23.	Loss: 0.5086866617202759	F1-Score:

Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 1.2833364009857178	F1-Score: 0.0
2.	Loss: 1.1057145595550537	F1-Score: 0.0
3.	Loss: 1.0677869319915771	F1-Score: 0.0
4.	Loss: 1.0709285736083984	F1-Score: 0.0
5.	Loss: 1.0760802030563354	F1-Score: 0.0
6.	Loss: 1.0792968273162842	F1-Score: 0.0
7.	Loss: 1.0818848609924316	F1-Score: 0.0
8.	Loss: 1.0845381021499634	F1-Score: 0.0
9.	Loss: 1.087375283241272	F1-Score: 0.0
10.	Loss: 1.0903739929199219	F1-Score: 0.0
11.	Loss: 1.0935131311416626	F1-Score: 0.0
12.	Loss: 1.096786379814148	F1-Score: 0.0
13.	Loss: 1.100197196006775	F1-Score: 0.0


Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 0.6081348657608032	F1-Score: 1.0
2.	Loss: 0.5853734016418457	F1-Score: 1.0
3.	Loss: 0.5721896290779114	F1-Score: 1.0
4.	Loss: 0.5616356134414673	F1-Score: 1.0
5.	Loss: 0.5527849197387695	F1-Score: 1.0
6.	Loss: 0.5452143549919128	F1-Score: 1.0
7.	Loss: 0.538685142993927	F1-Score: 1.0
8.	Loss: 0.5330355167388916	F1-Score: 1.0
9.	Loss: 0.5281402468681335	F1-Score: 1.0
10.	Loss: 0.5238953232765198	F1-Score: 1.0
11.	Loss: 0.5202117562294006	F1-Score: 1.0
12.	Loss: 0.5170131921768188	F1-Score: 1.0
13.	Loss: 0.5142337083816528	F1-Score: 1.0
14.	Loss: 0.511817216873169	F1-Score: 1.0
15.	Loss: 0.5097152590751648	F1-Score: 1.0
16.	Loss: 0.5078867077827454	F1-Score: 1.0
17.	Loss: 0.5062958598136902	F1-Score: 1.0
18.	Loss: 0.5049120187759399	F1-Score: 1.0
19.	Loss: 0.5037079453468323	F1-Score: 1.0
20.	Loss: 0.5026602149009705	F1-Score: 1.0
21.	Loss: 0.5017480850219727	F1-Score: 1.0
22.	Loss: 0.5009535551071167	F1-Score: 1.0
23.	Loss: 0.5002610087394714	F1-Sco

Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 0.5396748185157776	F1-Score: 1.0
2.	Loss: 0.5358312129974365	F1-Score: 1.0
3.	Loss: 0.5442080497741699	F1-Score: 1.0
4.	Loss: 0.5352064371109009	F1-Score: 1.0
5.	Loss: 0.5244330763816833	F1-Score: 1.0
6.	Loss: 0.5187471508979797	F1-Score: 1.0
7.	Loss: 0.5142234563827515	F1-Score: 1.0
8.	Loss: 0.5104933381080627	F1-Score: 1.0
9.	Loss: 0.5074087381362915	F1-Score: 1.0
10.	Loss: 0.5048593282699585	F1-Score: 1.0
11.	Loss: 0.5027533173561096	F1-Score: 1.0
12.	Loss: 0.5010145902633667	F1-Score: 1.0
13.	Loss: 0.4995792806148529	F1-Score: 1.0
14.	Loss: 0.4983944296836853	F1-Score: 1.0
15.	Loss: 0.4974164366722107	F1-Score: 1.0
16.	Loss: 0.4966091811656952	F1-Score: 1.0
17.	Loss: 0.4959428310394287	F1-Score: 1.0
18.	Loss: 0.4953928291797638	F1-Score: 1.0
19.	Loss: 0.4949389696121216	F1-Score: 1.0
20.	Loss: 0.49456456303596497	F1-Score: 1.0
21.	Loss: 0.494255930185318	F1-Score: 1.0
22.	Loss: 0.4940018355846405	F1-Score: 1.0
23.	Loss: 0.4937928020954132	F1-S

Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 0.7923469543457031	F1-Score: 0.0
2.	Loss: 0.7186577916145325	F1-Score: 0.0
3.	Loss: 0.5724894404411316	F1-Score: 1.0
4.	Loss: 0.5290055274963379	F1-Score: 1.0
5.	Loss: 0.5124011039733887	F1-Score: 1.0
6.	Loss: 0.50639408826828	F1-Score: 1.0
7.	Loss: 0.5039423704147339	F1-Score: 1.0
8.	Loss: 0.5026341676712036	F1-Score: 1.0
9.	Loss: 0.5013983845710754	F1-Score: 1.0
10.	Loss: 0.5009967684745789	F1-Score: 1.0
11.	Loss: 0.5001598000526428	F1-Score: 1.0
12.	Loss: 0.4989244043827057	F1-Score: 1.0
13.	Loss: 0.49895179271698	F1-Score: 1.0
14.	Loss: 0.49831998348236084	F1-Score: 1.0
15.	Loss: 0.49706539511680603	F1-Score: 1.0
16.	Loss: 0.4960215389728546	F1-Score: 1.0
17.	Loss: 0.49325355887413025	F1-Score: 1.0
18.	Loss: 0.49450016021728516	F1-Score: 1.0
19.	Loss: 0.49353715777397156	F1-Score: 1.0
20.	Loss: 0.4922468364238739	F1-Score: 1.0
21.	Loss: 0.4906269907951355	F1-Score: 1.0
22.	Loss: 0.48917356133461	F1-Score: 1.0
23.	Loss: 0.48827075958251953	F1-S

Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 0.565144419670105	F1-Score: 1.0
2.	Loss: 0.5559232831001282	F1-Score: 1.0
3.	Loss: 0.5472409725189209	F1-Score: 1.0
4.	Loss: 0.5397143959999084	F1-Score: 1.0
5.	Loss: 0.5332579612731934	F1-Score: 1.0
6.	Loss: 0.5277172327041626	F1-Score: 1.0
7.	Loss: 0.5229566693305969	F1-Score: 1.0
8.	Loss: 0.5188627243041992	F1-Score: 1.0
9.	Loss: 0.5153394937515259	F1-Score: 1.0
10.	Loss: 0.5123066902160645	F1-Score: 1.0
11.	Loss: 0.5096943378448486	F1-Score: 1.0
12.	Loss: 0.5074433088302612	F1-Score: 1.0
13.	Loss: 0.5055018067359924	F1-Score: 1.0
14.	Loss: 0.5038254261016846	F1-Score: 1.0
15.	Loss: 0.5023762583732605	F1-Score: 1.0
16.	Loss: 0.5011215806007385	F1-Score: 1.0
17.	Loss: 0.5000333786010742	F1-Score: 1.0
18.	Loss: 0.4990880489349365	F1-Score: 1.0
19.	Loss: 0.49826502799987793	F1-Score: 1.0
20.	Loss: 0.4975473880767822	F1-Score: 1.0
21.	Loss: 0.4969203770160675	F1-Score: 1.0
22.	Loss: 0.4963715076446533	F1-Score: 1.0
23.	Loss: 0.4958902895450592	F1-S

Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 0.8989920020103455	F1-Score: 0.0
2.	Loss: 0.9623714685440063	F1-Score: 0.0
3.	Loss: 0.987128496170044	F1-Score: 0.0
4.	Loss: 0.9956129193305969	F1-Score: 0.0
5.	Loss: 1.0097405910491943	F1-Score: 0.0
6.	Loss: 1.013956069946289	F1-Score: 0.0
7.	Loss: 1.0217586755752563	F1-Score: 0.0
8.	Loss: 1.0279719829559326	F1-Score: 0.0
9.	Loss: 1.028096079826355	F1-Score: 0.0
10.	Loss: 1.0275236368179321	F1-Score: 0.0
11.	Loss: 1.0292607545852661	F1-Score: 0.0
12.	Loss: 1.039193868637085	F1-Score: 0.0


Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 0.8677865862846375	F1-Score: 0.0
2.	Loss: 0.8940250873565674	F1-Score: 0.0
3.	Loss: 0.920167088508606	F1-Score: 0.0
4.	Loss: 0.9442647099494934	F1-Score: 0.0
5.	Loss: 0.9660167098045349	F1-Score: 0.0
6.	Loss: 0.9854254722595215	F1-Score: 0.0
7.	Loss: 1.0025646686553955	F1-Score: 0.0
8.	Loss: 1.017545223236084	F1-Score: 0.0
9.	Loss: 1.0305123329162598	F1-Score: 0.0
10.	Loss: 1.041638731956482	F1-Score: 0.0
11.	Loss: 1.0511140823364258	F1-Score: 0.0


Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 0.9193761348724365	F1-Score: 0.0
2.	Loss: 1.1423022747039795	F1-Score: 0.0
3.	Loss: 1.1723095178604126	F1-Score: 0.0
4.	Loss: 1.174831509590149	F1-Score: 0.0
5.	Loss: 1.1716314554214478	F1-Score: 0.0
6.	Loss: 1.169386625289917	F1-Score: 0.0
7.	Loss: 1.1640093326568604	F1-Score: 0.0
8.	Loss: 1.1658680438995361	F1-Score: 0.0
9.	Loss: 1.16620934009552	F1-Score: 0.0
10.	Loss: 1.1658371686935425	F1-Score: 0.0
11.	Loss: 1.1654120683670044	F1-Score: 0.0
12.	Loss: 1.1651052236557007	F1-Score: 0.0
13.	Loss: 1.1649165153503418	F1-Score: 0.0
14.	Loss: 1.1648259162902832	F1-Score: 0.0
15.	Loss: 1.1648192405700684	F1-Score: 0.0
16.	Loss: 1.1648881435394287	F1-Score: 0.0
17.	Loss: 1.1650265455245972	F1-Score: 0.0
18.	Loss: 1.1652299165725708	F1-Score: 0.0
19.	Loss: 1.1654949188232422	F1-Score: 0.0
20.	Loss: 1.1658191680908203	F1-Score: 0.0


Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 0.4270838499069214	F1-Score: 1.0
2.	Loss: 0.4862366020679474	F1-Score: 1.0
3.	Loss: 0.5092223286628723	F1-Score: 1.0
4.	Loss: 0.5223269462585449	F1-Score: 1.0
5.	Loss: 0.5262510180473328	F1-Score: 1.0
6.	Loss: 0.5265316963195801	F1-Score: 1.0
7.	Loss: 0.5263285040855408	F1-Score: 1.0
8.	Loss: 0.5266523957252502	F1-Score: 1.0
9.	Loss: 0.5251628756523132	F1-Score: 1.0
10.	Loss: 0.5232610106468201	F1-Score: 1.0
11.	Loss: 0.5263650417327881	F1-Score: 1.0
12.	Loss: 0.5230803489685059	F1-Score: 1.0
13.	Loss: 0.5026869773864746	F1-Score: 1.0
14.	Loss: 0.5098015069961548	F1-Score: 1.0
15.	Loss: 0.5175400972366333	F1-Score: 1.0
16.	Loss: 0.5637370944023132	F1-Score: 1.0


Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 0.680308997631073	F1-Score: 1.0
2.	Loss: 0.7173005938529968	F1-Score: 0.0
3.	Loss: 0.7520161867141724	F1-Score: 0.0
4.	Loss: 0.7850791811943054	F1-Score: 0.0
5.	Loss: 0.8171307444572449	F1-Score: 0.0
6.	Loss: 0.8483835458755493	F1-Score: 0.0
7.	Loss: 0.878616452217102	F1-Score: 0.0
8.	Loss: 0.907712459564209	F1-Score: 0.0
9.	Loss: 0.9350114464759827	F1-Score: 0.0
10.	Loss: 0.9599403142929077	F1-Score: 0.0
11.	Loss: 0.9821324348449707	F1-Score: 0.0


Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 0.7402384281158447	F1-Score: 0.0
2.	Loss: 0.7061464190483093	F1-Score: 0.0
3.	Loss: 0.6620606780052185	F1-Score: 1.0
4.	Loss: 0.6168215274810791	F1-Score: 1.0
5.	Loss: 0.58363276720047	F1-Score: 1.0
6.	Loss: 0.5637571215629578	F1-Score: 1.0
7.	Loss: 0.5523360371589661	F1-Score: 1.0
8.	Loss: 0.5453681349754333	F1-Score: 1.0
9.	Loss: 0.5407314896583557	F1-Score: 1.0
10.	Loss: 0.5375398993492126	F1-Score: 1.0
11.	Loss: 0.5345938205718994	F1-Score: 1.0
12.	Loss: 0.5321735143661499	F1-Score: 1.0
13.	Loss: 0.5301191806793213	F1-Score: 1.0
14.	Loss: 0.5283072590827942	F1-Score: 1.0
15.	Loss: 0.5266655087471008	F1-Score: 1.0
16.	Loss: 0.525151789188385	F1-Score: 1.0
17.	Loss: 0.52373868227005	F1-Score: 1.0
18.	Loss: 0.522406816482544	F1-Score: 1.0
19.	Loss: 0.5211406350135803	F1-Score: 1.0
20.	Loss: 0.5199275612831116	F1-Score: 1.0
21.	Loss: 0.5187563300132751	F1-Score: 1.0
22.	Loss: 0.5176182985305786	F1-Score: 1.0
23.	Loss: 0.5165053009986877	F1-Score: 

Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 0.5677382349967957	F1-Score: 1.0
2.	Loss: 0.5552058219909668	F1-Score: 1.0
3.	Loss: 0.5444692969322205	F1-Score: 1.0
4.	Loss: 0.5355585217475891	F1-Score: 1.0
5.	Loss: 0.5281702876091003	F1-Score: 1.0
6.	Loss: 0.5220270752906799	F1-Score: 1.0
7.	Loss: 0.5169088244438171	F1-Score: 1.0
8.	Loss: 0.512640118598938	F1-Score: 1.0
9.	Loss: 0.5090801119804382	F1-Score: 1.0
10.	Loss: 0.5061126351356506	F1-Score: 1.0
11.	Loss: 0.5036391019821167	F1-Score: 1.0
12.	Loss: 0.5015767216682434	F1-Score: 1.0
13.	Loss: 0.49985572695732117	F1-Score: 1.0
14.	Loss: 0.498418390750885	F1-Score: 1.0
15.	Loss: 0.49721699953079224	F1-Score: 1.0
16.	Loss: 0.4962121248245239	F1-Score: 1.0
17.	Loss: 0.4953717291355133	F1-Score: 1.0
18.	Loss: 0.49466875195503235	F1-Score: 1.0
19.	Loss: 0.4940812587738037	F1-Score: 1.0
20.	Loss: 0.49359115958213806	F1-Score: 1.0
21.	Loss: 0.4931831955909729	F1-Score: 1.0
22.	Loss: 0.4928445518016815	F1-Score: 1.0
23.	Loss: 0.4925650656223297	F1

Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 0.7689473032951355	F1-Score: 0.0
2.	Loss: 0.8197014331817627	F1-Score: 0.0
3.	Loss: 0.8700745701789856	F1-Score: 0.0
4.	Loss: 0.9160885214805603	F1-Score: 0.0
5.	Loss: 0.9560478329658508	F1-Score: 0.0
6.	Loss: 0.9894165992736816	F1-Score: 0.0
7.	Loss: 1.016437292098999	F1-Score: 0.0
8.	Loss: 1.0378139019012451	F1-Score: 0.0
9.	Loss: 1.0544410943984985	F1-Score: 0.0
10.	Loss: 1.0672202110290527	F1-Score: 0.0
11.	Loss: 1.0769610404968262	F1-Score: 0.0


Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 0.7673255801200867	F1-Score: 0.0
2.	Loss: 1.0496076345443726	F1-Score: 0.0
3.	Loss: 1.1559085845947266	F1-Score: 0.0
4.	Loss: 1.1728335618972778	F1-Score: 0.0
5.	Loss: 1.1682230234146118	F1-Score: 0.0
6.	Loss: 1.16293203830719	F1-Score: 0.0
7.	Loss: 1.1590763330459595	F1-Score: 0.0
8.	Loss: 1.1560777425765991	F1-Score: 0.0
9.	Loss: 1.1543501615524292	F1-Score: 0.0
10.	Loss: 1.1526316404342651	F1-Score: 0.0
11.	Loss: 1.150834321975708	F1-Score: 0.0
12.	Loss: 1.1498150825500488	F1-Score: 0.0
13.	Loss: 1.1470707654953003	F1-Score: 0.0
14.	Loss: 1.1444422006607056	F1-Score: 0.0
15.	Loss: 1.1429070234298706	F1-Score: 0.0
16.	Loss: 1.1417648792266846	F1-Score: 0.0
17.	Loss: 1.1406733989715576	F1-Score: 0.0
18.	Loss: 1.138949990272522	F1-Score: 0.0
19.	Loss: 1.1375139951705933	F1-Score: 0.0
20.	Loss: 1.1363050937652588	F1-Score: 0.0
21.	Loss: 1.1351933479309082	F1-Score: 0.0
22.	Loss: 1.1341392993927002	F1-Score: 0.0
23.	Loss: 1.1361501216888428	F1-Score

Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 0.6543229818344116	F1-Score: 1.0
2.	Loss: 0.6191815137863159	F1-Score: 1.0
3.	Loss: 0.6014785170555115	F1-Score: 1.0
4.	Loss: 0.5870475769042969	F1-Score: 1.0
5.	Loss: 0.5839497447013855	F1-Score: 1.0
6.	Loss: 0.5606147646903992	F1-Score: 1.0
7.	Loss: 0.5623072981834412	F1-Score: 1.0
8.	Loss: 0.5566054582595825	F1-Score: 1.0
9.	Loss: 0.5505304932594299	F1-Score: 1.0
10.	Loss: 0.5469557046890259	F1-Score: 1.0
11.	Loss: 0.5405360460281372	F1-Score: 1.0
12.	Loss: 0.5353084206581116	F1-Score: 1.0
13.	Loss: 0.5379177927970886	F1-Score: 1.0
14.	Loss: 0.5606394410133362	F1-Score: 1.0
15.	Loss: 0.5529047250747681	F1-Score: 1.0
16.	Loss: 0.546265184879303	F1-Score: 1.0
17.	Loss: 0.5402692556381226	F1-Score: 1.0
18.	Loss: 0.5348755717277527	F1-Score: 1.0
19.	Loss: 0.5300446152687073	F1-Score: 1.0
20.	Loss: 0.5257328152656555	F1-Score: 1.0
21.	Loss: 0.5218949913978577	F1-Score: 1.0
22.	Loss: 0.518486738204956	F1-Score: 1.0
23.	Loss: 0.5154654383659363	F1-Sco

Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 0.6010470390319824	F1-Score: 1.0
2.	Loss: 0.5833615660667419	F1-Score: 1.0
3.	Loss: 0.5758847594261169	F1-Score: 1.0
4.	Loss: 0.5696247816085815	F1-Score: 1.0
5.	Loss: 0.5629636645317078	F1-Score: 1.0
6.	Loss: 0.5570348501205444	F1-Score: 1.0
7.	Loss: 0.545028030872345	F1-Score: 1.0
8.	Loss: 0.5404378175735474	F1-Score: 1.0
9.	Loss: 0.5366401672363281	F1-Score: 1.0
10.	Loss: 0.5332765579223633	F1-Score: 1.0
11.	Loss: 0.530239462852478	F1-Score: 1.0
12.	Loss: 0.5274662375450134	F1-Score: 1.0
13.	Loss: 0.5249181985855103	F1-Score: 1.0
14.	Loss: 0.5225694179534912	F1-Score: 1.0
15.	Loss: 0.5204002857208252	F1-Score: 1.0
16.	Loss: 0.5183953642845154	F1-Score: 1.0
17.	Loss: 0.5165416598320007	F1-Score: 1.0
18.	Loss: 0.5148279666900635	F1-Score: 1.0
19.	Loss: 0.5132439136505127	F1-Score: 1.0
20.	Loss: 0.511780321598053	F1-Score: 1.0
21.	Loss: 0.5104277729988098	F1-Score: 1.0
22.	Loss: 0.5091784596443176	F1-Score: 1.0
23.	Loss: 0.5080251097679138	F1-Scor

Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 0.4657336175441742	F1-Score: 1.0
2.	Loss: 0.49224916100502014	F1-Score: 1.0
3.	Loss: 0.5086423754692078	F1-Score: 1.0
4.	Loss: 0.5171930193901062	F1-Score: 1.0
5.	Loss: 0.5210955739021301	F1-Score: 1.0
6.	Loss: 0.5246551632881165	F1-Score: 1.0
7.	Loss: 0.5255047678947449	F1-Score: 1.0
8.	Loss: 0.5253777503967285	F1-Score: 1.0
9.	Loss: 0.5246501564979553	F1-Score: 1.0
10.	Loss: 0.5235475897789001	F1-Score: 1.0
11.	Loss: 0.5222412347793579	F1-Score: 1.0
12.	Loss: 0.5208523273468018	F1-Score: 1.0
13.	Loss: 0.5130466222763062	F1-Score: 1.0
14.	Loss: 0.5074048638343811	F1-Score: 1.0
15.	Loss: 0.5059940814971924	F1-Score: 1.0
16.	Loss: 0.5023937821388245	F1-Score: 1.0
17.	Loss: 0.5050240159034729	F1-Score: 1.0
18.	Loss: 0.5010610222816467	F1-Score: 1.0
19.	Loss: 0.4980999529361725	F1-Score: 1.0
20.	Loss: 0.5031455755233765	F1-Score: 1.0
21.	Loss: 0.49906018376350403	F1-Score: 1.0
22.	Loss: 0.5031954646110535	F1-Score: 1.0
23.	Loss: 0.4958803057670593	F1

Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 0.5840712785720825	F1-Score: 1.0
2.	Loss: 0.5766488313674927	F1-Score: 1.0
3.	Loss: 0.5673382878303528	F1-Score: 1.0
4.	Loss: 0.558795690536499	F1-Score: 1.0
5.	Loss: 0.5510542988777161	F1-Score: 1.0
6.	Loss: 0.5440903306007385	F1-Score: 1.0
7.	Loss: 0.5378614664077759	F1-Score: 1.0
8.	Loss: 0.5323178768157959	F1-Score: 1.0
9.	Loss: 0.5274056196212769	F1-Score: 1.0
10.	Loss: 0.5230697393417358	F1-Score: 1.0
11.	Loss: 0.5192549228668213	F1-Score: 1.0
12.	Loss: 0.5159076452255249	F1-Score: 1.0
13.	Loss: 0.5129716396331787	F1-Score: 1.0
14.	Loss: 0.5104272961616516	F1-Score: 1.0
15.	Loss: 0.508202075958252	F1-Score: 1.0
16.	Loss: 0.5062583684921265	F1-Score: 1.0
17.	Loss: 0.504562497138977	F1-Score: 1.0
18.	Loss: 0.5030837655067444	F1-Score: 1.0
19.	Loss: 0.5017954707145691	F1-Score: 1.0
20.	Loss: 0.5006733536720276	F1-Score: 1.0
21.	Loss: 0.49969616532325745	F1-Score: 1.0
22.	Loss: 0.4988453686237335	F1-Score: 1.0
23.	Loss: 0.49810466170310974	F1-Sc

Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 0.6626095175743103	F1-Score: 1.0
2.	Loss: 0.6179826855659485	F1-Score: 1.0
3.	Loss: 0.6016381978988647	F1-Score: 1.0
4.	Loss: 0.5828850865364075	F1-Score: 1.0
5.	Loss: 0.5548301935195923	F1-Score: 1.0
6.	Loss: 0.5600900650024414	F1-Score: 1.0
7.	Loss: 0.5395510196685791	F1-Score: 1.0
8.	Loss: 0.5499265193939209	F1-Score: 1.0
9.	Loss: 0.5470162034034729	F1-Score: 1.0
10.	Loss: 0.5917562246322632	F1-Score: 1.0
11.	Loss: 0.573859453201294	F1-Score: 1.0
12.	Loss: 0.5593938827514648	F1-Score: 1.0
13.	Loss: 0.5473914742469788	F1-Score: 1.0
14.	Loss: 0.5374795794487	F1-Score: 1.0
15.	Loss: 0.5293230414390564	F1-Score: 1.0
16.	Loss: 0.5226285457611084	F1-Score: 1.0
17.	Loss: 0.517144501209259	F1-Score: 1.0
18.	Loss: 0.5126573443412781	F1-Score: 1.0
19.	Loss: 0.5089876651763916	F1-Score: 1.0
20.	Loss: 0.5059871077537537	F1-Score: 1.0
21.	Loss: 0.5035327672958374	F1-Score: 1.0
22.	Loss: 0.5015243291854858	F1-Score: 1.0
23.	Loss: 0.49987971782684326	F1-Score

Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 0.8710914254188538	F1-Score: 0.0
2.	Loss: 1.064164161682129	F1-Score: 0.0
3.	Loss: 1.0952396392822266	F1-Score: 0.0
4.	Loss: 1.0954208374023438	F1-Score: 0.0
5.	Loss: 1.096837043762207	F1-Score: 0.0
6.	Loss: 1.0993618965148926	F1-Score: 0.0
7.	Loss: 1.1023962497711182	F1-Score: 0.0
8.	Loss: 1.1053024530410767	F1-Score: 0.0
9.	Loss: 1.1087727546691895	F1-Score: 0.0
10.	Loss: 1.1124098300933838	F1-Score: 0.0
11.	Loss: 1.116377353668213	F1-Score: 0.0


Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 0.6957072615623474	F1-Score: 0.0
2.	Loss: 0.6810175776481628	F1-Score: 1.0
3.	Loss: 0.5989636182785034	F1-Score: 1.0
4.	Loss: 0.5874641537666321	F1-Score: 1.0
5.	Loss: 0.5785128474235535	F1-Score: 1.0
6.	Loss: 0.5702368021011353	F1-Score: 1.0
7.	Loss: 0.5637697577476501	F1-Score: 1.0
8.	Loss: 0.5589051246643066	F1-Score: 1.0
9.	Loss: 0.5551362633705139	F1-Score: 1.0
10.	Loss: 0.552054226398468	F1-Score: 1.0
11.	Loss: 0.5484191179275513	F1-Score: 1.0
12.	Loss: 0.5456592440605164	F1-Score: 1.0
13.	Loss: 0.5425922274589539	F1-Score: 1.0
14.	Loss: 0.538304328918457	F1-Score: 1.0
15.	Loss: 0.5358255505561829	F1-Score: 1.0
16.	Loss: 0.5319024324417114	F1-Score: 1.0
17.	Loss: 0.5298107266426086	F1-Score: 1.0
18.	Loss: 0.5259578824043274	F1-Score: 1.0
19.	Loss: 0.5236107110977173	F1-Score: 1.0
20.	Loss: 0.5193970799446106	F1-Score: 1.0
21.	Loss: 0.5203178524971008	F1-Score: 1.0
22.	Loss: 0.5172979831695557	F1-Score: 1.0
23.	Loss: 0.5006749629974365	F1-Sco

Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 0.6089116334915161	F1-Score: 1.0
2.	Loss: 0.5910782814025879	F1-Score: 1.0
3.	Loss: 0.578233540058136	F1-Score: 1.0
4.	Loss: 0.5672733783721924	F1-Score: 1.0
5.	Loss: 0.5578144192695618	F1-Score: 1.0
6.	Loss: 0.5495980381965637	F1-Score: 1.0
7.	Loss: 0.5424400568008423	F1-Score: 1.0
8.	Loss: 0.536198079586029	F1-Score: 1.0
9.	Loss: 0.5307551622390747	F1-Score: 1.0
10.	Loss: 0.5260113477706909	F1-Score: 1.0
11.	Loss: 0.5218795537948608	F1-Score: 1.0
12.	Loss: 0.5182822346687317	F1-Score: 1.0
13.	Loss: 0.5151510834693909	F1-Score: 1.0
14.	Loss: 0.5124257802963257	F1-Score: 1.0
15.	Loss: 0.5100533962249756	F1-Score: 1.0
16.	Loss: 0.5079872608184814	F1-Score: 1.0
17.	Loss: 0.5061869025230408	F1-Score: 1.0
18.	Loss: 0.504616916179657	F1-Score: 1.0
19.	Loss: 0.5032468438148499	F1-Score: 1.0
20.	Loss: 0.50204998254776	F1-Score: 1.0
21.	Loss: 0.5010034441947937	F1-Score: 1.0
22.	Loss: 0.500087559223175	F1-Score: 1.0
23.	Loss: 0.4992850124835968	F1-Score: 

Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 0.5767555832862854	F1-Score: 1.0
2.	Loss: 0.5685617923736572	F1-Score: 1.0
3.	Loss: 0.5565049052238464	F1-Score: 1.0
4.	Loss: 0.5464301705360413	F1-Score: 1.0
5.	Loss: 0.5380105972290039	F1-Score: 1.0
6.	Loss: 0.5309253334999084	F1-Score: 1.0
7.	Loss: 0.524932324886322	F1-Score: 1.0
8.	Loss: 0.5198465585708618	F1-Score: 1.0
9.	Loss: 0.5155197381973267	F1-Score: 1.0
10.	Loss: 0.5118308067321777	F1-Score: 1.0
11.	Loss: 0.5086797475814819	F1-Score: 1.0
12.	Loss: 0.5059827566146851	F1-Score: 1.0
13.	Loss: 0.5036706924438477	F1-Score: 1.0
14.	Loss: 0.5016852617263794	F1-Score: 1.0
15.	Loss: 0.49997857213020325	F1-Score: 1.0
16.	Loss: 0.49851012229919434	F1-Score: 1.0
17.	Loss: 0.4972456991672516	F1-Score: 1.0
18.	Loss: 0.4961562156677246	F1-Score: 1.0
19.	Loss: 0.4952171742916107	F1-Score: 1.0
20.	Loss: 0.4944070279598236	F1-Score: 1.0
21.	Loss: 0.49370765686035156	F1-Score: 1.0
22.	Loss: 0.4931032657623291	F1-Score: 1.0
23.	Loss: 0.49258068203926086	F

Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 0.7402632832527161	F1-Score: 0.0
2.	Loss: 0.7009713649749756	F1-Score: 0.0
3.	Loss: 0.6592661738395691	F1-Score: 1.0
4.	Loss: 0.5879050493240356	F1-Score: 1.0
5.	Loss: 0.5916413068771362	F1-Score: 1.0
6.	Loss: 0.6174758076667786	F1-Score: 1.0
7.	Loss: 0.5965861678123474	F1-Score: 1.0
8.	Loss: 0.5786774158477783	F1-Score: 1.0
9.	Loss: 0.5634016990661621	F1-Score: 1.0
10.	Loss: 0.5504641532897949	F1-Score: 1.0
11.	Loss: 0.5395963191986084	F1-Score: 1.0
12.	Loss: 0.5305356979370117	F1-Score: 1.0
13.	Loss: 0.5230275988578796	F1-Score: 1.0
14.	Loss: 0.5168338418006897	F1-Score: 1.0
15.	Loss: 0.5117397308349609	F1-Score: 1.0
16.	Loss: 0.5075575709342957	F1-Score: 1.0
17.	Loss: 0.5041268467903137	F1-Score: 1.0
18.	Loss: 0.5013130903244019	F1-Score: 1.0
19.	Loss: 0.4990043044090271	F1-Score: 1.0
20.	Loss: 0.497108519077301	F1-Score: 1.0
21.	Loss: 0.4955796003341675	F1-Score: 1.0
22.	Loss: 0.49439185857772827	F1-Score: 1.0
23.	Loss: 0.49340730905532837	F1-

Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 1.053387999534607	F1-Score: 0.0
2.	Loss: 1.1102609634399414	F1-Score: 0.0
3.	Loss: 1.0956162214279175	F1-Score: 0.0
4.	Loss: 1.0920945405960083	F1-Score: 0.0
5.	Loss: 1.094652533531189	F1-Score: 0.0
6.	Loss: 1.0907871723175049	F1-Score: 0.0
7.	Loss: 1.0899657011032104	F1-Score: 0.0
8.	Loss: 1.0902565717697144	F1-Score: 0.0
9.	Loss: 1.0907196998596191	F1-Score: 0.0
10.	Loss: 1.0912854671478271	F1-Score: 0.0
11.	Loss: 1.1064568758010864	F1-Score: 0.0
12.	Loss: 1.0955479145050049	F1-Score: 0.0
13.	Loss: 1.0970698595046997	F1-Score: 0.0
14.	Loss: 1.0985872745513916	F1-Score: 0.0
15.	Loss: 1.0990004539489746	F1-Score: 0.0
16.	Loss: 1.0991848707199097	F1-Score: 0.0


Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 0.9498424530029297	F1-Score: 0.0
2.	Loss: 1.015076756477356	F1-Score: 0.0
3.	Loss: 1.0304017066955566	F1-Score: 0.0
4.	Loss: 1.0343464612960815	F1-Score: 0.0
5.	Loss: 1.0367164611816406	F1-Score: 0.0
6.	Loss: 1.0390146970748901	F1-Score: 0.0
7.	Loss: 1.0413140058517456	F1-Score: 0.0
8.	Loss: 1.0436182022094727	F1-Score: 0.0
9.	Loss: 1.045960783958435	F1-Score: 0.0
10.	Loss: 1.0483750104904175	F1-Score: 0.0
11.	Loss: 1.0508846044540405	F1-Score: 0.0


Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 0.8396737575531006	F1-Score: 0.0
2.	Loss: 1.1887954473495483	F1-Score: 0.0
3.	Loss: 1.1957981586456299	F1-Score: 0.0
4.	Loss: 1.1862523555755615	F1-Score: 0.0
5.	Loss: 1.1949825286865234	F1-Score: 0.0
6.	Loss: 1.7412728071212769	F1-Score: 0.0
7.	Loss: 1.0904091596603394	F1-Score: 0.0
8.	Loss: 1.1601852178573608	F1-Score: 0.0
9.	Loss: 1.1831839084625244	F1-Score: 0.0
10.	Loss: 1.1815979480743408	F1-Score: 0.0
11.	Loss: 1.1800386905670166	F1-Score: 0.0
12.	Loss: 1.185484766960144	F1-Score: 0.0
13.	Loss: 1.1737347841262817	F1-Score: 0.0
14.	Loss: 1.1786404848098755	F1-Score: 0.0
15.	Loss: 1.1796715259552002	F1-Score: 0.0
16.	Loss: 1.1792218685150146	F1-Score: 0.0
17.	Loss: 1.1790810823440552	F1-Score: 0.0
18.	Loss: 1.1793512105941772	F1-Score: 0.0


Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 0.6071006059646606	F1-Score: 1.0
2.	Loss: 0.5898905992507935	F1-Score: 1.0
3.	Loss: 0.5787035226821899	F1-Score: 1.0
4.	Loss: 0.568732500076294	F1-Score: 1.0
5.	Loss: 0.5598263740539551	F1-Score: 1.0
6.	Loss: 0.551868736743927	F1-Score: 1.0
7.	Loss: 0.5447714924812317	F1-Score: 1.0
8.	Loss: 0.5384629964828491	F1-Score: 1.0
9.	Loss: 0.5328773856163025	F1-Score: 1.0
10.	Loss: 0.5279505848884583	F1-Score: 1.0
11.	Loss: 0.5236191749572754	F1-Score: 1.0
12.	Loss: 0.5198211073875427	F1-Score: 1.0
13.	Loss: 0.5164977312088013	F1-Score: 1.0
14.	Loss: 0.513593852519989	F1-Score: 1.0
15.	Loss: 0.511059582233429	F1-Score: 1.0
16.	Loss: 0.5088493824005127	F1-Score: 1.0
17.	Loss: 0.5069231390953064	F1-Score: 1.0
18.	Loss: 0.5052447319030762	F1-Score: 1.0
19.	Loss: 0.5037826299667358	F1-Score: 1.0
20.	Loss: 0.5025088787078857	F1-Score: 1.0
21.	Loss: 0.5013985633850098	F1-Score: 1.0
22.	Loss: 0.5004305243492126	F1-Score: 1.0
23.	Loss: 0.49958595633506775	F1-Scor

Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 0.8819653391838074	F1-Score: 0.0
2.	Loss: 1.0746209621429443	F1-Score: 0.0
3.	Loss: 1.101270318031311	F1-Score: 0.0
4.	Loss: 1.099574089050293	F1-Score: 0.0
5.	Loss: 1.1000518798828125	F1-Score: 0.0
6.	Loss: 1.1026333570480347	F1-Score: 0.0
7.	Loss: 1.1006648540496826	F1-Score: 0.0
8.	Loss: 1.0985887050628662	F1-Score: 0.0
9.	Loss: 1.0971416234970093	F1-Score: 0.0
10.	Loss: 1.0961401462554932	F1-Score: 0.0
11.	Loss: 1.095442771911621	F1-Score: 0.0
12.	Loss: 1.0949382781982422	F1-Score: 0.0
13.	Loss: 1.094617486000061	F1-Score: 0.0
14.	Loss: 1.0943593978881836	F1-Score: 0.0
15.	Loss: 1.0948278903961182	F1-Score: 0.0
16.	Loss: 1.0945258140563965	F1-Score: 0.0
17.	Loss: 1.0944856405258179	F1-Score: 0.0
18.	Loss: 1.0945732593536377	F1-Score: 0.0
19.	Loss: 1.0947494506835938	F1-Score: 0.0
20.	Loss: 1.0949630737304688	F1-Score: 0.0
21.	Loss: 1.0952180624008179	F1-Score: 0.0
22.	Loss: 1.095518708229065	F1-Score: 0.0


Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 0.4395131468772888	F1-Score: 1.0
2.	Loss: 0.5323485136032104	F1-Score: 1.0
3.	Loss: 0.5437688231468201	F1-Score: 1.0
4.	Loss: 0.5382479429244995	F1-Score: 1.0
5.	Loss: 0.5312632322311401	F1-Score: 1.0
6.	Loss: 0.5287757515907288	F1-Score: 1.0
7.	Loss: 0.5272478461265564	F1-Score: 1.0
8.	Loss: 0.526122510433197	F1-Score: 1.0
9.	Loss: 0.5248066782951355	F1-Score: 1.0
10.	Loss: 0.5237318277359009	F1-Score: 1.0
11.	Loss: 0.5224337577819824	F1-Score: 1.0
12.	Loss: 0.5205456018447876	F1-Score: 1.0
13.	Loss: 0.5189662575721741	F1-Score: 1.0
14.	Loss: 0.5176694989204407	F1-Score: 1.0
15.	Loss: 0.517448902130127	F1-Score: 1.0
16.	Loss: 0.5147578716278076	F1-Score: 1.0
17.	Loss: 0.5147565007209778	F1-Score: 1.0
18.	Loss: 0.5121819376945496	F1-Score: 1.0
19.	Loss: 0.5123520493507385	F1-Score: 1.0
20.	Loss: 0.5116482377052307	F1-Score: 1.0
21.	Loss: 0.5092971920967102	F1-Score: 1.0
22.	Loss: 0.5085961818695068	F1-Score: 1.0
23.	Loss: 0.5067105889320374	F1-Sco

Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[+] Data vectorized correctly
1.	Loss: 0.7620685696601868	F1-Score: 0.0
2.	Loss: 0.6323961019515991	F1-Score: 1.0
3.	Loss: 0.527485728263855	F1-Score: 1.0
4.	Loss: 0.500312328338623	F1-Score: 1.0
5.	Loss: 0.484281063079834	F1-Score: 1.0
6.	Loss: 0.491383820772171	F1-Score: 1.0
7.	Loss: 0.4945339560508728	F1-Score: 1.0
