## Set-up environment

First, we install 🤗 Transformers, as well as 🤗 Datasets and Seqeval (the latter is useful for evaluation metrics such as F1 on sequence labeling tasks).

In [1]:
!pip install -q git+https://github.com/huggingface/transformers.git

  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m7.6/7.6 MB[0m [31m65.5 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m190.3/190.3 KB[0m [31m24.0 MB/s[0m eta [36m0:00:00[0m
[?25h  Building wheel for transformers (pyproject.toml) ... [?25l[?25hdone


In [2]:
!pip install -q datasets seqeval

[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/469.0 KB[0m [31m?[0m eta [36m-:--:--[0m[2K     [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[90m╺[0m [32m460.8/469.0 KB[0m [31m20.7 MB/s[0m eta [36m0:00:01[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m469.0/469.0 KB[0m [31m9.3 MB/s[0m eta [36m0:00:00[0m
[?25h[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/43.6 KB[0m [31m?[0m eta [36m-:--:--[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m43.6/43.6 KB[0m [31m5.9 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m213.0/213.0 KB[0m [31m11.4 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m110.5/110.5 KB[0m [31m14.9 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m132.0/132.0 KB[0m [31m17.0 MB

## Load dataset

Next, we load a dataset from the 🤗 [hub](https://huggingface.co/datasets/nielsr/funsd-layoutlmv3). This one is the [FUNSD](https://guillaumejaume.github.io/FUNSD/) dataset, a collection of annotated forms.

In [3]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [4]:
import json
from datasets import load_dataset
from PIL import Image
import transformers
from datasets import Features, Sequence, ClassLabel, Value, Array2D, Array3D, load_metric
from transformers import LayoutLMv2Model, LayoutLMv2Config, LayoutLMv2Processor, LayoutXLMTokenizer
from transformers import LayoutXLMProcessor
import numpy as np
from transformers import LayoutLMv2ForTokenClassification, AdamW, TrainingArguments, Trainer,AutoTokenizer
import torch
from tqdm.notebook import tqdm
import pandas as pd

In [5]:
PATH = '/content/drive/MyDrive/educ/data/'

In [6]:
TRAIN_PATH = PATH + 'train.json'
VAL_PATH = PATH + 'validation.json'
TEST_PATH = PATH + 'test.json'

In [7]:
with open(TRAIN_PATH) as outfile:
  data = json.load(outfile)

In [8]:
features = Features({
    'id': Value(dtype='int64', id=None),
    'tokens': Sequence(feature=Value(dtype='string', id=None), length=-1, id=None),
    'bboxes': Sequence(feature=Sequence(feature=Value(dtype='int64', id=None), length=-1, id=None), length=-1, id=None),
    'ner_tags': Sequence(feature=ClassLabel(
        num_classes=7,
        names=['O', 'I-EDUC_DATE', 'I-EDUC_COURSE', 'I-EDUC_LOC', 'I-EDUC_GRADE', 'I-EDUC_DESCRIPTION', 'I-EDUC_SCHOOL']
        , id=None), 
        length=-1, id=None),
    'image': Value(dtype='string', id=None),
    })

In [9]:
def iob_to_label(label):
    """
    Changes the label input in case of there isnt one

    Args:
        label: label of word
        
    Returns:
        label
    """
    
    label = label[2:]
    if not label:
      return 'o'
    return label

In [10]:
train_val_dataset = load_dataset('json', data_files={'train':TRAIN_PATH, 'val': VAL_PATH, 'test': TEST_PATH},field="cvs",features=features)

Downloading and preparing dataset json/default to /root/.cache/huggingface/datasets/json/default-bd9705ca216c7c6a/0.0.0/0f7e3662623656454fcd2b650f34e886a7db4b9104504885bd462096cc7a9f51...


Downloading data files:   0%|          | 0/3 [00:00<?, ?it/s]

Extracting data files:   0%|          | 0/3 [00:00<?, ?it/s]

Generating train split: 0 examples [00:00, ? examples/s]

Generating val split: 0 examples [00:00, ? examples/s]

Generating test split: 0 examples [00:00, ? examples/s]

Dataset json downloaded and prepared to /root/.cache/huggingface/datasets/json/default-bd9705ca216c7c6a/0.0.0/0f7e3662623656454fcd2b650f34e886a7db4b9104504885bd462096cc7a9f51. Subsequent calls will reuse this data.


  0%|          | 0/3 [00:00<?, ?it/s]

As we can see, the dataset consists of 2 splits ("train" and "test"), and each example contains a list of words ("tokens") with corresponding boxes ("bboxes"), and the words are tagged ("ner_tags"). Each example also include the original image ("image").

In [11]:
train_val_dataset

DatasetDict({
    train: Dataset({
        features: ['id', 'tokens', 'bboxes', 'ner_tags', 'image'],
        num_rows: 63
    })
    val: Dataset({
        features: ['id', 'tokens', 'bboxes', 'ner_tags', 'image'],
        num_rows: 13
    })
    test: Dataset({
        features: ['id', 'tokens', 'bboxes', 'ner_tags', 'image'],
        num_rows: 14
    })
})

Let's check the features:

In [12]:
train_val_dataset["train"].features

{'id': Value(dtype='int64', id=None),
 'tokens': Sequence(feature=Value(dtype='string', id=None), length=-1, id=None),
 'bboxes': Sequence(feature=Sequence(feature=Value(dtype='int64', id=None), length=-1, id=None), length=-1, id=None),
 'ner_tags': Sequence(feature=ClassLabel(names=['O', 'I-EDUC_DATE', 'I-EDUC_COURSE', 'I-EDUC_LOC', 'I-EDUC_GRADE', 'I-EDUC_DESCRIPTION', 'I-EDUC_SCHOOL'], id=None), length=-1, id=None),
 'image': Value(dtype='string', id=None)}

Note that you can directly see the example in a notebook (as the "image" column is of type [Image](https://huggingface.co/docs/datasets/v2.2.1/en/package_reference/main_classes#datasets.Image)).

In [13]:
example = train_val_dataset["train"][0]
example["image"]

'/content/drive/MyDrive/cv_images/CVFP6-1.png'

In [14]:
labels = train_val_dataset['train'].features['ner_tags'].feature.names
labels

['O',
 'I-EDUC_DATE',
 'I-EDUC_COURSE',
 'I-EDUC_LOC',
 'I-EDUC_GRADE',
 'I-EDUC_DESCRIPTION',
 'I-EDUC_SCHOOL']

In [15]:
id2label = {v: k for v, k in enumerate(labels)}
label2id = {k: v for v, k in enumerate(labels)}
label2id

{'O': 0,
 'I-EDUC_DATE': 1,
 'I-EDUC_COURSE': 2,
 'I-EDUC_LOC': 3,
 'I-EDUC_GRADE': 4,
 'I-EDUC_DESCRIPTION': 5,
 'I-EDUC_SCHOOL': 6}

## Prepare dataset

Next, we prepare the dataset for the model. This can be done very easily using `LayoutLMv3Processor`, which internally wraps a `LayoutLMv3FeatureExtractor` (for the image modality) and a `LayoutLMv3Tokenizer` (for the text modality) into one.

Basically, the processor does the following internally:
* the feature extractor is used to resize + normalize each document image into `pixel_values`
* the tokenizer is used to turn the words, boxes and NER tags into token-level `input_ids`, `attention_mask` and `labels`.

The processor simply returns a dictionary that contains all these keys.

In [16]:
from transformers import AutoProcessor

# we'll use the Auto API here - it will load LayoutLMv3Processor behind the scenes,
# based on the checkpoint we provide from the hub
processor = AutoProcessor.from_pretrained("microsoft/layoutlmv3-base", apply_ocr=False)

Downloading (…)rocessor_config.json:   0%|          | 0.00/275 [00:00<?, ?B/s]

Downloading (…)okenizer_config.json:   0%|          | 0.00/1.14k [00:00<?, ?B/s]

Downloading (…)lve/main/config.json:   0%|          | 0.00/856 [00:00<?, ?B/s]

Downloading (…)olve/main/vocab.json:   0%|          | 0.00/899k [00:00<?, ?B/s]

Downloading (…)olve/main/merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

We'll first create `id2label` and label2id mappings, useful for inference. Note that `LayoutLMv3ForTokenClassification` (the model we'll use later on) will simply output an integer index for a particular class (for each token), so we still need to map it to an actual class name.

In [17]:
from datasets.features import ClassLabel

features = train_val_dataset["train"].features
column_names = train_val_dataset["train"].column_names
image_column_name = "image"
text_column_name = "tokens"
boxes_column_name = "bboxes"
label_column_name = "ner_tags"

# In the event the labels are not a `Sequence[ClassLabel]`, we will need to go through the dataset to get the
# unique labels.
def get_label_list(labels):
    unique_labels = set()
    for label in labels:
        unique_labels = unique_labels | set(label)
    label_list = list(unique_labels)
    label_list.sort()
    return label_list

if isinstance(features[label_column_name].feature, ClassLabel):
    label_list = features[label_column_name].feature.names
    # No need to convert the labels since they are already ints.
    id2label = {k: v for k,v in enumerate(label_list)}
    label2id = {v: k for k,v in enumerate(label_list)}
else:
    label_list = get_label_list(dataset["train"][label_column_name])
    id2label = {k: v for k,v in enumerate(label_list)}
    label2id = {v: k for k,v in enumerate(label_list)}
num_labels = len(label_list)

In [18]:
print(label_list)

['O', 'I-EDUC_DATE', 'I-EDUC_COURSE', 'I-EDUC_LOC', 'I-EDUC_GRADE', 'I-EDUC_DESCRIPTION', 'I-EDUC_SCHOOL']


In [19]:
print(id2label)

{0: 'O', 1: 'I-EDUC_DATE', 2: 'I-EDUC_COURSE', 3: 'I-EDUC_LOC', 4: 'I-EDUC_GRADE', 5: 'I-EDUC_DESCRIPTION', 6: 'I-EDUC_SCHOOL'}


Next, we'll define a function which we can apply on the entire dataset.

In [20]:
def prepare_examples(examples):
  images = [Image.open(path).convert("RGB") for path in examples['image']]
  words = examples[text_column_name]
  boxes = examples[boxes_column_name]
  word_labels = examples[label_column_name]

  encoding = processor(images, words, boxes=boxes, word_labels=word_labels,
                             return_overflowing_tokens=True,
                             return_offsets_mapping=True,
                       truncation=True, padding="max_length")
  
  sample_mapping = encoding.pop("overflow_to_sample_mapping")

  offset_mapping = encoding.pop("offset_mapping")


  return encoding

In [21]:
from datasets import Features, Sequence, ClassLabel, Value, Array2D, Array3D

# we need to define custom features for `set_format` (used later on) to work properly
features = Features({
    'pixel_values': Array3D(dtype="float32", shape=(3, 224, 224)),
    'input_ids': Sequence(feature=Value(dtype='int64')),
    'attention_mask': Sequence(Value(dtype='int64')),
    'bbox': Array2D(dtype="int64", shape=(512, 4)),
    'labels': Sequence(feature=Value(dtype='int64')),
})

train_dataset = train_val_dataset["train"].map(
    prepare_examples,
    batched=True,
    remove_columns=column_names,
    features=features,
)
eval_dataset = train_val_dataset["val"].map(
    prepare_examples,
    batched=True,
    remove_columns=column_names,
    features=features,
)
test_dataset = train_val_dataset["test"].map(
    prepare_examples,
    batched=True,
    remove_columns=column_names,
    features=features,
)

Map:   0%|          | 0/63 [00:00<?, ? examples/s]

Map:   0%|          | 0/13 [00:00<?, ? examples/s]

Map:   0%|          | 0/14 [00:00<?, ? examples/s]

In [22]:
train_dataset.features

{'pixel_values': Array3D(shape=(3, 224, 224), dtype='float32', id=None),
 'input_ids': Sequence(feature=Value(dtype='int64', id=None), length=-1, id=None),
 'attention_mask': Sequence(feature=Value(dtype='int64', id=None), length=-1, id=None),
 'bbox': Array2D(shape=(512, 4), dtype='int64', id=None),
 'labels': Sequence(feature=Value(dtype='int64', id=None), length=-1, id=None)}

In [23]:
example = train_dataset[0]
processor.tokenizer.decode(example["input_ids"])

'<s> EDUCATION Specialization,: MANAGEMENT, TECHNOLOGY AND SUSTAINABILITY, UniversityofValedoRiodosSinos–2010/11. MBA,BUSINESSSTRATEGY,UniversityofSouthernSantaCatarina–2006/08. BachelorofBUSINESSADMINISTRATION,UniversityofSinosValley,1998/03– Business Intelligence – Competitive Environment Monitoring – CLADEA congress,Lima(2003)/SLADEcongress,Camburiú–SC(2004).</s><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><p

Next, we set the format to PyTorch.

In [24]:
train_dataset.set_format("torch")

Let's verify that everything was created properly:

In [25]:
import torch

example = train_dataset[0]
for k,v in example.items():
    print(k,v.shape)

pixel_values torch.Size([3, 224, 224])
input_ids torch.Size([512])
attention_mask torch.Size([512])
bbox torch.Size([512, 4])
labels torch.Size([512])


In [26]:
eval_dataset

Dataset({
    features: ['pixel_values', 'input_ids', 'attention_mask', 'bbox', 'labels'],
    num_rows: 14
})

In [27]:
processor.tokenizer.decode(eval_dataset[0]["input_ids"])

'<s> Education: CAPM Exam Prep Seminar - PMBOK Guide, Sixth Edition – Apr 2019 - Udemy Agile Project Management – PMI-ACP Certification Program – Mar 2019 - Udemy Scrum Fundamentals – Feb 2019 - Pluralsight Project 2016 for Business Professionals – Feb 2019 - Pluralsight CompTIA Project+ series, covering the fundamentals of IT project management - Pluralsight – Feb 2019 CompTIA Project+ (PK0-004) Path – Pluralsight – Nov 2018 AgilePM Project Management – Pluralsight – Apr 2018 PM CompTIA Project+ (Part 1 / Part 2) – Pluralsight – Mar 2018 Computer Science Degree – ISTEC – class of 2009-2012</s><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad><pad

In [None]:
for id, label in zip(train_dataset[0]["input_ids"], train_dataset[0]["labels"]):
  print(processor.tokenizer.decode([id]), label.item())

## Define metrics

Next, we define a `compute_metrics` function, which is used by the Trainer to ... compute metrics.

This function should take a named tuple as input, and return a dictionary as output as stated in the [docs](https://huggingface.co/docs/transformers/main_classes/trainer).

In [29]:
device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")
device

device(type='cuda')

In [30]:
train_dataset.set_format(type="torch", device=device)
eval_dataset.set_format(type="torch", device=device)
test_dataset.set_format(type="torch", device=device)

In [31]:
from torch.utils.data import DataLoader

train_dataloader = DataLoader(train_dataset, batch_size=4, shuffle=True)
val_dataloader = DataLoader(eval_dataset, batch_size=2, shuffle=True)
test_dataloader = DataLoader(test_dataset, batch_size=2, shuffle=True)

In [32]:
from datasets import load_metric

metric = load_metric("seqeval")

  metric = load_metric("seqeval")


Downloading builder script:   0%|          | 0.00/2.47k [00:00<?, ?B/s]

In [33]:
import numpy as np

return_entity_level_metrics = False

def compute_metrics(p):
    predictions, labels = p
    predictions = np.argmax(predictions, axis=2)

    # Remove ignored index (special tokens)
    true_predictions = [
        [label_list[p] for (p, l) in zip(prediction, label) if l != -100]
        for prediction, label in zip(predictions, labels)
    ]
    true_labels = [
        [label_list[l] for (p, l) in zip(prediction, label) if l != -100]
        for prediction, label in zip(predictions, labels)
    ]

    results = metric.compute(predictions=true_predictions, references=true_labels)
    if return_entity_level_metrics:
        # Unpack nested dictionaries
        final_results = {}
        for key, value in results.items():
            if isinstance(value, dict):
                for n, v in value.items():
                    final_results[f"{key}_{n}"] = v
            else:
                final_results[key] = value
        return final_results
    else:
        return {
            "precision": results["overall_precision"],
            "recall": results["overall_recall"],
            "f1": results["overall_f1"],
            "accuracy": results["overall_accuracy"],
        }

## Define the model

Next we define the model: this is a Transformer encoder with pre-trained weights, and a randomly initialized head on top for token classification.

In [34]:
from torch import nn
from transformers import Trainer


class CustomTrainer(Trainer):
  pass
  """
    def compute_loss(self, model, inputs, return_outputs=False):
        labels = inputs.get("labels")
        # forward pass
        outputs = model(**inputs)
        logits = outputs.get("logits")
        # compute custom loss (suppose one has 3 labels with different weights)
        loss_fct = nn.CrossEntropyLoss(weight=torch.tensor([1.0, 2.0, 3.0])) / biggest
        loss = loss_fct(logits.view(-1, self.model.config.num_labels), labels.view(-1))
        return (loss, outputs) if return_outputs else loss"""

In [None]:
from transformers import LayoutLMv3ForTokenClassification

model = LayoutLMv3ForTokenClassification.from_pretrained("microsoft/layoutlmv3-base",
                                                         id2label=id2label,
                                                         label2id=label2id)
model.to(device)

In [36]:
# Set id2label and label2id 
model.config.id2label = id2label
model.config.label2id = label2id

In [37]:
from torch.utils.tensorboard import SummaryWriter
# Writer will output to ./runs/ directory by default
writer = SummaryWriter()

In [38]:
import torch.nn as nn
import os
model.to(device)
optimizer = AdamW(model.parameters(), lr=5e-5)
loss_fct = nn.CrossEntropyLoss()
global_step = 0
num_train_epochs = 20
t_total = len(train_dataloader) * num_train_epochs # total number of training steps 

metric = load_metric("seqeval")
return_entity_level_metrics = True

min_valid_loss = np.inf

counter = 0
patience_counter = 0

for epoch in range(num_train_epochs):
  running_loss = 0
  correct=0
  total=0
  print("Epoch:", epoch)
  for batch in tqdm(train_dataloader):
    #put the model in training mode
    model.train() 
    # zero the parameter gradients
    optimizer.zero_grad()

    # forward + backward + optimize
    outputs = model(**batch) 

    predictions = outputs.logits.argmax(dim=2)

    true_predictions = [
      [id2label[p.item()] for (p, l) in zip(prediction, label) if l != -100]
      for prediction, label in zip(predictions,batch['labels'])
      ]
    true_labels = [
      [id2label[l.item()] for (p, l) in zip(prediction, label) if l != -100]
      for prediction, label in zip(predictions,  batch['labels'])
      ]
    metric.add_batch(predictions=true_predictions, references=true_labels)
    labels = batch['labels']
    logits = outputs.get("logits")
    loss = loss_fct(logits.view(-1, model.config.num_labels), labels.view(-1))
    writer.add_scalar("Loss/train", loss, epoch)
        
    # print loss every 100 steps
    if global_step % 100 == 0:
      print(f"Loss after {global_step} steps: {loss.item()}")
      final_score = metric.compute(predictions=true_predictions, references=true_labels)
      print(final_score)
      writer.add_scalar("overall_precision/train", final_score["overall_precision"], epoch)
      writer.add_scalar("overall_recall/train", final_score["overall_recall"], epoch)
      writer.add_scalar("overall_f1/train", final_score["overall_f1"], epoch)
      writer.add_scalar("overall_accuracy/train", final_score["overall_accuracy"], epoch)

    loss.backward()
    optimizer.step()

    # Incrementing loss
    running_loss += loss.item()

    global_step += 1

  valid_loss = 0.0
  model.eval()
  for batch in tqdm(val_dataloader, desc="Evaluating"):
    with torch.no_grad():
      # forward pass
      outputs = model(**batch) 
      
      labels = batch['labels']
      logits = outputs.get("logits")
      loss = loss_fct(logits.view(-1, model.config.num_labels), labels.view(-1))
      writer.add_scalar("Loss/val", loss, epoch)
      # Incrementing loss
      valid_loss += loss.item()
    
  # Averaging out loss over entire batch
  running_loss /= len(train_dataloader)
  valid_loss /= len(val_dataloader)

  print('Training loss: {} \t\t Validation Loss: {}'.format(running_loss, valid_loss))
  # predictions
  predictions = outputs.logits.argmax(dim=2)

  # Remove ignored index (special tokens)
  true_predictions = [
      [id2label[p.item()] for (p, l) in zip(prediction, label) if l != -100]
      for prediction, label in zip(predictions, batch['labels'])
  ]
  true_labels = [
      [id2label[l.item()] for (p, l) in zip(prediction, label) if l != -100]
      for prediction, label in zip(predictions, batch['labels'])
  ]

  metric.add_batch(predictions=true_predictions, references=true_labels)

  final_score = metric.compute(predictions=true_predictions, references=true_labels)
  print(final_score)
  writer.add_scalar("overall_precision/val", final_score["overall_precision"], epoch)
  writer.add_scalar("overall_recall/val", final_score["overall_recall"], epoch)
  writer.add_scalar("overall_f1/val", final_score["overall_f1"], epoch)
  writer.add_scalar("overall_accuracy/val", final_score["overall_accuracy"], epoch)

  if min_valid_loss > valid_loss :
    print(f'Validation Loss Decreased({min_valid_loss:^.6f}--->{valid_loss:^.6f})')
    min_valid_loss = valid_loss        

  accu = 0
  if final_score['overall_accuracy'] > accu:
      accu = final_score['overall_accuracy']
      name_model = f"/content/modelLMv3/model-{accu:.3f}"
      
      model.save_pretrained(name_model)
      model.save_pretrained("/content/drive/MyDrive/modelLMv3")
      torch.save(model.state_dict(), "/content/drive/MyDrive/modelLMv3.pt")
  else:
        patience_counter += 1
        if patience_counter >= 3:
            print(f'Validation Loss did not improve for {patience_counter} epochs. Stopping training.')
            break



Epoch: 0


  0%|          | 0/16 [00:00<?, ?it/s]



Loss after 0 steps: 1.9534953832626343
{'EDUC_COURSE': {'precision': 0.0, 'recall': 0.0, 'f1': 0.0, 'number': 38}, 'EDUC_DATE': {'precision': 0.0, 'recall': 0.0, 'f1': 0.0, 'number': 34}, 'EDUC_DESCRIPTION': {'precision': 0.0, 'recall': 0.0, 'f1': 0.0, 'number': 2}, 'EDUC_GRADE': {'precision': 0.0, 'recall': 0.0, 'f1': 0.0, 'number': 2}, 'EDUC_LOC': {'precision': 0.0, 'recall': 0.0, 'f1': 0.0, 'number': 2}, 'EDUC_SCHOOL': {'precision': 0.0, 'recall': 0.0, 'f1': 0.0, 'number': 38}, 'overall_precision': 0.0, 'overall_recall': 0.0, 'overall_f1': 0.0, 'overall_accuracy': 0.2081447963800905}


Evaluating:   0%|          | 0/7 [00:00<?, ?it/s]

Training loss: 1.5416271835565567 		 Validation Loss: 1.0936289855412074


  _warn_prf(average, modifier, msg_start, len(result))


{'EDUC_COURSE': {'precision': 0.09188034188034189, 'recall': 0.1853448275862069, 'f1': 0.12285714285714286, 'number': 232}, 'EDUC_DATE': {'precision': 0.43946188340807174, 'recall': 0.5157894736842106, 'f1': 0.47457627118644075, 'number': 190}, 'EDUC_DESCRIPTION': {'precision': 0.0036363636363636364, 'recall': 0.020833333333333332, 'f1': 0.006191950464396285, 'number': 48}, 'EDUC_GRADE': {'precision': 0.0, 'recall': 0.0, 'f1': 0.0, 'number': 19}, 'EDUC_LOC': {'precision': 0.0, 'recall': 0.0, 'f1': 0.0, 'number': 78}, 'EDUC_SCHOOL': {'precision': 0.125, 'recall': 0.1377551020408163, 'f1': 0.13106796116504854, 'number': 196}, 'overall_precision': 0.14297800338409475, 'overall_recall': 0.2214941022280472, 'overall_f1': 0.1737789203084833, 'overall_accuracy': 0.43329032258064515}
Validation Loss Decreased(inf--->1.093629)
Epoch: 1


  0%|          | 0/16 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/7 [00:00<?, ?it/s]

Training loss: 0.8886784538626671 		 Validation Loss: 1.3759466069085258
{'EDUC_COURSE': {'precision': 0.27684964200477324, 'recall': 0.46586345381526106, 'f1': 0.34730538922155685, 'number': 249}, 'EDUC_DATE': {'precision': 0.5910780669144982, 'recall': 0.775609756097561, 'f1': 0.6708860759493671, 'number': 205}, 'EDUC_DESCRIPTION': {'precision': 0.022935779816513763, 'recall': 0.10204081632653061, 'f1': 0.03745318352059925, 'number': 49}, 'EDUC_GRADE': {'precision': 0.0, 'recall': 0.0, 'f1': 0.0, 'number': 20}, 'EDUC_LOC': {'precision': 0.41304347826086957, 'recall': 0.24050632911392406, 'f1': 0.304, 'number': 79}, 'EDUC_SCHOOL': {'precision': 0.2850877192982456, 'recall': 0.2995391705069124, 'f1': 0.2921348314606741, 'number': 217}, 'overall_precision': 0.30847457627118646, 'overall_recall': 0.4444444444444444, 'overall_f1': 0.3641820910455228, 'overall_accuracy': 0.6819066147859922}
Epoch: 2


  0%|          | 0/16 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/7 [00:00<?, ?it/s]

Training loss: 0.7819925993680954 		 Validation Loss: 0.8070986930813108
{'EDUC_COURSE': {'precision': 0.34563758389261745, 'recall': 0.4309623430962343, 'f1': 0.3836126629422719, 'number': 239}, 'EDUC_DATE': {'precision': 0.7914691943127962, 'recall': 0.8564102564102564, 'f1': 0.8226600985221675, 'number': 195}, 'EDUC_DESCRIPTION': {'precision': 0.06040268456375839, 'recall': 0.1836734693877551, 'f1': 0.09090909090909091, 'number': 49}, 'EDUC_GRADE': {'precision': 0.0, 'recall': 0.0, 'f1': 0.0, 'number': 20}, 'EDUC_LOC': {'precision': 0.631578947368421, 'recall': 0.5925925925925926, 'f1': 0.6114649681528662, 'number': 81}, 'EDUC_SCHOOL': {'precision': 0.44171779141104295, 'recall': 0.34782608695652173, 'f1': 0.3891891891891892, 'number': 207}, 'overall_precision': 0.44481605351170567, 'overall_recall': 0.504424778761062, 'overall_f1': 0.47274881516587675, 'overall_accuracy': 0.7485}
Validation Loss Decreased(1.093629--->0.807099)
Epoch: 3


  0%|          | 0/16 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/7 [00:00<?, ?it/s]

Training loss: 0.46106266789138317 		 Validation Loss: 0.954510258776801
{'EDUC_COURSE': {'precision': 0.5622895622895623, 'recall': 0.6600790513833992, 'f1': 0.6072727272727273, 'number': 253}, 'EDUC_DATE': {'precision': 0.762114537444934, 'recall': 0.8277511961722488, 'f1': 0.7935779816513763, 'number': 209}, 'EDUC_DESCRIPTION': {'precision': 0.14678899082568808, 'recall': 0.32653061224489793, 'f1': 0.20253164556962025, 'number': 49}, 'EDUC_GRADE': {'precision': 0.6666666666666666, 'recall': 0.1, 'f1': 0.1739130434782609, 'number': 20}, 'EDUC_LOC': {'precision': 0.47191011235955055, 'recall': 0.5316455696202531, 'f1': 0.4999999999999999, 'number': 79}, 'EDUC_SCHOOL': {'precision': 0.5022421524663677, 'recall': 0.5067873303167421, 'f1': 0.5045045045045045, 'number': 221}, 'overall_precision': 0.540084388185654, 'overall_recall': 0.6161251504211793, 'overall_f1': 0.5756042720629566, 'overall_accuracy': 0.8460240963855422}
Epoch: 4


  0%|          | 0/16 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/7 [00:00<?, ?it/s]

Training loss: 0.32280629966408014 		 Validation Loss: 1.0345483933176314
{'EDUC_COURSE': {'precision': 0.7550200803212851, 'recall': 0.8068669527896996, 'f1': 0.7800829875518672, 'number': 233}, 'EDUC_DATE': {'precision': 0.9090909090909091, 'recall': 0.9424083769633508, 'f1': 0.9254498714652956, 'number': 191}, 'EDUC_DESCRIPTION': {'precision': 0.45614035087719296, 'recall': 0.5098039215686274, 'f1': 0.48148148148148145, 'number': 51}, 'EDUC_GRADE': {'precision': 0.5, 'recall': 0.3, 'f1': 0.37499999999999994, 'number': 20}, 'EDUC_LOC': {'precision': 0.6375, 'recall': 0.6296296296296297, 'f1': 0.6335403726708074, 'number': 81}, 'EDUC_SCHOOL': {'precision': 0.6568627450980392, 'recall': 0.6733668341708543, 'f1': 0.6650124069478909, 'number': 199}, 'overall_precision': 0.73125, 'overall_recall': 0.7548387096774194, 'overall_f1': 0.7428571428571429, 'overall_accuracy': 0.9206106870229007}
Epoch: 5


  0%|          | 0/16 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/7 [00:00<?, ?it/s]

Training loss: 0.2779663628898561 		 Validation Loss: 0.7981693318911961
{'EDUC_COURSE': {'precision': 0.7558139534883721, 'recall': 0.8297872340425532, 'f1': 0.7910750507099391, 'number': 235}, 'EDUC_DATE': {'precision': 0.9170731707317074, 'recall': 0.9740932642487047, 'f1': 0.9447236180904522, 'number': 193}, 'EDUC_DESCRIPTION': {'precision': 0.3116883116883117, 'recall': 0.47058823529411764, 'f1': 0.37499999999999994, 'number': 51}, 'EDUC_GRADE': {'precision': 0.6666666666666666, 'recall': 0.6, 'f1': 0.631578947368421, 'number': 20}, 'EDUC_LOC': {'precision': 0.7294117647058823, 'recall': 0.7654320987654321, 'f1': 0.7469879518072289, 'number': 81}, 'EDUC_SCHOOL': {'precision': 0.7511737089201878, 'recall': 0.7804878048780488, 'f1': 0.7655502392344498, 'number': 205}, 'overall_precision': 0.7488317757009346, 'overall_recall': 0.8165605095541402, 'overall_f1': 0.7812309567336989, 'overall_accuracy': 0.9159136546184738}
Validation Loss Decreased(0.807099--->0.798169)
Epoch: 6


  0%|          | 0/16 [00:00<?, ?it/s]

Loss after 100 steps: 0.1823967844247818
{'EDUC_COURSE': {'precision': 0.7432432432432432, 'recall': 0.6875, 'f1': 0.7142857142857143, 'number': 80}, 'EDUC_DATE': {'precision': 1.0, 'recall': 1.0, 'f1': 1.0, 'number': 58}, 'EDUC_DESCRIPTION': {'precision': 0.5277777777777778, 'recall': 0.9047619047619048, 'f1': 0.6666666666666666, 'number': 21}, 'EDUC_GRADE': {'precision': 1.0, 'recall': 0.75, 'f1': 0.8571428571428571, 'number': 8}, 'EDUC_LOC': {'precision': 0.38235294117647056, 'recall': 0.43333333333333335, 'f1': 0.40625, 'number': 30}, 'EDUC_SCHOOL': {'precision': 0.7368421052631579, 'recall': 0.7368421052631579, 'f1': 0.7368421052631579, 'number': 57}, 'overall_precision': 0.7283018867924528, 'overall_recall': 0.7598425196850394, 'overall_f1': 0.7437379576107899, 'overall_accuracy': 0.9292364990689013}


Evaluating:   0%|          | 0/7 [00:00<?, ?it/s]

Training loss: 0.196627531433478 		 Validation Loss: 0.8716779404452869
{'EDUC_COURSE': {'precision': 0.8102564102564103, 'recall': 0.8926553672316384, 'f1': 0.8494623655913979, 'number': 177}, 'EDUC_DATE': {'precision': 0.9668874172185431, 'recall': 0.9605263157894737, 'f1': 0.9636963696369637, 'number': 152}, 'EDUC_DESCRIPTION': {'precision': 0.38095238095238093, 'recall': 0.5333333333333333, 'f1': 0.4444444444444444, 'number': 30}, 'EDUC_GRADE': {'precision': 0.36363636363636365, 'recall': 0.3333333333333333, 'f1': 0.34782608695652173, 'number': 12}, 'EDUC_LOC': {'precision': 0.7323943661971831, 'recall': 0.7222222222222222, 'f1': 0.7272727272727272, 'number': 72}, 'EDUC_SCHOOL': {'precision': 0.7853107344632768, 'recall': 0.8424242424242424, 'f1': 0.8128654970760234, 'number': 165}, 'overall_precision': 0.7959814528593508, 'overall_recall': 0.8470394736842105, 'overall_f1': 0.8207171314741036, 'overall_accuracy': 0.9117647058823529}
Epoch: 7


  0%|          | 0/16 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/7 [00:00<?, ?it/s]

Training loss: 0.13870725757442415 		 Validation Loss: 0.8914924114942551
{'EDUC_COURSE': {'precision': 0.8508064516129032, 'recall': 0.8612244897959184, 'f1': 0.8559837728194727, 'number': 245}, 'EDUC_DATE': {'precision': 0.9512195121951219, 'recall': 0.9605911330049262, 'f1': 0.9558823529411765, 'number': 203}, 'EDUC_DESCRIPTION': {'precision': 0.7755102040816326, 'recall': 0.7755102040816326, 'f1': 0.7755102040816326, 'number': 49}, 'EDUC_GRADE': {'precision': 0.6470588235294118, 'recall': 0.55, 'f1': 0.5945945945945946, 'number': 20}, 'EDUC_LOC': {'precision': 0.71, 'recall': 0.7634408602150538, 'f1': 0.7357512953367875, 'number': 93}, 'EDUC_SCHOOL': {'precision': 0.7436974789915967, 'recall': 0.8309859154929577, 'f1': 0.7849223946784923, 'number': 213}, 'overall_precision': 0.8203033838973163, 'overall_recall': 0.8541919805589308, 'overall_f1': 0.8369047619047619, 'overall_accuracy': 0.9482840800762631}
Epoch: 8


  0%|          | 0/16 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/7 [00:00<?, ?it/s]

Training loss: 0.10109345463570207 		 Validation Loss: 1.07657554852111
{'EDUC_COURSE': {'precision': 0.8571428571428571, 'recall': 0.9446808510638298, 'f1': 0.8987854251012146, 'number': 235}, 'EDUC_DATE': {'precision': 0.9323671497584541, 'recall': 0.9507389162561576, 'f1': 0.9414634146341464, 'number': 203}, 'EDUC_DESCRIPTION': {'precision': 0.7377049180327869, 'recall': 0.7377049180327869, 'f1': 0.7377049180327869, 'number': 61}, 'EDUC_GRADE': {'precision': 0.46153846153846156, 'recall': 0.3, 'f1': 0.3636363636363637, 'number': 20}, 'EDUC_LOC': {'precision': 0.7127659574468085, 'recall': 0.7362637362637363, 'f1': 0.7243243243243244, 'number': 91}, 'EDUC_SCHOOL': {'precision': 0.8237885462555066, 'recall': 0.8697674418604651, 'f1': 0.8461538461538461, 'number': 215}, 'overall_precision': 0.8362369337979094, 'overall_recall': 0.8727272727272727, 'overall_f1': 0.8540925266903914, 'overall_accuracy': 0.9595170454545454}
Epoch: 9


  0%|          | 0/16 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/7 [00:00<?, ?it/s]

Training loss: 0.0709370577824302 		 Validation Loss: 1.112668263060706
{'EDUC_COURSE': {'precision': 0.9433198380566802, 'recall': 0.9357429718875502, 'f1': 0.9395161290322581, 'number': 249}, 'EDUC_DATE': {'precision': 0.9805825242718447, 'recall': 0.9853658536585366, 'f1': 0.9829683698296836, 'number': 205}, 'EDUC_DESCRIPTION': {'precision': 0.7692307692307693, 'recall': 0.8163265306122449, 'f1': 0.7920792079207921, 'number': 49}, 'EDUC_GRADE': {'precision': 0.6, 'recall': 0.6, 'f1': 0.6, 'number': 20}, 'EDUC_LOC': {'precision': 0.7872340425531915, 'recall': 0.9367088607594937, 'f1': 0.8554913294797688, 'number': 79}, 'EDUC_SCHOOL': {'precision': 0.9054726368159204, 'recall': 0.8387096774193549, 'f1': 0.8708133971291866, 'number': 217}, 'overall_precision': 0.9060975609756098, 'overall_recall': 0.9072039072039072, 'overall_f1': 0.9066503965832825, 'overall_accuracy': 0.9710603112840467}
Epoch: 10


  0%|          | 0/16 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/7 [00:00<?, ?it/s]

Training loss: 0.05766146839596331 		 Validation Loss: 1.3000341483524867
{'EDUC_COURSE': {'precision': 0.9076305220883534, 'recall': 0.9617021276595744, 'f1': 0.933884297520661, 'number': 235}, 'EDUC_DATE': {'precision': 0.9563106796116505, 'recall': 0.9704433497536946, 'f1': 0.9633251833740832, 'number': 203}, 'EDUC_DESCRIPTION': {'precision': 0.847457627118644, 'recall': 0.819672131147541, 'f1': 0.8333333333333333, 'number': 61}, 'EDUC_GRADE': {'precision': 0.8, 'recall': 0.8, 'f1': 0.8000000000000002, 'number': 20}, 'EDUC_LOC': {'precision': 0.8791208791208791, 'recall': 0.8791208791208791, 'f1': 0.8791208791208791, 'number': 91}, 'EDUC_SCHOOL': {'precision': 0.8564814814814815, 'recall': 0.8604651162790697, 'f1': 0.8584686774941995, 'number': 215}, 'overall_precision': 0.896551724137931, 'overall_recall': 0.9139393939393939, 'overall_f1': 0.9051620648259304, 'overall_accuracy': 0.970407196969697}
Epoch: 11


  0%|          | 0/16 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/7 [00:00<?, ?it/s]

Training loss: 0.04218552174279466 		 Validation Loss: 0.923192675624575
{'EDUC_COURSE': {'precision': 0.9585062240663901, 'recall': 0.9585062240663901, 'f1': 0.9585062240663901, 'number': 241}, 'EDUC_DATE': {'precision': 0.9949238578680203, 'recall': 0.9949238578680203, 'f1': 0.9949238578680203, 'number': 197}, 'EDUC_DESCRIPTION': {'precision': 0.8695652173913043, 'recall': 0.8163265306122449, 'f1': 0.8421052631578948, 'number': 49}, 'EDUC_GRADE': {'precision': 0.8947368421052632, 'recall': 0.85, 'f1': 0.8717948717948718, 'number': 20}, 'EDUC_LOC': {'precision': 0.9743589743589743, 'recall': 0.9382716049382716, 'f1': 0.9559748427672956, 'number': 81}, 'EDUC_SCHOOL': {'precision': 0.9519230769230769, 'recall': 0.9473684210526315, 'f1': 0.9496402877697842, 'number': 209}, 'overall_precision': 0.9607097591888466, 'overall_recall': 0.9510664993726474, 'overall_f1': 0.9558638083228246, 'overall_accuracy': 0.9895781637717121}
Epoch: 12


  0%|          | 0/16 [00:00<?, ?it/s]

Loss after 200 steps: 0.013623776845633984
{'EDUC_COURSE': {'precision': 0.9805194805194806, 'recall': 0.993421052631579, 'f1': 0.9869281045751634, 'number': 152}, 'EDUC_DATE': {'precision': 1.0, 'recall': 1.0, 'f1': 1.0, 'number': 132}, 'EDUC_DESCRIPTION': {'precision': 1.0, 'recall': 1.0, 'f1': 1.0, 'number': 28}, 'EDUC_GRADE': {'precision': 0.7777777777777778, 'recall': 0.7777777777777778, 'f1': 0.7777777777777778, 'number': 9}, 'EDUC_LOC': {'precision': 1.0, 'recall': 1.0, 'f1': 1.0, 'number': 52}, 'EDUC_SCHOOL': {'precision': 0.9784172661870504, 'recall': 0.9784172661870504, 'f1': 0.9784172661870504, 'number': 139}, 'overall_precision': 0.9844357976653697, 'overall_recall': 0.98828125, 'overall_f1': 0.98635477582846, 'overall_accuracy': 0.9954394693200663}


Evaluating:   0%|          | 0/7 [00:00<?, ?it/s]

Training loss: 0.028523065790068358 		 Validation Loss: 1.144701445741313
{'EDUC_COURSE': {'precision': 0.9811320754716981, 'recall': 0.9811320754716981, 'f1': 0.9811320754716981, 'number': 106}, 'EDUC_DATE': {'precision': 0.9761904761904762, 'recall': 0.9761904761904762, 'f1': 0.9761904761904762, 'number': 84}, 'EDUC_DESCRIPTION': {'precision': 0.9583333333333334, 'recall': 0.7931034482758621, 'f1': 0.8679245283018867, 'number': 29}, 'EDUC_GRADE': {'precision': 0.8461538461538461, 'recall': 0.8461538461538461, 'f1': 0.8461538461538461, 'number': 13}, 'EDUC_LOC': {'precision': 0.7058823529411765, 'recall': 0.6923076923076923, 'f1': 0.6990291262135924, 'number': 52}, 'EDUC_SCHOOL': {'precision': 0.8043478260869565, 'recall': 0.8505747126436781, 'f1': 0.8268156424581005, 'number': 87}, 'overall_precision': 0.8918918918918919, 'overall_recall': 0.889487870619946, 'overall_f1': 0.8906882591093117, 'overall_accuracy': 0.9389846297158826}
Epoch: 13


  0%|          | 0/16 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/7 [00:00<?, ?it/s]

Training loss: 0.023669635033002123 		 Validation Loss: 1.0210979995982987
{'EDUC_COURSE': {'precision': 0.9612403100775194, 'recall': 0.9575289575289575, 'f1': 0.9593810444874276, 'number': 259}, 'EDUC_DATE': {'precision': 0.9629629629629629, 'recall': 0.9585253456221198, 'f1': 0.9607390300230947, 'number': 217}, 'EDUC_DESCRIPTION': {'precision': 0.9215686274509803, 'recall': 0.9591836734693877, 'f1': 0.9400000000000001, 'number': 49}, 'EDUC_GRADE': {'precision': 0.7727272727272727, 'recall': 0.85, 'f1': 0.8095238095238095, 'number': 20}, 'EDUC_LOC': {'precision': 0.7572815533980582, 'recall': 0.8387096774193549, 'f1': 0.7959183673469389, 'number': 93}, 'EDUC_SCHOOL': {'precision': 0.8908296943231441, 'recall': 0.8986784140969163, 'f1': 0.894736842105263, 'number': 227}, 'overall_precision': 0.9124004550625711, 'overall_recall': 0.9271676300578034, 'overall_f1': 0.9197247706422017, 'overall_accuracy': 0.9656998158379374}
Epoch: 14


  0%|          | 0/16 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/7 [00:00<?, ?it/s]

Training loss: 0.02190876376698725 		 Validation Loss: 1.0012558017458235
{'EDUC_COURSE': {'precision': 0.9914529914529915, 'recall': 0.9872340425531915, 'f1': 0.9893390191897654, 'number': 235}, 'EDUC_DATE': {'precision': 0.9740932642487047, 'recall': 0.9842931937172775, 'f1': 0.9791666666666667, 'number': 191}, 'EDUC_DESCRIPTION': {'precision': 0.9787234042553191, 'recall': 0.9387755102040817, 'f1': 0.9583333333333333, 'number': 49}, 'EDUC_GRADE': {'precision': 0.7894736842105263, 'recall': 0.75, 'f1': 0.7692307692307692, 'number': 20}, 'EDUC_LOC': {'precision': 0.9746835443037974, 'recall': 0.9746835443037974, 'f1': 0.9746835443037974, 'number': 79}, 'EDUC_SCHOOL': {'precision': 0.9547738693467337, 'recall': 0.9547738693467337, 'f1': 0.9547738693467337, 'number': 199}, 'overall_precision': 0.9701686121919585, 'overall_recall': 0.9676584734799483, 'overall_f1': 0.9689119170984456, 'overall_accuracy': 0.9926395939086294}
Epoch: 15


  0%|          | 0/16 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/7 [00:00<?, ?it/s]

Training loss: 0.02779653889592737 		 Validation Loss: 1.3116334932191032
{'EDUC_COURSE': {'precision': 0.9718875502008032, 'recall': 0.9877551020408163, 'f1': 0.9797570850202428, 'number': 245}, 'EDUC_DATE': {'precision': 0.9900497512437811, 'recall': 0.9900497512437811, 'f1': 0.9900497512437811, 'number': 201}, 'EDUC_DESCRIPTION': {'precision': 1.0, 'recall': 1.0, 'f1': 1.0, 'number': 49}, 'EDUC_GRADE': {'precision': 0.75, 'recall': 0.75, 'f1': 0.75, 'number': 20}, 'EDUC_LOC': {'precision': 0.9036144578313253, 'recall': 0.9493670886075949, 'f1': 0.9259259259259259, 'number': 79}, 'EDUC_SCHOOL': {'precision': 0.9036697247706422, 'recall': 0.9162790697674419, 'f1': 0.9099307159353349, 'number': 215}, 'overall_precision': 0.947560975609756, 'overall_recall': 0.9604449938195303, 'overall_f1': 0.9539594843462246, 'overall_accuracy': 0.9905707196029777}
Epoch: 16


  0%|          | 0/16 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/7 [00:00<?, ?it/s]

Training loss: 0.02045657172857318 		 Validation Loss: 1.3303493548716818
{'EDUC_COURSE': {'precision': 0.9791666666666666, 'recall': 0.9832635983263598, 'f1': 0.9812108559498954, 'number': 239}, 'EDUC_DATE': {'precision': 0.9948717948717949, 'recall': 0.9948717948717949, 'f1': 0.9948717948717949, 'number': 195}, 'EDUC_DESCRIPTION': {'precision': 0.9803921568627451, 'recall': 0.9803921568627451, 'f1': 0.9803921568627451, 'number': 51}, 'EDUC_GRADE': {'precision': 0.8181818181818182, 'recall': 0.8181818181818182, 'f1': 0.8181818181818182, 'number': 22}, 'EDUC_LOC': {'precision': 0.9875, 'recall': 0.9753086419753086, 'f1': 0.9813664596273292, 'number': 81}, 'EDUC_SCHOOL': {'precision': 0.9577464788732394, 'recall': 0.9855072463768116, 'f1': 0.9714285714285714, 'number': 207}, 'overall_precision': 0.9737827715355806, 'overall_recall': 0.9811320754716981, 'overall_f1': 0.9774436090225566, 'overall_accuracy': 0.9950617283950617}
Epoch: 17


  0%|          | 0/16 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/7 [00:00<?, ?it/s]

Training loss: 0.01844534432166256 		 Validation Loss: 1.5556584458265985
{'EDUC_COURSE': {'precision': 0.9512195121951219, 'recall': 0.970954356846473, 'f1': 0.9609856262833675, 'number': 241}, 'EDUC_DATE': {'precision': 0.9949238578680203, 'recall': 0.9949238578680203, 'f1': 0.9949238578680203, 'number': 197}, 'EDUC_DESCRIPTION': {'precision': 0.9782608695652174, 'recall': 0.8490566037735849, 'f1': 0.9090909090909092, 'number': 53}, 'EDUC_GRADE': {'precision': 0.8181818181818182, 'recall': 0.9, 'f1': 0.8571428571428572, 'number': 20}, 'EDUC_LOC': {'precision': 0.896551724137931, 'recall': 0.896551724137931, 'f1': 0.896551724137931, 'number': 87}, 'EDUC_SCHOOL': {'precision': 0.9704433497536946, 'recall': 0.9425837320574163, 'f1': 0.9563106796116505, 'number': 209}, 'overall_precision': 0.9588014981273408, 'overall_recall': 0.9516728624535316, 'overall_f1': 0.9552238805970148, 'overall_accuracy': 0.9771744353676117}
Epoch: 18


  0%|          | 0/16 [00:00<?, ?it/s]

Loss after 300 steps: 0.004100163001567125
{'EDUC_COURSE': {'precision': 1.0, 'recall': 0.9894179894179894, 'f1': 0.9946808510638299, 'number': 189}, 'EDUC_DATE': {'precision': 0.9935897435897436, 'recall': 0.9935897435897436, 'f1': 0.9935897435897436, 'number': 156}, 'EDUC_DESCRIPTION': {'precision': 1.0, 'recall': 1.0, 'f1': 1.0, 'number': 48}, 'EDUC_GRADE': {'precision': 0.8823529411764706, 'recall': 0.8823529411764706, 'f1': 0.8823529411764706, 'number': 17}, 'EDUC_LOC': {'precision': 0.9733333333333334, 'recall': 0.9733333333333334, 'f1': 0.9733333333333334, 'number': 75}, 'EDUC_SCHOOL': {'precision': 0.95625, 'recall': 0.9683544303797469, 'f1': 0.9622641509433963, 'number': 158}, 'overall_precision': 0.9813374805598756, 'overall_recall': 0.9813374805598756, 'overall_f1': 0.9813374805598756, 'overall_accuracy': 0.9963421496904896}


Evaluating:   0%|          | 0/7 [00:00<?, ?it/s]

Training loss: 0.011748855278710835 		 Validation Loss: 1.4933336185557502
{'EDUC_COURSE': {'precision': 0.9661016949152542, 'recall': 0.9661016949152542, 'f1': 0.9661016949152542, 'number': 59}, 'EDUC_DATE': {'precision': 0.9183673469387755, 'recall': 0.9574468085106383, 'f1': 0.9375000000000001, 'number': 47}, 'EDUC_DESCRIPTION': {'precision': 1.0, 'recall': 1.0, 'f1': 1.0, 'number': 7}, 'EDUC_GRADE': {'precision': 1.0, 'recall': 1.0, 'f1': 1.0, 'number': 3}, 'EDUC_LOC': {'precision': 0.8181818181818182, 'recall': 0.75, 'f1': 0.7826086956521738, 'number': 12}, 'EDUC_SCHOOL': {'precision': 0.9272727272727272, 'recall': 0.9444444444444444, 'f1': 0.9357798165137615, 'number': 54}, 'overall_precision': 0.9347826086956522, 'overall_recall': 0.945054945054945, 'overall_f1': 0.9398907103825138, 'overall_accuracy': 0.9820971867007673}
Epoch: 19


  0%|          | 0/16 [00:00<?, ?it/s]

Evaluating:   0%|          | 0/7 [00:00<?, ?it/s]

Training loss: 0.010653728051693179 		 Validation Loss: 1.568187300648008
{'EDUC_COURSE': {'precision': 0.968, 'recall': 0.9718875502008032, 'f1': 0.969939879759519, 'number': 249}, 'EDUC_DATE': {'precision': 1.0, 'recall': 1.0, 'f1': 1.0, 'number': 203}, 'EDUC_DESCRIPTION': {'precision': 0.8571428571428571, 'recall': 0.9795918367346939, 'f1': 0.9142857142857143, 'number': 49}, 'EDUC_GRADE': {'precision': 0.9, 'recall': 0.9, 'f1': 0.9, 'number': 20}, 'EDUC_LOC': {'precision': 0.8210526315789474, 'recall': 0.9873417721518988, 'f1': 0.896551724137931, 'number': 79}, 'EDUC_SCHOOL': {'precision': 0.9800995024875622, 'recall': 0.9078341013824884, 'f1': 0.9425837320574162, 'number': 217}, 'overall_precision': 0.9527272727272728, 'overall_recall': 0.9620563035495716, 'overall_f1': 0.9573690621193667, 'overall_accuracy': 0.9844886088221038}


In [39]:
train_state = {}

In [40]:
running_loss = 0.
running_acc = 0.
metric = load_metric("seqeval")
for batch_index, batch in enumerate(tqdm(test_dataloader)):
  outputs = model(**batch) 
  labels = batch['labels']
  logits = outputs.get("logits")
  loss = loss_fct(logits.view(-1, model.config.num_labels), labels.view(-1))
  loss_batch = loss.item()
  running_loss += (loss_batch - running_loss) / (batch_index + 1)
  writer.add_scalar("Loss/test", loss, batch_index)

  predictions = outputs.logits.argmax(dim=2)
  # Remove ignored index (special tokens)
  true_predictions = [
      [id2label[p.item()] for (p, l) in zip(prediction, label) if l != -100]
      for prediction, label in zip(predictions, batch['labels'])
  ]
  true_labels = [
      [id2label[l.item()] for (p, l) in zip(prediction, label) if l != -100]
      for prediction, label in zip(predictions, batch['labels'])
  ]
  final_score = metric.compute(predictions=true_predictions, references=true_labels)
  acc_batch = final_score["overall_accuracy"]
  running_acc += (acc_batch - running_acc) / (batch_index + 1)
  writer.add_scalar("overall_precision/test", final_score["overall_precision"], batch_index)
  writer.add_scalar("overall_recall/test", final_score["overall_recall"], batch_index)
  writer.add_scalar("overall_f1/test", final_score["overall_f1"], batch_index)
  writer.add_scalar("overall_accuracy/test", final_score["overall_accuracy"], batch_index)

  0%|          | 0/8 [00:00<?, ?it/s]

  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


In [41]:
train_state['test_loss'] = running_loss
train_state['test_acc'] = running_acc
print("Test loss: {:.3f}".format(train_state['test_loss']))
print("Test Accuracy: {:.2f}".format(train_state['test_acc']))

Test loss: 0.810
Test Accuracy: 0.82


## Define TrainingArguments + Trainer

Next we define the `TrainingArguments`, which define all hyperparameters related to training. Note that there is a huge amount of parameters to tweak, check the [docs](https://huggingface.co/docs/transformers/main_classes/trainer#transformers.TrainingArguments) for more info.

In [None]:
%load_ext tensorboard

In [42]:
!zip -r /content/runs.zip /content/runs/

  adding: content/runs/ (stored 0%)
  adding: content/runs/Mar06_11-38-15_d363e5911561/ (stored 0%)
  adding: content/runs/Mar06_11-38-15_d363e5911561/events.out.tfevents.1678102695.d363e5911561.130.0 (deflated 68%)


In [43]:
from google.colab import files
files.download("/content/runs.zip")

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

We can now instantiate a Trainer, with the model and args defined above. We also provide our datasets, as well as a "default data collator" - which will batch the examples using `torch.stack`. We also provide our `compute_metrics` function defined above.

## Inference

You can load the model for inference as follows:

In [None]:
from transformers import AutoModelForTokenClassification

model = AutoModelForTokenClassification.from_pretrained(f"{OUTPUT_DIR}checkpoint-3000")

Let's take an example of the training dataset to show inference.

In [None]:
example = dataset["val"][0]
print(example.keys())

We first prepare it for the model using the processor.

In [None]:
print(example['image'])

In [None]:
image = Image.open(example['image'])
words = example["tokens"]
boxes = example["bboxes"]
word_labels = example["ner_tags"]
print(len(words))
encoding = processor(image, words, boxes=boxes, word_labels=word_labels, return_tensors="pt")
for k,v in encoding.items():
  print(k,v.shape)

Next, we do a forward pass. We use torch.no_grad() as we don't require gradient computation.

In [None]:
next(model.parameters()).is_cuda

In [None]:
with torch.no_grad():
  outputs = model(**encoding.to("cuda"))

The model outputs logits of shape (batch_size, seq_len, num_labels).

In [None]:
logits = outputs.logits
logits.shape

We take the highest score for each token, using argmax. This serves as the predicted label for each token.

In [None]:
predictions = logits.argmax(-1).squeeze().tolist()
print(predictions)

Let's compare this to the ground truth: note that many labels are -100, as we're only labeling the first subword token of each word.

NOTE: at "true inference" time, you don't have access to labels, see the latest section of this notebook how you can use `offset_mapping` in that case.

In [None]:
labels = encoding.labels.squeeze().tolist()
print(labels)

So let's only compare predictions and labels at positions where the label isn't -100. We also want to have the bounding boxes of these (unnormalized):

In [None]:
def unnormalize_box(bbox, width, height):
     return [
         bbox[0],
         bbox[1],
         bbox[2],
         bbox[3],
     ]

token_boxes = encoding.bbox.squeeze().tolist()
width, height = image.size

true_predictions = [model.config.id2label[pred] for pred, label in zip(predictions, labels) if label != - 100]
true_labels = [model.config.id2label[label] for prediction, label in zip(predictions, labels) if label != -100]
true_boxes = [unnormalize_box(box, width, height) for box, label in zip(token_boxes, labels) if label != -100]

In [None]:
len(true_predictions)

In [None]:
'NONE','EDUC_DATE','EDUC_COURSE','EDUC_LOC',
               'EDUC_GRADE','EDUC_DESCRIPTION','EDUC_SCHOOL'

In [None]:
from PIL import ImageDraw, ImageFont

draw = ImageDraw.Draw(image)

font = ImageFont.load_default()

def iob_to_label(label):
    if label == 'NONE':
      return label
    else:
      label = label
      if not label:
        return 'other'
      return label

label2color = {'none': 'blue',
               'educ_loc': 'black', 
               'educ_date': 'green',
               'educ_course': 'orange',
               'educ_grade': 'red',
               'educ_description': 'purple',
               'educ_school': 'brown'
               }

for prediction, box in zip(true_predictions, true_boxes):
    predicted_label = iob_to_label(prediction).lower()
    draw.rectangle(box, outline=label2color[predicted_label])
    draw.text((box[0] + 10, box[1] - 10), text=predicted_label, fill=label2color[predicted_label], font=font)

image

Compare this to the ground truth:

In [None]:
image = example["image"]
image = Image.open(example['image'])

draw = ImageDraw.Draw(image)

for word, box, label in zip(example['tokens'], example['bboxes'], example['ner_tags']):
  actual_label = iob_to_label(id2label[label]).lower()
  box = unnormalize_box(box, width, height)
  draw.rectangle(box, outline=label2color[actual_label], width=2)
  draw.text((box[0] + 10, box[1] - 10), actual_label, fill=label2color[actual_label], font=font)

image

## Note: inference when you don't have labels

The code above used the `labels` to determine which tokens were at the start of a particular word or not. Of course, at inference time, you don't have access to any labels. In that case, you can leverage the `offset_mapping` returned by the tokenizer. I do have a notebook for that (for LayoutLMv2, but it's equivalent for LayoutLMv3) [here](https://github.com/NielsRogge/Transformers-Tutorials/blob/master/LayoutLMv2/FUNSD/True_inference_with_LayoutLMv2ForTokenClassification_%2B_Gradio_demo.ipynb).