# 30 - Evaluation DAN

The aim of this notebook is to provide a detailed evaluation of the performance of the fine-tuned model.

Some requirements:
* It's better to do the evaluation on a single GPU to avoid parallelization drawbacks.
* Don't use the configuration file (JSON) used for training. Prefer an adapted version dedicated to the evaluation, with the following parameters:
    * *batch size* : 1 (if > 1, the metrics are those of the whole batch and not of each independent image)
    * *load_epoch* : best (we want to evaluate the best model, not the last)
* As we have used the character **§** in the training dataset, be sure to replace § with £ (or any other character not present in the dataset) in *nerval/parse.py* (this was normally done to train the model).

In [8]:
import glob
import pandas as pd
import sys
import os
# Access to the utils directory
current_dir = os.getcwd()
utils_dir = os.path.join(current_dir, '..', 'utils')
sys.path.append(utils_dir)

In [5]:
ROOT = "/home/STual/DAN-cadastre"

## 1. Global evaluation
The global evaluation is performed with the teklia-dan dedicated command.

#### DAN evaluation

| Split | CER (HTR-NER) | CER (HTR) | WER (HTR-NER) | WER (HTR) | WER (HTR no punct) |  NER  |
|:-----:|:-------------:|:---------:|:-------------:|:---------:|:------------------:|:-----:|
| train |      2.05     |    2.13   |      2.93     |    2.89   |        2.89        |  1.38 |
|  val  |     39.68     |   40.17   |     64.44     |   63.78   |       63.78        | 32.61 |
|  test |     33.67     |    35.3   |     54.78     |   54.36   |       54.36        | 16.19 |

#### Nerval evaluation

Distance threshold for the match between gold and predicted entity during Nerval evaluation is set to 0 : impose perfect matches.

##### train

| tag                    | predicted | matched | Precision | Recall |    F1 | Support |
|:-----------------------|----------:|--------:|----------:|-------:|------:|--------:|
| ancien_numero_parcelle |       441 |     429 |     97.28 |  96.62 | 96.95 |     444 |
| ancienne_nature        |       427 |     419 |     98.13 |  96.99 | 97.56 |     432 |
| identite               |      1817 |    1796 |     98.84 |  99.01 | 98.93 |    1814 |
| lieu-dit               |      1515 |    1498 |     98.88 |  99.14 | 99.01 |    1511 |
| nature                 |      1813 |    1799 |     99.23 |  99.34 | 99.28 |    1811 |
| numero_parcelle        |      1817 |    1786 |     98.29 |  98.46 | 98.38 |    1814 |
| numero_proprietaire    |       996 |     977 |     98.09 |  97.99 | 98.04 |     997 |
| ALL                    |      8826 |    8704 |     98.62 |  98.65 | 98.63 |    8823 |

##### val

| tag                    | predicted | matched | Precision | Recall |    F1 | Support |
|:-----------------------|----------:|--------:|----------:|-------:|------:|--------:|
| ancien_numero_parcelle |        78 |      40 |     51.28 |  47.06 | 49.08 |      85 |
| ancienne_nature        |        80 |      35 |     43.75 |  41.67 | 42.68 |      84 |
| identite               |       347 |     206 |     59.37 |  58.69 | 59.03 |     351 |
| lieu-dit               |       289 |     155 |     53.63 |  48.74 | 51.07 |     318 |
| nature                 |       364 |     189 |     51.92 |   54.0 | 52.94 |     350 |
| numero_parcelle        |       363 |     204 |      56.2 |  58.29 | 57.22 |     350 |
| numero_proprietaire    |       232 |     118 |     50.86 |   47.2 | 48.96 |     250 |
| ALL                    |      1753 |     947 |     54.02 |  52.96 | 53.49 |    1788 |

##### test

| tag                    | predicted | matched | Precision | Recall |    F1 | Support |
|:-----------------------|----------:|--------:|----------:|-------:|------:|--------:|
| ancien_numero_parcelle |        95 |      49 |     51.58 |  37.12 | 43.17 |     132 |
| ancienne_nature        |        93 |      41 |     44.09 |  37.61 | 40.59 |     109 |
| identite               |       485 |     301 |     62.06 |  64.59 |  63.3 |     466 |
| lieu-dit               |       328 |     178 |     54.27 |  57.42 |  55.8 |     310 |
| nature                 |       490 |     335 |     68.37 |  71.58 | 69.94 |     468 |
| numero_parcelle        |       487 |     341 |     70.02 |  73.02 | 71.49 |     467 |
| numero_proprietaire    |       384 |     251 |     65.36 |  74.04 | 69.43 |     339 |
| ALL                    |      2362 |    1496 |     63.34 |   65.3 |  64.3 |    2291 |

## 2. Additionnal evaluation

* Ranking by page
* CER/WER by entity types
* CER/WER on full transcription without the layout tokens
* Refaire l'évaluation après avoir fait la détection d'hallucinations ?
* Stats par pages (avec communes etc)

### 2.1 Ranking by pages

In [1]:
import json
import pandas as pd

PAGES_EVAL = ROOT + "/scripts/DAN/metrics/training-12-03-2025-2000epochs.json"

with open(PAGES_EVAL, 'r') as file:
    data = json.load(file)

train = data["train"]
val = data["val"]
test = data["test"]

#### Test set ranking

In [2]:
test_df = pd.DataFrame(test,columns=["element","gt","pred","?","score"])
test_df.sort_values(by=['score'])

Unnamed: 0,element,gt,pred,?,score
6,9f05c03f-8178-4c88-8260-7934b59a1722.jpg,ⒹLes Garennes Ⓖ47 ⒸBourgeot N↑as↓ Denis Ⓕ373 Ⓔ...,ⒹLes Garennes Ⓖ47 ⒸBourgeot n↑as↓ denis Ⓕ373 Ⓔ...,,0.1967
24,ebb6c2e5-3e89-4fcf-9120-41fadf75cc02.jpg,Ⓖ70 ⒸNolo Louis Ⓕ402 ⒺTerre\nⒼ74 ⒸPetit P↑re↓ ...,Ⓖ70 ⒸMolo Louis Ⓕ402 ⒺTerre\nⒼ74 ⒸPetit P↑re↓ ...,,0.2347
17,0864252b-1fc6-4fa4-b71f-0d7813d1f8e0.jpg,Ⓖ67 ⒸMomet Ⓕ16 ⒺTerre\nⒼ83 ⒸSamson Jean Ⓕ17 Ⓔi...,Ⓖ67 ⒸMomet Ⓕ16 ⒺTerre\nⒼ83 ⒸSamson Jean Ⓕ17 Ⓔi...,,0.25
8,11f414bb-4f41-4a46-812e-07fd5d29cb80.jpg,Ⓖ42 ⒸGevaudan Ⓕ134 ⒺTerre\nⒼ23 Ⓒdame De Chavag...,Ⓖ42 ⒸGievaudan→Ⓕ134 ⒺTerre\nⒼ23 ⒸCainé De Chav...,,0.2673
23,dedb479b-00cd-486e-94c7-e2eb4b04bd0b.jpg,Ⓖ58 ⒸLe Gouvernement Ⓕ461 Ⓔéquediere\nⒼ58 ⒸLe ...,Ⓖ58 ⒸLe gouvernement Ⓕ461 ⒺCigne\nⒼ58 ⒸLe même...,,0.3077
21,d4ecb57d-1863-4a0e-9d4e-4c6675c39404.jpg,ⒹRue de→ Ⓖ35 ⒸDestouches louis Ⓕ153 ⒺMaison→et...,ⒹRue de→ Cour Ⓖ35 ⒸDestouches louis Ⓕ153 Ⓔmais...,,0.3305
13,29761c35-dd25-477c-beb6-60fe0cac7917.jpg,Ⓖ62 ⒸMarmontel Veuve Ⓕ24 ⒺTerre\nⒼ22 ⒸChasselo...,Ⓖ62 ⒸMarmontet Veuve Ⓕ24 ⒺTerre\nⒼ22 ⒸChapelou...,,0.3895
25,f11fae21-c23b-46d5-9324-26c52cc2f6e8.jpg,ⒹLe Village ⒸPithois Jean baptiste charles à P...,ⒹLe Village ⒸPithois Jean baptiste charles à P...,,0.4223
15,43534ab4-6452-4c18-8ffe-d135cfcfc99f.jpg,Ⓖ50 ⒸFoulon Veuve Ⓕ256 ⒺJardin\nⒼ124 ⒸPessoux ...,Ⓖ124 ⒸPe Moux Père Ⓕ257 Ⓔid\nⒼ57 ⒸGiot Veuve Ⓕ...,,0.4353
26,f25edeaf-de9b-4adc-a558-7405d57ee89c.jpg,Ⓓles larris ⒸPatureau hubert marie Ⓕ432 Ⓔjardi...,ⒹLes Larris ⒸPatureau hubert marie Ⓕ432 ⒺJardi...,,0.4383


#### Val set ranking

In [3]:
val_df = pd.DataFrame(val,columns=["element","gt","pred","?","score"])
val_df.sort_values(by=['score'])

Unnamed: 0,element,gt,pred,?,score
0,3bf5614d-9110-4b17-a136-1f89aafe668c.jpg,ⒹLa Jarry ⒸPréaus denis vincent Ⓕ1634 Ⓔterre\n...,ⒹLa Jarrig ⒸPréauf denis vincent Ⓕ1634 ⒺTerre\...,,0.3293
4,26a1a247-6ff1-453e-a8fe-16658a829ce2.jpg,ⒹEntre les deux→ruelles Ⓖ243 ⒸDefresne pierre ...,ⒹCutre les denx→Ruelles Ⓖ38 ⒸDefresne pierre J...,,0.3413
13,a47dc560-9b5a-44e0-b59c-94b7a95c06ba.jpg,ⒹLes Saint Méry→id Ⓖ708 ⒸThevenard louis marie...,ⒹLes saint Méry Ⓖ704 ⒸThevenard louis marie a ...,,0.3936
3,8d10f2cd-870d-44c9-b1c0-2f9963af6d55.jpg,ⒹLe Bordiau ⒸPitou Pierre fils à fontenay Ⓕ101...,ⒹLe Bordiau ⒸPilon Pierre fils à Fontenay Ⓕ101...,,0.4021
14,aee0a32e-378b-4103-aee8-319f918cb85e.jpg,Ⓓport de la→varenne Ⓖ90 ⒸKretz Ⓕ364 Ⓔpavillon ...,Ⓓport de la→varenne Ⓖ90 ⒸHretz Ⓕ364 Ⓔpavillon ...,,0.4286
8,679b6507-cec7-422f-8605-e1671d95fba3.jpg,ⒹLa tombe Ⓖ666 ⒸTalon gendarme Ⓕ841 Ⓔterre Ⓐ42...,ⒹLa tombe terre ⒸCalon Gendarnie Ⓕ841 Ⓔterre Ⓐ...,,0.5
10,81411b1c-9ab7-48a6-a641-194048d39d80.jpg,Ⓓle h champ des→moines Ⓖ32 ⒸCassin Ⓕ392 ⒺT. pl...,Ⓓle th champ des→moiher Ⓖ32 ⒸCaffin Ⓕ392 ⒺT. p...,,0.5672
7,668fb164-96be-4c84-99d9-1927808e9ec6.jpg,ⒹRut Grand Ⓖ154 ⒸCoulombier J↑ques↓ H↑ers↓ Ⓕ40...,ⒹRue GGrand Ⓖ154 ⒸCoulombier J↑ques↓ Ⓕ409 ⒺTer...,,0.5787
6,346be28d-98a8-4831-95f8-54d527c7c64c.jpg,ⒹVacher Ⓖ186 ⒸCretté J↑n↓ Cadet Prompt Ⓕ1732 Ⓔ...,Ⓖ306 ⒸCretté J↑n↓ Caset P1732 ⒺTer\nⒸ×77± ⒸBou...,,0.5879
5,89edf51e-a836-4359-adc8-efd759991e5f.jpg,Ⓓchemin→de Brie→ Ⓖ95 ⒸLefevre Ⓕ166 Ⓔterre\nⒹsu...,ⒹChemin→de Brie Ⓖ95 ⒸLefevte Ⓕ95 Ⓔterre\nⒹØ Ⓖ2...,,0.5921


### 2.2 ie-eval evaluation
This is a test using the ie-eval lib. Not mandatory. Requires a dedicated virtual environnement with ie-eval package.
Required to convert data in IOB format.

In [None]:
import os
import subprocess
import json
import pandas as pd
from dan.bio import convert
from dan.utils import EntityType

ner_tokens = {
        "ancien_numero_parcelle": EntityType(start="Ⓐ"),
        "ancienne_nature": EntityType(start="Ⓑ"),
        "identite": EntityType(start="Ⓒ"),
        "lieu-dit": EntityType(start="Ⓓ"),
        "nature": EntityType(start="Ⓔ"),
        "numero_parcelle": EntityType(start="Ⓕ"),
        "numero_proprietaire": EntityType(start="Ⓖ"),
    }

GT_JSON = ROOT + "/dataset2/page_dataset/split.json"
GT_IOB_FOLDER = ROOT + "/scripts/DAN/iob/labels"
PRED_DAN_FOLDER = "/home/STual/DAN-cadastre/inference/training120325_config2025_prod_2000epochs"
PRED_IOB_FOLDER = "/home/STual/DAN-cadastre/inference/iob/predictions/test"
DAN_TOKENS = "/home/STual/DAN-cadastre/dataset2/tokens.yml"

CONVERT = False

def convert_danlabels_to_iob(split_json_path, save_dir, ner_tokens, subsets=["train","val","test"]):
    """
    :param split_json_path: Path to the split.json file of a DAN dataset produced with teklia dan
    :param save_dir: Path to the folder where to save the data in iob format
    :param ner_tokens: Dict of the entity and their corresponding special token
    :param subsets: List of the subsets to convert to iob format. Default = ["train","val","test"]
    """

    with open(split_json_path, 'r') as file:
        data = json.load(file)
        
    if not os.path.exists(save_dir):
            os.makedirs(save_dir)
        
    for subset in subsets:
        if not os.path.exists(save_dir + '/' + subset):
            os.makedirs(save_dir + '/' + subset)
        gt = data[subset]
        
        ls = []
        for elem in gt:
            uuid = elem
            image = gt[elem]["image"]["iiif_url"]
            text = gt[elem]["text"]
            row = [uuid,image,text]
            ls.append(row)
        df_gt = pd.DataFrame(ls,columns=["uuid","image","text"])
        
        #for _, row in df_gt.iterrows():
            #page_convert = convert(row["text"], ner_tokens).split("\n")
            #with open(save_dir + '/' + subset + '/' + row["uuid"] + '.bio','w',encoding='utf8') as f:
                #f.write("\n".join(page_convert))
                
if __name__ == "__main__":
    if CONVERT:
        # Convert DAN labels in IOB
        convert_danlabels_to_iob(GT_JSON, GT_IOB_FOLDER, ner_tokens)
        # Convert DAN predictions as IOB
        command_pred_test = f"teklia-dan convert {PRED_DAN_FOLDER} --output {PRED_IOB_FOLDER} --tokens {DAN_TOKENS}"
        result = subprocess.run(command_pred_test, capture_output=True, text=True)
        print("Output:", result.stdout)
        print("Error:", result.stderr)

In [None]:
for _, row in test_df_gt.iterrows():
    page_convert = convert(row["text"], ner_tokens).split("\n")
    with open(""row["uuid"] + '.bio','w',encoding='utf8') as f:
        f.write("\n".join(page_convert))

| Category | BoW-F1 (%) | BoTW-F1 (%) | BoE-F1 (%) | N documents |
|:---------|:----------:|:-----------:|:----------:|------------:|
| total    |   59.45    |    57.86    |   58.83    |          27 |

| Category               | BoW-F1 (%) | BoTW-F1 (%) | BoE-F1 (%) | N documents |
|:-----------------------|:----------:|:-----------:|:----------:|------------:|
| total                  |   59.45    |    57.86    |   58.83    |          27 |
| nature                 |   64.22    |    64.22    |   65.48    |          27 |
| numero_proprietaire    |   71.41    |    71.41    |   73.04    |          22 |
| numero_parcelle        |   76.67    |    76.67    |   76.97    |          27 |
| identite               |   46.59    |    46.59    |   31.58    |          27 |
| lieu-dit               |   60.17    |    60.17    |   59.56    |          20 |
| ancien_numero_parcelle |   35.42    |    35.42    |   38.77    |           9 |
| ancienne_nature        |   48.15    |    48.15    |   47.29    |           8 |