# SciBERT Fine-Tuning on Drug/ADE Corpus
#### Hugging Face Course Community Event
#### By Justin S. Lee 
#### November 15-19, 2021

---

In this notebook, we use the 🤗 `transformers` library to fine-tune the `allenai/scibert_scivocab_uncased` model on the dataset `ade_corpus_v2`. The goal is for the fine-tuned model to perform Named Entity Recognition by identifying Adverse Drug Reactions (ADRs) as well as Drug names. 

This was originally run on an `ml.p3.2xlarge` instance on AWS SageMaker.

In [1]:
! pip install datasets transformers seqeval

You should consider upgrading via the '/home/ec2-user/anaconda3/envs/amazonei_pytorch_latest_p36/bin/python -m pip install --upgrade pip' command.[0m


In [39]:
! pip install spacy 

Collecting spacy
  Downloading spacy-3.2.0-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.0 MB)
[K     |████████████████████████████████| 6.0 MB 24.6 MB/s eta 0:00:01
[?25hCollecting typer<0.5.0,>=0.3.0
  Downloading typer-0.4.0-py3-none-any.whl (27 kB)
Collecting srsly<3.0.0,>=2.4.1
  Downloading srsly-2.4.2-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (451 kB)
[K     |████████████████████████████████| 451 kB 64.6 MB/s eta 0:00:01
[?25hCollecting spacy-loggers<2.0.0,>=1.0.0
  Downloading spacy_loggers-1.0.1-py3-none-any.whl (7.0 kB)
Collecting catalogue<2.1.0,>=2.0.6
  Downloading catalogue-2.0.6-py3-none-any.whl (17 kB)
Collecting murmurhash<1.1.0,>=0.28.0
  Downloading murmurhash-1.0.6-cp36-cp36m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (21 kB)
Collecting blis<0.8.0,>=0.4.0
  Downloading blis-0.7.5-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (9.9 MB)
[K     |████████████████████████████████| 9

In [2]:
from datasets import Dataset, ClassLabel, Sequence, load_dataset, load_metric
import numpy as np
import pandas as pd
from spacy import displacy
import transformers
from transformers import (AutoModelForTokenClassification, 
                          AutoTokenizer, 
                          DataCollatorForTokenClassification,
                          pipeline,
                          TrainingArguments, 
                          Trainer)

In [None]:
# confirm version > 4.11.0
print(transformers.__version__)

---
## Dataset Exploration

We use the `Ade_corpus_v2_drug_ade_relation` subset of the `ade_corpus_v2` dataset, which provides labeled spans for drug names and adverse effects.

See dataset page here: https://huggingface.co/datasets/ade_corpus_v2

In [6]:
datasets = load_dataset("ade_corpus_v2", "Ade_corpus_v2_drug_ade_relation")

Reusing dataset ade_corpus_v2 (/home/ec2-user/.cache/huggingface/datasets/ade_corpus_v2/Ade_corpus_v2_drug_ade_relation/1.0.0/940d61334dbfac6b01ac5d00286a2122608b8dc79706ee7e9206a1edb172c559)


  0%|          | 0/1 [00:00<?, ?it/s]

In [7]:
datasets

DatasetDict({
    train: Dataset({
        features: ['text', 'drug', 'effect', 'indexes'],
        num_rows: 6821
    })
})

In [8]:
datasets["train"][0]

{'text': 'Intravenous azithromycin-induced ototoxicity.',
 'drug': 'azithromycin',
 'effect': 'ototoxicity',
 'indexes': {'drug': {'start_char': [12], 'end_char': [24]},
  'effect': {'start_char': [33], 'end_char': [44]}}}

## Dataset Consolidation
----
Upon further examination of the dataset, we can see that sentences are often repeated to identify different pairs of drugs and adverse reactions. For example, see this sentence from the dataset:
```
{'text': 'After therapy for diabetic coma with insulin (containing the preservative cresol) and electrolyte solutions was started, the patient complained of increasing myalgia, developed a high fever and respiratory and metabolic acidosis and lost consciousness.', 'drug': 'insulin', 'effect': 'increasing myalgia', 'indexes': {'drug': {'start_char': [37], 'end_char': [44]}, 'effect': {'start_char': [147], 'end_char': [165]}}}
{'text': 'After therapy for diabetic coma with insulin (containing the preservative cresol) and electrolyte solutions was started, the patient complained of increasing myalgia, developed a high fever and respiratory and metabolic acidosis and lost consciousness.', 'drug': 'cresol', 'effect': 'lost consciousness', 'indexes': {'drug': {'start_char': [74], 'end_char': [80]}, 'effect': {'start_char': [233], 'end_char': [251]}}}
{'text': 'After therapy for diabetic coma with insulin (containing the preservative cresol) and electrolyte solutions was started, the patient complained of increasing myalgia, developed a high fever and respiratory and metabolic acidosis and lost consciousness.', 'drug': 'cresol', 'effect': 'high fever', 'indexes': {'drug': {'start_char': [74], 'end_char': [80]}, 'effect': {'start_char': [179], 'end_char': [189]}}}
{'text': 'After therapy for diabetic coma with insulin (containing the preservative cresol) and electrolyte solutions was started, the patient complained of increasing myalgia, developed a high fever and respiratory and metabolic acidosis and lost consciousness.', 'drug': 'insulin', 'effect': 'high fever', 'indexes': {'drug': {'start_char': [37], 'end_char': [44]}, 'effect': {'start_char': [179], 'end_char': [189]}}}
{'text': 'After therapy for diabetic coma with insulin (containing the preservative cresol) and electrolyte solutions was started, the patient complained of increasing myalgia, developed a high fever and respiratory and metabolic acidosis and lost consciousness.', 'drug': 'insulin', 'effect': 'lost consciousness', 'indexes': {'drug': {'start_char': [37], 'end_char': [44]}, 'effect': {'start_char': [233], 'end_char': [251]}}}
{'text': 'After therapy for diabetic coma with insulin (containing the preservative cresol) and electrolyte solutions was started, the patient complained of increasing myalgia, developed a high fever and respiratory and metabolic acidosis and lost consciousness.', 'drug': 'insulin', 'effect': 'respiratory and metabolic acidosis', 'indexes': {'drug': {'start_char': [37], 'end_char': [44]}, 'effect': {'start_char': [194], 'end_char': [228]}}}
{'text': 'After therapy for diabetic coma with insulin (containing the preservative cresol) and electrolyte solutions was started, the patient complained of increasing myalgia, developed a high fever and respiratory and metabolic acidosis and lost consciousness.', 'drug': 'cresol', 'effect': 'respiratory and metabolic acidosis', 'indexes': {'drug': {'start_char': [74], 'end_char': [80]}, 'effect': {'start_char': [194], 'end_char': [228]}}}
```

This is not ideal in an NER setting - if we assigned one set of token labels per row in this dataset as-is, we would end up giving different labels to the same tokens in the same sentences. This would confuse the model during fine-tuning, so we need to consolidate all of the ranges provided for each unique sentence, before performing one pass to label all known entities.

In [10]:
consolidated_dataset = {}

for row in datasets["train"]:
    if row["text"] in consolidated_dataset:
        consolidated_dataset[row["text"]]["drug_indices_start"].update(row["indexes"]["drug"]["start_char"])
        consolidated_dataset[row["text"]]["drug_indices_end"].update(row["indexes"]["drug"]["end_char"])
        consolidated_dataset[row["text"]]["effect_indices_start"].update(row["indexes"]["effect"]["start_char"])
        consolidated_dataset[row["text"]]["effect_indices_end"].update(row["indexes"]["effect"]["end_char"])
        consolidated_dataset[row["text"]]["drug"].append(row["drug"])
        consolidated_dataset[row["text"]]["effect"].append(row["effect"])
        
    else:
        consolidated_dataset[row["text"]] = {
            "text": row["text"],
            "drug": [row["drug"]],
            "effect": [row["effect"]],
            # use sets because the indices can repeat for various reasons
            "drug_indices_start": set(row["indexes"]["drug"]["start_char"]),
            "drug_indices_end": set(row["indexes"]["drug"]["end_char"]),
            "effect_indices_start": set(row["indexes"]["effect"]["start_char"]),
            "effect_indices_end": set(row["indexes"]["effect"]["end_char"])
        }

---
With the dataset consolidated, we need to assign per-token labels to each sentence. First, we re-define our Python data structure as a Hugging Face Dataset object.

In [11]:
df = pd.DataFrame(list(consolidated_dataset.values()))

In [12]:
df.head()

Unnamed: 0,text,drug,effect,drug_indices_start,drug_indices_end,effect_indices_start,effect_indices_end
0,Intravenous azithromycin-induced ototoxicity.,[azithromycin],[ototoxicity],{12},{24},{33},{44}
1,"Immobilization, while Paget's bone disease was...",[dihydrotachysterol],[increased calcium-release],{91},{109},{143},{168}
2,Unaccountable severe hypercalcemia in a patien...,[dihydrotachysterol],[hypercalcemia],{84},{102},{21},{34}
3,METHODS: We report two cases of pseudoporphyri...,"[naproxen, oxaprozin]","[pseudoporphyria, pseudoporphyria]","{58, 71}","{80, 66}",{32},{47}
4,"Naproxen, the most common offender, has been a...",[Naproxen],[erythropoietic protoporphyria],{0},{8},{134},{163}


In [13]:
# since no spans overlap, we can sort to get 1:1 matched index spans
# note that sets don't preserve insertion order

df["drug_indices_start"] = df["drug_indices_start"].apply(list).apply(sorted)
df["drug_indices_end"] = df["drug_indices_end"].apply(list).apply(sorted)
df["effect_indices_start"] = df["effect_indices_start"].apply(list).apply(sorted)
df["effect_indices_end"] = df["effect_indices_end"].apply(list).apply(sorted)

In [14]:
# save to JSON to then import into Dataset object
df.to_json("dataset.jsonl", orient="records", lines=True)

In [15]:
cons_dataset = load_dataset("json", data_files="dataset.jsonl")

Using custom data configuration default-4d50f1e083f6f7fa


Downloading and preparing dataset json/default to /home/ec2-user/.cache/huggingface/datasets/json/default-4d50f1e083f6f7fa/0.0.0/c2d554c3377ea79c7664b93dc65d0803b45e3279000f993c7bfd18937fd7f426...


  0%|          | 0/1 [00:00<?, ?it/s]

  0%|          | 0/1 [00:00<?, ?it/s]

Dataset json downloaded and prepared to /home/ec2-user/.cache/huggingface/datasets/json/default-4d50f1e083f6f7fa/0.0.0/c2d554c3377ea79c7664b93dc65d0803b45e3279000f993c7bfd18937fd7f426. Subsequent calls will reuse this data.


  0%|          | 0/1 [00:00<?, ?it/s]

In [16]:
# no train-test provided, so we create our own
cons_dataset = cons_dataset["train"].train_test_split()

In [17]:
cons_dataset

DatasetDict({
    train: Dataset({
        features: ['text', 'drug', 'effect', 'drug_indices_start', 'drug_indices_end', 'effect_indices_start', 'effect_indices_end'],
        num_rows: 3203
    })
    test: Dataset({
        features: ['text', 'drug', 'effect', 'drug_indices_start', 'drug_indices_end', 'effect_indices_start', 'effect_indices_end'],
        num_rows: 1068
    })
})

---
## Token Labeling

Finally, we can label each token with its entity. We use BIO tagging on two entities, `DRUG` and `EFFECT`. This results in five possible classes for each token:

* `O` - outside any entity we care about
* `B-DRUG` - the beginning of a `DRUG` entity
* `I-DRUG` - inside a `DRUG` entity
* `B-EFFECT` - the beginning of an `EFFECT` entity
* `I-EFFECT` - inside an `EFFECT` entity

In [18]:
label_list = ['O', 'B-DRUG', 'I-DRUG', 'B-EFFECT', 'I-EFFECT']

custom_seq = Sequence(feature=ClassLabel(num_classes=5, 
                                         names=label_list,
                                         names_file=None, id=None), length=-1, id=None)

cons_dataset["train"].features["ner_tags"] = custom_seq
cons_dataset["test"].features["ner_tags"] = custom_seq

In [19]:
tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)

In [20]:
def generate_row_labels(row, verbose=False):
    """ Given a row from the consolidated `Ade_corpus_v2_drug_ade_relation` dataset, 
    generates BIO tags for drug and effect entities. 
    
    """

    text = row["text"]

    labels = []
    label = "O"
    prefix = ""
    
    # while iterating through tokens, increment to traverse all drug and effect spans
    drug_index = 0
    effect_index = 0
    
    tokens = tokenizer(text, return_offsets_mapping=True)

    for n in range(len(tokens["input_ids"])):
        offset_start, offset_end = tokens["offset_mapping"][n]

        # should only happen for [CLS] and [SEP]
        if offset_end - offset_start == 0:
            labels.append(-100)
            continue
        
        if drug_index < len(row["drug_indices_start"]) and offset_start == row["drug_indices_start"][drug_index]:
            label = "DRUG"
            prefix = "B-"

        elif effect_index < len(row["effect_indices_start"]) and offset_start == row["effect_indices_start"][effect_index]:
            label = "EFFECT"
            prefix = "B-"
        
        labels.append(label_list.index(f"{prefix}{label}"))
            
        if drug_index < len(row["drug_indices_end"]) and offset_end == row["drug_indices_end"][drug_index]:
            label = "O"
            prefix = ""
            drug_index += 1
            
        elif effect_index < len(row["effect_indices_end"]) and offset_end == row["effect_indices_end"][effect_index]:
            label = "O"
            prefix = ""
            effect_index += 1

        # need to transition "inside" if we just entered an entity
        if prefix == "B-":
            prefix = "I-"
    
    if verbose:
        print(f"{row}\n")
        orig = tokenizer.convert_ids_to_tokens(tokens["input_ids"])
        for n in range(len(labels)):
            print(orig[n], labels[n])
    tokens["labels"] = labels
    
    return tokens


In [21]:
# testing out...

generate_row_labels(cons_dataset["train"][2], verbose=True)

{'text': 'Ampicillin-associated seizures.', 'drug': ['Ampicillin'], 'effect': ['seizures'], 'drug_indices_start': [0], 'drug_indices_end': [10], 'effect_indices_start': [22], 'effect_indices_end': [30]}

[CLS] -100
ampicillin 1
- 0
associated 0
seizures 3
. 0
[SEP] -100


{'input_ids': [102, 26728, 579, 1111, 12787, 205, 103], 'token_type_ids': [0, 0, 0, 0, 0, 0, 0], 'attention_mask': [1, 1, 1, 1, 1, 1, 1], 'offset_mapping': [(0, 0), (0, 10), (10, 11), (11, 21), (22, 30), (30, 31), (0, 0)], 'labels': [-100, 1, 0, 0, 3, 0, -100]}

In [22]:
labeled_dataset = cons_dataset.map(generate_row_labels)

  0%|          | 0/3203 [00:00<?, ?ex/s]

  0%|          | 0/1068 [00:00<?, ?ex/s]

---
## SciBERT Model Fine-Tuning

We are now ready to fine-tune the SciBERT model on our dataset. This section is modified from the following 🤗 notebook provided here: https://github.com/huggingface/notebooks/blob/master/examples/token_classification.ipynb


In [23]:
task = "ner" # Should be one of "ner", "pos" or "chunk"
model_checkpoint = "allenai/scibert_scivocab_uncased"
batch_size = 16

In [24]:
model = AutoModelForTokenClassification.from_pretrained(model_checkpoint, num_labels=len(label_list))

Some weights of the model checkpoint at allenai/scibert_scivocab_uncased were not used when initializing BertForTokenClassification: ['cls.predictions.transform.dense.bias', 'cls.predictions.decoder.weight', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.decoder.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.seq_relationship.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForTokenClassification were not initi

In [26]:
model_name = model_checkpoint.split("/")[-1]
args = TrainingArguments(
    f"{model_name}-finetuned-{task}",
    evaluation_strategy = "epoch",
    learning_rate=1e-5,
    per_device_train_batch_size=batch_size,
    per_device_eval_batch_size=batch_size,
    num_train_epochs=5,
    weight_decay=0.05,
    logging_steps=1
)

In [27]:
data_collator = DataCollatorForTokenClassification(tokenizer)

In [28]:
metric = load_metric("seqeval")

In [29]:
def compute_metrics(p):
    predictions, labels = p
    predictions = np.argmax(predictions, axis=2)

    # Remove ignored index (special tokens)
    true_predictions = [
        [label_list[p] for (p, l) in zip(prediction, label) if l != -100]
        for prediction, label in zip(predictions, labels)
    ]
    true_labels = [
        [label_list[l] for (p, l) in zip(prediction, label) if l != -100]
        for prediction, label in zip(predictions, labels)
    ]

    results = metric.compute(predictions=true_predictions, references=true_labels)
    return {
        "precision": results["overall_precision"],
        "recall": results["overall_recall"],
        "f1": results["overall_f1"],
        "accuracy": results["overall_accuracy"],
    }

In [30]:
trainer = Trainer(
    model,
    args,
    train_dataset=labeled_dataset["train"],
    eval_dataset=labeled_dataset["test"],
    data_collator=data_collator,
    tokenizer=tokenizer,
    compute_metrics=compute_metrics, 

)

In [31]:
trainer.train()

The following columns in the training set  don't have a corresponding argument in `BertForTokenClassification.forward` and have been ignored: drug, effect_indices_end, offset_mapping, drug_indices_start, text, effect, drug_indices_end, effect_indices_start.
***** Running training *****
  Num examples = 3203
  Num Epochs = 5
  Instantaneous batch size per device = 16
  Total train batch size (w. parallel, distributed & accumulation) = 16
  Gradient Accumulation steps = 1
  Total optimization steps = 1005


Epoch,Training Loss,Validation Loss,Precision,Recall,F1,Accuracy
1,0.0131,0.147024,0.810579,0.904463,0.854951,0.952808
2,0.0394,0.132503,0.838336,0.906677,0.871168,0.958644
3,0.0549,0.132246,0.854145,0.915898,0.883944,0.960842
4,0.0425,0.133181,0.853583,0.909627,0.880714,0.96203
5,0.0222,0.136563,0.860602,0.917743,0.888254,0.962823


The following columns in the evaluation set  don't have a corresponding argument in `BertForTokenClassification.forward` and have been ignored: drug, effect_indices_end, offset_mapping, drug_indices_start, text, effect, drug_indices_end, effect_indices_start.
***** Running Evaluation *****
  Num examples = 1068
  Batch size = 16
The following columns in the evaluation set  don't have a corresponding argument in `BertForTokenClassification.forward` and have been ignored: drug, effect_indices_end, offset_mapping, drug_indices_start, text, effect, drug_indices_end, effect_indices_start.
***** Running Evaluation *****
  Num examples = 1068
  Batch size = 16
Saving model checkpoint to scibert_scivocab_uncased-finetuned-ner/checkpoint-500
Configuration saved in scibert_scivocab_uncased-finetuned-ner/checkpoint-500/config.json
Model weights saved in scibert_scivocab_uncased-finetuned-ner/checkpoint-500/pytorch_model.bin
tokenizer config file saved in scibert_scivocab_uncased-finetuned-ner/che

TrainOutput(global_step=1005, training_loss=0.12553799401699978, metrics={'train_runtime': 128.9417, 'train_samples_per_second': 124.203, 'train_steps_per_second': 7.794, 'total_flos': 438556142082630.0, 'train_loss': 0.12553799401699978, 'epoch': 5.0})

In [32]:
predictions, labels, _ = trainer.predict(labeled_dataset["test"])
predictions = np.argmax(predictions, axis=2)

# Remove ignored index (special tokens)
true_predictions = [
    [label_list[p] for (p, l) in zip(prediction, label) if l != -100]
    for prediction, label in zip(predictions, labels)
]
true_labels = [
    [label_list[l] for (p, l) in zip(prediction, label) if l != -100]
    for prediction, label in zip(predictions, labels)
]

results = metric.compute(predictions=true_predictions, references=true_labels)
results

The following columns in the test set  don't have a corresponding argument in `BertForTokenClassification.forward` and have been ignored: drug, effect_indices_end, offset_mapping, drug_indices_start, text, effect, drug_indices_end, effect_indices_start.
***** Running Prediction *****
  Num examples = 1068
  Batch size = 16


{'DRUG': {'precision': 0.9234731420161884,
  'recall': 0.9661277906081601,
  'f1': 0.9443190368698269,
  'number': 1299},
 'EFFECT': {'precision': 0.8048302872062664,
  'recall': 0.873229461756374,
  'f1': 0.8376358695652174,
  'number': 1412},
 'overall_precision': 0.8606018678657904,
 'overall_recall': 0.917742530431575,
 'overall_f1': 0.888254194930382,
 'overall_accuracy': 0.962822868258943}

---
## See Model Outputs

We load our fine-tuned model into a `pipeline` object to run arbitrary input against it.

In [33]:
effect_ner_model = pipeline(task="ner", model=model, tokenizer=tokenizer, device=0)

In [34]:
# something from our validation set
effect_ner_model(labeled_dataset["test"][4]["text"])

Asking to truncate to max_length but no maximum length is provided and the model has no predefined maximum length. Default to no truncation.


[{'entity': 'LABEL_0',
  'score': 0.99228,
  'index': 1,
  'word': 'possible',
  'start': 0,
  'end': 8},
 {'entity': 'LABEL_3',
  'score': 0.9958305,
  'index': 2,
  'word': 'serotonin',
  'start': 9,
  'end': 18},
 {'entity': 'LABEL_4',
  'score': 0.99691534,
  'index': 3,
  'word': 'syndrome',
  'start': 19,
  'end': 27},
 {'entity': 'LABEL_0',
  'score': 0.9987228,
  'index': 4,
  'word': 'associated',
  'start': 28,
  'end': 38},
 {'entity': 'LABEL_0',
  'score': 0.9990265,
  'index': 5,
  'word': 'with',
  'start': 39,
  'end': 43},
 {'entity': 'LABEL_1',
  'score': 0.99770975,
  'index': 6,
  'word': 'clo',
  'start': 44,
  'end': 47},
 {'entity': 'LABEL_2',
  'score': 0.9988942,
  'index': 7,
  'word': '##mi',
  'start': 47,
  'end': 49},
 {'entity': 'LABEL_2',
  'score': 0.99920785,
  'index': 8,
  'word': '##pr',
  'start': 49,
  'end': 51},
 {'entity': 'LABEL_2',
  'score': 0.9991023,
  'index': 9,
  'word': '##amine',
  'start': 51,
  'end': 56},
 {'entity': 'LABEL_0',
  's

---
We try out the first few examples of adverse effects from the Wikipedia page on adverse effects and visualize with the displaCy library:

https://en.wikipedia.org/wiki/Adverse_effect#Medications

In [96]:
def visualize_entities(sentence):
    tokens = effect_ner_model(sentence)
    entities = []
    
    for token in tokens:
        label = int(token["entity"][-1])
        if label != 0:
            token["label"] = label_list[label]
            entities.append(token)
    
    params = [{"text": sentence,
               "ents": entities,
               "title": None}]
    
    html = displacy.render(params, style="ent", manual=True, options={
        "colors": {
                   "B-DRUG": "#f08080",
                   "I-DRUG": "#f08080",
                   "B-EFFECT": "#9bddff",
                   "I-EFFECT": "#9bddff",
               },
    })
    

In [103]:
examples = [
    "Abortion, miscarriage or uterine hemorrhage associated with misoprostol (Cytotec), a labor-inducing drug.",
    "Addiction to many sedatives and analgesics, such as diazepam, morphine, etc.",
    "Birth defects associated with thalidomide",
    "Bleeding of the intestine associated with aspirin therapy",
    "Cardiovascular disease associated with COX-2 inhibitors (i.e. Vioxx)",
    "Deafness and kidney failure associated with gentamicin (an antibiotic)"
]

for example in examples:
    visualize_entities(example)
    print(f"{'*' * 50}\n")

**************************************************



**************************************************



**************************************************



**************************************************



**************************************************



**************************************************

