# This notebook will attempt to show how to compare two models

We often arrive in a situation where we've got multiple different models. Yet we're note sure which one we should focus on or start from for a particular task.
This notebook aims to introduce some tools that (hopefully) help us do that.

## Initial input - models and data

There are two different workflows this notebook can handle:
1. Compare two different model packs
  - Provide 2 model pack paths
  - Provide a documents file
2. Compare model pack with and without supervised training
  - Provide 1 model pack path
  - Provide a file path to a MedCATtrainer (MCT) export
  - Provide a document file

The model packs can be either the `.zip` file (which will be automatically unzipped) or the folder.

The documents file is expected in a `.csv` format with two columns (`id`, and `text`).

The MCT export is expected in the format given by MedCATtrainer.

For the two approaches, there is a slightly different internal workflow.
But other than ticking the checkbox, the process should be identical to the user.

In [9]:
from ipyfilechooser import FileChooser
from ipywidgets import widgets
import os
_def_path = '../../models/modelpack'
_def_path = _def_path if os.path.exists(_def_path) else '.'
model1_chooser = FileChooser(_def_path)
model2_chooser = FileChooser(_def_path)
documents_chooser = FileChooser(".")
display(model1_chooser)
display(model2_chooser)
display(documents_chooser)
ckbox = widgets.Checkbox(description="MCT export compare")

FileChooser(path='/Users/martratas/Documents/CogStack/.MedCAT.nosync/working_with_cogstack/models/modelpack', …

FileChooser(path='/Users/martratas/Documents/CogStack/.MedCAT.nosync/working_with_cogstack/models/modelpack', …

FileChooser(path='/Users/martratas/Documents/CogStack/.MedCAT.nosync/working_with_cogstack/medcat/compare_mode…

### CUI filter settings

These are optional.

If you wish to filter based on CUIs (i.e only run the comparison for some CUIs), you can do so.
You can either list the CUIs (separated by comma) or provide a file that lists them (separated by comma).

You can also include the children of the selected CUIs.The default is not to do so.
But you can opt to include children of a certain order (i.e `1` means direct children only, `2` meand children of children as well, and so on).

In [22]:
from ipywidgets import widgets
cui_filter_chooser = FileChooser(".", description="The CUI filter file")
cui_filter_box = widgets.Textarea(description="CUI list")
cui_children = widgets.IntText(description="Children", value=-1)
display(cui_filter_chooser)
display(cui_filter_box)
display(cui_children)

FileChooser(path='/Users/martratas/Documents/CogStack/.MedCAT.nosync/working_with_cogstack/medcat/compare_mode…

Textarea(value='', description='CUI list')

IntText(value=-1, description='Children')

In [30]:
model_path_1 = model1_chooser.selected
model_path_2 = model2_chooser.selected
documents_file = documents_chooser.selected
is_mct_export_compare = ckbox.value
if not is_mct_export_compare:
    print(f"For models, selected:\nModel1: {model_path_1}\nModel2: {model_path_2}"
          f"\nDocuments: {documents_file}")
else:
    print(f"Selected:\nModel: {model_path_1}\nMCT export: {model_path_2}"
          f"\nDocuments: {documents_file}")
# CUI filter
cui_filter = None
filter_children = None
if cui_filter_chooser.selected:
    cui_filter = cui_filter_chooser.selected
elif cui_filter_box.value:
    cui_filter = cui_filter_box.value
if cui_children.value and cui_children.value > 0:
    filter_children = cui_children.value
print(f"For CUI filter, selected:\nFilter: {cui_filter}\nChildren: {filter_children}")

For models, selected:
Model1: /Users/martratas/Documents/CogStack/.MedCAT.nosync/working_with_cogstack/models/modelpack/KCH2024_snomed_no_enrichment.zip
Model2: /Users/martratas/Documents/CogStack/.MedCAT.nosync/working_with_cogstack/models/modelpack/SNOMED2024_UK_FINAL_0c0de303b6dc0020.zip
Documents: /Users/martratas/Documents/CogStack/.MedCAT.nosync/working_with_cogstack/medcat/compare_models/data/some_synthetic_data.csv
For CUI filter, selected:
Filter: None
Children: None


### Running the difference finder

Now that we've got the input data, we need to figure out how they work and what their differences are.
We use the `get_diffs_for` method that loads both models, runs `CAT.get_entities` on each document for either model, and then returns some results.

These results show describe the difference in the raw CDB (i.e the number of concepts (join and unique), amount of training, and so on), the total differences in the entities extracted (i.e the number of recognitions and forms per CUI) as well as per document differences (i.e the number of identical as well as different entity recognitions found).

We will look into the details later.

In [31]:
from compare import get_diffs_for
from output import parse_and_show, show_dict_deep, compare_dicts

cdb_comp, tally1, tally2, ann_diffs = get_diffs_for(model_path_1, model_path_2, documents_file, cui_filter=cui_filter, include_children_in_filter=filter_children,
                                                    supervised_train_comparison_model=is_mct_export_compare)

Loading [1] ../../../MedCAT/temp/model_packs/20230227__kch_gstt_trained_model_494c3717f637bb89.zip




Loading [2] ../../../MedCAT/temp/model_packs/snomed2024_kch_trained_d4092ab9f5360973.zip
Per annotations diff finding


100%|██████████| 60/60 [00:09<00:00,  6.53it/s]


Counting [1&2]


100%|██████████| 60/60 [00:00<00:00, 10632.40it/s]


CDB compare


keys: 100%|██████████| 794151/794151 [00:01<00:00, 557600.58it/s]
keys: 100%|██████████| 794151/794151 [00:02<00:00, 308384.42it/s]


For now, we'll use the common parser/display method to dispaly an overview of the results.
We can later look at more granual details as well.

In [3]:
# show results
parse_and_show(cdb_comp, tally1, tally2, ann_diffs)

CDB overall differences:


| Path | Value | [Optional] Comparison |
| ----- | ----- | ----- |
| names.keys.joint                         | 752042                                   |                                          |
| names.keys.total                         | 760283                                   | 785910                                   |
| names.keys.not_in_                       | 33868                                    | 8241                                     |
| names.values.joint                       | 2327941                                  |                                          |
| names.values.total                       | 3149859                                  | 2510372                                  |
| names.values.unique_in_                  | 752906                                   | 152108                                   |
| names.values.not_in_                     | 170834                                   | 810321                                   |
| snames.keys.joint                        | 752042                                   |                                          |
| snames.keys.total                        | 760283                                   | 785910                                   |
| snames.keys.not_in_                      | 33868                                    | 8241                                     |
| snames.values.joint                      | 5094031                                  |                                          |
| snames.values.total                      | 13486640                                 | 11958247                                 |
| snames.values.unique_in_                 | 1565939                                  | 349022                                   |
| snames.values.not_in_                    | 670099                                   | 2198492                                  |

Now tally differences


| Path | First | Second |
| ----- | ----- | ----- |
| pt2ch (Dict[str, Set])                   | 352226 keys (mean 2.0 values per key)    | 147466 keys (mean 2.0 values per key)    |
| cat_data                                 | {'Number of concepts': 760283, 'Number of names': 3080845, 'Number of concepts that received training': 38460, 'Number of seen training examples in total': 153875883, 'Average training examples per concept': 4000.932995319813} | {'Number of concepts': 785910, 'Number of names': 2480049, 'Number of concepts that received training': 373727, 'Number of seen training examples in total': 1474910653, 'Average training examples per concept': 3946.492099848285} |
| per_cui_count (Dict[str, int])           | 621 keys (total 2220 in value)           | 584 keys (total 2162 in value)           |
| per_cui_acc (Dict[str, float])           | 621 keys (mean 0.9029113037725474 in value) | 584 keys (mean 0.963999005716541 in value) |
| per_cui_forms (Dict[str, Set])           | 621 keys (mean 2.0 values per key)       | 584 keys (mean 2.0 values per key)       |
| per_type_counts (Dict[str, int])         | 25 keys (total 2220 in value)            | 24 keys (total 2162 in value)            |
| total_count                              | 2220                                     | 2162                                     |

Now per-annotation differences:


| Path | Value | [Optional] Comparison |
| ----- | ----- | ----- |
| IDENTICAL                                | 1406                                     |                                          |
| FIRST_HAS                                | 419                                      |                                          |
| SECOND_HAS                               | 361                                      |                                          |
| SAME_SPAN_DIFF_CONCEPT                   | 183                                      |                                          |
| SAME_GRANDPARENT                         | 1                                        |                                          |
| OVERLAPP_1ST_LARGER_DIFF_CONCEPT         | 129                                      |                                          |
| SAME_SPAN_CONCEPT_NOT_IN_2ND             | 18                                       |                                          |
| SAME_PARENT                              | 38                                       |                                          |
| OVERLAPP_2ND_LARGER_DIFF_CONCEPT         | 14                                       |                                          |
| OVERLAPP_1ST_LARGER_SAME_CONCEPT         | 9                                        |                                          |
| SAME_SPAN_CONCEPT_NOT_IN_1ST             | 3                                        |                                          |

## More granual details (per document view)

The above does not give us all the information we need.
For instance, we may also want to compare the performance accross some documents.
We can do so as follows.

In [4]:
# you can play with individual parts as well.
# for example, isolate a specific document
ann_diffs.per_doc_results.keys()

for key in list(ann_diffs.per_doc_results.keys())[0:10]:
    print('='*20,f'\n{key}', f'\n{"="*20}')
    show_dict_deep(ann_diffs.per_doc_results[key].nr_of_comparisons)

doc_0 


| Path | Value | [Optional] Comparison |
| ----- | ----- | ----- |
| IDENTICAL                                | 41                                       |                                          |
| FIRST_HAS                                | 6                                        |                                          |
| SECOND_HAS                               | 6                                        |                                          |
| SAME_SPAN_DIFF_CONCEPT                   | 3                                        |                                          |
| SAME_GRANDPARENT                         | 1                                        |                                          |
| OVERLAPP_1ST_LARGER_DIFF_CONCEPT         | 4                                        |                                          |

doc_1 


| Path | Value | [Optional] Comparison |
| ----- | ----- | ----- |
| IDENTICAL                                | 28                                       |                                          |
| FIRST_HAS                                | 10                                       |                                          |
| SECOND_HAS                               | 5                                        |                                          |
| SAME_SPAN_CONCEPT_NOT_IN_2ND             | 1                                        |                                          |
| OVERLAPP_1ST_LARGER_DIFF_CONCEPT         | 3                                        |                                          |
| SAME_SPAN_DIFF_CONCEPT                   | 3                                        |                                          |
| SAME_PARENT                              | 2                                        |                                          |
| OVERLAPP_2ND_LARGER_DIFF_CONCEPT         | 1                                        |                                          |

doc_2 


| Path | Value | [Optional] Comparison |
| ----- | ----- | ----- |
| IDENTICAL                                | 33                                       |                                          |
| FIRST_HAS                                | 6                                        |                                          |
| SECOND_HAS                               | 6                                        |                                          |
| OVERLAPP_1ST_LARGER_DIFF_CONCEPT         | 2                                        |                                          |
| SAME_SPAN_DIFF_CONCEPT                   | 2                                        |                                          |
| SAME_PARENT                              | 1                                        |                                          |

doc_3 


| Path | Value | [Optional] Comparison |
| ----- | ----- | ----- |
| IDENTICAL                                | 26                                       |                                          |
| FIRST_HAS                                | 5                                        |                                          |
| OVERLAPP_1ST_LARGER_SAME_CONCEPT         | 2                                        |                                          |
| SECOND_HAS                               | 10                                       |                                          |
| SAME_SPAN_CONCEPT_NOT_IN_2ND             | 1                                        |                                          |
| SAME_SPAN_DIFF_CONCEPT                   | 10                                       |                                          |
| OVERLAPP_1ST_LARGER_DIFF_CONCEPT         | 3                                        |                                          |

doc_4 


| Path | Value | [Optional] Comparison |
| ----- | ----- | ----- |
| IDENTICAL                                | 25                                       |                                          |
| FIRST_HAS                                | 6                                        |                                          |
| OVERLAPP_1ST_LARGER_SAME_CONCEPT         | 1                                        |                                          |
| SECOND_HAS                               | 6                                        |                                          |
| SAME_SPAN_CONCEPT_NOT_IN_2ND             | 1                                        |                                          |
| SAME_SPAN_CONCEPT_NOT_IN_1ST             | 1                                        |                                          |
| SAME_SPAN_DIFF_CONCEPT                   | 2                                        |                                          |
| OVERLAPP_1ST_LARGER_DIFF_CONCEPT         | 3                                        |                                          |

doc_5 


| Path | Value | [Optional] Comparison |
| ----- | ----- | ----- |
| IDENTICAL                                | 29                                       |                                          |
| FIRST_HAS                                | 6                                        |                                          |
| SECOND_HAS                               | 6                                        |                                          |
| OVERLAPP_2ND_LARGER_DIFF_CONCEPT         | 1                                        |                                          |
| OVERLAPP_1ST_LARGER_DIFF_CONCEPT         | 1                                        |                                          |
| SAME_SPAN_DIFF_CONCEPT                   | 2                                        |                                          |

doc_6 


| Path | Value | [Optional] Comparison |
| ----- | ----- | ----- |
| IDENTICAL                                | 28                                       |                                          |
| FIRST_HAS                                | 9                                        |                                          |
| SECOND_HAS                               | 6                                        |                                          |
| SAME_SPAN_DIFF_CONCEPT                   | 2                                        |                                          |
| SAME_SPAN_CONCEPT_NOT_IN_2ND             | 1                                        |                                          |
| SAME_PARENT                              | 1                                        |                                          |
| OVERLAPP_1ST_LARGER_DIFF_CONCEPT         | 1                                        |                                          |

doc_7 


| Path | Value | [Optional] Comparison |
| ----- | ----- | ----- |
| IDENTICAL                                | 28                                       |                                          |
| FIRST_HAS                                | 7                                        |                                          |
| OVERLAPP_1ST_LARGER_SAME_CONCEPT         | 1                                        |                                          |
| SECOND_HAS                               | 4                                        |                                          |
| SAME_SPAN_CONCEPT_NOT_IN_2ND             | 1                                        |                                          |
| OVERLAPP_1ST_LARGER_DIFF_CONCEPT         | 1                                        |                                          |
| SAME_SPAN_DIFF_CONCEPT                   | 2                                        |                                          |
| OVERLAPP_2ND_LARGER_DIFF_CONCEPT         | 1                                        |                                          |

doc_8 


| Path | Value | [Optional] Comparison |
| ----- | ----- | ----- |
| IDENTICAL                                | 24                                       |                                          |
| FIRST_HAS                                | 6                                        |                                          |
| SECOND_HAS                               | 7                                        |                                          |
| OVERLAPP_1ST_LARGER_DIFF_CONCEPT         | 2                                        |                                          |
| SAME_SPAN_DIFF_CONCEPT                   | 3                                        |                                          |

doc_9 


| Path | Value | [Optional] Comparison |
| ----- | ----- | ----- |
| IDENTICAL                                | 12                                       |                                          |
| FIRST_HAS                                | 4                                        |                                          |
| SECOND_HAS                               | 6                                        |                                          |
| SAME_SPAN_DIFF_CONCEPT                   | 1                                        |                                          |

# Saving annotation output to CSV file
You can also save the annotation output to a .csv file. That file inclues the following columns:
```
doc_id  text    ann1    ann2
```
where `doc_id` refers to the ID of the document in question, `text` is the relevant text around the specific annotation, `ann1` is the annotation json for model 1 (if present), and `ann2` is the annotation json for model 2 (if present).

*Note:* One of the annotations may not be present. This is the case if one of the models did not annotate that specific span.

In [5]:
ann_diffs.to_csv("23vs24_annotations.csv")

## More granual details (per cui view)

We may also want to look at how we did for a specific CUI.
This is how we can do that.

In [6]:
# cui = '37151006'  # Erythromelalgia
cui = '25064002'  # headache
per_cui1 = tally1.get_for_cui(cui, include_children=2)
per_cui2 = tally2.get_for_cui(cui, include_children=2)
compare_dicts(per_cui1, per_cui2)

| Path | First | Second |
| ----- | ----- | ----- |
| name                                     | Headache (and 76 children)               | Headache (and 96 children)               |
| count                                    | 12                                       | 18                                       |
| acc                                      | 3.0                                      | 3.0                                      |
| forms                                    | 3                                        | 3                                        |

## More granual details (per annotation view)
Sometimes we may want to look at things on a per annotation basis as well.
That is, we want to look at some annotations and compare them between the two models.

In [7]:
# we can iterate over annotation pairs.
# we may optionally specify the documents we wish to look at
# we will specify one document here so as to not generate too much output
docs = ['doc_2']
# by default, this will omit identical annotations
# but this can be changed by setting omit_identical=False
for doc_name, pair in ann_diffs.iter_ann_pairs(docs=docs, omit_identical=True):
    print('='*20,f'\n{doc_name} ({pair.comparison_type})', f'\n{"="*20}')
    # NOTE: if only one of the two has an annotation, the other one will be None
    #       the following will deal with that automatically, though
    compare_dicts(pair.one, pair.two)

doc_2 (AnnotationComparisonType.FIRST_HAS) 


| Path | First | Second |
| ----- | ----- | ----- |
| pretty_name                              | Genus Quercus                            |                                          |
| cui                                      | 53347009                                 |                                          |
| type_ids                                 | ['81102976']                             |                                          |
| types                                    | ['']                                     |                                          |
| source_value                             | Oak                                      |                                          |
| detected_name                            | oak                                      |                                          |
| acc                                      | 0.6368384509248382                       |                                          |
| context_similarity                       | 0.6368384509248382                       |                                          |
| start                                    | 43                                       |                                          |
| end                                      | 46                                       |                                          |
| icd10                                    | []                                       |                                          |
| ontologies                               | ['20220803_SNOMED_UK_CLINICAL_EXT']      |                                          |
| snomed                                   | []                                       |                                          |
| id                                       | 3                                        |                                          |
| meta_anns (Dict[str, dict])              | 3                                        | 0                                        |
| start-raw                                | 43                                       |                                          |
| end-raw                                  | 46                                       |                                          |

doc_2 (AnnotationComparisonType.FIRST_HAS) 


| Path | First | Second |
| ----- | ----- | ----- |
| pretty_name                              | Milliliter                               |                                          |
| cui                                      | 258773002                                |                                          |
| type_ids                                 | ['7882689']                              |                                          |
| types                                    | ['']                                     |                                          |
| source_value                             | CC                                       |                                          |
| detected_name                            | cc                                       |                                          |
| acc                                      | 0.5504460208011586                       |                                          |
| context_similarity                       | 0.5504460208011586                       |                                          |
| start                                    | 68                                       |                                          |
| end                                      | 70                                       |                                          |
| icd10                                    | []                                       |                                          |
| ontologies                               | ['20220803_SNOMED_UK_CLINICAL_EXT']      |                                          |
| snomed                                   | []                                       |                                          |
| id                                       | 5                                        |                                          |
| meta_anns (Dict[str, dict])              | 3                                        | 0                                        |
| start-raw                                | 68                                       |                                          |
| end-raw                                  | 70                                       |                                          |

doc_2 (AnnotationComparisonType.SECOND_HAS) 


| Path | First | Second |
| ----- | ----- | ----- |
| pretty_name                              |                                          | Acute bronchitis                         |
| cui                                      |                                          | 10509002                                 |
| type_ids                                 |                                          | ['9090192']                              |
| types                                    |                                          | ['disorder']                             |
| source_value                             |                                          | Acute bronchitis                         |
| detected_name                            |                                          | acute~bronchitis                         |
| acc                                      |                                          | 1.0                                      |
| context_similarity                       |                                          | 1.0                                      |
| start                                    |                                          | 72                                       |
| end                                      |                                          | 88                                       |
| icd10                                    |                                          | ['J205', 'J206', 'J208', 'J202', 'J207', 'J200', 'J201', 'J700', 'J209', 'J203', 'J204', 'J680'] |
| ontologies                               |                                          | ['SNOMED-CT']                            |
| snomed                                   |                                          | []                                       |
| id                                       |                                          | 6                                        |
| meta_anns (Dict[str, dict])              | 0                                        | 0                                        |
| start-raw                                |                                          | 72                                       |
| end-raw                                  |                                          | 88                                       |

doc_2 (AnnotationComparisonType.FIRST_HAS) 


| Path | First | Second |
| ----- | ----- | ----- |
| pretty_name                              | History of (contextual qualifier)        |                                          |
| cui                                      | 392521001                                |                                          |
| type_ids                                 | ['7882689']                              |                                          |
| types                                    | ['']                                     |                                          |
| source_value                             | HX                                       |                                          |
| detected_name                            | hx                                       |                                          |
| acc                                      | 1.0                                      |                                          |
| context_similarity                       | 1.0                                      |                                          |
| start                                    | 90                                       |                                          |
| end                                      | 92                                       |                                          |
| icd10                                    | []                                       |                                          |
| ontologies                               | ['20220803_SNOMED_UK_CLINICAL_EXT']      |                                          |
| snomed                                   | []                                       |                                          |
| id                                       | 9                                        |                                          |
| meta_anns (Dict[str, dict])              | 3                                        | 0                                        |
| start-raw                                | 90                                       |                                          |
| end-raw                                  | 92                                       |                                          |

doc_2 (AnnotationComparisonType.SECOND_HAS) 


| Path | First | Second |
| ----- | ----- | ----- |
| pretty_name                              |                                          | year                                     |
| cui                                      |                                          | 258707000                                |
| type_ids                                 |                                          | ['7882689']                              |
| types                                    |                                          | ['qualifier value']                      |
| source_value                             |                                          | year                                     |
| detected_name                            |                                          | year                                     |
| acc                                      |                                          | 0.99                                     |
| context_similarity                       |                                          | 0.99                                     |
| start                                    |                                          | 114                                      |
| end                                      |                                          | 118                                      |
| icd10                                    |                                          | []                                       |
| ontologies                               |                                          | ['SNOMED-CT']                            |
| snomed                                   |                                          | []                                       |
| id                                       |                                          | 9                                        |
| meta_anns (Dict[str, dict])              | 0                                        | 0                                        |
| start-raw                                |                                          | 114                                      |
| end-raw                                  |                                          | 118                                      |

doc_2 (AnnotationComparisonType.SECOND_HAS) 


| Path | First | Second |
| ----- | ----- | ----- |
| pretty_name                              |                                          | Old episode                              |
| cui                                      |                                          | 272131007                                |
| type_ids                                 |                                          | ['7882689']                              |
| types                                    |                                          | ['qualifier value']                      |
| source_value                             |                                          | old                                      |
| detected_name                            |                                          | old                                      |
| acc                                      |                                          | 0.9644956622075471                       |
| context_similarity                       |                                          | 0.9644956622075471                       |
| start                                    |                                          | 119                                      |
| end                                      |                                          | 122                                      |
| icd10                                    |                                          | []                                       |
| ontologies                               |                                          | ['SNOMED-CT']                            |
| snomed                                   |                                          | []                                       |
| id                                       |                                          | 10                                       |
| meta_anns (Dict[str, dict])              | 0                                        | 0                                        |
| start-raw                                |                                          | 119                                      |
| end-raw                                  |                                          | 122                                      |

doc_2 (AnnotationComparisonType.SECOND_HAS) 


| Path | First | Second |
| ----- | ----- | ----- |
| pretty_name                              |                                          | Male                                     |
| cui                                      |                                          | 248153007                                |
| type_ids                                 |                                          | ['67667581']                             |
| types                                    |                                          | ['finding']                              |
| source_value                             |                                          | male                                     |
| detected_name                            |                                          | male                                     |
| acc                                      |                                          | 0.99                                     |
| context_similarity                       |                                          | 0.99                                     |
| start                                    |                                          | 123                                      |
| end                                      |                                          | 127                                      |
| icd10                                    |                                          | ['#NC']                                  |
| ontologies                               |                                          | ['SNOMED-CT']                            |
| snomed                                   |                                          | []                                       |
| id                                       |                                          | 11                                       |
| meta_anns (Dict[str, dict])              | 0                                        | 0                                        |
| start-raw                                |                                          | 123                                      |
| end-raw                                  |                                          | 127                                      |

doc_2 (AnnotationComparisonType.FIRST_HAS) 


| Path | First | Second |
| ----- | ----- | ----- |
| pretty_name                              | Presentation                             |                                          |
| cui                                      | 246105001                                |                                          |
| type_ids                                 | ['43039974']                             |                                          |
| types                                    | ['']                                     |                                          |
| source_value                             | presents                                 |                                          |
| detected_name                            | present                                  |                                          |
| acc                                      | 0.4530222896013254                       |                                          |
| context_similarity                       | 0.4530222896013254                       |                                          |
| start                                    | 132                                      |                                          |
| end                                      | 140                                      |                                          |
| icd10                                    | []                                       |                                          |
| ontologies                               | ['20220803_SNOMED_UK_CLINICAL_EXT']      |                                          |
| snomed                                   | []                                       |                                          |
| id                                       | 15                                       |                                          |
| meta_anns (Dict[str, dict])              | 3                                        | 0                                        |
| start-raw                                | 132                                      |                                          |
| end-raw                                  | 140                                      |                                          |

doc_2 (AnnotationComparisonType.SECOND_HAS) 


| Path | First | Second |
| ----- | ----- | ----- |
| pretty_name                              |                                          | Report                                   |
| cui                                      |                                          | 229059009                                |
| type_ids                                 |                                          | ['90170645']                             |
| types                                    |                                          | ['record artifact']                      |
| source_value                             |                                          | reports                                  |
| detected_name                            |                                          | report                                   |
| acc                                      |                                          | 1.0                                      |
| context_similarity                       |                                          | 1.0                                      |
| start                                    |                                          | 179                                      |
| end                                      |                                          | 186                                      |
| icd10                                    |                                          | []                                       |
| ontologies                               |                                          | ['SNOMED-CT']                            |
| snomed                                   |                                          | []                                       |
| id                                       |                                          | 15                                       |
| meta_anns (Dict[str, dict])              | 0                                        | 0                                        |
| start-raw                                |                                          | 179                                      |
| end-raw                                  |                                          | 186                                      |

doc_2 (AnnotationComparisonType.OVERLAPP_1ST_LARGER_DIFF_CONCEPT) 


| Path | First | Second |
| ----- | ----- | ----- |
| pretty_name                              | Productive cough                         | Cough                                    |
| cui                                      | 28743005                                 | 49727002                                 |
| type_ids                                 | ['67667581']                             | ['67667581']                             |
| types                                    | ['']                                     | ['finding']                              |
| source_value                             | cough productive                         | cough                                    |
| detected_name                            | cough~productive                         | cough                                    |
| acc                                      | 1.0                                      | 1.0                                      |
| context_similarity                       | 1.0                                      | 1.0                                      |
| start                                    | 189                                      | 189                                      |
| end                                      | 205                                      | 194                                      |
| icd10                                    | ['R05']                                  | ['R05X', 'J410', 'J111', 'F453', 'R042'] |
| ontologies                               | ['20220803_SNOMED_UK_CLINICAL_EXT']      | ['SNOMED-CT']                            |
| snomed                                   | []                                       | []                                       |
| id                                       | 21                                       | 16                                       |
| meta_anns (Dict[str, dict])              | 3                                        | 0                                        |
| start-raw                                | 189                                      | 189                                      |
| end-raw                                  | 205                                      | 194                                      |

doc_2 (AnnotationComparisonType.SAME_SPAN_DIFF_CONCEPT) 


| Path | First | Second |
| ----- | ----- | ----- |
| pretty_name                              | In the past                              | Past                                     |
| cui                                      | 410513005                                | 716861000000108                          |
| type_ids                                 | ['7882689']                              | ['90170645']                             |
| types                                    | ['']                                     | ['record artifact']                      |
| source_value                             | past                                     | past                                     |
| detected_name                            | past                                     | past                                     |
| acc                                      | 0.915658888260423                        | 0.99                                     |
| context_similarity                       | 0.915658888260423                        | 0.99                                     |
| start                                    | 200                                      | 200                                      |
| end                                      | 204                                      | 204                                      |
| icd10                                    | []                                       | []                                       |
| ontologies                               | ['20220803_SNOMED_UK_CLINICAL_EXT']      | ['SNOMED-CT']                            |
| snomed                                   | []                                       | []                                       |
| id                                       | 31                                       | 26                                       |
| meta_anns (Dict[str, dict])              | 3                                        | 0                                        |
| start-raw                                | 278                                      | 278                                      |
| end-raw                                  | 282                                      | 282                                      |

doc_2 (AnnotationComparisonType.FIRST_HAS) 


| Path | First | Second |
| ----- | ----- | ----- |
| pretty_name                              | Mitral valve regurgitation               |                                          |
| cui                                      | 48724000                                 |                                          |
| type_ids                                 | ['9090192']                              |                                          |
| types                                    | ['']                                     |                                          |
| source_value                             | Mr                                       |                                          |
| detected_name                            | mr                                       |                                          |
| acc                                      | 0.3131859075574164                       |                                          |
| context_similarity                       | 0.3131859075574164                       |                                          |
| start                                    | 200                                      |                                          |
| end                                      | 202                                      |                                          |
| icd10                                    | ['I34.0']                                |                                          |
| ontologies                               | ['20220803_SNOMED_UK_CLINICAL_EXT']      |                                          |
| snomed                                   | []                                       |                                          |
| id                                       | 36                                       |                                          |
| meta_anns (Dict[str, dict])              | 3                                        | 0                                        |
| start-raw                                | 341                                      |                                          |
| end-raw                                  | 343                                      |                                          |

doc_2 (AnnotationComparisonType.SECOND_HAS) 


| Path | First | Second |
| ----- | ----- | ----- |
| pretty_name                              |                                          | Medical                                  |
| cui                                      |                                          | 74188005                                 |
| type_ids                                 |                                          | ['7882689']                              |
| types                                    |                                          | ['qualifier value']                      |
| source_value                             |                                          | medical                                  |
| detected_name                            |                                          | medical                                  |
| acc                                      |                                          | 1.0                                      |
| context_similarity                       |                                          | 1.0                                      |
| start                                    |                                          | 200                                      |
| end                                      |                                          | 207                                      |
| icd10                                    |                                          | []                                       |
| ontologies                               |                                          | ['SNOMED-CT']                            |
| snomed                                   |                                          | []                                       |
| id                                       |                                          | 31                                       |
| meta_anns (Dict[str, dict])              | 0                                        | 0                                        |
| start-raw                                |                                          | 360                                      |
| end-raw                                  |                                          | 367                                      |

doc_2 (AnnotationComparisonType.FIRST_HAS) 


| Path | First | Second |
| ----- | ----- | ----- |
| pretty_name                              | Respiratory rate                         |                                          |
| cui                                      | 86290005                                 |                                          |
| type_ids                                 | ['2680757']                              |                                          |
| types                                    | ['']                                     |                                          |
| source_value                             | respiratory                              |                                          |
| detected_name                            | respiratory                              |                                          |
| acc                                      | 0.370196740030693                        |                                          |
| context_similarity                       | 0.370196740030693                        |                                          |
| start                                    | 200                                      |                                          |
| end                                      | 211                                      |                                          |
| icd10                                    | []                                       |                                          |
| ontologies                               | ['20220803_SNOMED_UK_CLINICAL_EXT']      |                                          |
| snomed                                   | []                                       |                                          |
| id                                       | 50                                       |                                          |
| meta_anns (Dict[str, dict])              | 3                                        | 0                                        |
| start-raw                                | 497                                      |                                          |
| end-raw                                  | 508                                      |                                          |

doc_2 (AnnotationComparisonType.SAME_SPAN_DIFF_CONCEPT) 


| Path | First | Second |
| ----- | ----- | ----- |
| pretty_name                              | Disease                                  | Condition                                |
| cui                                      | 64572001                                 | 260905004                                |
| type_ids                                 | ['9090192']                              | ['43039974']                             |
| types                                    | ['']                                     | ['attribute']                            |
| source_value                             | conditions                               | conditions                               |
| detected_name                            | condition                                | condition                                |
| acc                                      | 0.5839914477394028                       | 1.0                                      |
| context_similarity                       | 0.5839914477394028                       | 1.0                                      |
| start                                    | 200                                      | 200                                      |
| end                                      | 210                                      | 210                                      |
| icd10                                    | ['']                                     | []                                       |
| ontologies                               | ['20220803_SNOMED_UK_CLINICAL_EXT']      | ['SNOMED-CT']                            |
| snomed                                   | []                                       | []                                       |
| id                                       | 51                                       | 44                                       |
| meta_anns (Dict[str, dict])              | 3                                        | 0                                        |
| start-raw                                | 509                                      | 509                                      |
| end-raw                                  | 519                                      | 519                                      |

doc_2 (AnnotationComparisonType.OVERLAPP_1ST_LARGER_DIFF_CONCEPT) 


| Path | First | Second |
| ----- | ----- | ----- |
| pretty_name                              | Alcoholic beverage intake                | Substance with alcohol structure         |
| cui                                      | 897148007                                | 53041004                                 |
| type_ids                                 | ['2680757']                              | ['91187746']                             |
| types                                    | ['']                                     | ['substance']                            |
| source_value                             | alcohol consumption                      | alcohol                                  |
| detected_name                            | alcohol~consumption                      | alcohol                                  |
| acc                                      | 1.0                                      | 1.0                                      |
| context_similarity                       | 1.0                                      | 1.0                                      |
| start                                    | 200                                      | 200                                      |
| end                                      | 219                                      | 207                                      |
| icd10                                    | []                                       | []                                       |
| ontologies                               | ['20220803_SNOMED_UK_CLINICAL_EXT']      | ['SNOMED-CT']                            |
| snomed                                   | []                                       | []                                       |
| id                                       | 64                                       | 55                                       |
| meta_anns (Dict[str, dict])              | 3                                        | 0                                        |
| start-raw                                | 596                                      | 596                                      |
| end-raw                                  | 615                                      | 603                                      |

doc_2 (AnnotationComparisonType.SAME_PARENT) 


| Path | First | Second |
| ----- | ----- | ----- |
| pretty_name                              | Physical examination procedure           | Examination - action                     |
| cui                                      | 5880005                                  | 302199004                                |
| type_ids                                 | ['28321150']                             | ['7882689']                              |
| types                                    | ['']                                     | ['qualifier value']                      |
| source_value                             | examination                              | examination                              |
| detected_name                            | examination                              | examination                              |
| acc                                      | 0.99                                     | 1.0                                      |
| context_similarity                       | 0.99                                     | 1.0                                      |
| start                                    | 200                                      | 200                                      |
| end                                      | 211                                      | 211                                      |
| icd10                                    | []                                       | []                                       |
| ontologies                               | ['20220803_SNOMED_UK_CLINICAL_EXT']      | ['SNOMED-CT']                            |
| snomed                                   | []                                       | []                                       |
| id                                       | 66                                       | 57                                       |
| meta_anns (Dict[str, dict])              | 3                                        | 0                                        |
| start-raw                                | 628                                      | 628                                      |
| end-raw                                  | 639                                      | 639                                      |