# FindingModel Index

In [1]:
from pprint import pprint

from findingmodel import ChoiceAttributeIded, ChoiceValueIded, FindingModelFull, NumericAttributeIded
from findingmodel.index import Index

`Index` is a list of the basic metadata about a model. It can be loaded from a JSONL file, 
or it can be extracted from a directory of `*.fm.json` files containing model definitions.
The index is defined with a path to a directory containing a `defs` directory containing the
model definitions, and may contain a `index.jsonl` file at the same level as the definitions 
directory. (Like the [findingmodels repository](https://github.com/openimagingdata/findingmodels))

In [2]:
index = Index("data")

In [3]:
len(index)

6

Get an entry from the index (metadata only; use `load_model()` to actually get the finding model) using ID, name, or one of its synonyms.

In [4]:
entry = index["abdominal aortic aneurysm"]
if entry:
    pprint(entry.model_dump())
else:
    print("Entry not found.")

{'attributes': [{'attribute_id': 'OIFMA_MSFT_898601',
                 'name': 'presence',
                 'type': 'choice'},
                {'attribute_id': 'OIFMA_MSFT_783072',
                 'name': 'change from prior',
                 'type': 'choice'}],
 'contributors': ['HeatherChase', 'MSFT'],
 'description': 'An abdominal aortic aneurysm (AAA) is a localized dilation of '
                'the abdominal aorta, typically defined as a diameter greater '
                'than 3 cm, which can lead to rupture and significant '
                'morbidity or mortality.',
 'filename': 'abdominal_aortic_aneurysm.fm.json',
 'name': 'abdominal aortic aneurysm',
 'oifm_id': 'OIFM_MSFT_134126',
 'synonyms': ['AAA'],
 'tags': None}


In [5]:
pprint([entry.name for entry in index.entries])

['Ventricular diameters',
 'Mammographic malignancy assessment',
 'pulmonary embolism',
 'abdominal aortic aneurysm',
 'Breast density',
 'aortic dissection']


## Add/Remove Model

Note that adding a model performs a number of checks, especially for duplicate IDs, duplicated names, duplicate synonyms.

In [6]:
new_model = FindingModelFull(
    oifm_id="OIFM_TEST_123456",
    name="Test Model",
    description="A simple test finding model.",
    synonyms=["Test Synonym"],
    tags=["tag1", "tag2"],
    attributes=[
        ChoiceAttributeIded(
            oifma_id="OIFMA_TEST_123456",
            name="Severity",
            description="How severe is the finding?",
            values=[
                ChoiceValueIded(value_code="OIFMA_TEST_123456.0", name="Mild"),
                ChoiceValueIded(value_code="OIFMA_TEST_123456.1", name="Severe"),
            ],
            required=True,
            max_selected=1,
        ),
        NumericAttributeIded(
            oifma_id="OIFMA_TEST_654321",
            name="Size",
            description="Size of the finding.",
            minimum=1,
            maximum=10,
            unit="cm",
            required=False,
        ),
    ],
)

In [7]:
index.add_entry(new_model, "test_model.fm.json")

In [8]:
pprint([entry.name for entry in index.entries])

['Ventricular diameters',
 'Mammographic malignancy assessment',
 'pulmonary embolism',
 'abdominal aortic aneurysm',
 'Breast density',
 'aortic dissection',
 'Test Model']


In [9]:
index.remove_entry("Test Model")
print(index["Test Model"])

None


## Get Full Model from File

In [10]:
model = index.load_model("abdominal aortic aneurysm")
if model:
    print(model.model_dump_json(indent=2, exclude_none=True))
else:
    print("Model not found.")

{
  "oifm_id": "OIFM_MSFT_134126",
  "name": "abdominal aortic aneurysm",
  "description": "An abdominal aortic aneurysm (AAA) is a localized dilation of the abdominal aorta, typically defined as a diameter greater than 3 cm, which can lead to rupture and significant morbidity or mortality.",
  "synonyms": [
    "AAA"
  ],
  "contributors": [
    {
      "github_username": "HeatherChase",
      "email": "heatherchase@microsoft.com",
      "name": "Heather Chase",
      "organization_code": "MSFT",
      "url": "https://www.linkedin.com/in/heatherwalkerchase/"
    },
    {
      "name": "Microsoft",
      "code": "MSFT",
      "url": "https://microsoft.com/"
    }
  ],
  "attributes": [
    {
      "oifma_id": "OIFMA_MSFT_898601",
      "name": "presence",
      "description": "Presence or absence of abdominal aortic aneurysm",
      "type": "choice",
      "values": [
        {
          "value_code": "OIFMA_MSFT_898601.0",
          "name": "absent",
          "description": "Abdomina

## Name Search

Look for hits fuzzily matching a target string. Might be useful for finding potential duplicates before inserting, or just to quickly
search in general. Hits can come from the name or synonyms (not description).

In [11]:
results = index.find_similar_names("abdomen")
pprint(results)

[('abdominal aortic aneurysm', 77.14285714285715),
 ('Breast density', 51.42857142857142),
 ('Mammographic density', 51.300000000000004)]


In [12]:
results = index.find_similar_names("breast")
pprint(results)

[('Breast density', 90.0),
 ('Risk of breast cancer', 90.0),
 ('Breast cancer risk assessment', 90.0)]


In [13]:
results = index.find_similar_names("mammogram")
pprint(results)

[('Mammographic malignancy assessment', 84.70588235294117),
 ('Mammographic density', 84.70588235294117),
 ('aortic dissection', 45.0)]
