<H1>ANKH</H1>

In [1]:
import phenopackets as php
from google.protobuf.json_format import MessageToDict, MessageToJson
from google.protobuf.json_format import Parse, ParseDict
import pandas as pd
pd.set_option('display.max_colwidth', None) # show entire column contents, important!
from collections import defaultdict
import os
import sys

sys.path.insert(0, os.path.abspath('../../pyphetools'))
from pyphetools import *

<h3>Import HPO Dara</h3>

In [2]:
parser = HpoParser()
hpo_cr = parser.get_hpo_concept_recognizer()

<H1>Importing a single case report</H1>
<p>Here, we use functions of the pyphetools package to import data from a typical case report: <a href="https://pubmed.ncbi.nlm.nih.gov/33748234/" target="__blank">Wu JL, et al.</a> A three-year clinical investigation of a Chinese child with craniometaphyseal dysplasia caused by a mutated ANKH gene. World J Clin Cases. 2021 Mar 16;9(8):1853-1862.</p>
<p>The case report consists of several sections to which we can apply text mining and add some corrections for cases in which text mining fails to capture an HPO term or calls a false-positive term.</p>
<p>The basic strategy is to use the <tt>add_vignette</tt> function for each section and judge the results by manual inspecting, adding any missed terms using the custom dictionary (see examples below).</p>

In [3]:
pmid = "PMID:33748234"
age = "P1Y5M"
parser = CaseParser(concept_recognizer=hpo_cr, pmid=pmid, age_at_last_exam=age)

<h3>Chief complaints</h3>
<p>A 17-mo-old boy presented with progressive nasal obstruction, snoring and hearing loss symptoms when referred to the hospital.</p>

In [4]:
vignette = "A 17-mo-old boy presented with progressive nasal obstruction, snoring and hearing loss symptoms when referred to the hospital."
results = parser.add_vignette(vignette=vignette)

In [5]:
results

Unnamed: 0,id,label,observed,measured
0,HP:0000365,Hearing impairment,True,True
1,HP:0001742,Nasal congestion,True,True
2,HP:0025267,Snoring,True,True


<h2>History of present illness</h2>

In [6]:
v2 = """
The patient’s medical history was first reviewed before the diagnosis. His head circumference was 45.5 cm, 
46.5 cm and 49.5 cm at age 3 mo, 6 mo and 12 mo, respectively. When he was 6 mo old, fiber nasopharyngoscopy 
revealed a double choanal stenosis. The patient was found to have a serious nasal obstruction at the age of 
12 mo due to a wide nasal bridge. Occasionally, he resorted to mouth breathing, especially at night. 
The patient was examined at the local hospital, showed low bone mineral density and commenced 
oral calcium supplements. His head circumference increased to 51 cm (standard value 45.2 cm). 
Consequently, the patient developed a prominent forehead, prognathism and occipital protuberance. 
At the age of 16 mo, the patient presented with mild hearing loss. He had been receiving calcium and 
vitamin D supplementation for 4 mo prior to examination at other hospital; however, the patient’s symptoms 
developed progressively. 
"""
d = {'low bone mineral density':'Reduced bone mineral density',
    'head circumference increased to 51 cm':'Macrocephaly'}
results = parser.add_vignette(vignette=v2, custom_d=d)

In [7]:
results


Unnamed: 0,id,label,observed,measured
0,HP:0004349,Reduced bone mineral density,True,True
1,HP:0000256,Macrocephaly,True,True
2,HP:0000303,Mandibular prognathia,True,True
3,HP:0000365,Hearing impairment,True,True
4,HP:0000431,Wide nasal bridge,True,True
5,HP:0000452,Choanal stenosis,True,True
6,HP:0001742,Nasal congestion,True,True
7,HP:0011220,Prominent forehead,True,True


<h2>Physical examination</h2>

In [8]:
v3 = """
A wide nasal bridge, paranasal bossing, widely spaced eyes with an increased bizygomatic width, and 
prominent mandible (Figure 1) were noted. However, hypertelorism was not obviously discernible. 
Additionally, the patient’s frontal and maxillary sinuses were severely obstructed. He had 20 teeth 
with wide spacing between the teeth. His teeth appeared small. He exhibited no facial nerve palsy or 
limb muscle tension. His pain perception and muscular strength appeared normal. Nasal laryngeal mirror 
showed serious choanal stenosis on both sides. The bottom of the patient’s nose exhibited bossing and 
his palatine bone appeared thickened. The patient’s parents and his elder brother had completely normal 
features.
"""
d = {
    'wide spacing between the teeth': 'Widely spaced teeth'
}
results = parser.add_vignette(vignette=v3, custom_d=d)
results

Unnamed: 0,id,label,observed,measured
0,HP:0000687,Widely spaced teeth,True,True
1,HP:0000303,Mandibular prognathia,True,True
2,HP:0000316,Hypertelorism,True,True
3,HP:0000316,Hypertelorism,True,True
4,HP:0000431,Wide nasal bridge,True,True
5,HP:0000452,Choanal stenosis,True,True
6,HP:0010628,Facial palsy,True,True
7,HP:0012531,Pain,True,True


<h2>Radiograph</h2>

In [9]:
v4 ="""
Radiograph and facial appearance of the child aged 17 mo. A and B: Cranial computed tomography (CT) scan 
shows significantly increased bone density and thickened bone plate of the skull. The sinus cavity was small 
without inflation, and the nasal cavity was obviously narrowed. The nasal bone was thickened with abnormal 
morphology; C: CT scan shows that the middle ear cavities were narrowed; the lumen of the labyrinth 
(vestibular, semicircular canal and cochlear) was sclerotic and the ossicular chain was thickened. The width 
of the left optic canal was 3.93 mm, the width of the right optic canal was 4.17 mm; D: CT scan shows 
sclerosis of the clavicles and ribs; E: X-ray image shows pronounced metaphyseal flaring in the distal 
femora and “Flask deformation” of the proximal metaphysis on both sides (Erlenmeyer flask configuration); 
F: Facial appearance of the patient shows a wide nasal bridge, paranasal bossing, widely spaced eyes with 
an increased bizygomatic width, and a prominent mandible. 
"""
d4 = {}
results = parser.add_vignette(vignette=v4, custom_d=d4)
results

Unnamed: 0,id,label,observed,measured
0,HP:0000303,Mandibular prognathia,True,True
1,HP:0000316,Hypertelorism,True,True
2,HP:0000431,Wide nasal bridge,True,True
3,HP:0003015,Flared metaphysis,True,True
4,HP:0011001,Increased bone mineral density,True,True


<h2>Laboratory examinations</h2>

In [10]:
v5 = """
Laboratory examinations

The patient’s blood test results are shown in Table 1. On the patient’s first visit to the hospital, 
he underwent laboratory tests of serum alkaline, calcium and others. After genetic diagnosis, more 
related tests were performed. Given that the diagnosis of AD-CMD was established, related biochemical 
tests were followed up (Table 1). The results showed that the patient’s serum concentration of 
alkaline phosphatase (ALP) decreased after 3 mo of dietary intervention. After 14 mo of low-calcium 
diet at the age of 2 years and 7 mo, his ALP continuously decreased to within the normal range. His parents 
then changed his diet to an intermittent low-calcium diet to include milk and eggs. 
At the age of 4 years and 2 mo, his ALP was slightly higher than normal but still close to the normal range. 
His serum osteocalcin (OC) was higher than normal even after 8 mo of low-calcium diet. However, it began 
to continuously drop after dietary restrictions for 14 mo and then reached normal levels. His serum 
combined beta C-terminal telopeptide of type I collagen (β-CTX) also decreased after 33 mo of the 
nutritional intervention, but still remained slightly higher than normal. Other biochemical test 
results were normal.
"""
d5 = {}
results = parser.add_vignette(vignette=v5, custom_d=d5)
results

Unnamed: 0,Col1,Col2,Col3


In [11]:
term="Increased circulating osteocalcin level"
age="P1Y8M" # 20 months
results = parser.add_term(label=term, custom_age=age)
results

Unnamed: 0,id,label,observed,measured
0,HP:0031428,Increased circulating osteocalcin level,False,True


In [12]:
# Combined β-CTX (ng/mL)
term = "Increased circulating beta-C-terminal telopeptide concentration" # HP:0031425
results = parser.add_term(label=term, custom_age="P1Y8M")
results

Unnamed: 0,id,label,observed,measured
0,HP:0031425,Increased circulating beta-C-terminal telopeptide concentration,False,True


In [13]:
# low 25-hydroxyvitamin D
term = "Decreased circulating calcifediol concentration" # HP:0012053
results = parser.add_term(label=term, custom_age="P1Y8M")
results

Unnamed: 0,id,label,observed,measured
0,HP:0012053,Decreased circulating calcifediol concentration,False,True


In [14]:
term = "Elevated circulating alkaline phosphatase concentration"
results = parser.add_term(label=term, custom_age="P1Y8M")
results

Unnamed: 0,id,label,observed,measured
0,HP:0003155,Elevated circulating alkaline phosphatase concentration,False,True


In [15]:
term = "Abnormal circulating calcium concentration"
results = parser.add_term(label=term, excluded=True, custom_age="P1Y8M")
results

Unnamed: 0,id,label,observed,measured
0,HP:0004363,Abnormal circulating calcium concentration,False,True
