<H1>FBN1: acromicric and geleophysic dysplasia (Le Goff, 2011)</H1>
<p>Extract phenopackets from the clinical data in <a href="https://pubmed.ncbi.nlm.nih.gov/21683322/" target="__blank">Le Goff et al (2011)</a>.</p>

In [1]:
import phenopackets as php
from google.protobuf.json_format import MessageToDict, MessageToJson
from google.protobuf.json_format import Parse, ParseDict
import pandas as pd
pd.set_option('display.max_colwidth', None) # show entire column contents, important!
from collections import defaultdict
from pyphetools.creation import *
# last tested with pyphetools version 0.2.20

In [2]:
parser = HpoParser()
hpo_cr = parser.get_hpo_concept_recognizer()
hpo_version = parser.get_version()
metadata = MetaData(created_by="ORCID:0000-0002-0736-9199")
metadata.default_versions_with_hpo(version=hpo_version)

In [16]:
df = pd.read_excel('input/LeGoff_FBN1_AD_GD.xltx')

<H2>Geleophysic dysplasia and Acromicric dysplasia: CLinical description</H2>
<p>According to Le Goff et al., Geleophysic dysplasia (GD, [MIM 231050]) and acromicric dysplasia (AD, [MIM 102370]) belong to the acromelic dysplasia group and are both characterized by severe short stature (&lt;−3 standard deviations [SD]), short hands and feet, joint limitations, and skin thickening.1
Radiological manifestations include delayed bone age, cone-shaped epiphyses, shortened long tubular bones, and ovoid vertebral bodies. GD is distinct from AD because it has an autosomal-recessive mode of inheritance, characteristic facial features—a “happy” face with full cheeks, a shortened nose, hypertelorism, a long and flat philtrum, and a thin upper lip—a progressive cardiac valvular thickening often leading to an early death, toe walking, tracheal stenosis, respiratory insufficiency, and lysosomal-like storage vacuoles in various tissues.</p> 
<p>In this notebook, we will first extract the data for Geleophysic dysplasia.</p>

<H3>Geleophysic dysplasia</H3>
<p>According to the authors, there were "Nineteen GD cases were included in the study, and they all fulfilled the diagnostic criteria, namely short stature &lt;−3 SD, short hands and feet, restricted joint mobility, characteristic facial features, and progressive cardiac involvement (Table 1, Figure 1). </p>
<p>Detailed clinical data are not available, but according to this description, we will assume that each proband had "progressive dilation and thickening of the pulmonary, aortic, and mitral valves, with stenosis of these three valves (See the <a href="https://www.ncbi.nlm.nih.gov/books/NBK11168/">GeneReview</a> entry for geleophysic dysplasia) as well as the above named facial features.</p>

In [17]:
gd_df = df[df['Diagnosis']=="GD"]

In [18]:
gd_df.set_index("Family", inplace=True)
#dft = df.transpose()

#dft.columns = dft.iloc[0]
#dft.drop(dft.index[0], inplace=True)
#dft.head()
gd_df

Unnamed: 0_level_0,Origin,Diagnosis,Age (Years),Height,Cardiac Involvement,Other
Family,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
1,Belgium,GD,Death at 9,<−6 SD (80 cm),mitral stenosis and insufficiency,tracheotomy at 3
2,France,GD,18,<−6 SD (112 cm),mitral stenosis,"HTAP, respiratory insufficiency,\n\nhepatomegaly, laryngeal stenosis"
3,Russia,GD,12,<−6 SD (106 cm),no,hepatomegaly
4,Switzerland,GD,21,<−6 SD (116 cm),"tricuspid stenosis, mild aortic insufficiency",
5,Russia,GD,8,−4 SD (103.5 cm),no,
6,France,GD,5.7,−4 SD (97 cm),no,laryngeal and respiratory insufficiency
7,U.K.,GD,Death at 3,−5 SD (75 cm),no,"respiratory insufficiency, HTAP,\n\nSleep apnea"
8,Turkey,GD,4.5,−4 SD (85 cm),mitral and tricuspide stenosis,"respiratory insufficiency, hepatomegaly, spleep apnea"
9,Algeria,GD,Death at 4,<−6 SD (60 cm),mitral and tricuspide stenosis,"laryngeal and respiratory insufficiency, HTAP"
10,Lebanon,GD,14,−3.5 SD (133 cm),no,–


In [19]:
column_mapper_d = defaultdict(ColumnMapper)

Unnamed: 0_level_0,Origin,Diagnosis,Age (Years),Height,Cardiac Involvement,Other
Family,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
18,USA,GD,?,?,,


In [29]:
# Remove one proband at index 18 because no clinical data at all is available
delete_row = gd_df.loc[[18]].index
gd_df = gd_df.drop(delete_row)
gd_df

Unnamed: 0_level_0,Origin,Diagnosis,Age (Years),Height,Cardiac Involvement,Other
Family,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
1,Belgium,GD,Death at 9,<−6 SD (80 cm),mitral stenosis and insufficiency,tracheotomy at 3
2,France,GD,18,<−6 SD (112 cm),mitral stenosis,"HTAP, respiratory insufficiency,\n\nhepatomegaly, laryngeal stenosis"
3,Russia,GD,12,<−6 SD (106 cm),no,hepatomegaly
4,Switzerland,GD,21,<−6 SD (116 cm),"tricuspid stenosis, mild aortic insufficiency",
5,Russia,GD,8,−4 SD (103.5 cm),no,
6,France,GD,5.7,−4 SD (97 cm),no,laryngeal and respiratory insufficiency
7,U.K.,GD,Death at 3,−5 SD (75 cm),no,"respiratory insufficiency, HTAP,\n\nSleep apnea"
8,Turkey,GD,4.5,−4 SD (85 cm),mitral and tricuspide stenosis,"respiratory insufficiency, hepatomegaly, spleep apnea"
9,Algeria,GD,Death at 4,<−6 SD (60 cm),mitral and tricuspide stenosis,"laryngeal and respiratory insufficiency, HTAP"
10,Lebanon,GD,14,−3.5 SD (133 cm),no,–


In [30]:
column_mapper_d = defaultdict(ColumnMapper)

In [32]:
# Assume that all patients have short stature to some degree, even though 
# details are not recorded in two individuals
statureMapper = ConstantColumnMapper(hpo_id="HP:0004322", hpo_label="Short stature")
#statureMapper.preview_column(gd_df["Height"])
column_mapper_d["Height"] = statureMapper

In [39]:
# Cardiac Involvement
# Note: Change entries to enable text mining
gd_df.at[1,"Cardiac Involvement"] = "Mitral stenosis; Mitral regurgitation" # was mitral stenosis and insufficiency
gd_df.at[8,"Cardiac Involvement"] = "Mitral stenosis; Tricuspid stenosis"  # was mitral and tricuspide stenosis
gd_df.at[9,"Cardiac Involvement"] = "Mitral stenosis; Tricuspid stenosis"  # was mitral and tricuspide stenosis
gd_df.at[15,"Cardiac Involvement"] = "Aortic stenosis; Mitral regurgitation; Aortic regurgitation"

#mitral and aortic valve insufficiencies
cardiacMapper = CustomColumnMapper(concept_recognizer=hpo_cr)
cardiacMapper.preview_column(gd_df["Cardiac Involvement"])
column_mapper_d["Cardiac Involvement"] = cardiacMapper

In [40]:
# Other
otherMap = CustomColumnMapper(concept_recognizer=hpo_cr)
otherMap.preview_column(gd_df["Other"])

Unnamed: 0,column,terms
0,tracheotomy at 3,
1,"HTAP, respiratory insufficiency,\n\nhepatomegaly, laryngeal stenosis",Respiratory insufficiency (HP:0002093); Hepatomegaly (HP:0002240); Laryngeal stenosis (HP:0001602)
2,hepatomegaly,Hepatomegaly (HP:0002240)
3,,
4,laryngeal and respiratory insufficiency,Respiratory insufficiency (HP:0002093)
5,"respiratory insufficiency, HTAP,\n\nSleep apnea",Respiratory insufficiency (HP:0002093); Sleep apnea (HP:0010535)
6,"respiratory insufficiency, hepatomegaly, spleep apnea",Respiratory insufficiency (HP:0002093); Hepatomegaly (HP:0002240); Apnea (HP:0002104)
7,"laryngeal and respiratory insufficiency, HTAP",Respiratory insufficiency (HP:0002093)
8,–,
9,pyloric stenosis,Pyloric stenosis (HP:0002021)
