 ### POLR1A 

[Acrofacial dysostosis, Cincinnati type - OMIM:616462](https://omim.org/entry/616462) is caused by variants in the POLR1A gene.

Data is extracted from [Smallwood K, et al. POLR1A variants underlie phenotypic heterogeneity in craniofacial, neural, and cardiac anomalies. Am J Hum Genet. 2023;110(5):809-825](https://pubmed.ncbi.nlm.nih.gov/37075751/).

In [1]:
import pandas as pd
pd.set_option('display.max_colwidth', None) # show entire column contents, important!
from IPython.display import display, HTML
from pyphetools.creation import *
from pyphetools.visualization import *
from pyphetools.validation import *
import pyphetools
print(f"Using pyphetools version {pyphetools.__version__}")

Using pyphetools version 0.9.66


In [2]:
PMID = "PMID:37075751"
title = "POLR1A variants underlie phenotypic heterogeneity in craniofacial, neural, and cardiac anomalies"
cite = Citation(pmid=PMID, title=title)
parser = HpoParser(hpo_json_file="../hp.json")
hpo_cr = parser.get_hpo_concept_recognizer()
hpo_version = parser.get_version()
hpo_ontology = parser.get_ontology()
metadata = MetaData(created_by="ORCID:0000-0001-7941-2961", citation=cite)
metadata.default_versions_with_hpo(version=hpo_version)
print(f"HPO version {hpo_version}")

HPO version 2024-03-06


In [3]:
df = pd.read_excel("input/POLR1A_Smallwood_2023.xlsx")
dft = df.transpose()
dft.columns = dft.iloc[0] 
dft.drop(dft.index[0], inplace=True)
dft['person'] = dft.index
dft['person_id'] = dft['person'].apply(lambda x: f"Individual {x}")
dft.head(2)

individual_id,Test,NM_015425.6,Sex,Onset,Age (at most recent assessment),Deceased,Protein,Inheritance,ACMG classification,ACMG criteria,...,IUGR,NaN,head circumference (cm),length (cm),weight (kg),Any other features,Abbreviations: U/L unilateral; B/L bilateral; n/a not available; VUS variant of uncertain significance; BP4 lack of segregation in affected members of a family; PM2 absent from controls; PP3 computational evidence supports deleterious effect; PS2 de novo; PP5 reputable source reports variant as pathogenic; PM4 protein length changes due to in frame deletion in a non-repeat region,SD calculated using CDC growth charts,person,person_id
1,Targeted variant,c.176A>T,male,Congenital onset,P3M,yes,p.(Asp59Val),paternal,VUS,"PM2, PP3",...,no,,,,,,,,1,Individual 1
2,Exome trio,c.176A>T,male,Congenital onset,P5Y,no,p.(Asp59Val),paternal,VUS,"PM2, PP3",...,no,,,,,,,,2,Individual 2


In [4]:
generator = SimpleColumnMapperGenerator(df=dft,
                                  observed='yes',
                                  excluded='no',
                                  hpo_cr=hpo_cr)
column_mapper_list = generator.try_mapping_columns()
display(HTML(generator.to_html()))

Result,Columns
Mapped,Macrocephaly; Microcephaly; Craniosynostosis; Ptosis; Hypertelorism; Cleft lip; Cleft palate; Micrognathia; Facial Asymmetry; Hypodontia; Ventriculomegaly; Hypotonia; Infantile spasms; Epilepsy; Developmental delay
Unmapped,"Test; NM_015425.6; Sex; Onset; Age (at most recent assessment); Deceased; Protein; Inheritance; ACMG classification; ACMG criteria; gnomAD allele count (homozygotes); Ears-low set; Microtia (unilateral or bilateral); Other craniofacial; Brain imaging (CT, MRI, Ultrasound); Structural Brain anomaly; Contractures; Regression; Other neuro; Limb defects; Echocardiogram (Y/N); PFO/ASD; VSD; Other structural heart; IUGR; nan; head circumference (cm); length (cm); weight (kg); Any other features; Abbreviations: U/L unilateral; B/L bilateral; n/a not available; VUS variant of uncertain significance; BP4 lack of segregation in affected members of a family; PM2 absent from controls; PP3 computational evidence supports deleterious effect; PS2 de novo; PP5 reputable source reports variant as pathogenic; PM4 protein length changes due to in frame deletion in a non-repeat region ; SD calculated using CDC growth charts; person; person_id"


In [5]:
#output = OptionColumnMapper.autoformat(df=dft, hpo_cr=hpo_cr, delimiter=";")
#print(output)

In [6]:
macrocephaly_d = { #'nan': 'PLACEHOLDER',
 #'no': 'PLACEHOLDER',
 #'unk': 'PLACEHOLDER'
}
excluded = {'no': 'Macrocephaly', }
macrocephalyMapper = OptionColumnMapper(column_name="Macrocephaly", concept_recognizer=hpo_cr, option_d=macrocephaly_d, excluded_d=excluded)
column_mapper_list.append(macrocephalyMapper)
macrocephalyMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Macrocephaly (HP:0000256) (excluded),13


In [7]:
microcephaly_d = {#'nan': 'PLACEHOLDER',
 'yes': 'Microcephaly',
 'yes (presumed)': 'Microcephaly',
}
excluded = {'no': 'Microcephaly',}
microcephalyMapper = OptionColumnMapper(column_name="Microcephaly", concept_recognizer=hpo_cr, option_d=microcephaly_d, excluded_d=excluded)
column_mapper_list.append(microcephalyMapper)
microcephalyMapper.preview_column(dft)


Unnamed: 0,mapping,count
0,Microcephaly (HP:0000252) (observed),5
1,Microcephaly (HP:0000252) (excluded),8


In [8]:
craniosynostosis_d = {'metopic': 'Metopic synostosis',
 'Metopic': 'Metopic synostosis'}
excluded = {'no': 'Metopic synostosis'}
craniosynostosisMapper = OptionColumnMapper(column_name="Craniosynostosis", concept_recognizer=hpo_cr, option_d=craniosynostosis_d, excluded_d=excluded)
column_mapper_list.append(craniosynostosisMapper)
craniosynostosisMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Metopic synostosis (HP:0011330) (observed),3
1,Metopic synostosis (HP:0011330) (excluded),15


In [9]:
ptosis_d = {#'no': 'PLACEHOLDER',
 'yes': 'Ptosis'}
excluded = {'no': 'Ptosis'}
ptosisMapper = OptionColumnMapper(column_name="Ptosis", concept_recognizer=hpo_cr, option_d=ptosis_d, excluded_d=excluded)
column_mapper_list.append(ptosisMapper)
ptosisMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Ptosis (HP:0000508) (excluded),13
1,Ptosis (HP:0000508) (observed),5


In [10]:
hypertelorism_d = {#'no': 'PLACEHOLDER',
 'yes': 'Hypertelorism',
 'mild': 'Hypertelorism',
}
excluded = {'no': 'Hypertelorism'}
hypertelorismMapper = OptionColumnMapper(column_name="Hypertelorism", concept_recognizer=hpo_cr, option_d=hypertelorism_d, excluded_d=excluded)
column_mapper_list.append(hypertelorismMapper)
hypertelorismMapper.preview_column(dft)


Unnamed: 0,mapping,count
0,Hypertelorism (HP:0000316) (excluded),7
1,Hypertelorism (HP:0000316) (observed),9


In [11]:
cleft_lip_d = {'yes': 'Cleft lip'}
excluded = {'no': 'Cleft lip'}
cleft_lipMapper = OptionColumnMapper(column_name="Cleft lip", concept_recognizer=hpo_cr, option_d=cleft_lip_d, excluded_d=excluded)
column_mapper_list.append(cleft_lipMapper)
cleft_lipMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Cleft lip (HP:0410030) (observed),2
1,Cleft lip (HP:0410030) (excluded),16


In [12]:
cleft_palate_d = {'yes': 'Cleft palate',
 'no (high palate)': 'High palate'}
excluded = {'no': 'Cleft palate'}
cleft_palateMapper = OptionColumnMapper(column_name="Cleft palate", concept_recognizer=hpo_cr, option_d=cleft_palate_d, excluded_d=excluded)
column_mapper_list.append(cleft_palateMapper)
cleft_palateMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Cleft palate (HP:0000175) (observed),5
1,Cleft palate (HP:0000175) (excluded),12
2,High palate (HP:0000218) (observed),1


In [13]:
micrognathia_d = {'yes': 'Micrognathia'}
excluded = {'no': 'Micrognathia'}
micrognathiaMapper = OptionColumnMapper(column_name="Micrognathia", concept_recognizer=hpo_cr, option_d=micrognathia_d, excluded_d=excluded)
column_mapper_list.append(micrognathiaMapper)
micrognathiaMapper.preview_column(dft)


Unnamed: 0,mapping,count
0,Micrognathia (HP:0000347) (excluded),11
1,Micrognathia (HP:0000347) (observed),7


In [14]:
ears_low_set_d = {'yes': 'Low-set ears',}
excluded = {'no': 'Low-set ears'}
ears_low_setMapper = OptionColumnMapper(column_name="Ears-low set", concept_recognizer=hpo_cr, option_d=ears_low_set_d, excluded_d=excluded)
column_mapper_list.append(ears_low_setMapper)
ears_low_setMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Low-set ears (HP:0000369) (excluded),11
1,Low-set ears (HP:0000369) (observed),6


In [15]:
microtia_d = {
 'yes': 'Microtia',
 'small ears': 'Microtia',
 'U/L': 'Microtia',
 'unilateral': 'Microtia'}
excluded = {'no': 'Microtia'}
microtiaMapper = OptionColumnMapper(column_name="Microtia (unilateral or bilateral)", concept_recognizer=hpo_cr, option_d=microtia_d, excluded_d=excluded)
column_mapper_list.append(microtiaMapper)
microtiaMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Microtia (HP:0008551) (excluded),14
1,Microtia (HP:0008551) (observed),4


In [16]:
facial_asymmetry_d = {'yes': 'Facial asymmetry'}
excluded = {'no': 'Facial asymmetry'}
facial_asymmetryMapper = OptionColumnMapper(column_name="Facial Asymmetry", concept_recognizer=hpo_cr, option_d=facial_asymmetry_d, excluded_d=excluded)
column_mapper_list.append(facial_asymmetryMapper)
facial_asymmetryMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Facial asymmetry (HP:0000324) (excluded),15
1,Facial asymmetry (HP:0000324) (observed),3


In [17]:
hypodontia_d = {'yes': 'Hypodontia',}
excluded = {'no': 'Hypodontia'}
hypodontiaMapper = OptionColumnMapper(column_name="Hypodontia", concept_recognizer=hpo_cr, option_d=hypodontia_d, excluded_d=excluded)
column_mapper_list.append(hypodontiaMapper)
hypodontiaMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Hypodontia (HP:0000668) (observed),2
1,Hypodontia (HP:0000668) (excluded),5


In [18]:
other_craniofacial_d = {
 'Congenital hearing loss': 'Hearing impairment',
 'B/L vocal cord paralysis. Tall and wide palpebral fissures. Short nose with upturned tip.': ['Vocal cord paralysis', 'Long palpebral fissure', 'Anteverted nares'],
 'U/L vocal cord paralysis': 'Vocal cord paralysis',
 'bilateral choanal atresia': 'Bilateral choanal atresia',
 'partial acalvaria, bilateral upper eyelid colobomas': ['Upper eyelid coloboma', 'Calvarial skull defect'],
 'wide forehead, full eyebrows, deep set eyes': ['Broad forehead', 'Thick eyebrow', 'Deeply set eye'],
 'strabismus (Duane anomaly)': 'Duane anomaly',
 'L enophthalmia with ptosis. L partial hearing loss': ['Deeply set eye', 'Hearing impairment', 'Ptosis', 'Mild hearing impairment'],
 'B/l epicanthal folds': 'Epicanthus',
 'broad, flat nasal bridge, mild ala nasi deficiency': ['Depressed nasal bridge', 'Underdeveloped nasal alae'],
 'short upturned nose': 'Anteverted nares',
 'upslanting palpebral fissures': 'Upslanted palpebral fissure',
 'midline pseudocleft of upper lip':  'Median pseudocleft lip',
 'airway malacia, hearing loss': ['Tracheobronchomalacia','Hearing impairment'],
 's/p Md distraction': 'Abnormal mandible morphology', ##NOT SURE OF THE CORRECT TERM
 'laryngomalacia': 'Laryngomalacia'}
excluded = {'no': 'Abnormality of the face'}
other_craniofacialMapper = OptionColumnMapper(column_name="Other craniofacial", concept_recognizer=hpo_cr, option_d=other_craniofacial_d, excluded_d=excluded)
column_mapper_list.append(other_craniofacialMapper)
other_craniofacialMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Abnormality of the face (HP:0000271) (excluded),6
1,Hearing impairment (HP:0000365) (observed),3
2,Vocal cord paralysis (HP:0001605) (observed),2
3,Short nose (HP:0003196) (observed),1
4,Long palpebral fissure (HP:0000637) (observed),1
5,Bilateral choanal atresia (HP:0004502) (observed),1
6,Upper eyelid coloboma (HP:0000636) (observed),1
7,Calvarial skull defect (HP:0001362) (observed),1
8,Broad forehead (HP:0000337) (observed),1
9,Thick eyebrow (HP:0000574) (observed),1


In [19]:
#NOTE: LEAST CONFIDENT OF THIS SECTION
brain_imaging_d = {
 #'MRI': 'PLACEHOLDER',
 'MRI: hydrocephaly with junctional stenosis C0-C1 and syringomyelia':['Hydrocephalus', 'Syringomyelia'], 
    ##NOT SURE IF HYDROCEPHALUS IS THE CORRECT TERM OR IF A CHILD TERM WOULD BE MORE APPROPRIATE
 'MRI Punctate restricted diffusion in the right parietal white matter and periatrial white matter representing ischemia or infarctions with minimal petechial hemorrhage':['Brain ischemia', 'Encephalomalacia', 'Cerebral contusions'],
    #'CT': 'PLACEHOLDER',
 'yes': 'Brain imaging abnormality'}
excluded = {'no': 'Brain imaging abnormality','normal': 'Brain imaging abnormality',}
brain_imagingMapper = OptionColumnMapper(column_name="Brain imaging (CT, MRI, Ultrasound)", concept_recognizer=hpo_cr, option_d=brain_imaging_d, excluded_d=excluded)
column_mapper_list.append(brain_imagingMapper)
brain_imagingMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Brain imaging abnormality (HP:0410263) (excluded),7
1,Hydrocephalus (HP:0000238) (observed),1
2,Syringomyelia (HP:0003396) (observed),1
3,Brain imaging abnormality (HP:0410263) (observed),5


In [20]:
ventriculomegaly_d = { 
 'yes': 'Ventriculomegaly',
 'mild prominent 4th ventricle and infracerebellar space': 'Dandy-Walker malformation',
}
excluded = {'no': 'Ventriculomegaly',}
ventriculomegalyMapper = OptionColumnMapper(column_name="Ventriculomegaly", concept_recognizer=hpo_cr, option_d=ventriculomegaly_d, excluded_d=excluded)
column_mapper_list.append(ventriculomegalyMapper)
ventriculomegalyMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Ventriculomegaly (HP:0002119) (excluded),9
1,Ventriculomegaly (HP:0002119) (observed),1
2,Dandy-Walker malformation (HP:0001305) (observed),1


In [21]:
structural_brain_anomaly_d = { #'nan': 'PLACEHOLDER',
 'cavum septum pellucidum': 'Cavum septum pellucidum',
 'aqueductal stenosis': 'Aqueductal stenosis'}
excluded = { 'no': 'Abnormal brain morphology',}
structural_brain_anomalyMapper = OptionColumnMapper(column_name="Structural Brain anomaly", concept_recognizer=hpo_cr, option_d=structural_brain_anomaly_d, excluded_d=excluded)
column_mapper_list.append(structural_brain_anomalyMapper)
structural_brain_anomalyMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Abnormal brain morphology (HP:0012443) (excluded),8
1,Cavum septum pellucidum (HP:0002389) (observed),1
2,Aqueductal stenosis (HP:0002410) (observed),1


In [22]:
hypotonia_d = {'yes': 'Hypotonia'}
excluded = {'no': 'Hypotonia',}
hypotoniaMapper = OptionColumnMapper(column_name="Hypotonia", concept_recognizer=hpo_cr, option_d=hypotonia_d, excluded_d=excluded)
column_mapper_list.append(hypotoniaMapper)
hypotoniaMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Hypotonia (HP:0001252) (excluded),7
1,Hypotonia (HP:0001252) (observed),10


In [23]:
contractures_d = {'yes': 'Joint contracture'}
excluded = {'no': 'Joint contracture',}
contracturesMapper = OptionColumnMapper(column_name="Contractures", concept_recognizer=hpo_cr, option_d=contractures_d, excluded_d=excluded)
column_mapper_list.append(contracturesMapper)
contracturesMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Joint contracture (HP:0034392) (excluded),14
1,Joint contracture (HP:0034392) (observed),2


In [24]:
infantile_spasms_d = {'yes': 'Infantile spasms'}
excluded = {'no': 'Infantile spasms',}
infantile_spasmsMapper = OptionColumnMapper(column_name="Infantile spasms", concept_recognizer=hpo_cr, option_d=infantile_spasms_d, excluded_d=excluded)
column_mapper_list.append(infantile_spasmsMapper)
infantile_spasmsMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Infantile spasms (HP:0012469) (excluded),17
1,Infantile spasms (HP:0012469) (observed),1


In [25]:
epilepsy_d = {
 'yes': 'Seizure',
 'yes, epileptic encephalopathy, myoclonia, atonic seizures': ['Epileptic encephalopathy', 'Myoclonus', 'Atonic seizure'],
}
excluded = {'no': 'Seizure',}
epilepsyMapper = OptionColumnMapper(column_name="Epilepsy", concept_recognizer=hpo_cr, option_d=epilepsy_d, excluded_d=excluded)
column_mapper_list.append(epilepsyMapper)
epilepsyMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Seizure (HP:0001250) (excluded),11
1,Seizure (HP:0001250) (observed),6
2,Epileptic encephalopathy (HP:0200134) (observed),1
3,Myoclonus (HP:0001336) (observed),1
4,Atonic seizure (HP:0010819) (observed),1


In [26]:
developmental_delay_d = {
 'yes': 'Neurodevelopmental delay',
 'slight motor': 'Motor delay'}
excluded = {'no': 'Neurodevelopmental delay',}
developmental_delayMapper = OptionColumnMapper(column_name="Developmental delay", concept_recognizer=hpo_cr, option_d=developmental_delay_d, excluded_d=excluded)
column_mapper_list.append(developmental_delayMapper)
developmental_delayMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Neurodevelopmental delay (HP:0012758) (excluded),6
1,Neurodevelopmental delay (HP:0012758) (observed),8
2,Seizure (HP:0001250) (observed),1
3,Motor delay (HP:0001270) (observed),1


In [27]:
regression_d = {'yes after onset epilepsy': 'Developmental regression'}
excluded = {'no': 'Developmental regression',}
regressionMapper = OptionColumnMapper(column_name="Regression", concept_recognizer=hpo_cr, option_d=regression_d, excluded_d=excluded)
column_mapper_list.append(regressionMapper)
regressionMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Developmental regression (HP:0002376) (excluded),15
1,Developmental regression (HP:0002376) (observed),1


In [28]:
other_neuro_d = {
 'possible ataxia': 'Abnormality of coordination',
 'leg spasticity, spastic dystonia, mild learning difficulties': ['Lower limb spasticity', 'Laryngeal dystonia', 'Mild global developmental delay'], }
excluded = {'no': 'Abnormality of coordination',}
other_neuroMapper = OptionColumnMapper(column_name="Other neuro", concept_recognizer=hpo_cr, option_d=other_neuro_d, excluded_d=excluded)
column_mapper_list.append(other_neuroMapper)
other_neuroMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Abnormality of coordination (HP:0011443) (excluded),15
1,Abnormality of coordination (HP:0011443) (observed),1
2,Lower limb spasticity (HP:0002061) (observed),1
3,Laryngeal dystonia (HP:0012049) (observed),1
4,Mild global developmental delay (HP:0011342) (observed),1


In [29]:
limb_defects_d = {'yes': 'Abnormality of limbs'}
excluded = {'no': 'Abnormality of limbs',}
limb_defectsMapper = OptionColumnMapper(column_name="Limb defects", concept_recognizer=hpo_cr, option_d=limb_defects_d, excluded_d=excluded)
column_mapper_list.append(limb_defectsMapper)
limb_defectsMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Abnormality of limbs (HP:0040064) (excluded),12
1,Abnormality of limbs (HP:0040064) (observed),6


In [30]:
pfo_asd_d = {
 'yes': ['Patent foramen ovale', 'Atrial septal defect'],
 'PFO': 'Patent foramen ovale'}
excluded = {'no': ['Patent foramen ovale', 'Atrial septal defect'],}
pfo_asdMapper = OptionColumnMapper(column_name="PFO/ASD", concept_recognizer=hpo_cr, option_d=pfo_asd_d, excluded_d=excluded)
column_mapper_list.append(pfo_asdMapper)
pfo_asdMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Patent foramen ovale (HP:0001655) (excluded),10
1,Atrial septal defect (HP:0001631) (excluded),10
2,Patent foramen ovale (HP:0001655) (observed),4
3,Atrial septal defect (HP:0001631) (observed),3


In [31]:
vsd_d = {'yes': 'Ventricular septal defect'}
excluded = {'no': 'Ventricular septal defect',}
vsdMapper = OptionColumnMapper(column_name="VSD", concept_recognizer=hpo_cr, option_d=vsd_d, excluded_d=excluded)
column_mapper_list.append(vsdMapper)
vsdMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Ventricular septal defect (HP:0001629) (excluded),11
1,Ventricular septal defect (HP:0001629) (observed),3


In [32]:
other_structural_heart_d = {
 'Biventricular hypertrophy': 'Biventricular hypertrophy',
 'anomalous origin of right coronary artery': 'Anomalous origin of right coronary artery from the pulmonary artery',
 'pulmonary artery stenosis': 'Pulmonary artery stenosis',
 'yes (see text)': ['Bicuspid aortic valve', 'Aortic aneurysm', 'Pulmonary artery aneurysm', 'Partial atrioventricular canal defect', 'Cleft anterior mitral valve leaflet'] }
excluded = {'no': 'Abnormal heart morphology',}
other_structural_heartMapper = OptionColumnMapper(column_name="Other structural heart", concept_recognizer=hpo_cr, option_d=other_structural_heart_d, excluded_d=excluded)
column_mapper_list.append(other_structural_heartMapper)
other_structural_heartMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Abnormal heart morphology (HP:0001627) (excluded),10
1,Biventricular hypertrophy (HP:0200128) (observed),1
2,Anomalous origin of right coronary artery from the pulmonary artery (HP:0011639) (observed),1
3,Pulmonary artery stenosis (HP:0004415) (observed),1
4,Bicuspid aortic valve (HP:0001647) (observed),1
5,Aortic aneurysm (HP:0004942) (observed),1
6,Pulmonary artery aneurysm (HP:0004937) (observed),1
7,Partial atrioventricular canal defect (HP:0011577) (observed),1
8,Cleft anterior mitral valve leaflet (HP:0011569) (observed),1


In [33]:
iugr_d = { 'yes': 'Intrauterine growth retardation'}
excluded = {'no': 'Intrauterine growth retardation',}
iugrMapper = OptionColumnMapper(column_name="IUGR", concept_recognizer=hpo_cr, option_d=iugr_d, excluded_d=excluded)
column_mapper_list.append(iugrMapper)
iugrMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Intrauterine growth retardation (HP:0001511) (excluded),11
1,Intrauterine growth retardation (HP:0001511) (observed),3


In [34]:
any_other_features_d = { #'nan': 'PLACEHOLDER',
 'Large scalp congenital naevus. Persistently raised lactates. Muscle and liver biopsies consistent with combined respiratory chain enzyme defect affecting primarily  complexes I and IV in muscle and borderline IV in the liver.': ['Nevus', 'Increased circulating lactate concentration', 'Muscle abnormality related to mitochondrial dysfunction'],
 'fifth finger clinodactyly, single palmar crease, mild toenail hypoplasia': ['Clinodactyly of the 5th finger', 'Single transverse palmar crease', 'Hypoplastic toenails'],
 'bilateral cryptorchidism,dysphagia and aspiration requiring GJ dependence, OSA, right hydronephrosis': ['Bilateral cryptorchidism', 'Dysphagia', 'Aspiration', 'Gastrojejunal tube feeding in infancy', 'Obstructive sleep apnea', 'Hydronephrosis'], 
 'recurrent otitis media': 'Recurrent otitis media',
 'L diaphragmatic eventration. L partial hearing loss': ['Diaphragmatic eventration', 'Hearing impairment'],
 'R inguinal hernia': 'Inguinal hernia',
 'L cryptorchidism': 'Cryptorchidism',
 'FTT requiring G tube placemenet': ['Failure to thrive', 'Gastrostomy tube feeding in infancy'],
 'G-tube for feeding.': 'Gastrostomy tube feeding in infancy',
 'G-tube': 'Gastrostomy tube feeding in infancy',
 'Scoliosis': 'Scoliosis',
 'Eye skin growth': 'Pterygium', 
'hearing loss (b/l)': 'Hearing impairment'}
excluded = {'no intellectual disability': 'Intellectual disability'}
any_other_featuresMapper = OptionColumnMapper(column_name="Any other features", concept_recognizer=hpo_cr, option_d=any_other_features_d, excluded_d=excluded)
column_mapper_list.append(any_other_featuresMapper)
any_other_featuresMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Nevus (HP:0003764) (observed),1
1,Clinodactyly of the 5th finger (HP:0004209) (observed),1
2,Single transverse palmar crease (HP:0000954) (observed),1
3,Hypoplastic toenails (HP:0001800) (observed),1
4,Intellectual disability (HP:0001249) (excluded),1
5,Bilateral cryptorchidism (HP:0008689) (observed),1
6,Aspiration (HP:0002835) (observed),1
7,Dysphagia (HP:0002015) (observed),1
8,Hydronephrosis (HP:0000126) (observed),1
9,Recurrent otitis media (HP:0000403) (observed),1


In [35]:
def correct_hgvs(hgvs):
    if hgvs == "c.3649delC":
        return "c.3649del"
    elif hgvs == "c.3988_3990delGAG":
        return "c.3988_3990del"
    else:
        return hgvs

dft["allele_1"] = dft['NM_015425.6'].apply(lambda x: correct_hgvs(x))

In [36]:
POLR1C_transcript = "NM_015425.6"
vman = VariantManager(df=dft, 
                      individual_column_name="person_id", 
                      gene_symbol="POLR1C",
                      allele_1_column_name="allele_1", 
                      transcript=POLR1C_transcript)

In [37]:
sexMapper = SexColumnMapper(column_name='Sex',male_symbol="male", female_symbol="female")
#sexMapper.preview_column(dft)
onsetMapper = AgeColumnMapper.hpo_onset(column_name='Onset')
#onsetMapper.preview_column(dft)
encounterMapper = AgeColumnMapper.iso8601(column_name='Age (at most recent assessment)')
#encounterMapper.preview_column(dft)

In [38]:
AFDCIN = Disease(disease_id="OMIM:616462", disease_label="Acrofacial dysostosis, Cincinnati type")
varMapper = VariantColumnMapper(variant_d=vman.get_variant_d(),
                                variant_column_name="allele_1", 
                                default_genotype="heterozygous")

In [39]:
encoder = CohortEncoder(df=dft, 
                        hpo_cr=hpo_cr, 
                        column_mapper_list=column_mapper_list, 
                        individual_column_name="person_id", 
                        age_of_onset_mapper=onsetMapper, 
                        age_at_last_encounter_mapper=encounterMapper,
                        sexmapper=sexMapper,
                        variant_mapper=varMapper,
                        metadata=metadata)
encoder.set_disease(AFDCIN)

In [40]:
individuals = encoder.get_individuals()

Could not parse the following as ISO8601 ages: na (n=1)
Could not parse the following as ISO8601 ages: adult (n=1)


In [41]:
cvalidator = CohortValidator(cohort=individuals, ontology=hpo_ontology, min_hpo=1, allelic_requirement=AllelicRequirement.MONO_ALLELIC)
qc = QcVisualizer(cohort_validator=cvalidator)
display(HTML(qc.to_summary_html()))

Level,Error category,Count
ERROR,CONFLICT,2
WARNING,REDUNDANT,65
INFORMATION,NOT_MEASURED,43


In [42]:
individuals = cvalidator.get_error_free_individual_list()
table = PhenopacketTable(individual_list=individuals, metadata=metadata)
display(HTML(table.to_html()))

Individual,Disease,Genotype,Phenotypic features
Individual 1 (MALE; P3M),"Acrofacial dysostosis, Cincinnati type (OMIM:616462)",NM_203290.4:c.176A>T (heterozygous),Cleft palate (HP:0000175); Metopic synostosis (HP:0011330); Cleft lip (HP:0410030); excluded: Abnormality of limbs (HP:0040064); excluded: Patent foramen ovale (HP:0001655); excluded: Microtia (HP:0008551); excluded: Low-set ears (HP:0000369); excluded: Hypotonia (HP:0001252); excluded: Hypertelorism (HP:0000316); excluded: Ptosis (HP:0000508); excluded: Ventricular septal defect (HP:0001629); excluded: Brain imaging abnormality (HP:0410263); excluded: Micrognathia (HP:0000347); excluded: Intrauterine growth retardation (HP:0001511); excluded: Joint contracture (HP:0034392); excluded: Global developmental delay (HP:0001263); excluded: Developmental regression (HP:0002376); excluded: Facial asymmetry (HP:0000324); excluded: Abnormality of coordination (HP:0011443); excluded: Infantile spasms (HP:0012469)
Individual 2 (MALE; P5Y),"Acrofacial dysostosis, Cincinnati type (OMIM:616462)",NM_203290.4:c.176A>T (heterozygous),Cleft palate (HP:0000175); Metopic synostosis (HP:0011330); Cleft lip (HP:0410030); excluded: Abnormality of limbs (HP:0040064); excluded: Patent foramen ovale (HP:0001655); excluded: Microtia (HP:0008551); excluded: Low-set ears (HP:0000369); excluded: Hypotonia (HP:0001252); excluded: Hypertelorism (HP:0000316); excluded: Ptosis (HP:0000508); excluded: Ventricular septal defect (HP:0001629); excluded: Brain imaging abnormality (HP:0410263); excluded: Micrognathia (HP:0000347); excluded: Intrauterine growth retardation (HP:0001511); excluded: Joint contracture (HP:0034392); excluded: Global developmental delay (HP:0001263); excluded: Developmental regression (HP:0002376); excluded: Facial asymmetry (HP:0000324); excluded: Abnormality of coordination (HP:0011443); excluded: Infantile spasms (HP:0012469)
Individual 3 (FEMALE; P1M26D),"Acrofacial dysostosis, Cincinnati type (OMIM:616462)",NM_015425.6:c.190del (heterozygous),Global developmental delay (HP:0001263); Hearing impairment (HP:0000365); Biventricular hypertrophy (HP:0200128); Nevus (HP:0003764); Microcephaly (HP:0000252); Hypotonia (HP:0001252); excluded: Abnormality of limbs (HP:0040064); excluded: Patent foramen ovale (HP:0001655); excluded: Microtia (HP:0008551); excluded: Low-set ears (HP:0000369); excluded: Macrocephaly (HP:0000256); excluded: Craniosynostosis (HP:0001363); excluded: Cleft lip (HP:0410030); excluded: Hypertelorism (HP:0000316); excluded: Ptosis (HP:0000508); excluded: Ventricular septal defect (HP:0001629); excluded: Micrognathia (HP:0000347); excluded: Intrauterine growth retardation (HP:0001511); excluded: Cleft palate (HP:0000175); excluded: Joint contracture (HP:0034392); excluded: Metopic synostosis (HP:0011330); excluded: Developmental regression (HP:0002376); excluded: Ventriculomegaly (HP:0002119); excluded: Facial asymmetry (HP:0000324); excluded: Abnormality of coordination (HP:0011443); excluded: Infantile spasms (HP:0012469)
Individual 4 (MALE; P11M),"Acrofacial dysostosis, Cincinnati type (OMIM:616462)",NM_015425.6:c.1178G>A (heterozygous),Single transverse palmar crease (HP:0000954); Low-set ears (HP:0000369); Long palpebral fissure (HP:0000637); Anomalous origin of right coronary artery from the pulmonary artery (HP:0011639); Vocal cord paralysis (HP:0001605); Hypoplastic toenails (HP:0001800); Clinodactyly of the 5th finger (HP:0004209); Short nose (HP:0003196); Hypotonia (HP:0001252); excluded: Patent foramen ovale (HP:0001655); excluded: Microtia (HP:0008551); excluded: Microcephaly (HP:0000252); excluded: Macrocephaly (HP:0000256); excluded: Craniosynostosis (HP:0001363); excluded: Cleft lip (HP:0410030); excluded: Hypertelorism (HP:0000316); excluded: Ptosis (HP:0000508); excluded: Ventricular septal defect (HP:0001629); excluded: Brain imaging abnormality (HP:0410263); excluded: Micrognathia (HP:0000347); excluded: Intrauterine growth retardation (HP:0001511); excluded: Cleft palate (HP:0000175); excluded: Joint contracture (HP:0034392); excluded: Metopic synostosis (HP:0011330); excluded: Facial asymmetry (HP:0000324); excluded: Abnormality of coordination (HP:0011443); excluded: Infantile spasms (HP:0012469)
Individual 5 (FEMALE; n/a),"Acrofacial dysostosis, Cincinnati type (OMIM:616462)",NM_015425.6:c.1178G>A (heterozygous),Ptosis (HP:0000508); Ventricular septal defect (HP:0001629); Vocal cord paralysis (HP:0001605); Hypotonia (HP:0001252); excluded: Craniosynostosis (HP:0001363); excluded: Cleft lip (HP:0410030); excluded: Micrognathia (HP:0000347); excluded: Hypertelorism (HP:0000316); excluded: Abnormality of limbs (HP:0040064); excluded: Metopic synostosis (HP:0011330); excluded: Joint contracture (HP:0034392); excluded: Patent foramen ovale (HP:0001655); excluded: Facial asymmetry (HP:0000324); excluded: Microtia (HP:0008551); excluded: Low-set ears (HP:0000369); excluded: Brain imaging abnormality (HP:0410263); excluded: Cleft palate (HP:0000175); excluded: Infantile spasms (HP:0012469)
Individual 6 (FEMALE; P10Y4M),"Acrofacial dysostosis, Cincinnati type (OMIM:616462)",NM_015425.6:c.1442G>A (heterozygous),Hypodontia (HP:0000668); Hydrocephalus (HP:0000238); Hypertelorism (HP:0000316); Cleft palate (HP:0000175); Low-set ears (HP:0000369); Facial asymmetry (HP:0000324); Ventriculomegaly (HP:0002119); Bilateral choanal atresia (HP:0004502); Micrognathia (HP:0000347); Syringomyelia (HP:0003396); Intrauterine growth retardation (HP:0001511); excluded: Abnormality of limbs (HP:0040064); excluded: Patent foramen ovale (HP:0001655); excluded: Microtia (HP:0008551); excluded: Microcephaly (HP:0000252); excluded: Macrocephaly (HP:0000256); excluded: Craniosynostosis (HP:0001363); excluded: Hypotonia (HP:0001252); excluded: Cleft lip (HP:0410030); excluded: Ptosis (HP:0000508); excluded: Ventricular septal defect (HP:0001629); excluded: Joint contracture (HP:0034392); excluded: Global developmental delay (HP:0001263); excluded: Intellectual disability (HP:0001249); excluded: Metopic synostosis (HP:0011330); excluded: Developmental regression (HP:0002376); excluded: Abnormality of coordination (HP:0011443); excluded: Infantile spasms (HP:0012469)
Individual 7 (FEMALE; P44Y),"Acrofacial dysostosis, Cincinnati type (OMIM:616462)",NM_015425.6:c.1442G>A (heterozygous),Cleft palate (HP:0000175); Micrognathia (HP:0000347); Low-set ears (HP:0000369); Hypertelorism (HP:0000316); excluded: Abnormality of limbs (HP:0040064); excluded: Patent foramen ovale (HP:0001655); excluded: Microtia (HP:0008551); excluded: Microcephaly (HP:0000252); excluded: Macrocephaly (HP:0000256); excluded: Craniosynostosis (HP:0001363); excluded: Hypotonia (HP:0001252); excluded: Cleft lip (HP:0410030); excluded: Ptosis (HP:0000508); excluded: Ventricular septal defect (HP:0001629); excluded: Brain imaging abnormality (HP:0410263); excluded: Joint contracture (HP:0034392); excluded: Global developmental delay (HP:0001263); excluded: Metopic synostosis (HP:0011330); excluded: Hypodontia (HP:0000668); excluded: Developmental regression (HP:0002376); excluded: Ventriculomegaly (HP:0002119); excluded: Facial asymmetry (HP:0000324); excluded: Abnormality of coordination (HP:0011443); excluded: Infantile spasms (HP:0012469)
Individual 8 (MALE; P2Y),"Acrofacial dysostosis, Cincinnati type (OMIM:616462)",NM_015425.6:c.1488G>T (heterozygous),Calvarial skull defect (HP:0001362); Global developmental delay (HP:0001263); Microtia (HP:0008551); Seizure (HP:0001250); Patent foramen ovale (HP:0001655); Hypertelorism (HP:0000316); Hydronephrosis (HP:0000126); Dysphagia (HP:0002015); Low-set ears (HP:0000369); Abnormality of limbs (HP:0040064); Upper eyelid coloboma (HP:0000636); Micrognathia (HP:0000347); Bilateral cryptorchidism (HP:0008689); Microcephaly (HP:0000252); Aspiration (HP:0002835); Hypotonia (HP:0001252); excluded: Craniosynostosis (HP:0001363); excluded: Cleft lip (HP:0410030); excluded: Metopic synostosis (HP:0011330); excluded: Intrauterine growth retardation (HP:0001511); excluded: Ptosis (HP:0000508); excluded: Joint contracture (HP:0034392); excluded: Ventricular septal defect (HP:0001629); excluded: Developmental regression (HP:0002376); excluded: Ventriculomegaly (HP:0002119); excluded: Facial asymmetry (HP:0000324); excluded: Macrocephaly (HP:0000256); excluded: Cleft palate (HP:0000175); excluded: Abnormality of coordination (HP:0011443); excluded: Infantile spasms (HP:0012469)
Individual 9 (MALE; P4Y),"Acrofacial dysostosis, Cincinnati type (OMIM:616462)",NM_015425.6:c.2583_2586del (heterozygous),Broad forehead (HP:0000337); Microtia (HP:0008551); Hypertelorism (HP:0000316); Deeply set eye (HP:0000490); Developmental regression (HP:0002376); Abnormality of coordination (HP:0011443); Dandy-Walker malformation (HP:0001305); Myoclonus (HP:0001336); Epileptic encephalopathy (HP:0200134); Atonic seizure (HP:0010819); Thick eyebrow (HP:0000574); excluded: Craniosynostosis (HP:0001363); excluded: Hypotonia (HP:0001252); excluded: Cleft lip (HP:0410030); excluded: Micrognathia (HP:0000347); excluded: Low-set ears (HP:0000369); excluded: Abnormality of limbs (HP:0040064); excluded: Metopic synostosis (HP:0011330); excluded: Intrauterine growth retardation (HP:0001511); excluded: Hypodontia (HP:0000668); excluded: Ptosis (HP:0000508); excluded: Joint contracture (HP:0034392); excluded: Facial asymmetry (HP:0000324); excluded: Microcephaly (HP:0000252); excluded: Macrocephaly (HP:0000256); excluded: Cleft palate (HP:0000175); excluded: Infantile spasms (HP:0012469)
Individual 10 (FEMALE; P7Y7M),"Acrofacial dysostosis, Cincinnati type (OMIM:616462)",NM_015425.6:c.3649del (heterozygous),Hypodontia (HP:0000668); Duane anomaly (HP:0009921); Ptosis (HP:0000508); Recurrent otitis media (HP:0000403); Hypertelorism (HP:0000316); Cleft palate (HP:0000175); Abnormality of limbs (HP:0040064); Micrognathia (HP:0000347); excluded: Patent foramen ovale (HP:0001655); excluded: Microtia (HP:0008551); excluded: Low-set ears (HP:0000369); excluded: Microcephaly (HP:0000252); excluded: Macrocephaly (HP:0000256); excluded: Craniosynostosis (HP:0001363); excluded: Hypotonia (HP:0001252); excluded: Cleft lip (HP:0410030); excluded: Ventricular septal defect (HP:0001629); excluded: Intrauterine growth retardation (HP:0001511); excluded: Joint contracture (HP:0034392); excluded: Global developmental delay (HP:0001263); excluded: Metopic synostosis (HP:0011330); excluded: Developmental regression (HP:0002376); excluded: Ventriculomegaly (HP:0002119); excluded: Facial asymmetry (HP:0000324); excluded: Abnormality of coordination (HP:0011443); excluded: Infantile spasms (HP:0012469)


In [43]:
Individual.output_individuals_as_phenopackets(individual_list=individuals,
                                              metadata=metadata)

We output 18 GA4GH phenopackets to the directory phenopackets


In [44]:
ingestor = PhenopacketIngestor(indir="phenopackets")
ppkt_d = ingestor.get_phenopacket_dictionary()
ppkt_list = list(ppkt_d.values())

[pyphetools] Ingested 18 GA4GH phenopackets.


In [45]:
builder = HpoaTableBuilder(phenopacket_list=ppkt_list)

In [46]:
PMID = "PMID:37075751" # 
creator = builder.autosomal_dominant(PMID).build()
df = creator.get_dataframe()
creator.write_data_frame()

We found a total of 83 unique HPO terms
Extracted disease: Acrofacial dysostosis, Cincinnati type (OMIM:616462)
Wrote HPOA disease file to OMIM-616462.tab
