<h1>Genotype–phenotype correlation at codon 1740 of SETD2</h1>
<p>Generate phenopackets from the data reported in <a href="https://pubmed.ncbi.nlm.nih.gov/32710489/">Rabin et al., (2020) Genotype-phenotype correlation at codon 1740 of SETD2</a>.</p>

In [1]:
import pandas as pd
from IPython.display import HTML, display
pd.set_option('display.max_colwidth', None) # show entire column contents, important!
from collections import defaultdict
from pyphetools.creation import *
from pyphetools.visualization import PhenopacketTable, QcVisualizer
from pyphetools.validation import *
import pyphetools
print(f"Using pyphetools version {pyphetools.__version__}")

Using pyphetools version 0.9.66


In [2]:
PMID = "PMID:32710489"
title = "Genotype-phenotype correlation at codon 1740 of SETD2"
cite = Citation(pmid=PMID, title=title)
parser = HpoParser(hpo_json_file="../hp.json")
hpo_cr = parser.get_hpo_concept_recognizer()
hpo_version = parser.get_version()
hpo_ontology = parser.get_ontology()
metadata = MetaData(created_by="ORCID:0000-0002-0736-9199", citation=cite)
metadata.default_versions_with_hpo(version=hpo_version)
print(f"HPO version {hpo_version}")

HPO version 2024-03-06


In [3]:
df = pd.read_excel("input/RabinSupplementaryTable1-SETD2.xlsx")

In [4]:
df.head()

Unnamed: 0,Feature,Group 1 Patient 1,Group 1 Patient 2,Group 1 Patient 3,Group 1 Patient 4,Group 1 Patient 5,Group 1 Patient 6,Group 1 Patient 7,Group 1 Patient 8,Group 1 Patient 9,Group 1 Patient 10,Group 1 Patient 11,Group 1 Patient 12,Group 2 Patient 1,Group 2 Patient 2,Group 2 Patient 3
0,Variant,p.(Arg1740Trp),p.(Arg1740Trp),p.(Arg1740Trp),p.(Arg1740Trp),p.(Arg1740Trp),p.(Arg1740Trp),p.(Arg1740Trp),p.(Arg1740Trp),p.(Arg1740Trp),p.(Arg1740Trp),p.(Arg1740Trp),p.(Arg1740Trp),p.(Arg1740Gln),p.(Arg1740Gln),p.(Arg1740Gln)
1,Sex,Male,Male,Male,Female,Female,Female,Male,Female,Female,Male,Female,Female,Male,Male,Female
2,Prenatal complications,"Extra fluid in the back of the cerebellum at 35 weeks; fetal MRI at 35 weeks showed VSD, small cerebellum, and agenesis of the corpus callosum; pre-eclampsia; \nIUGR","Polyhydramnios, maternal asthma, Maternal MVP, maternal cholelithiasis",Pre-eclampsia; fetal ventriculomegaly,,"Fetal cerebellar hypoplasia, ventriculomegaly, intraventricular hemorrhage",Preterm labor,"Twin gestation; pre-eclampsia, heart defect (VSD, cardiomegaly), urogenital anomaly, suspected toxemia of pregnancy",Perterm labor,IVF pregnancy conceived with frozen sperm; increased nuchal translucency; pre-eclampsia,Ambiguous genitalia; enlarged cisterna magna; right dysplastic multi cystic kidney; dandy walker variant; cardiac defect,Polyhydramnios noted at 20 wks,,,Polyhydramnios,
3,Gestational age,36 weeks,36 4/7 weeks,36 weeks,full term,33 weeks,32 2/7 weeks,30 6/7 weeks,35 5/7 weeks,35 weeks,34 3/7 weeks,40 weeks,39 2/7 weeks,Full term,39 weeks,40 weels
4,Perinatal complications,Caesarean,,Caesarean,,,,,,,,Emergency caesarean for fetal decelerations,,,,


In [5]:
# Convert to row-based
dft = df.transpose()

dft.columns = dft.iloc[0]
dft.drop(dft.index[0], inplace=True)
dft.index
dft['patient_id'] = dft.index
dft.head(2)

Feature,Variant,Sex,Prenatal complications,Gestational age,Perinatal complications,Birth weight,Birth length,Birth head circumference,Growth,Weight,...,Cardiac,Gastrointestinal,Renal / urinary tract,Genital,Skeletal,Neuromuscular,Neuroimaging,Other genetic findings,Other,patient_id
Group 1 Patient 1,p.(Arg1740Trp),Male,"Extra fluid in the back of the cerebellum at 35 weeks; fetal MRI at 35 weeks showed VSD, small cerebellum, and agenesis of the corpus callosum; pre-eclampsia; \nIUGR",36 weeks,Caesarean,2126 grams (10%)1,47 cm (50%)1,32.5 cm (45%)1,,7.6 kg at 6 months (40%)2,...,VSD with narrow LVOT; hypoplastic aortic valve; transverse arch hypoplasia and coarctation of the aorta; PDA; ASD,GTT at 4 months,Dilated collecting system; malrotation of right kidney,Cryptorchidism; incomplete foreskin; shawl scrotum,Hip dysplasia at birth,Hypotonia; seizure onset at 5 months; neuromuscular scoliosis,"Widening of SS; enlargement of cisterna magna / extra fuid around the cerebellum; dysgenesis of CC; small pons,",,,Group 1 Patient 1
Group 1 Patient 2,p.(Arg1740Trp),Male,"Polyhydramnios, maternal asthma, Maternal MVP, maternal cholelithiasis",36 4/7 weeks,,3175 grams (80%)1,,33 cm (50%)1,,26.4 kg at 10 years (15%)2,...,Normal echocardiogram,GGT in infancy; GE reflux; constipation,Bilateral duplicated kidneys; hydronephrosis; left VCU reflux,Cryptorchidism; penoscrotal transposition,Hip dysplasia,Scoliosis; seizure onset at 4 months (seizure free after age 6); spastic paraplegia,At 4 days old showed microcephaly with simplified gyral pattern; inferior cerebellar hypoplasia; mega cisterna magna; and hypogensis of the genu and rostrum of the corpus callosum,"SMPD1 paternal LP and maternal VUS, CEP290 paternal and maternal VUS; positive Factor V Leiden",Polycythemia at birth; blood clot in IVC,Group 1 Patient 2


In [6]:
column_mapper_list = list()

In [7]:
prenatal_custom_map = {'agenesis of the corpus callosum': 'Agenesis of corpus callosum',  
                         '\nIUGR': 'Intrauterine growth retardation',
                         'small cerebellum':'Cerebellar hypoplasia',
                         'vsd': 'Ventricular septal defect',
                           #'pre-eclampsia': 'Preeclampsia',
                       'right dysplastic multi cystic kidney':'Multicystic kidney dysplasia'
                        }
excluded = {'maternal asthma', 'heart defect', 'maternal cholelithiasis'}
prenatalMapper = OptionColumnMapper(column_name='Prenatal complications',
                                    concept_recognizer=hpo_cr, option_d=prenatal_custom_map, omitSet=excluded)
column_mapper_list.append(prenatalMapper)
prenatalMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Ventricular septal defect (HP:0001629) (observed),2
1,Cerebellar hypoplasia (HP:0001321) (observed),2
2,Agenesis of corpus callosum (HP:0001274) (observed),1
3,Polyhydramnios (HP:0001561) (observed),3
4,Ventriculomegaly (HP:0002119) (observed),2
5,Intraventricular hemorrhage (HP:0030746) (observed),1
6,Cardiomegaly (HP:0001640) (observed),1
7,Increased nuchal translucency (HP:0010880) (observed),1
8,Ambiguous genitalia (HP:0000062) (observed),1
9,Enlarged cisterna magna (HP:0002280) (observed),1


In [8]:
dev_custom_map = {'Severe global developmental delay': 'Severe global developmental delay'}
devMapper = OptionColumnMapper(column_name='Development',concept_recognizer=hpo_cr, option_d=dev_custom_map)
column_mapper_list.append(devMapper)
devMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Severe global developmental delay (HP:0011344) (observed),9
1,Global developmental delay (HP:0001263) (observed),1


In [9]:
walking_custom_map = {'No': 'Inability to walk',  
                         'No; wheelchair bound at 10 years': 'Inability to walk',
                         'No at 3.5 years and could not stand at 3.5 years':'Delayed ability to walk',
                         'No at 13 years': 'Inability to walk',  
                         'Able to take a few steps at 7 years': 'Inability to walk',  
                       'No at 6 years':'Inability to walk',  
                        }
walkingMapper = OptionColumnMapper(column_name='Walking independently',concept_recognizer=hpo_cr, option_d=walking_custom_map)
column_mapper_list.append(walkingMapper)
walkingMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Inability to walk (HP:0002540) (observed),6
1,Delayed ability to walk (HP:0031936) (observed),1


In [10]:
sitting_custom_map = {'No': 'Delayed ability to sit',  
                         'at 2.5 years': 'Delayed ability to sit',
                         'No at 3.5 years and could not stand at 3.5 years':'Delayed ability to walk',
                         'No at 10 years': 'Delayed ability to sit',  
                         'Attempting to sit at 6 years': 'Delayed ability to sit',  
                       'No at 6 years':'Inability to walk',  
                        }
sittingMapper = OptionColumnMapper(column_name='Sitting independently',concept_recognizer=hpo_cr, option_d=sitting_custom_map)
column_mapper_list.append(sittingMapper)
sittingMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Delayed ability to sit (HP:0025336) (observed),6


In [11]:
speech_custom_map = {'At 16 months making sounds': 'Delayed speech and language development',  
                         'No speech; only babbling at 10 years': 'Absent speech',
                         'No speech; only cooing at 3.5 years':'Delayed speech and language development',  
                         'Cccasional vocalizations at 7 years': 'Absent speech',
                         '15 months had single words; 4 years 6 months spoke in short sentences with pronunciation difficulties':'Delayed speech and language development',  
                        }
speechMapper = OptionColumnMapper(column_name='speech',concept_recognizer=hpo_cr, option_d=speech_custom_map)
column_mapper_list.append(speechMapper)
speechMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Delayed speech and language development (HP:0000750) (observed),3
1,Absent speech (HP:0001344) (observed),2


In [12]:
skull_map = {'Metopic ridge': 'Prominent metopic ridge'}
skullMapper= OptionColumnMapper(column_name='Fontanelle/ skull',concept_recognizer=hpo_cr, option_d=skull_map)
column_mapper_list.append(skullMapper)
skullMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Large fontanelles (HP:0000239) (observed),1
1,Dolichocephaly (HP:0000268) (observed),1
2,Frontal bossing (HP:0002007) (observed),1
3,Flat occiput (HP:0005469) (observed),1
4,Wide anterior fontanel (HP:0000260) (observed),1
5,Prominent metopic ridge (HP:0005487) (observed),2
6,Brachycephaly (HP:0000248) (observed),1


In [13]:
items = {
    'midface hypoplasia/maxillary hypoplasia': ["Midface retrusion","HP:0011800"],
    'wide nasal bridge': ['Wide nasal bridge', 'HP:0000431'],
    'broad nasal tip': ['Broad nasal tip', 'HP:0000455'],
    'Low hanging columella': ['Low hanging columella', 'HP:0009765'],
    'upslanted palbebral fissures': ['Upslanted palpebral fissure', 'HP:0000582'], 
    'narrow/short palbebral fissures': ['Short palpebral fissure','HP:0012745'],
    'Periorbital fullness': ['Periorbital fullness', 'HP:0000629'],
    'arched eyebrows': ['Highly arched eyebrow', 'HP:0002553'],
    'hypertelorism': ['Hypertelorism',  'HP:0000316'],
    'micrognathia': ['Micrognathia', 'HP:0000347'],
  }
item_column_mapper_d = hpo_cr.initialize_simple_column_maps(column_name_to_hpo_label_map=items, observed='Present',
    excluded='no')
  # Transfer to column_mapper_d
for k, v in item_column_mapper_d.items():
    column_mapper_list.append(v)

In [14]:
handsMapper = OptionColumnMapper(column_name='Minor malfromations of hands and feet',
                                 concept_recognizer=hpo_cr, option_d={})
column_mapper_list.append(handsMapper)
handsMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Single transverse palmar crease (HP:0000954) (observed),4
1,2-3 toe syndactyly (HP:0004691) (observed),2
2,Brachydactyly (HP:0001156) (observed),3
3,Fair hair (HP:0002286) (observed),1
4,Triphalangeal thumb (HP:0001199) (observed),1
5,Rocker bottom foot (HP:0001838) (observed),1
6,Camptodactyly (HP:0012385) (observed),2
7,Short distal phalanx of finger (HP:0009882) (observed),1
8,Small nail (HP:0001792) (observed),1
9,Broad thumb (HP:0011304) (observed),2


In [15]:
ears_d = {'low set': "Low-set ears",
             'Attached ear-lobes':"Attached earlobe",
          'earlobes attached to side':"Attached earlobe"
         }
excluded = {'malformed ears'}

earsMapper = OptionColumnMapper(column_name='Malformations of the ears',
                                concept_recognizer=hpo_cr, option_d=ears_d, omitSet=excluded)
column_mapper_list.append(earsMapper)
earsMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Low-set ears (HP:0000369) (observed),3
1,Posteriorly rotated ears (HP:0000358) (observed),2
2,Preauricular skin tag (HP:0000384) (observed),1
3,Stenosis of the external auditory canal (HP:0000402) (observed),1
4,Auricular pit (HP:0030025) (observed),1
5,Macrotia (HP:0000400) (observed),1
6,Attached earlobe (HP:0009907) (observed),2


In [16]:
other_d = {'down turned corners of the mouth': 'Downturned corners of mouth',
          ' low set nipples': 'Low-set nipples',
          'Inverted': 'Inverted nipples',
          'Synophyrs': 'Synophrys'}
otherMapper = OptionColumnMapper(column_name='Other malformations',concept_recognizer=hpo_cr, option_d=other_d)
column_mapper_list.append(otherMapper)
otherMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Downturned corners of mouth (HP:0002714) (observed),1
1,Low-set nipples (HP:0002562) (observed),3
2,Sparse hair (HP:0008070) (observed),1
3,Blepharophimosis (HP:0000581) (observed),1
4,Hirsutism (HP:0001007) (observed),1
5,Narrow forehead (HP:0000341) (observed),2
6,Smooth philtrum (HP:0000319) (observed),1
7,Narrow mouth (HP:0000160) (observed),2
8,Everted lower lip vermilion (HP:0000232) (observed),1
9,Inverted nipples (HP:0003186) (observed),1


In [17]:
eyeMapper = OptionColumnMapper(column_name='Ophthalmology',concept_recognizer=hpo_cr, option_d={})
column_mapper_list.append(eyeMapper)
eyeMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Optic nerve hypoplasia (HP:0000609) (observed),2
1,Retinal detachment (HP:0000541) (observed),5
2,Ptosis (HP:0000508) (observed),1
3,Cataract (HP:0000518) (observed),2
4,Iris coloboma (HP:0000612) (observed),1
5,Retinal hemorrhage (HP:0000573) (observed),1
6,Retinopathy of prematurity (HP:0500049) (observed),1
7,Retinal telangiectasia (HP:0007763) (observed),1
8,Retinal dysplasia (HP:0007973) (observed),2
9,Glaucoma (HP:0000501) (observed),1


In [18]:
#eyeMapper.preview_column(dft['Audiology'])
ear_d = {'(mixed) hearing loss': 'Mixed hearing impairment',
         'Mixed hearing loss':  'Mixed hearing impairment',
          'Severe mixed hearing loss': 'Mixed hearing impairment',
          'Conductive hearing loss': 'Conductive hearing impairment',
          'Sensorineural hearing loss': 'Sensorineural hearing impairment'}
earMapper = OptionColumnMapper(column_name='Audiology',concept_recognizer=hpo_cr, option_d=ear_d)
column_mapper_list.append(earMapper)
earMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Mixed hearing impairment (HP:0000410) (observed),3
1,Conductive hearing impairment (HP:0000405) (observed),1
2,Hearing impairment (HP:0000365) (observed),3
3,Sensorineural hearing impairment (HP:0000407) (observed),1


In [19]:
endoMapper = OptionColumnMapper(column_name='Endocrine',concept_recognizer=hpo_cr, option_d={})
column_mapper_list.append(endoMapper)
endoMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Hyponatremia (HP:0002902) (observed),8
1,Inappropriate antidiuretic hormone secretion (HP:0031218) (observed),3
2,Hyperkalemia (HP:0002153) (observed),1
3,Hypothyroidism (HP:0000821) (observed),1
4,Bronchiolitis (HP:0011950) (observed),1
5,Neonatal hypoglycemia (HP:0001998) (observed),1
6,Hamartoma (HP:0010566) (observed),1
7,Hirsutism (HP:0001007) (observed),1


In [20]:
respiratory_d = {'trachemalacea': "Tracheomalacia"}
respMapper = OptionColumnMapper(column_name='Respiratory',concept_recognizer=hpo_cr, option_d=respiratory_d)
column_mapper_list.append(respMapper)
respMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Bradycardia (HP:0001662) (observed),1
1,Tracheomalacia (HP:0002779) (observed),3
2,Hypoventilation (HP:0002791) (observed),1
3,Aspiration pneumonia (HP:0011951) (observed),1
4,Restrictive ventilatory defect (HP:0002091) (observed),1
5,Respiratory distress (HP:0002098) (observed),2
6,Central apnea (HP:0002871) (observed),1
7,Apnea (HP:0002104) (observed),1
8,Sleep apnea (HP:0010535) (observed),1
9,Obstructive sleep apnea (HP:0002870) (observed),1


In [21]:
cord_d = {'PFO':'Patent foramen ovale',
            'VSD': 'Ventricular septal defect',
         'transverse arch hypoplasia': 'Hypoplastic aortic arch',
         'LVOT': 'Left ventricular outflow tract obstruction',
         'PDA':'Patent ductus arteriosus',
          'DORV':'Double outlet right ventricle',
          'Persistent LSVC':'Persistent left superior vena cava',
         'ASD':'Atrial septal defect'}
corMapper =  OptionColumnMapper(column_name='Cardiac',concept_recognizer=hpo_cr, option_d=cord_d)
column_mapper_list.append(corMapper)
corMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Ventricular septal defect (HP:0001629) (observed),7
1,Left ventricular outflow tract obstruction (HP:0032092) (observed),1
2,Coarctation of aorta (HP:0001680) (observed),1
3,Hypoplastic aortic arch (HP:0012304) (observed),1
4,Patent ductus arteriosus (HP:0001643) (observed),4
5,Atrial septal defect (HP:0001631) (observed),2
6,Pulmonary artery hypoplasia (HP:0004971) (observed),1
7,Tetralogy of Fallot (HP:0001636) (observed),1
8,Heart block (HP:0012722) (observed),1
9,Patent foramen ovale (HP:0001655) (observed),2


In [22]:
gi_d = {'GTT': 'Feeding difficulties',
       'GGT': 'Feeding difficulties',
       'PEG': 'Feeding difficulties',
       'reflux': 'Gastroesophageal reflux'}
giMapper =  OptionColumnMapper(column_name='Gastrointestinal',concept_recognizer=hpo_cr, option_d=gi_d)
column_mapper_list.append(giMapper)
giMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Feeding difficulties (HP:0011968) (observed),12
1,Gastroesophageal reflux (HP:0002020) (observed),2
2,Constipation (HP:0002019) (observed),2
3,Tube feeding (HP:0033454) (observed),1
4,Intestinal malrotation (HP:0002566) (observed),1
5,Dysphagia (HP:0002015) (observed),1


In [23]:
guMapper =  OptionColumnMapper(column_name='Renal / urinary tract',concept_recognizer=hpo_cr, option_d={})
column_mapper_list.append(guMapper)
guMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Collectionism (HP:0030212) (observed),1
1,Intestinal malrotation (HP:0002566) (observed),1
2,Hydronephrosis (HP:0000126) (observed),1
3,Multicystic kidney dysplasia (HP:0000003) (observed),1
4,Renal dysplasia (HP:0000110) (observed),1
5,Hydroureter (HP:0000072) (observed),1
6,Polycystic kidney dysplasia (HP:0000113) (observed),1
7,Cystic renal dysplasia (HP:0000800) (observed),1


In [24]:
genitalMapper =  OptionColumnMapper(column_name='Genital',concept_recognizer=hpo_cr, option_d={})
column_mapper_list.append(genitalMapper)
genitalMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Cryptorchidism (HP:0000028) (observed),5
1,Shawl scrotum (HP:0000049) (observed),3
2,Penoscrotal transposition (HP:0100600) (observed),1
3,Hernia (HP:0100790) (observed),1
4,Micropenis (HP:0000054) (observed),2
5,Cervical agenesis (HP:0030008) (observed),1


In [25]:
skelMapper =  OptionColumnMapper(column_name='Skeletal',concept_recognizer=hpo_cr, option_d={})
column_mapper_list.append(skelMapper)
skelMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Hip dysplasia (HP:0001385) (observed),3
1,Narrow chest (HP:0000774) (observed),1
2,Sacral dimple (HP:0000960) (observed),2
3,Osteopenia (HP:0000938) (observed),1
4,Dislocated radial head (HP:0003083) (observed),1
5,Hip subluxation (HP:0030043) (observed),1
6,Tethered cord (HP:0002144) (observed),1
7,Talipes equinovarus (HP:0001762) (observed),1
8,Thoracic dysplasia (HP:0006644) (observed),1
9,Accelerated skeletal maturation (HP:0005616) (observed),1


In [26]:
nMapper =  OptionColumnMapper(column_name='Neuromuscular',concept_recognizer=hpo_cr, option_d={})
column_mapper_list.append(nMapper)
nMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Hypotonia (HP:0001252) (observed),4
1,Seizure (HP:0001250) (observed),6
2,Scoliosis (HP:0002650) (observed),6
3,Spastic paraplegia (HP:0001258) (observed),1
4,EEG abnormality (HP:0002353) (observed),1
5,Poor suck (HP:0002033) (observed),1
6,Fever (HP:0001945) (observed),1
7,Generalized myoclonic seizure (HP:0002123) (observed),1
8,Generalized non-motor (absence) seizure (HP:0002121) (observed),1
9,Myoclonus (HP:0001336) (observed),1


In [27]:
imaging_d = {'small pons': 'Hypoplasia of the pons',
            'Dandy Walker malformation': 'Dandy-Walker malformation',
            'hypoplasia of cerebellar vermis': 'Cerebellar vermis hypoplasia',
            'corpus callosum is thinned': 'Thin corpus callosum',
            'Thin CC':'Thin corpus callosum',
            'generalised atrophy particularly brainstem': 'Atrophy/Degeneration affecting the brainstem'}
imagingMapper =  OptionColumnMapper(column_name='Neuroimaging',concept_recognizer=hpo_cr, option_d=imaging_d)
column_mapper_list.append(imagingMapper)
imagingMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Hypoplasia of the pons (HP:0012110) (observed),2
1,Simplified gyral pattern (HP:0009879) (observed),1
2,Microcephaly (HP:0000252) (observed),1
3,Cerebellar hypoplasia (HP:0001321) (observed),3
4,Enlarged cisterna magna (HP:0002280) (observed),4
5,Thin corpus callosum (HP:0033725) (observed),6
6,Dandy-Walker malformation (HP:0001305) (observed),2
7,Cerebellar vermis hypoplasia (HP:0001320) (observed),1
8,Enlarged posterior fossa (HP:0005445) (observed),1
9,Delayed myelination (HP:0012448) (observed),1


<h3>Variants</h3>
<p>The individuals in this study have one of two distinct variants.</p>

In [28]:
genome = 'hg38'
default_genotype = 'heterozygous'
setd2_transcript='NM_014159.7'
vvalidator = VariantValidator(genome_build=genome, transcript=setd2_transcript)

In [29]:
variant_5218 = vvalidator.encode_hgvs('c.5218C>T')
variant_5219 = vvalidator.encode_hgvs('c.5219G>A')
variant_d = {"p.(Arg1740Trp)": variant_5218, 'p.(Arg1740Gln)': variant_5219}
varMapper = VariantColumnMapper(variant_column_name='Variant', variant_d=variant_d, default_genotype=default_genotype)

https://rest.variantvalidator.org/VariantValidator/variantvalidator/hg38/NM_014159.7%3Ac.5218C>T/NM_014159.7?content-type=application%2Fjson
https://rest.variantvalidator.org/VariantValidator/variantvalidator/hg38/NM_014159.7%3Ac.5219G>A/NM_014159.7?content-type=application%2Fjson


In [30]:
# ageMapper =  not provided 
sexMapper = SexColumnMapper(male_symbol='Male', female_symbol='Female', column_name='Sex')
individual_column_name = 'patient_id'

encoder = CohortEncoder(df=dft, 
                        hpo_cr=hpo_cr, 
                        column_mapper_list=column_mapper_list, 
                        individual_column_name=individual_column_name,
                        sexmapper=sexMapper,
                        variant_mapper=varMapper,
                        metadata=metadata)

rabin_omim = "OMIM:620155"
rabin_label = "Rabin-Pappas syndrome"
rabin = Disease(disease_id=rabin_omim, disease_label=rabin_label)
idd_ad70_omim = "OMIM:620157"
idd_ad70_label = "Intellectual developmental disorder, autosomal dominant 70"
idd_ad70 = Disease(disease_id=idd_ad70_omim,  disease_label=idd_ad70_label)
# Create map from patient id to labels
disease_map = {}
for i in range(1, 13):
    pat_id = f"Group 1 Patient {i}"
    disease_map[pat_id] = rabin
for i in range(1, 4):
    pat_id = f"Group 2 Patient {i}"
    disease_map[pat_id] = idd_ad70
encoder.set_disease_dictionary(disease_map)

In [31]:
individuals = encoder.get_individuals()
cvalidator = CohortValidator(cohort=individuals, ontology=hpo_ontology, min_hpo=1, allelic_requirement=AllelicRequirement.MONO_ALLELIC)
qc = QcVisualizer(cohort_validator=cvalidator)
display(HTML(qc.to_summary_html()))

Level,Error category,Count
WARNING,REDUNDANT,6
INFORMATION,NOT_MEASURED,69


In [32]:
individuals = cvalidator.get_error_free_individual_list()
table = PhenopacketTable(individual_list=individuals, metadata=metadata)
display(HTML(table.to_html()))

Individual,Disease,Genotype,Phenotypic features
Group 1 Patient 1 (MALE; n/a),Rabin-Pappas syndrome (OMIM:620155),NM_014159.7:c.5218C>T (heterozygous),Ptosis (HP:0000508); Delayed speech and language development (HP:0000750); Cerebellar hypoplasia (HP:0001321); Agenesis of corpus callosum (HP:0001274); Upslanted palpebral fissure (HP:0000582); Patent ductus arteriosus (HP:0001643); Highly arched eyebrow (HP:0002553); Coarctation of aorta (HP:0001680); Hypotonia (HP:0001252); Micrognathia (HP:0000347); Low-set ears (HP:0000369); Downturned corners of mouth (HP:0002714); Bradycardia (HP:0001662); Collectionism (HP:0030212); Hypoplastic aortic arch (HP:0012304); Feeding difficulties (HP:0011968); Delayed ability to sit (HP:0025336); 2-3 toe syndactyly (HP:0004691); Retinal detachment (HP:0000541); Hip dysplasia (HP:0001385); Hyperkalemia (HP:0002153); Scoliosis (HP:0002650); Large fontanelles (HP:0000239); Hyponatremia (HP:0002902); Inappropriate antidiuretic hormone secretion (HP:0031218); Mixed hearing impairment (HP:0000410); Optic nerve hypoplasia (HP:0000609); Hypoplasia of the pons (HP:0012110); Brachydactyly (HP:0001156); Hypertelorism (HP:0000316); Single transverse palmar crease (HP:0000954); Severe global developmental delay (HP:0011344); Short palpebral fissure (HP:0012745); Cryptorchidism (HP:0000028); Seizure (HP:0001250); Ventricular septal defect (HP:0001629); Intestinal malrotation (HP:0002566); Inability to walk (HP:0002540); Left ventricular outflow tract obstruction (HP:0032092); Atrial septal defect (HP:0001631); Shawl scrotum (HP:0000049)
Group 1 Patient 2 (MALE; n/a),Rabin-Pappas syndrome (OMIM:620155),NM_014159.7:c.5218C>T (heterozygous),Polyhydramnios (HP:0001561); Severe global developmental delay (HP:0011344); Inability to walk (HP:0002540); Delayed ability to sit (HP:0025336); Absent speech (HP:0001344); Midface retrusion (HP:0011800); Wide nasal bridge (HP:0000431); Broad nasal tip (HP:0000455); Low hanging columella (HP:0009765); Short palpebral fissure (HP:0012745); Periorbital fullness (HP:0000629); Highly arched eyebrow (HP:0002553); Hypertelorism (HP:0000316); Single transverse palmar crease (HP:0000954); Fair hair (HP:0002286); Triphalangeal thumb (HP:0001199); Optic nerve hypoplasia (HP:0000609); Cataract (HP:0000518); Mixed hearing impairment (HP:0000410); Tracheomalacia (HP:0002779); Feeding difficulties (HP:0011968); Gastroesophageal reflux (HP:0002020); Constipation (HP:0002019); Hydronephrosis (HP:0000126); Cryptorchidism (HP:0000028); Penoscrotal transposition (HP:0100600); Hip dysplasia (HP:0001385); Scoliosis (HP:0002650); Seizure (HP:0001250); Spastic paraplegia (HP:0001258); Simplified gyral pattern (HP:0009879); Microcephaly (HP:0000252); Cerebellar hypoplasia (HP:0001321); Enlarged cisterna magna (HP:0002280)
Group 1 Patient 3 (MALE; n/a),Rabin-Pappas syndrome (OMIM:620155),NM_014159.7:c.5218C>T (heterozygous),Ventriculomegaly (HP:0002119); Hypertelorism (HP:0000316); Rocker bottom foot (HP:0001838); Low-set nipples (HP:0002562); Hyponatremia (HP:0002902); Hypoventilation (HP:0002791); Tracheomalacia (HP:0002779); Ventricular septal defect (HP:0001629); Pulmonary artery hypoplasia (HP:0004971); Cryptorchidism (HP:0000028); Shawl scrotum (HP:0000049); EEG abnormality (HP:0002353); Seizure (HP:0001250); Thin corpus callosum (HP:0033725)
Group 1 Patient 4 (FEMALE; n/a),Rabin-Pappas syndrome (OMIM:620155),NM_014159.7:c.5218C>T (heterozygous),Dolichocephaly (HP:0000268); Frontal bossing (HP:0002007); Midface retrusion (HP:0011800); Wide nasal bridge (HP:0000431); Broad nasal tip (HP:0000455); Low hanging columella (HP:0009765); Upslanted palpebral fissure (HP:0000582); Short palpebral fissure (HP:0012745); Periorbital fullness (HP:0000629); Highly arched eyebrow (HP:0002553); Hypertelorism (HP:0000316); Camptodactyly (HP:0012385); Low-set ears (HP:0000369); Posteriorly rotated ears (HP:0000358); Low-set nipples (HP:0002562); Sparse hair (HP:0008070); Iris coloboma (HP:0000612); Retinal detachment (HP:0000541); Hyponatremia (HP:0002902); Inappropriate antidiuretic hormone secretion (HP:0031218); Aspiration pneumonia (HP:0011951); Tetralogy of Fallot (HP:0001636); Tube feeding (HP:0033454); Multicystic kidney dysplasia (HP:0000003); Narrow chest (HP:0000774); Sacral dimple (HP:0000960); Poor suck (HP:0002033); Dandy-Walker malformation (HP:0001305)
Group 1 Patient 5 (FEMALE; n/a),Rabin-Pappas syndrome (OMIM:620155),NM_014159.7:c.5218C>T (heterozygous),Delayed speech and language development (HP:0000750); Cerebellar hypoplasia (HP:0001321); Wide nasal bridge (HP:0000431); Cataract (HP:0000518); Highly arched eyebrow (HP:0002553); Thin corpus callosum (HP:0033725); Osteopenia (HP:0000938); Small nail (HP:0001792); Hypotonia (HP:0001252); Everted lower lip vermilion (HP:0000232); Low hanging columella (HP:0009765); Enlarged cisterna magna (HP:0002280); Stenosis of the external auditory canal (HP:0000402); Flat occiput (HP:0005469); Dislocated radial head (HP:0003083); Feeding difficulties (HP:0011968); Delayed ability to sit (HP:0025336); Midface retrusion (HP:0011800); Hirsutism (HP:0001007); Scoliosis (HP:0002650); Ventriculomegaly (HP:0002119); Broad nasal tip (HP:0000455); Intraventricular hemorrhage (HP:0030746); Heart block (HP:0012722); Preauricular skin tag (HP:0000384); Delayed ability to walk (HP:0031936); Hypertelorism (HP:0000316); Single transverse palmar crease (HP:0000954); Fever (HP:0001945); Severe global developmental delay (HP:0011344); Short palpebral fissure (HP:0012745); Smooth philtrum (HP:0000319); Short foot (HP:0001773); Seizure (HP:0001250); Broad thumb (HP:0011304); Narrow mouth (HP:0000160); Blepharophimosis (HP:0000581); Narrow forehead (HP:0000341); Conductive hearing impairment (HP:0000405); Short distal phalanx of finger (HP:0009882); Periorbital fullness (HP:0000629)
Group 1 Patient 6 (FEMALE; n/a),Rabin-Pappas syndrome (OMIM:620155),NM_014159.7:c.5218C>T (heterozygous),Severe global developmental delay (HP:0011344); Inability to walk (HP:0002540); Delayed ability to sit (HP:0025336); Wide anterior fontanel (HP:0000260); Midface retrusion (HP:0011800); Wide nasal bridge (HP:0000431); Broad nasal tip (HP:0000455); Low hanging columella (HP:0009765); Short palpebral fissure (HP:0012745); Periorbital fullness (HP:0000629); Highly arched eyebrow (HP:0002553); Hypertelorism (HP:0000316); 2-3 toe syndactyly (HP:0004691); Broad hallux (HP:0010055); Auricular pit (HP:0030025); Sensorineural hearing impairment (HP:0000407); Inappropriate antidiuretic hormone secretion (HP:0031218); Hyponatremia (HP:0002902); Tracheomalacia (HP:0002779); Restrictive ventilatory defect (HP:0002091); Respiratory distress (HP:0002098); Central apnea (HP:0002871); Ventricular septal defect (HP:0001629); Patent ductus arteriosus (HP:0001643); Patent foramen ovale (HP:0001655); Feeding difficulties (HP:0011968); Intestinal malrotation (HP:0002566); Hip subluxation (HP:0030043); Sacral dimple (HP:0000960); Tethered cord (HP:0002144); Seizure (HP:0001250); Thin corpus callosum (HP:0033725); Hypoplasia of the pons (HP:0012110); Delayed myelination (HP:0012448)
Group 1 Patient 7 (MALE; n/a),Rabin-Pappas syndrome (OMIM:620155),NM_014159.7:c.5218C>T (heterozygous),Posteriorly rotated ears (HP:0000358); Gastroesophageal reflux (HP:0002020); Retinopathy of prematurity (HP:0500049); Wide nasal bridge (HP:0000431); Highly arched eyebrow (HP:0002553); Hernia (HP:0100790); Talipes equinovarus (HP:0001762); Low hanging columella (HP:0009765); Low-set ears (HP:0000369); Enlarged cisterna magna (HP:0002280); Feeding difficulties (HP:0011968); Midface retrusion (HP:0011800); Micropenis (HP:0000054); Cardiomegaly (HP:0001640); Broad nasal tip (HP:0000455); Hyponatremia (HP:0002902); Apnea (HP:0002104); Hypothyroidism (HP:0000821); Hypertelorism (HP:0000316); Single transverse palmar crease (HP:0000954); Renal dysplasia (HP:0000110); Retinal hemorrhage (HP:0000573); Severe global developmental delay (HP:0011344); Short palpebral fissure (HP:0012745); Cryptorchidism (HP:0000028); Seizure (HP:0001250); Ventricular septal defect (HP:0001629); Narrow mouth (HP:0000160); Persistent left superior vena cava (HP:0005301); Quadricuspid aortic valve (HP:0031655); Shawl scrotum (HP:0000049); Periorbital fullness (HP:0000629)
Group 1 Patient 8 (FEMALE; n/a),Rabin-Pappas syndrome (OMIM:620155),NM_014159.7:c.5218C>T (heterozygous),Severe global developmental delay (HP:0011344); Inability to walk (HP:0002540); Delayed ability to sit (HP:0025336); Midface retrusion (HP:0011800); Wide nasal bridge (HP:0000431); Broad nasal tip (HP:0000455); Low hanging columella (HP:0009765); Short palpebral fissure (HP:0012745); Periorbital fullness (HP:0000629); Highly arched eyebrow (HP:0002553); Hypertelorism (HP:0000316); Micrognathia (HP:0000347); Retinal telangiectasia (HP:0007763); Retinal detachment (HP:0000541); Mixed hearing impairment (HP:0000410); Pulmonary artery stenosis (HP:0004415); Dysphagia (HP:0002015); Feeding difficulties (HP:0011968); Hydroureter (HP:0000072); Generalized non-motor (absence) seizure (HP:0002121); Generalized myoclonic seizure (HP:0002123); Scoliosis (HP:0002650); Thin corpus callosum (HP:0033725)
Group 1 Patient 9 (FEMALE; n/a),Rabin-Pappas syndrome (OMIM:620155),NM_014159.7:c.5218C>T (heterozygous),Increased nuchal translucency (HP:0010880); Severe global developmental delay (HP:0011344); Inability to walk (HP:0002540); Absent speech (HP:0001344); Midface retrusion (HP:0011800); Wide nasal bridge (HP:0000431); Broad nasal tip (HP:0000455); Low hanging columella (HP:0009765); Periorbital fullness (HP:0000629); Highly arched eyebrow (HP:0002553); Hypertelorism (HP:0000316); Brachydactyly (HP:0001156); Broad thumb (HP:0011304); Short foot (HP:0001773); Short toe (HP:0001831); Inverted nipples (HP:0003186); Low-set nipples (HP:0002562); Retinal dysplasia (HP:0007973); Glaucoma (HP:0000501); Hearing impairment (HP:0000365); Bronchiolitis (HP:0011950); Hyponatremia (HP:0002902); Sleep apnea (HP:0010535); Ventricular septal defect (HP:0001629); Double outlet right ventricle (HP:0001719); Thin corpus callosum (HP:0033725); Atrophy/Degeneration affecting the brainstem (HP:0007366)
Group 1 Patient 10 (MALE; n/a),Rabin-Pappas syndrome (OMIM:620155),NM_014159.7:c.5218C>T (heterozygous),Respiratory distress (HP:0002098); Cerebellar hypoplasia (HP:0001321); Wide nasal bridge (HP:0000431); Highly arched eyebrow (HP:0002553); Ambiguous genitalia (HP:0000062); Multicystic kidney dysplasia (HP:0000003); Double outlet right ventricle (HP:0001719); Pulmonic stenosis (HP:0001642); Low hanging columella (HP:0009765); Enlarged cisterna magna (HP:0002280); Polycystic kidney dysplasia (HP:0000113); Feeding difficulties (HP:0011968); Midface retrusion (HP:0011800); Micropenis (HP:0000054); Broad nasal tip (HP:0000455); Hyponatremia (HP:0002902); Hypertelorism (HP:0000316); Cryptorchidism (HP:0000028); Short palpebral fissure (HP:0012745); Ventricular septal defect (HP:0001629); Myoclonus (HP:0001336); Periorbital fullness (HP:0000629)


In [33]:
Individual.output_individuals_as_phenopackets(individual_list=individuals, metadata=metadata)

We output 15 GA4GH phenopackets to the directory phenopackets
