# KDM6B, Rots D et al.

Data taken from  [Rots D, The clinical and molecular spectrum of the KDM6B-related neurodevelopmental disorder. Am J Hum Genet. 2023 ](https://pubmed.ncbi.nlm.nih.gov/37196654/)
Data extracted from Table S1. Detailed clinical information of the cases with the (likely) pathogenic KDM6B variants.

In [1]:
import pandas as pd
pd.set_option('display.max_colwidth', None) # show entire column contents, important!
from collections import defaultdict
from IPython.display import HTML, display
from pyphetools.creation import *
from pyphetools.validation import *
from pyphetools.visualization import *
import pyphetools
print(f"Using pyphetools version {pyphetools.__version__}")

Using pyphetools version 0.9.65


In [2]:
PMID = "PMID:37196654"
title = "The clinical and molecular spectrum of the KDM6B-related neurodevelopmental disorder"
cite = Citation(pmid=PMID, title=title)
parser = HpoParser(hpo_json_file="../hp.json")
hpo_cr = parser.get_hpo_concept_recognizer()
hpo_version = parser.get_version()
hpo_ontology = parser.get_ontology()
metadata = MetaData(created_by="ORCID:0000-0002-0736-9199", citation=cite)
metadata.default_versions_with_hpo(version=hpo_version)
print(f"HPO version {hpo_version}")

HPO version 2024-02-27


In [3]:
df = pd.read_excel('input/Rots_2023_PMID_37196654.xlsx')

In [4]:
df.head()

Unnamed: 0,Field,Individual 1,Individual 2,Individual 3,Individual 5,Individual 6,Individual 7,Individual 8,Individual 9,Individual 11,...,Individual 4,Individual 10,Individual 34,Individual 38,Individual 44 (DDD_286674),Individual 49 (DEASD_0146_001),Individual 50 (DEASD_0129_001),Individual 54 (SSC_13675.p1),Individual 58 (DDD_305030),Individual 59 (DDD_306396)
0,Sex,F,F,M,M,M,F,M,M,F,...,M,M,M,M,F,M,M,M,M,M
1,"Age, years",16,10,9,25,13y2m,9y6m,10,6y6m,19,...,14,4,11y,6,3,7y3m,8y7m,,,
2,Cohort type,Clinical testing,Clinical testing,Clinical testing,Clinical testing,Clinical testing,Clinical testing,Clinical testing,Clinical testing,Clinical testing,...,Clinical testing,Clinical testing,Clinical testing,Clinical testing,Research and clinical testing,Research cohort,Research cohort,Research cohort,Research cohort,Research cohort
3,Mutation (NM_),1,2,3,4,5,6,7,8,9,...,64,65,66,67,68,69,70,71,72,73
4,cDNA change (ENST00000254846.9 or NM_001080424.2),c.1014delC,c.1085_1088del,c.654_655del,c.1439dup,c.2598delC,c.4500C>A,c.403C>T,c.4737+1G>A,c.3288_3291delTGAG,...,c.4696C>A,c.3762_3764del,c.4118T>C,c.4193C>A,c.4724G>C,c.4174G>A,c.4186T>A,c.4187_4189del,c.4187_4189del,c.4222T>C


In [5]:
dft = df.transpose()
dft.columns = dft.iloc[0]
dft.drop(dft.index[0], inplace=True)
dft.head()

Field,Sex,"Age, years",Cohort type,Mutation (NM_),cDNA change (ENST00000254846.9 or NM_001080424.2),Amino acid change,Variant Type (PTV or PAV),Inheritance,Heterozygous/Homozygous,Additional findings of genetic testing,...,Constipation,Other_gi,Skin hyperlaxity,Genitourinary abnormalities,Cryptorchidism,Other medication received,Other,NaN,NaN.1,NaN.2
Individual 1,F,16,Clinical testing,1,c.1014delC,p.(Arg340Alafs*147),PTV,Maternal,Heterozygous,No,...,No,,No,No,,Not reported,Nasal speech,,,
Individual 2,F,10,Clinical testing,2,c.1085_1088del,p.(Glu362Alafs*124),PTV,Paternal,Heterozygous,No,...,No,,No,No,,No,2 Cafe-au-lait spots,Night incontinence,commonly head and abdominal pain,
Individual 3,M,9,Clinical testing,3,c.654_655del,p.(Glu220Glyfs*16),PTV,de novo,Heterozygous,Beta-thalasemia carrier,...,Yes,,No,No,No,"Melatonine(sleep problems); Prednisolon, budesonide, salbutamol (asthma); macrogol (constipations); esomeprazole (GERD)",1 Cafe-au-lait spot,Adenotomy due to the hyperplasia,Bronchial asthma,Verry common airway infections
Individual 5,M,25,Clinical testing,4,c.1439dup,p.(Pro481Thrfs*29),PTV,de novo,Heterozygous,No,...,No,"Eats/drinks no cow's milk, no gluten and little soya. No allergy but seems sensitive to these products",No,No,No,Vitamins and feeding supplements through alternative doctor,At metabolic screening increased essential amino acids alanin amongst others,,,
Individual 6,M,13y2m,Clinical testing,5,c.2598delC,p.(Ser867Argfs*27),PTV,de novo,Heterozygous,No,...,No,,No,phimosis,No,No,tongue frenulum IQ because of dyslalia. Double appical hair whorl,,,


In [6]:
dft['individual_id'] = dft.index  # Set the new column 'patient_id' to be identical to the contents of the index
dft.head(2)

Field,Sex,"Age, years",Cohort type,Mutation (NM_),cDNA change (ENST00000254846.9 or NM_001080424.2),Amino acid change,Variant Type (PTV or PAV),Inheritance,Heterozygous/Homozygous,Additional findings of genetic testing,...,Other_gi,Skin hyperlaxity,Genitourinary abnormalities,Cryptorchidism,Other medication received,Other,NaN,NaN.1,NaN.2,individual_id
Individual 1,F,16,Clinical testing,1,c.1014delC,p.(Arg340Alafs*147),PTV,Maternal,Heterozygous,No,...,,No,No,,Not reported,Nasal speech,,,,Individual 1
Individual 2,F,10,Clinical testing,2,c.1085_1088del,p.(Glu362Alafs*124),PTV,Paternal,Heterozygous,No,...,,No,No,,No,2 Cafe-au-lait spots,Night incontinence,commonly head and abdominal pain,,Individual 2


In [7]:
generator = SimpleColumnMapperGenerator(df=dft, observed="Yes", excluded="No", hpo_cr=hpo_cr)
column_mapper_list = generator.try_mapping_columns()
display(HTML(generator.to_html()))

Result,Columns
Mapped,Motor delay; Intellectual disability; Autism spectrum disorder; Sleep disturbances; Hypotonia; Spasticity; Joint hypermobility; Syndactyly; Pectus excavatum; Strabismus; Recurrent ear infections; Constipation; Cryptorchidism
Unmapped,"Sex; Age, years; Cohort type; Mutation (NM_); cDNA change (ENST00000254846.9 or NM_001080424.2); Amino acid change; Variant Type (PTV or PAV); Inheritance; Heterozygous/Homozygous; Additional findings of genetic testing; Other affected relatives; Pregnancy/delivery; Complications of Pregnancy/Delivery; Gestational age, weeks; Birth weight, g (SD); Growth; Height, cm (SD); Weight, kg(SD); Head circumference, cm(SD); Age at folow-up/measurements, years; Neurodevelopment; Language/speech delay; First words, months; First steps, months; IQ profile; nan; nan; Behavior problems; Psychosis / Schizophrenia; Use of psychiatric drugs; Other_neurodev; Neurological; Seizures / Epilepsy; Dystonia, if present - type and age of onset; Other neurological/movement issues; Brain MRI findings; Musculoskeletal/extremities; Vertebral abnormalities (Scoliosis, kyphosis etc).; Hand /foot/ finger abnormalities; Other_musculoskel; Dysmorphism; Dysmorphic features; Lip/palate cleft; Eyes/visual problems; Hypermetropia/myopia; Other_eye; Ear/ hearing problems; Hearing; Other_ear; Cardiovascular; Congenital heart disease; Other_cv; Gastrointestinal; Neonatal feeding difficulties; Yes; Other_gi; Skin hyperlaxity; Genitourinary abnormalities; Other medication received; Other; nan; nan; nan; individual_id"


In [8]:
option_d = {'Bossy, Agressive (verbally)':'Aggressive behavior',
           'Yes': 'Atypical behavior'}
excluded_d = {'No':'Atypical behavior'}
behaviorMapper = OptionColumnMapper(column_name="Behavior problems",concept_recognizer=hpo_cr, option_d=option_d, excluded_d=excluded_d)
column_mapper_list.append(behaviorMapper)
behaviorMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Aggressive behavior (HP:0000718) (observed),8
1,Atypical behavior (HP:0000708) (excluded),24
2,Irritability (HP:0000737) (observed),1
3,Anxiety (HP:0000739) (observed),5
4,Obstipation (HP:0034782) (observed),1
5,Atypical behavior (HP:0000708) (observed),7
6,Impulsivity (HP:0100710) (observed),2
7,Agitation (HP:0000713) (observed),1
8,Depression (HP:0000716) (observed),1
9,Attention deficit hyperactivity disorder (HP:0007018) (observed),3


In [9]:
# only needed to generate suggestions for mappers
#result = OptionColumnMapper.autoformat(df=dft, concept_recognizer=hpo_cr, delimiter=";,")

In [10]:
complications_of_pregnancy_d = {
 'Gestational diabetes. Requiring insulin in 3rd term': 'Maternal diabetes',
 'Premature rupture of amnion': 'Premature rupture of membranes',
 'No; cesarian section for breech presentation': 'Breech presentation',
 'Pregnancy uncomplicate; Prolonged delivery and vacuum extraction': 'Ventouse delivery',
 'Small for gestational age, c-section': 'Small for gestational age',
 'Failed induction, fetal distress': 'Fetal distress',
 'Vacuum extraction': 'Ventouse delivery',
 'Intrauterine growth restriction': 'Intrauterine growth retardation',
 'Gestational diabetes': 'Maternal diabetes',
 'Shoulder dystocia': 'Shoulder dystocia',
 #'Maternal alcohol abuse': 'Fetal alcohol exposure',
 'Moderate IUGR': 'Moderate intrauterine growth retardation',
 'Polyhydramnios, elective c-section': 'Polyhydramnios',
 'Ventouse; Induced': 'Ventouse delivery',
 'induction of labor at 40^ wg for reduced fetal movements': 'Decreased fetal movement',
 'induction of labor at 37/40 for reduced fetal movements; born in good condition (Apgar scores 7 & 8). Deteriorated in first day of life - severe pulmonary HTn (see cardiac)': 'Decreased fetal movement'
 }
complications_of_pregnancyMapper = OptionColumnMapper(column_name="Complications of Pregnancy/Delivery",
                    concept_recognizer=hpo_cr, option_d=complications_of_pregnancy_d)
column_mapper_list.append(complications_of_pregnancyMapper)
complications_of_pregnancyMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Maternal diabetes (HP:0009800) (observed),3
1,Premature rupture of membranes (HP:0001788) (observed),1
2,Breech presentation (HP:0001623) (observed),1
3,Ventouse delivery (HP:0011412) (observed),3
4,Small for gestational age (HP:0001518) (observed),1
5,Fetal distress (HP:0025116) (observed),1
6,Intrauterine growth retardation (HP:0001511) (observed),1
7,Hydrocephalus (HP:0000238) (observed),1
8,Shoulder dystocia (HP:0011413) (observed),1
9,Hypertension (HP:0000822) (observed),2


In [11]:
birth_weight_g_d = {'4370 (>2)': 'Large for gestational age',
 '4490 (>2)': 'Large for gestational age',
 '3595 (>2)': 'Large for gestational age',
 '1814 (<-2)': 'Small for gestational age',
 '2414 (>2)': 'Large for gestational age',
 '3814 (>2)': 'Large for gestational age',
 '4000 (>2)': 'Large for gestational age',
 '4140 (>2)': 'Large for gestational age',
 '4150  (>2)': 'Large for gestational age',
 '1644 (-2.60)': 'Small for gestational age',
 '2100 (< -2)': 'Small for gestational age',
 '2270 (<-2)': 'Small for gestational age',
 '3800 (>2)': 'Large for gestational age',}
birth_weight_g_Mapper = OptionColumnMapper(column_name='Birth weight, g (SD)',concept_recognizer=hpo_cr, option_d=birth_weight_g_d)
column_mapper_list.append(birth_weight_g_Mapper)
birth_weight_g_Mapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Large for gestational age (HP:0001520) (observed),9
1,Small for gestational age (HP:0001518) (observed),4


In [12]:
height_cm_d = {
 '149.5 (+3)': 'Tall stature',
 '130 (+2.6)': 'Tall stature',
 '149 (+3.5)': 'Tall stature',
 }
height_cm_Mapper = OptionColumnMapper(column_name='Height, cm (SD)',concept_recognizer=hpo_cr, option_d=height_cm_d)
column_mapper_list.append(height_cm_Mapper)
height_cm_Mapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Tall stature (HP:0000098) (observed),3


In [13]:
weight_kg_d = {'60 (>+2 weight to length)': 'Obesity',
 '70 (+4.1)': 'Obesity',
 '83.1 (+2.5)': 'Obesity',
 '31 (+2.4)': 'Obesity',
 '22.1 (+2.4)': 'Obesity',
 '149.6 (>3)': 'Obesity'}
weight_kgMapper = OptionColumnMapper(column_name='Weight, kg(SD)',concept_recognizer=hpo_cr, option_d=weight_kg_d)
column_mapper_list.append(weight_kgMapper)
weight_kgMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Obesity (HP:0001513) (observed),6


In [14]:
head_circumference_cm_d = {
 '58 (>+2.5)': 'Macrocephaly',
 '60.5 (>+2)': 'Macrocephaly',
 '57 (>+2.5)': 'Macrocephaly',
 '55.3 (>2.5)': 'Macrocephaly',
 '55.5 (+2.3)': 'Macrocephaly',
 '56 (>+2)': 'Macrocephaly',
 '55 (+3.4)': 'Macrocephaly',
 '56.5 (+3.3)': 'Macrocephaly',
 '54.5 (>+2)': 'Macrocephaly',
 '59 (+2.7)': 'Macrocephaly',
 '55 (+2.2)': 'Macrocephaly',
 '59.4 (+3)': 'Macrocephaly'}
head_circumference_cmMapper = OptionColumnMapper(column_name='Head circumference, cm(SD)',
                                                 concept_recognizer=hpo_cr, option_d=head_circumference_cm_d)
column_mapper_list.append(head_circumference_cmMapper)
head_circumference_cmMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Macrocephaly (HP:0000256) (observed),12


In [15]:
language_speech_delay_d = {'Yes': 'Delayed speech and language development',
 'Yes, mild': 'Delayed speech and language development',
 'Yes, Mild': 'Delayed speech and language development'}
excluded = { 'No': 'Delayed speech and language development'}
language_speech_delayMapper = OptionColumnMapper(column_name='Language/speech delay',concept_recognizer=hpo_cr, option_d=language_speech_delay_d, excluded_d=excluded)
column_mapper_list.append(language_speech_delayMapper)
language_speech_delayMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Delayed speech and language development (HP:0000750) (observed),61
1,Delayed speech and language development (HP:0000750) (excluded),5


In [16]:
motor_delay_d = {'Yes': 'Motor delay',
 'Yes, mild': 'Motor delay'}
excluded = { 'No': 'Motor delay'}
motor_delayMapper = OptionColumnMapper(column_name='Motor delay',concept_recognizer=hpo_cr, option_d=motor_delay_d, excluded_d=excluded)
column_mapper_list.append(motor_delayMapper)
motor_delayMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Motor delay (HP:0001270) (observed),57
1,Motor delay (HP:0001270) (excluded),7


In [17]:
id_d = {'No; learning problems': 'Specific learning disability',
 "Yes, Mild": 'Intellectual disability, mild',
 'Yes, moderate': 'Intellectual disability, moderate',
 'Yes': 'Intellectual disability',
 'learning difficulties': 'Specific learning disability',
 'Yes, mild': 'Intellectual disability, mild',
 'Yes, severe (contributed by Pathogenic HNRNPU variant)': 'Intellectual disability, severe',
 'No, learning difficulties': 'Specific learning disability',
 'Yes, severe': 'Intellectual disability, severe',
 'Yes, mild/borderline; learning difficulties': 'Intellectual disability, borderline',
 'learning difficulty': 'Specific learning disability',
 'Learning disability in special education classes in school': 'Specific learning disability',
 'Yes, Moderate': 'Intellectual disability, moderate'}
excluded = { 'No': 'Intellectual disability',}
idMapper = OptionColumnMapper(column_name='Intellectual disability',concept_recognizer=hpo_cr, option_d=id_d)
column_mapper_list.append(idMapper)
idMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Specific learning disability (HP:0001328) (observed),8
1,"Intellectual disability, mild (HP:0001256) (observed)",13
2,"Intellectual disability, moderate (HP:0002342) (observed)",3
3,Intellectual disability (HP:0001249) (observed),10
4,"Intellectual disability, severe (HP:0010864) (observed)",2
5,"Intellectual disability, borderline (HP:0006889) (observed)",1


In [18]:
asd_d = {'Yes': 'Autistic behavior',
 'Autistic-like features - hand flapping, mouthing and repetitive mannerisms. Difficulties with language and socialising but also some features not in keeping with ASD - desire to include other people in her experiences and demonstration of empathy': 'Autistic behavior',
 'Yes, moderate-severe': 'Autistic behavior'}
excluded = {'No': 'Autistic behavior'}
asdMapper = OptionColumnMapper(column_name='Autism spectrum disorder',concept_recognizer=hpo_cr, option_d=asd_d, excluded_d=excluded)
column_mapper_list.append(asdMapper)
asdMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Autistic behavior (HP:0000729) (observed),39
1,Autistic behavior (HP:0000729) (excluded),25


In [19]:
behavior_d = {'Bossy, Agressive (verbally)': 'Aggressive behavior',
 'Irritability, Anger, anxiety (associated with obstipation periods); No contact except parents and physician': ['Irritability', "Anxiety"],
 'Yes': 'Atypical behavior',
 'ADHD': 'Attention deficit hyperactivity disorder',
 'AHDS; Aggression, problems in social interaction': ['Aggressive behavior', 'Attention deficit hyperactivity disorder'],
 'Tantrums and inattention': 'Severe temper tantrums',
 'ADHD, aggression, problems in social interaction': ['Aggressive behavior', 'Attention deficit hyperactivity disorder'],
 'AHDS, aggressive, impulsive behaivior': ['Impulsivity', 'Aggressive behavior', 'Attention deficit hyperactivity disorder'],
 'Anxiety, aggression': ['Anxiety','Aggressive behavior'],
 'probable ADHD': 'Attention deficit hyperactivity disorder',
 'Agitation, agressivity': ['Agitation','Aggressive behavior'],
 'Aggressive behavior, noncompliance, physical aggression, poor play skills': ['Aggressive behavior','Delay in the acquisition of play skills'],
 'Anxiety': 'Anxiety',
 'Early and atypical depression, attention deficit, anxiety, atypical sensory': ['Depression', 'Attention deficit hyperactivity disorder'],
 'Yes, Aggression': 'Aggressive behavior',
 'ADHD, anxiety': ['Aggressive behavior','Anxiety'],
 'Poor social skills, stereotypic behaviour.': 'Abnormal repetitive mannerisms',
 'stubborn, aggressive, tantrums': 'Aggressive behavior',
 'Short attention span': 'Short attention span',
 'ADHD, aggressive behavior':['Aggressive behavior', 'Attention deficit hyperactivity disorder'],
 'hyperactivity': 'Hyperactivity',
 'attention deficit': 'Attention deficit hyperactivity disorder',
 'impulsive': 'Impulsivity',
 'Hyperactivity': 'Hyperactivity'}
excluded = {'No': 'Aggressive behavior'}
behaviorMapper = OptionColumnMapper(column_name='Behavior problems',concept_recognizer=hpo_cr, option_d=behavior_d, excluded_d=excluded)
column_mapper_list.append(behaviorMapper)
behaviorMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Aggressive behavior (HP:0000718) (observed),10
1,Aggressive behavior (HP:0000718) (excluded),24
2,Irritability (HP:0000737) (observed),1
3,Anxiety (HP:0000739) (observed),4
4,Atypical behavior (HP:0000708) (observed),6
5,Attention deficit hyperactivity disorder (HP:0007018) (observed),13
6,Severe temper tantrums (HP:0025162) (observed),1
7,Impulsivity (HP:0100710) (observed),2
8,Agitation (HP:0000713) (observed),1
9,Depression (HP:0000716) (observed),1


In [20]:
psychosisMapper = SimpleColumnMapper(column_name='Psychosis / Schizophrenia',hpo_id="HP:0000709", hpo_label="Psychosis", observed="Yes", excluded="No")
column_mapper_list.append(psychosisMapper)
psychosisMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,"original value: ""No"" -> HP: Psychosis (HP:0000709) (excluded)",44
1,"original value: ""Yes"" -> HP: Psychosis (HP:0000709) (observed)",4
2,"original value: ""nan"" -> HP: Psychosis (HP:0000709) (not measured)",24
3,"original value: ""No "" -> HP: Psychosis (HP:0000709) (excluded)",1


In [21]:
sleep_d = {
 'Yes': 'Sleep abnormality',
 'History of obstructive sleep apnea': 'Obstructive sleep apnea',
 'Yes, sleep apnea': 'Sleep apnea',
 'Yes, delayed sleep initiation and maintenance': 'Sleep abnormality',
 'Yes (previously)': 'Sleep abnormality',
 'Yes, in the first year': 'Sleep abnormality',
 'Yes, in the first year of life': 'Sleep abnormality',
 'Yes, uses melatonine':'Sleep abnormality',
 'obstructive and central sleep apnea': 'Central sleep apnea',
 'Yes, poor sleep, frequent waking for prolonged periods of time': 'Sleep abnormality',
 'Yes, uses melatonin': 'Sleep abnormality',}
excluded = {'No':  'Sleep abnormality',}
sleepMapper = OptionColumnMapper(column_name='Sleep disturbances',concept_recognizer=hpo_cr, option_d=sleep_d, excluded_d=excluded)
column_mapper_list.append(sleepMapper)
sleepMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Sleep abnormality (HP:0002360) (excluded),35
1,Sleep abnormality (HP:0002360) (observed),15
2,Obstructive sleep apnea (HP:0002870) (observed),1
3,Sleep apnea (HP:0010535) (observed),1
4,Central sleep apnea (HP:0010536) (observed),1


In [22]:
other_d = {
 'Social developmental delay': 'Global developmental delay',
 'GDD': 'Global developmental delay',
 'hyperphagia, problems in social interaction': 'Polyphagia',
 'Regression - loss of words at 8m': 'Developmental regression',
 'DD': 'Global developmental delay',
 'Bruxism': 'Bruxism',
 'drooling': 'Drooling',
 'Moderate GDD':  'Moderate global developmental delay',
 'Moderate GDD, short attention span': ['Moderate global developmental delay','Short attention span'],
 'Developmental regression': 'Developmental regression',
 'problems in social interaction': 'Abnormal social behavior',
 'Depression ; abnormal eating behaviour; no eye contact at age of 3 years; no interest in socializing with others; echolalia;  OCD': 'Depression',
 'stereotypic behaviors (flapping, rubbing ears, hitting his thighs)': 'Recurrent hand flapping',
 'inconsistently responds to own name at 17 months (normal audiology). Sensory seeking. Mostly happy baby.': 'Sensory seeking',
 'GDD; Bruxism': ['Global developmental delay','Bruxism'],
 'enuresis': 'Enuresis',
 'Mild GDD': 'Mild global developmental delay'}
otherMapper = OptionColumnMapper(column_name='Other_neurodev',concept_recognizer=hpo_cr, option_d=other_d)
column_mapper_list.append(otherMapper)
otherMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Global developmental delay (HP:0001263) (observed),5
1,Polyphagia (HP:0002591) (observed),1
2,Developmental regression (HP:0002376) (observed),2
3,Bruxism (HP:0003763) (observed),2
4,Drooling (HP:0002307) (observed),1
5,Moderate global developmental delay (HP:0011343) (observed),2
6,Short attention span (HP:0000736) (observed),1
7,Abnormal social behavior (HP:0012433) (observed),1
8,Depression (HP:0000716) (observed),1
9,Recurrent hand flapping (HP:0100023) (observed),1


In [23]:
seizures_d = {
 'Febrile seizure after BMR vaccination (around 14 months), thereafter stagnation of social emotional and speech development': 'Seizure',
 'Yes': 'Seizure',
 'Yes (absence seizures and GTC)': ['Generalized non-motor (absence) seizure', 'Bilateral tonic-clonic seizure'],
 'Yes, At 5y 6 m onset. Long lasting complex partial seizures with hospitalization in the intensive care unit, partial seizures during sleep, focal epilepsy with opercular seizures; Current therapy: Ethosuximide, Clobazam': 'Focal impaired awareness seizure',
 'Yes; myoclonic onset 12m': 'Myoclonic seizure',

 'Yes, drug resistant epilepsy: general and focal seizures; with currently 1 event per month on oxacarbazepine + clobazam': 'Seizure',
 'Yes. In newborn period after neurological insult from hypoperfusion. Has been off AEDs for > 1 year and remains seizure free.': 'Seizure'}
excluded = {'No; normal EEG': 'Seizure', 'No': 'Seizure'}
seizuresMapper = OptionColumnMapper(column_name='Seizures / Epilepsy',concept_recognizer=hpo_cr, option_d=seizures_d, excluded_d=excluded)
column_mapper_list.append(seizuresMapper)
seizuresMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Seizure (HP:0001250) (excluded),48
1,Seizure (HP:0001250) (observed),10
2,Generalized non-motor (absence) seizure (HP:0002121) (observed),1
3,Bilateral tonic-clonic seizure (HP:0002069) (observed),1
4,Focal impaired awareness seizure (HP:0002384) (observed),1
5,Focal-onset seizure (HP:0007359) (observed),1
6,Myoclonic seizure (HP:0032794) (observed),1


In [24]:
hypotonia_d = {
 'Yes': 'Hypotonia',
 'Yes, neonatal': 'Hypotonia',
 'Yes, core': 'Hypotonia',
 'Yes, muscle issues in core and hands': 'Hypotonia',}
excluded = {'No': 'Hypotonia',}
hypotoniaMapper = OptionColumnMapper(column_name='Hypotonia',concept_recognizer=hpo_cr, option_d=hypotonia_d, excluded_d=excluded)
column_mapper_list.append(hypotoniaMapper)
hypotoniaMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Hypotonia (HP:0001252) (excluded),28
1,Hypotonia (HP:0001252) (observed),32


In [25]:
dystonia_d = {
 'Dystonic type episodes': 'Dystonia',
 'Yes - dystonic posture of the upper limbs': 'Dystonia',}
excluded = {'No': 'Dystonia',}
dystoniaMapper = OptionColumnMapper(column_name='Dystonia, if present - type and age of onset',
                                    concept_recognizer=hpo_cr, option_d=dystonia_d, excluded_d=excluded)
column_mapper_list.append(dystoniaMapper)
dystoniaMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Dystonia (HP:0001332) (excluded),51
1,Dystonia (HP:0001332) (observed),2


In [26]:
spasticity_d = {
 'No but always tendency tip-toe walking.  botulinum toxin type A (BTX-A) injections ere performed at the calfs - Triceps surae muscle': 'Tip-toe gait',
 'Increased tonus in legs. Tip-toe walking': 'Tip-toe gait'}
excluded = {'No': 'Spasticity'}
spasticityMapper = OptionColumnMapper(column_name='Spasticity',concept_recognizer=hpo_cr, option_d=spasticity_d, excluded_d=excluded)
column_mapper_list.append(spasticityMapper)
spasticityMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Spasticity (HP:0001257) (excluded),49
1,Tip-toe gait (HP:0030051) (observed),2


In [27]:
other_neurological_d = {
 'Mild intention tremor; "wooden" motoric skills; poor fine motoric skills': 'Intention tremor',
 'Tics in stressfull situation': 'Tics',
 'Ataxia': 'Ataxia',
 'Neurogenic bladder; tethered cord': ['Neurogenic bladder',"Tethered cord"],
 'Tics': 'Tics',
 'mild dysartria. Neurological evaluation Jun 2020 (10y5m): cerebellar and extrapriamidal involvement with dystonic postures. Slight piramidal signs.': 'Dysarthria',
 'tethered spinal cord s/p 1/2019': "Tethered cord",
 'Getting tired quickly': 'Fatigue',
 'Getting tired quickly; hyporeflexia': 'Fatigue',
 'Occasional enuresis': 'Enuresis',
 'Poor coordination skills': 'Poor coordination',
 'still unsteady at 9 years': 'Unsteady gait',
 'PVL - due to prematurity (born at 32 6/7)': 'Periventricular leukomalacia',
 'Coordination issues': 'Poor coordination',
 'gross motor impairment, slight balance disturbance': 'Poor gross motor coordination',
 'initial very broad based gait': 'Broad-based gait',
 'Brachycephaly': 'Brachycephaly',
 'Dolichocephaly, unsteady gait': ['Dolichocephaly', "Unsteady gait"]}
other_neurologicalMapper = OptionColumnMapper(column_name='Other neurological/movement issues',
                                              concept_recognizer=hpo_cr, option_d=other_neurological_d)
column_mapper_list.append(other_neurologicalMapper)
other_neurologicalMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Intention tremor (HP:0002080) (observed),1
1,Hypertonia (HP:0001276) (observed),1
2,Involuntary movements (HP:0004305) (observed),1
3,Tics (HP:0100033) (observed),3
4,Ataxia (HP:0001251) (observed),1
5,Neurogenic bladder (HP:0000011) (observed),1
6,Tethered cord (HP:0002144) (observed),2
7,Dysarthria (HP:0001260) (observed),1
8,Fatigue (HP:0012378) (observed),2
9,Enuresis (HP:0000805) (observed),1


In [28]:
brain_mri_d = {
 'cerebellar mild cortical atrophy': 'Cerebellar cortical atrophy',
 'Cerebral atrophy, multiple lesions including glial lesions, atrophy of the cerebellar vermis, corpus callosum agenesis': ['Cerebral atrophy',"Agenesis of corpus callosum"],
 'platybasia, small foramen magnum': ['Platybasia',"Small foramen magnum"],
 'multiple focal areas of altered signal, mostly subcortical, especially in the bilateral frontal area. Thinning of corpus callosum. Dilation of the Virchow-Robin perivascular spaces': 'Dilation of Virchow-Robin spaces',
 'external hydrocephalus': 'Hydrocephalus',
 'Assymetric hyppocampus; lightly delayed myelinisation': 'Delayed CNS myelination',
 }
brain_mri_Mapper = OptionColumnMapper(column_name='Brain MRI findings',concept_recognizer=hpo_cr, option_d=brain_mri_d)
column_mapper_list.append(brain_mri_Mapper)
brain_mri_Mapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Cerebellar cortical atrophy (HP:0008278) (observed),1
1,Cerebral atrophy (HP:0002059) (observed),1
2,Agenesis of corpus callosum (HP:0001274) (observed),1
3,Platybasia (HP:0002691) (observed),1
4,Small foramen magnum (HP:0002677) (observed),1
5,Dilation of Virchow-Robin spaces (HP:0012520) (observed),1
6,Hydrocephalus (HP:0000238) (observed),1
7,Delayed CNS myelination (HP:0002188) (observed),1


In [29]:
hypermobility_d = {
 'Yes (Breighton score 6/8)': 'Joint hypermobility',
 'Yes': 'Joint hypermobility',
 'Yes, at knees': 'Knee joint hypermobility',
 'Yes, mild': 'Joint hypermobility',
 'Yes, Mild hypermobility in hands': 'Hyperextensible hand joints',
 'Yes (distal)': 'Joint hypermobility',
 'Yes (recurvatum knees and elbows)': 'Joint hypermobility'}
excluded = {'No': 'Joint hypermobility',}
hypermobilityMapper = OptionColumnMapper(column_name='Joint hypermobility',concept_recognizer=hpo_cr, option_d=hypermobility_d)
column_mapper_list.append(hypermobilityMapper)
hypermobilityMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Joint hypermobility (HP:0001382) (observed),20
1,Knee joint hypermobility (HP:0045086) (observed),1
2,Hyperextensible hand joints (HP:0005639) (observed),1


In [30]:
syndactyly_d = {
 'Yes, slight bilateral II, III, IV toe syndactyly;': '2-4 toe syndactyly',
 '2-3 toe syndactyly': '2-3 toe syndactyly'}
excluded = {'No': 'Syndactyly',}
syndactylyMapper = OptionColumnMapper(column_name='Syndactyly',concept_recognizer=hpo_cr, option_d=syndactyly_d, excluded_d=excluded)
column_mapper_list.append(syndactylyMapper)
syndactylyMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Syndactyly (HP:0001159) (excluded),53
1,Toe syndactyly (HP:0001770) (observed),1
2,2-3 toe syndactyly (HP:0004691) (observed),1


In [31]:
vertebral_d = {
 'Kyphosis': 'Kyphosis',
 'Scoliosis': 'Scoliosis',
 'Scoliosis, dorsolumbar': 'Scoliosis',
 'Hyperlordosis': 'Hyperlordosis',
 'thoracic kyphosis without vertebral defect': 'Thoracic kyphosis',
 'kyphosis': 'Kyphosis'}
excluded = {'No': 'Scoliosis',}
vertebralMapper = OptionColumnMapper(column_name='Vertebral abnormalities (Scoliosis, kyphosis etc).',
                                                   concept_recognizer=hpo_cr, option_d=vertebral_d, excluded_d=excluded)
column_mapper_list.append(vertebralMapper)
vertebralMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Scoliosis (HP:0002650) (excluded),46
1,Kyphosis (HP:0002808) (observed),2
2,Scoliosis (HP:0002650) (observed),3
3,Hyperlordosis (HP:0003307) (observed),1
4,Thoracic kyphosis (HP:0002942) (observed),1


In [32]:
hand_foot_finger_d = {
 'finger clubbing; clinodactyly IV and V bilateral': 'Clubbing of fingers',
 'clinodactyly IV and V bilateral;short and broad feet;': 'Clinodactyly',
 'Flat feet, broad finger tips, curls up toes in shoes.': 'Pes planus',
 'unilateral varus foot; clinodactyly 5th finger, short hands': 'Clinodactyly',
 'PIP joints prominent': 'Prominent interphalangeal joints',
 'Broad fingertips': 'Broad fingertip',
 'pes planus': 'Pes planus',
 'Brachydactyly; broad toes': 'Brachydactyly',
 'Hands: broad. Feet: broad feet, short toes, sandal gap, mild clinodactyly dig 3-4.': 'Broad foot',
 'Prominent fingertip pads': 'Prominent fingertip pads',
 'flat feet with a broad base but otherwise normal gait': 'Pes planus',
 'yes - one hand very enlarged; pes planus': 'Pes planus',
 'Flat feet': 'Pes planus',
 'flat feet': 'Pes planus',
 'simian crease, short thumbs, fetal pads, pes planovalgus': 'Single transverse palmar crease',
 'Genu valgum, short toe nails, Pes planus et valgus': ['Genu valgum', 'Pes planus'],
 'Talipes': 'Talipes'}
hand_foot_finger_Mapper = OptionColumnMapper(column_name='Hand /foot/ finger abnormalities',
                                                          concept_recognizer=hpo_cr, option_d=hand_foot_finger_d)
column_mapper_list.append(hand_foot_finger_Mapper)
hand_foot_finger_Mapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Brachydactyly (HP:0001156) (observed),3
1,Clinodactyly (HP:0030084) (observed),4
2,Broad foot (HP:0001769) (observed),3
3,Clubbing of fingers (HP:0100759) (observed),1
4,Pes planus (HP:0001763) (observed),11
5,Broad finger (HP:0001500) (observed),1
6,Prominent interphalangeal joints (HP:0006237) (observed),1
7,Broad fingertip (HP:0011300) (observed),1
8,Small hand (HP:0200055) (observed),1
9,Prominent fingertip pads (HP:0001212) (observed),1


In [33]:
other_musculoskel_d = {
 'Hip dysplasia': 'Hip dysplasia',
 'bilateral external tibial torsion': 'External tibial torsion',
 'extra rib on each side; cervical vertebral fusion suspected': 'Fused cervical vertebrae',
 'Downward sloping shoulders, proportionate tall stature, talipes': ['Down-sloping shoulders', "Tall stature", "Talipes"],
 'Proportionate tall stature': 'Proportionate tall stature',
 'A strawberry neavus was present at the back of his neck with no other skin lesions or freckling.': 'Freckling',
 'coccygeal dimple': 'Sacral dimple',
 'hip dysplasia': 'Hip dysplasia',
 'recurrent patella luxation, temporary hemiepiphysiodesis distal femur medial left': 'Patellar subluxation',
 'torticollis, lingual frenulum': 'Torticollis',
 'Soft skin': 'Soft skin'}
otherMapper = OptionColumnMapper(column_name='Other_musculoskel',concept_recognizer=hpo_cr, option_d=other_musculoskel_d)
column_mapper_list.append(otherMapper)
otherMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Hip dysplasia (HP:0001385) (observed),2
1,External tibial torsion (HP:0034373) (observed),1
2,Fused cervical vertebrae (HP:0002949) (observed),1
3,Down-sloping shoulders (HP:0200021) (observed),1
4,Tall stature (HP:0000098) (observed),1
5,Talipes (HP:0001883) (observed),1
6,Proportionate tall stature (HP:0011407) (observed),1
7,Freckling (HP:0001480) (observed),1
8,Sacral dimple (HP:0000960) (observed),1
9,Patellar subluxation (HP:0010499) (observed),1


In [34]:
dysmorphic_featuresMapper = OptionColumnMapper(column_name='Dysmorphic features',
                                               concept_recognizer=hpo_cr, option_d={})
column_mapper_list.append(dysmorphic_featuresMapper)
dysmorphic_featuresMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Depressed nasal bridge (HP:0005280) (observed),8
1,Epicanthus (HP:0000286) (observed),3
2,Broad chin (HP:0011822) (observed),1
3,Anteverted nares (HP:0000463) (observed),8
4,Thin vermilion border (HP:0000233) (observed),1
...,...,...
74,Glossoptosis (HP:0000162) (observed),1
75,Broad forehead (HP:0000337) (observed),1
76,Long uvula (HP:0010810) (observed),1
77,Malar flattening (HP:0000272) (observed),1


In [35]:
cleft_d = {
 'Cleft palate': 'Cleft palate'}
excluded = {"No": 'Cleft palate'}
cleftMapper = OptionColumnMapper(column_name='Lip/palate cleft',
                                 concept_recognizer=hpo_cr, option_d=cleft_d, excluded_d=excluded)
column_mapper_list.append(cleftMapper)
cleftMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Cleft palate (HP:0000175) (excluded),54
1,Cleft palate (HP:0000175) (observed),2


In [36]:
myopia_d = {'Hypermetropia, mild': 'Mild hypermetropia',
 'Hypermetropia': 'Hypermetropia',
 'Yes': 'Abnormality of refraction',
 'Myopia': 'Myopia',
 'Myopia + astigmatism': 'Myopia',
 'Unilateral myopia causing right esotropia': 'Myopia',
 'Hypermetropia and astigmatism': 'Hypermetropia',
 'Myopia, mild': 'Mild myopia',
 'Mild hypermetropic astigmatism': 'Astigmatism'}
excluded = {"No": "Abnormality of refraction"}
myopiaMapper = OptionColumnMapper(column_name='Hypermetropia/myopia',concept_recognizer=hpo_cr, option_d=myopia_d, excluded_d=excluded)
column_mapper_list.append(myopiaMapper)
myopiaMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Mild hypermetropia (HP:0031728) (observed),2
1,Hypermetropia (HP:0000540) (observed),9
2,Abnormality of refraction (HP:0000539) (excluded),32
3,Abnormality of refraction (HP:0000539) (observed),2
4,Myopia (HP:0000545) (observed),3
5,Mild myopia (HP:0025573) (observed),1
6,Astigmatism (HP:0000483) (observed),1


In [37]:
strabismus_d = {
 'Yes (exotropia)': 'Exotropia',
 'Yes': 'Strabismus'}
excluded = {'No':"Strabismus"}
strabismusMapper = OptionColumnMapper(column_name='Strabismus',concept_recognizer=hpo_cr, option_d=strabismus_d, excluded_d=excluded)
column_mapper_list.append(strabismusMapper)
strabismusMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Strabismus (HP:0000486) (excluded),46
1,Exotropia (HP:0000577) (observed),1
2,Strabismus (HP:0000486) (observed),6


In [38]:
other_eye_d = {
 'astigmatism (R=-2, 00;10º; L=-2, 00;0º)': 'Astigmatism',
 'persistent nystagmus': 'Nystagmus',
 "vision20/400 right eye and 20/30 causing amblyopia and right esotropia-doesn't use righ eye": 'Esotropia',
 'Astigmatism.': 'Astigmatism',
 'left ptosis': 'Ptosis',
 'horizontal nystagmus': 'Horizontal nystagmus',
 'astigmatism': 'Astigmatism'}
otherMapper = OptionColumnMapper(column_name='Other_eye',concept_recognizer=hpo_cr, option_d=other_d)
column_mapper_list.append(otherMapper)
otherMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Astigmatism (HP:0000483) (observed),3
1,Nystagmus (HP:0000639) (observed),1
2,Amblyopia (HP:0000646) (observed),1
3,Esotropia (HP:0000565) (observed),1
4,Ptosis (HP:0000508) (observed),1
5,Horizontal nystagmus (HP:0000666) (observed),1


In [39]:
hearing_d = {
 'Hearing loss': 'Hearing impairment',}
excluded = {'Normal': 'Hearing impairment'}
hearingMapper = OptionColumnMapper(column_name='Hearing',concept_recognizer=hpo_cr, option_d=hearing_d, excluded_d=excluded)
column_mapper_list.append(hearingMapper)
hearingMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Hearing impairment (HP:0000365) (excluded),51
1,Hearing impairment (HP:0000365) (observed),1


In [40]:
ear_infections_d = {'Yes, Ear tubes': 'Recurrent otitis media',
 'Yes': 'Recurrent otitis media',
 }
excluded = {'No': 'Recurrent otitis media'}
ear_infectionsMapper = OptionColumnMapper(column_name='Recurrent ear infections',
                                                    concept_recognizer=hpo_cr, option_d=ear_infections_d, excluded_d=excluded)
column_mapper_list.append(ear_infectionsMapper)
ear_infectionsMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Recurrent otitis media (HP:0000403) (observed),6
1,Recurrent otitis media (HP:0000403) (excluded),38


In [41]:
other_ear_d = {
 'vestibular aqueduct dilation': 'Enlarged vestibular aqueduct',
 'Rhinitis sicca': 'Rhinitis',
 'grommets/Ts and As removed, bilateral preauricular pits': 'Preauricular pit',
 'Tinnitus': 'Tinnitus'}
otherEarMapper = OptionColumnMapper(column_name='Other_ear',concept_recognizer=hpo_cr, option_d=other_ear_d)
column_mapper_list.append(otherEarMapper)
otherEarMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Enlarged vestibular aqueduct (HP:0011387) (observed),1
1,Rhinitis (HP:0012384) (observed),1
2,Preauricular pit (HP:0004467) (observed),1
3,Tinnitus (HP:0000360) (observed),1


In [42]:
chd_d = {
 'PDA': 'Patent ductus arteriosus',
 'ASD II': 'Secundum atrial septal defect',
 'pulmonary stenosis that resolved by age 3': 'Pulmonic stenosis',
 }
chdMapper = OptionColumnMapper(column_name='Congenital heart disease',concept_recognizer=hpo_cr, option_d=chd_d)
column_mapper_list.append(chdMapper)
chdMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Patent ductus arteriosus (HP:0001643) (observed),3
1,Secundum atrial septal defect (HP:0001684) (observed),1
2,Pulmonic stenosis (HP:0001642) (observed),1
3,Bicuspid aortic valve (HP:0001647) (observed),1
4,Ventricular septal defect (HP:0001629) (observed),1
5,Subvalvular aortic stenosis (HP:0001682) (observed),1
6,Interrupted aortic arch (HP:0011611) (observed),1
7,Atrial septal defect (HP:0001631) (observed),1


In [43]:
feeding_d = {'slow weight gain in first month due to the breastfeeding problems. Resolved after switching to bottle feeding.': 'Feeding difficulties',
 'Yes': 'Feeding difficulties',
 'Yes, G-tube': 'Feeding difficulties',
 'Yes, admitted to NICU for 8d for feeding difficulties': 'Feeding difficulties',
 'Yes-lethargy interfered with taking a bottle well': 'Feeding difficulties',
 'Yes NG fed 3 days': 'Feeding difficulties',
 'Yes NG fed 5 days': 'Feeding difficulties',
 'Yes NG fed for 5 days': 'Feeding difficulties',
 'Difficulties with breast feeding': 'Feeding difficulties',
 'NG fed for 6 weeks and 84 day SCBU / NICU stay but premature': 'Feeding difficulties'}
excluded = {"No":"Feeding difficulties",}
feedingMapper = OptionColumnMapper(column_name='Neonatal feeding difficulties',
                                                         concept_recognizer=hpo_cr, option_d=feeding_d, excluded_d=excluded)
column_mapper_list.append(feedingMapper)
feedingMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Feeding difficulties (HP:0011968) (observed),22
1,Feeding difficulties (HP:0011968) (excluded),29


In [44]:
other_gi_d = {
 'still feeding difficulties': 'Feeding difficulties',
 'History of vomiting': 'Vomiting',
 'Failure to thrive, history of frequent vomiting': 'Failure to thrive',
 'Hyperphagia': 'Polyphagia',
 'eosinophilic esophagitis': 'Eosinophilic infiltration of the esophagus',
 'diarrhea': 'Diarrhea',
 'Feeding difficulties with solid foods': 'Feeding difficulties'}
otherGiMapper = OptionColumnMapper(column_name='Other_gi',concept_recognizer=hpo_cr, option_d=other_gi_d)
column_mapper_list.append(otherGiMapper)
otherGiMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Allergy (HP:0012393) (observed),1
1,Feeding difficulties (HP:0011968) (observed),2
2,Vomiting (HP:0002013) (observed),1
3,Failure to thrive (HP:0001508) (observed),1
4,Polyphagia (HP:0002591) (observed),1
5,Eosinophilic infiltration of the esophagus (HP:0410151) (observed),1
6,Diarrhea (HP:0002014) (observed),1
7,Increased body weight (HP:0004324) (observed),1


In [45]:
gu_d = {
 'phimosis': 'Phimosis',
 'Agenesis of the right kidney': 'Unilateral renal agenesis',
 'left pyelic duplicity': 'Duplication of renal pelvis',
 'meatal stenosis': 'Male urethral meatus stenosis',}
guMapper = OptionColumnMapper(column_name='Genitourinary abnormalities', concept_recognizer=hpo_cr, option_d=gu_d)
column_mapper_list.append(guMapper)
guMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Phimosis (HP:0001741) (observed),1
1,Unilateral renal agenesis (HP:0000122) (observed),1
2,Duplication of renal pelvis (HP:0005580) (observed),1
3,Male urethral meatus stenosis (HP:0032077) (observed),1


In [46]:
cryptorchidism_d = {
 'Yes': 'Cryptorchidism',
 'Yes, bilateral': 'Bilateral cryptorchidism'}
excluded = {'No': 'Cryptorchidism',}
cryptorchidismMapper = OptionColumnMapper(column_name='Cryptorchidism',concept_recognizer=hpo_cr, option_d=cryptorchidism_d)
column_mapper_list.append(cryptorchidismMapper)
cryptorchidismMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Cryptorchidism (HP:0000028) (observed),2
1,Bilateral cryptorchidism (HP:0008689) (observed),1


In [47]:
other_d = {'Nasal speech': 'Hypernasal speech',
 '2 Cafe-au-lait spots': 'Cafe-au-lait spot',
 '1 Cafe-au-lait spot': 'Cafe-au-lait spot',
 'addidional maxiliar tooth': 'Supernumerary maxillary incisor',
 'congenital hip dislocation': 'Congenital hip dislocation',
 'Recurrent skin infections when younger': 'Recurrent skin infections',
 'Hx of hypercalcemia, carnitine deficiency, and vomiting, urinary and bowel incontinence': 'Hypercalcemia',
 'Livedo reticularis': 'Livedo reticularis',
 'Cafe-au-lait spots; - note: he was considered to have overgrowth at some point in childhood': 'Cafe-au-lait spot',
 'Common infections; fast develops hypothermia (35.5oC)': 'Recurrent infections',
 'Hypoglycemia and presumed partial adrenal insufficiency': 'Adrenal insufficiency',
 'Premature adrenarche, advanced bone age, family history of hereditary hemochromatosis': 'Premature adrenarche',
 'Café-au-lait spot': 'Cafe-au-lait spot',
 'inguinal lentigines': 'Inguinal freckling',
 'cafe au lait spots': 'Cafe-au-lait spot',
 'Telangiectatisia on face and chest': 'Facial telangiectasia',
 'inguinal hernia (left)': 'Inguinal hernia',
 'Hypertension': 'Hypertension',
 'The hypotonia/ hyperlaxity was that severe a muscle panel was performed. CK was normal.': 'Hypotonia',
 'Ketosis': 'Ketosis'}
otherMapper = OptionColumnMapper(column_name='Other',concept_recognizer=hpo_cr, option_d=other_d)
column_mapper_list.append(otherMapper)
otherMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Hypernasal speech (HP:0001611) (observed),1
1,Cafe-au-lait spot (HP:0000957) (observed),5
2,Supernumerary maxillary incisor (HP:0006332) (observed),1
3,Congenital hip dislocation (HP:0001374) (observed),1
4,Recurrent skin infections (HP:0001581) (observed),1
5,Hypercalcemia (HP:0003072) (observed),1
6,Livedo reticularis (HP:0033505) (observed),1
7,Pneumonia (HP:0002090) (observed),2
8,Asthma (HP:0002099) (observed),1
9,Hernia (HP:0100790) (observed),1


# Demographics

In [48]:
sexMapper = SexColumnMapper(male_symbol="M", female_symbol="F", column_name="Sex")
# sexMapper.preview_column(dft['Sex'])
age_d = {}
for item in dft["Age, years"].unique():
    item = str(item)
    if "y" in item or "m" in item:
        age_d[item] = f"P{item.upper()}"
    elif item == "nan":
        age_d[item] = 'n/a'
    elif item == "3.9":
        age_d[item] = "P3Y10M"
    elif item == "6.4":
        age_d[item] = "P6Y5M"
    elif item == "4.5":
        age_d[item] = "P4Y6M"
    else:
        age_d[item] = f"P{item}Y"
ageMapper = AgeColumnMapper.custom_dictionary(column_name="Age, years", string_to_iso_d=age_d)
#ageMapper.preview_column(dft)

# Variants

In [49]:
KDM6B_transcript="NM_001348716.2"
KDM6B_id = "HGNC:29012"
vman = VariantManager(df=dft, individual_column_name="individual_id", gene_id=KDM6B_id, gene_symbol="KDM6B",
                     allele_1_column_name="cDNA change (ENST00000254846.9 or NM_001080424.2)",
                     transcript=KDM6B_transcript)

In [50]:
variant_d = vman.get_variant_d()

In [51]:
varMapper = VariantColumnMapper(variant_d=variant_d,
                               variant_column_name="cDNA change (ENST00000254846.9 or NM_001080424.2)",
                               default_genotype="heterozygous")

In [52]:
encoder = CohortEncoder(df=dft,
                       hpo_cr=hpo_cr,
                       column_mapper_list=column_mapper_list,
                       individual_column_name='individual_id',
                       metadata=metadata,
                       age_at_last_encounter_mapper=ageMapper,
                       sexmapper=sexMapper,
                       variant_mapper=varMapper)
disease = Disease(disease_id="OMIM:618505", disease_label="Neurodevelopmental disorder with coarse facies and mild distal skeletal abnormalities")
encoder.set_disease(disease)

In [53]:
individuals = encoder.get_individuals()
cvalidator = CohortValidator(cohort=individuals, ontology=hpo_ontology, min_hpo=1, allelic_requirement=AllelicRequirement.MONO_ALLELIC)
qc = QcVisualizer(cohort_validator=cvalidator)
display(HTML(qc.to_summary_html()))



Level,Error category,Count
ERROR,CONFLICT,1
WARNING,REDUNDANT,37
INFORMATION,NOT_MEASURED,333


In [54]:
individuals = cvalidator.get_error_free_individual_list()
cvalidator = CohortValidator(cohort=individuals, ontology=hpo_ontology, min_hpo=1, allelic_requirement=AllelicRequirement.MONO_ALLELIC)
qc = QcVisualizer(cohort_validator=cvalidator)
display(HTML(qc.to_summary_html()))

In [55]:
table = PhenopacketTable(individual_list=individuals, metadata=metadata)
display(HTML(table.to_html()))

Individual,Disease,Genotype,Phenotypic features
Individual 1 (FEMALE; P16Y),Neurodevelopmental disorder with coarse facies and mild distal skeletal abnormalities (OMIM:618505),NM_001348716.2:c.1018del (heterozygous),Square face (HP:0000321); Hypernasal speech (HP:0001611); Supernumerary nipple (HP:0002558); Feeding difficulties (HP:0011968); Mild hypermetropia (HP:0031728); Recurrent otitis media (HP:0000403); Aggressive behavior (HP:0000718); Thin vermilion border (HP:0000233); Broad chin (HP:0011822); Anteverted nares (HP:0000463); Epicanthus (HP:0000286); Motor delay (HP:0001270); Obesity (HP:0001513); Clinodactyly (HP:0030084); Delayed speech and language development (HP:0000750); Large for gestational age (HP:0001520); Brachydactyly (HP:0001156); Coarse facial features (HP:0000280); Specific learning disability (HP:0001328); Broad foot (HP:0001769); Autistic behavior (HP:0000729); Depressed nasal bridge (HP:0005280); excluded: Cleft palate (HP:0000175); excluded: Seizure (HP:0001250); excluded: Spasticity (HP:0001257); excluded: Scoliosis (HP:0002650); excluded: Psychosis (HP:0000709); excluded: Hearing impairment (HP:0000365); excluded: Hypotonia (HP:0001252); excluded: Syndactyly (HP:0001159); excluded: Pectus excavatum (HP:0000767); excluded: Strabismus (HP:0000486); excluded: Sleep abnormality (HP:0002360); excluded: Dystonia (HP:0001332); excluded: Constipation (HP:0002019)
Individual 2 (FEMALE; P10Y),Neurodevelopmental disorder with coarse facies and mild distal skeletal abnormalities (OMIM:618505),NM_001348716.2:c.1085_1088del (heterozygous),"Cafe-au-lait spot (HP:0000957); Prominent forehead (HP:0011220); Intention tremor (HP:0002080); Delayed speech and language development (HP:0000750); Hip dysplasia (HP:0001385); Mandibular prognathia (HP:0000303); Intellectual disability, mild (HP:0001256); Global developmental delay (HP:0001263); Feeding difficulties (HP:0011968); Clubbing of fingers (HP:0100759); Motor delay (HP:0001270); Hypermetropia (HP:0000540); Joint hypermobility (HP:0001382); excluded: Cleft palate (HP:0000175); excluded: Seizure (HP:0001250); excluded: Spasticity (HP:0001257); excluded: Scoliosis (HP:0002650); excluded: Aggressive behavior (HP:0000718); excluded: Psychosis (HP:0000709); excluded: Hearing impairment (HP:0000365); excluded: Hypotonia (HP:0001252); excluded: Syndactyly (HP:0001159); excluded: Autistic behavior (HP:0000729); excluded: Pectus excavatum (HP:0000767); excluded: Strabismus (HP:0000486); excluded: Sleep abnormality (HP:0002360); excluded: Dystonia (HP:0001332); excluded: Constipation (HP:0002019)"
Individual 3 (MALE; P9Y),Neurodevelopmental disorder with coarse facies and mild distal skeletal abnormalities (OMIM:618505),NM_001348716.2:c.654_655del (heterozygous),"Obstipation (HP:0034782); Square face (HP:0000321); Feeding difficulties (HP:0011968); Intellectual disability, moderate (HP:0002342); Recurrent otitis media (HP:0000403); Irritability (HP:0000737); Joint hypermobility (HP:0001382); Macrocephaly (HP:0000256); Cafe-au-lait spot (HP:0000957); Epicanthus (HP:0000286); Motor delay (HP:0001270); Obesity (HP:0001513); Clinodactyly (HP:0030084); Delayed speech and language development (HP:0000750); Large for gestational age (HP:0001520); Anxiety (HP:0000739); Broad foot (HP:0001769); Recurrent ear infections (HP:0410018); Protruding ear (HP:0000411); Autistic behavior (HP:0000729); Sleep abnormality (HP:0002360); Depressed nasal bridge (HP:0005280); excluded: Cleft palate (HP:0000175); excluded: Seizure (HP:0001250); excluded: Spasticity (HP:0001257); excluded: Cryptorchidism (HP:0000028); excluded: Scoliosis (HP:0002650); excluded: Psychosis (HP:0000709); excluded: Hearing impairment (HP:0000365); excluded: Hypotonia (HP:0001252); excluded: Syndactyly (HP:0001159); excluded: Pectus excavatum (HP:0000767); excluded: Strabismus (HP:0000486); excluded: Dystonia (HP:0001332); excluded: Abnormality of refraction (HP:0000539)"
Individual 5 (MALE; P25Y),Neurodevelopmental disorder with coarse facies and mild distal skeletal abnormalities (OMIM:618505),NM_001348716.2:c.1439dup (heterozygous),Intellectual disability (HP:0001249); Pes planus (HP:0001763); Delayed speech and language development (HP:0000750); Broad finger (HP:0001500); Allergy (HP:0012393); Recurrent ear infections (HP:0410018); Recurrent otitis media (HP:0000403); Motor delay (HP:0001270); Deeply set eye (HP:0000490); Hypertonia (HP:0001276); Autistic behavior (HP:0000729); Thick lower lip vermilion (HP:0000179); Involuntary movements (HP:0004305); Seizure (HP:0001250); Synophrys (HP:0000664); excluded: Joint hypermobility (HP:0001382); excluded: Cleft palate (HP:0000175); excluded: Spasticity (HP:0001257); excluded: Cryptorchidism (HP:0000028); excluded: Scoliosis (HP:0002650); excluded: Psychosis (HP:0000709); excluded: Hearing impairment (HP:0000365); excluded: Feeding difficulties (HP:0011968); excluded: Hypotonia (HP:0001252); excluded: Syndactyly (HP:0001159); excluded: Pectus excavatum (HP:0000767); excluded: Strabismus (HP:0000486); excluded: Dystonia (HP:0001332); excluded: Abnormality of refraction (HP:0000539); excluded: Constipation (HP:0002019)
Individual 6 (MALE; P13Y2M),Neurodevelopmental disorder with coarse facies and mild distal skeletal abnormalities (OMIM:618505),NM_001348716.2:c.2598del (heterozygous),"Delayed speech and language development (HP:0000750); Maternal diabetes (HP:0009800); Relative macrocephaly (HP:0004482); Intellectual disability, mild (HP:0001256); Motor delay (HP:0001270); Attention deficit hyperactivity disorder (HP:0007018); Phimosis (HP:0001741); Synophrys (HP:0000664); Hypermetropia (HP:0000540); excluded: Joint hypermobility (HP:0001382); excluded: Seizure (HP:0001250); excluded: Spasticity (HP:0001257); excluded: Recurrent ear infections (HP:0410018); excluded: Psychosis (HP:0000709); excluded: Pectus excavatum (HP:0000767); excluded: Sleep abnormality (HP:0002360); excluded: Hearing impairment (HP:0000365); excluded: Dystonia (HP:0001332); excluded: Cleft palate (HP:0000175); excluded: Recurrent otitis media (HP:0000403); excluded: Cryptorchidism (HP:0000028); excluded: Syndactyly (HP:0001159); excluded: Autistic behavior (HP:0000729); excluded: Strabismus (HP:0000486); excluded: Scoliosis (HP:0002650); excluded: Feeding difficulties (HP:0011968); excluded: Hypotonia (HP:0001252); excluded: Constipation (HP:0002019)"
Individual 7 (FEMALE; P9Y6M),Neurodevelopmental disorder with coarse facies and mild distal skeletal abnormalities (OMIM:618505),NM_001348716.2:c.4500C>A (heterozygous),Delayed speech and language development (HP:0000750); Specific learning disability (HP:0001328); Thick vermilion border (HP:0012471); Short nose (HP:0003196); Supernumerary maxillary incisor (HP:0006332); Short philtrum (HP:0000322); Feeding difficulties (HP:0011968); Premature rupture of membranes (HP:0001788); Sleep abnormality (HP:0002360); Seizure (HP:0001250); Preauricular pit (HP:0004467); Tics (HP:0100033); Aggressive behavior (HP:0000718); excluded: Joint hypermobility (HP:0001382); excluded: Motor delay (HP:0001270); excluded: Cleft palate (HP:0000175); excluded: Recurrent otitis media (HP:0000403); excluded: Spasticity (HP:0001257); excluded: Scoliosis (HP:0002650); excluded: Recurrent ear infections (HP:0410018); excluded: Psychosis (HP:0000709); excluded: Hearing impairment (HP:0000365); excluded: Hypotonia (HP:0001252); excluded: Syndactyly (HP:0001159); excluded: Autistic behavior (HP:0000729); excluded: Pectus excavatum (HP:0000767); excluded: Strabismus (HP:0000486); excluded: Dystonia (HP:0001332); excluded: Abnormality of refraction (HP:0000539); excluded: Constipation (HP:0002019)
Individual 8 (MALE; P10Y),Neurodevelopmental disorder with coarse facies and mild distal skeletal abnormalities (OMIM:618505),NM_001348716.2:c.403C>T (heterozygous),"Brachycephaly (HP:0000248); Prominent nasal bridge (HP:0000426); Intellectual disability, mild (HP:0001256); Joint hypermobility (HP:0001382); Motor delay (HP:0001270); Hypotelorism (HP:0000601); Hypotonia (HP:0001252); Clinodactyly (HP:0030084); Delayed speech and language development (HP:0000750); Large for gestational age (HP:0001520); Kyphosis (HP:0002808); Short philtrum (HP:0000322); Cerebellar cortical atrophy (HP:0008278); Attention deficit hyperactivity disorder (HP:0007018); Congenital hip dislocation (HP:0001374); Tall stature (HP:0000098); Breech presentation (HP:0001623); Autistic behavior (HP:0000729); Sleep abnormality (HP:0002360); excluded: Cleft palate (HP:0000175); excluded: Seizure (HP:0001250); excluded: Recurrent otitis media (HP:0000403); excluded: Spasticity (HP:0001257); excluded: Cryptorchidism (HP:0000028); excluded: Recurrent ear infections (HP:0410018); excluded: Psychosis (HP:0000709); excluded: Hearing impairment (HP:0000365); excluded: Feeding difficulties (HP:0011968); excluded: Syndactyly (HP:0001159); excluded: Pectus excavatum (HP:0000767); excluded: Strabismus (HP:0000486); excluded: Dystonia (HP:0001332); excluded: Abnormality of refraction (HP:0000539); excluded: Constipation (HP:0002019)"
Individual 9 (MALE; P6Y6M),Neurodevelopmental disorder with coarse facies and mild distal skeletal abnormalities (OMIM:618505),NM_001348716.2:c.4737+1G>A (heterozygous),Delayed speech and language development (HP:0000750); Ventouse delivery (HP:0011412); Global developmental delay (HP:0001263); Abnormality of refraction (HP:0000539); Motor delay (HP:0001270); Hypotonia (HP:0001252); Preauricular skin tag (HP:0000384); Joint hypermobility (HP:0001382); excluded: Seizure (HP:0001250); excluded: Recurrent otitis media (HP:0000403); excluded: Spasticity (HP:0001257); excluded: Cryptorchidism (HP:0000028); excluded: Scoliosis (HP:0002650); excluded: Recurrent ear infections (HP:0410018); excluded: Aggressive behavior (HP:0000718); excluded: Psychosis (HP:0000709); excluded: Hearing impairment (HP:0000365); excluded: Feeding difficulties (HP:0011968); excluded: Syndactyly (HP:0001159); excluded: Pectus excavatum (HP:0000767); excluded: Strabismus (HP:0000486); excluded: Sleep abnormality (HP:0002360); excluded: Dystonia (HP:0001332); excluded: Constipation (HP:0002019)
Individual 11 (FEMALE; P19Y),Neurodevelopmental disorder with coarse facies and mild distal skeletal abnormalities (OMIM:618505),NM_001348716.2:c.3288_3291del (heterozygous),Intellectual disability (HP:0001249); Delayed speech and language development (HP:0000750); Recurrent skin infections (HP:0001581); Enlarged vestibular aqueduct (HP:0011387); Motor delay (HP:0001270); Psychosis (HP:0000709); Autistic behavior (HP:0000729); Ataxia (HP:0001251); Hearing impairment (HP:0000365); Hypotonia (HP:0001252); excluded: Joint hypermobility (HP:0001382); excluded: Cleft palate (HP:0000175); excluded: Seizure (HP:0001250); excluded: Recurrent otitis media (HP:0000403); excluded: Spasticity (HP:0001257); excluded: Scoliosis (HP:0002650); excluded: Recurrent ear infections (HP:0410018); excluded: Feeding difficulties (HP:0011968); excluded: Syndactyly (HP:0001159); excluded: Pectus excavatum (HP:0000767); excluded: Strabismus (HP:0000486); excluded: Sleep abnormality (HP:0002360); excluded: Abnormality of refraction (HP:0000539); excluded: Constipation (HP:0002019)
Individual 12 (MALE; P17Y),Neurodevelopmental disorder with coarse facies and mild distal skeletal abnormalities (OMIM:618505),NM_001348716.2:c.3288_3291del (heterozygous),Macrocephaly (HP:0000256); Intellectual disability (HP:0001249); Delayed speech and language development (HP:0000750); Dystonia (HP:0001332); Motor delay (HP:0001270); Psychosis (HP:0000709); Autistic behavior (HP:0000729); Hypercalcemia (HP:0003072); Hypotonia (HP:0001252); Vomiting (HP:0002013); excluded: Joint hypermobility (HP:0001382); excluded: Cleft palate (HP:0000175); excluded: Seizure (HP:0001250); excluded: Recurrent otitis media (HP:0000403); excluded: Spasticity (HP:0001257); excluded: Cryptorchidism (HP:0000028); excluded: Scoliosis (HP:0002650); excluded: Recurrent ear infections (HP:0410018); excluded: Hearing impairment (HP:0000365); excluded: Feeding difficulties (HP:0011968); excluded: Syndactyly (HP:0001159); excluded: Pectus excavatum (HP:0000767); excluded: Strabismus (HP:0000486); excluded: Sleep abnormality (HP:0002360); excluded: Abnormality of refraction (HP:0000539); excluded: Constipation (HP:0002019)


In [57]:
Individual.output_individuals_as_phenopackets(individual_list=individuals,
                                             metadata=metadata)

We output 73 GA4GH phenopackets to the directory phenopackets


In [57]:
# pxf validate --hpo hp.json *.json
# no errors