<H1>MAPK8IP3: Platzer et al (2019)</H1>
<p>This notebook uses the <a href="https://github.com/monarch-initiative/pyphetools" target="__blank">pyphetools</a> library
to create GA4GH phenopackets from the data in  <a href="https://pubmed.ncbi.nlm.nih.gov/30612693/" target="__blank">Platzer K., et al. (2019) De Novo Variants in MAPK8IP3 Cause Intellectual Disability with Variable Brain Anomalies</a>. See the <a href="https://monarch-initiative.github.io/pyphetools/index.html" target="__blank">Pyphetools documentation</a> for more information about the code.</p>
<p>The original article describes de novo variants in MAPK8IP3 in 13 unrelated individuals presenting with an overlapping phenotype of mild to severe intellectual disability. </p>
<p>This notebook parses the information in Supplemental Table S1 (an Excel file).</p>

In [1]:
import pandas as pd
pd.set_option('display.max_colwidth', None) # show entire column contents, important!
from IPython.display import display, HTML
from pyphetools.creation import *
from pyphetools.visualization import *
from pyphetools.validation import *
import pyphetools
print(f"Using pyphetools version {pyphetools.__version__}")

Using pyphetools version 0.9.4


<h2>Importing HPO data</h2>

In [2]:
PMID = "PMID:30612693"
title = "De Novo Variants in MAPK8IP3 Cause Intellectual Disability with Variable Brain Anomalies"
cite = Citation(pmid=PMID, title=title)
parser = HpoParser()
hpo_cr = parser.get_hpo_concept_recognizer()
hpo_version = parser.get_version()
hpo_ontology = parser.get_ontology()
metadata = MetaData(created_by="ORCID:0000-0002-0736-9199", citation=cite)
metadata.default_versions_with_hpo(version=hpo_version)
print(f"HPO version {hpo_version}")

HPO version 2023-10-09


<H2>Importing supplemental file S1.</H2>

In [3]:
df = pd.read_excel('input/platzer_2019_supplement.xlsx')
df.head()

Unnamed: 0,Indvidual\nin\nmanuscript,g.(hg19) Chr16:,Transcript\nNM_015133.4\nc.,p.,origin,genetic testing,Sex,age at last assesment,prenatal period,Exam at birth,...,neurological examination,result of external MRI,seizures,Sz onset and Sz types,AEDs used,Sz outcome,EEG,Additional symptoms,family history,further results of genetic testing
0,1,1756405,c.65delG,p.Gly22Alafs*3,de novo,TrioWES,M,14 y 8 m,,41 weeks:\nlength: 53.3 cm\nweight: 3.941 kg\nOFC: NA,...,ataxia,"mild cerebellar atrophy, hypointensity of the globi pallidi and substantia nigra, possible mild degree of abnormal iron or mineral deposition",no,,,,,"speech is ataxic but speaks in sentences/short phrases; attention issues, impulse control and emotional lability, OCD symptoms; recently developed scoliosis",unremarkable,
1,2,1756419,c.79G>T,p.Glu27*,de novo,SingleWES,M,4 y,,length: 49 cm\nweigth: 3215 g\nOFC: 35 cm,...,ataxia,normal,no,,,,,pre-natal pelvi-ureteric junction stenosis (spontaneous resolution at 6 m),,
2,3,1756451,c.111C>G,p.Tyr37*,de novo,TrioWES,M,4 y,,length: 20.5 in\nweight: 8 lb 2 oz\nOFC: NA,...,,Stable areas of T2 hyperintensity involving the central tegmental tracts,no,,,,,Nystagmus,unremarkable,770 kb duplicaion of 20p12.3 on chromosome microarray
3,4,1798706,c.1198G>A,p.Gly400Arg,de novo,TrioWES,M,7 y 6 m,"no prenatal care, no known problems","32 weeks:\nlength: NA,\nweight: 4 lbs,\nOFC: NA\n\nhad a 30 day hospital course",...,,no MRI done,no,,,,,"Left hearing loss; Dysmorphic features: hypertelorism inner canthal distance 4.3cm; low set prominent ears, slight overhangin columella, hypodontia; 5th finger clinodactyly and 5th finger brachydactylky; synophrys; Encopresis",Mother with learning disorder; finished 11th grade; Father with ADHD and learning disorder; finished 9th grade; Full sister with learning disorder; Full sister no known problems; Full brother with learning disorder,
4,5,1810410,c.1331T>C,p.Leu444Pro,de novo,TrioWES,M,10 y,,"40 weeks, length: 52 cm\nweight: 3810 g\nOFC: 36 cm",...,,perisylvian polymicrogyria,yes,10 y:\none event of a generalized seizure,,,"pathological EEG with normal age-related background activity (alpha-type), increased appearance of slowing over temporal and occipital regions","no dysmorphism, small teeth, severe s-configured scoliosis of thoracic and lumbar spine",,


<h2>Collecting column mappers</h2>

In [4]:
column_mapper_d = {}

In [5]:
neuro_exam_custom_map = {'low extremity weakness': 'Lower limb muscle weakness',  
                         'unstable gait': 'Unsteady gait',
                         'dysfunction of the corticospinal pathways':'Upper motor neuron dysfunction',
                         'spastic': 'Spasticity',
                         'orobuccal dyspraxia': 'Oromotor apraxia',
                         'difficulty in coordination':'Poor coordination'
                        }
neuroMapper = OptionColumnMapper(concept_recognizer=hpo_cr, option_d=neuro_exam_custom_map, )
neuroMapper.preview_column(df['neurological examination'])
column_mapper_d['neurological examination'] = neuroMapper

In [6]:
severity_d = {'moderate\n(IQ 48)':'Intellectual disability, moderate',
             'moderate':'Intellectual disability, moderate',
             'moderate\n(IQ 49)': 'Intellectual disability, moderate',
             'severe': 'Intellectual disability, severe',
             'mild': 'Intellectual disability, mild'}
severityOfIdMapper = OptionColumnMapper(concept_recognizer=hpo_cr, option_d=severity_d)
#severityOfIdMapper.preview_column(df['severity of ID'])
column_mapper_d['severity of ID'] = severityOfIdMapper

In [7]:
mri_custom_map = {'hypomyelination': 'CNS hypomyelination',  
                  'thinning of CC': 'Thin corpus callosum',
                  'white matter volume loss':'Reduced cerebral white matter volume',
                  'widened lateral ventricles': 'Lateral ventricle dilatation',
                  'dysgenesis of corpus callosum': 'Dysplastic corpus callosum',
                  'hypoplasia of mesencephalon and brainstem': 'Hypoplasia of the brainstem'
                  }
mriMapper = OptionColumnMapper(concept_recognizer=hpo_cr, option_d=mri_custom_map, )
#mriMapper.preview_column(df['result of external MRI'])
column_mapper_d['result of external MRI'] = mriMapper

In [8]:
additional_custom_map = {'OCD': 'Obsessive-compulsive behavior',  
                         '5th finger clinodactyly': 'Clinodactyly of the 5th finger',
                         'small teeth':'Reduced cerebral white matter volume',
                         'widened lateral ventricles': 'Lateral ventricle dilatation',
                         'dysgenesis of corpus callosum': 'Dysplastic corpus callosum',
                         'dramatic increased weight': 'Obesity'
                        }
excluded = {'pseudostrabismus': "Strabismus"}
additionalFeaturesMapper = OptionColumnMapper(concept_recognizer=hpo_cr, 
                                              option_d=mri_custom_map, 
                                              excluded_d=excluded)
additionalFeaturesMapper.preview_column(df['Additional symptoms'])
column_mapper_d['Additional symptoms'] = additionalFeaturesMapper

<h2>Simple mappers</h2>

In [9]:
items = {
    'regression': ["Developmental regression","HP:0002376"],
    'autism': ['Autism', 'HP:0000717'],
    'hypotonia': ['Hypotonia', 'HP:0001252'],
    'movement disorder': ['Abnormality of movement', 'HP:0100022'],
    'CVI': ['Cerebral visual impairment', 'HP:0100704'], # CVI stands for Cortical visual impairment HP:0100704
    'seizures': ['Seizure','HP:0001250'],
    'DD': ['Global developmental delay', 'HP:0001263']
}
item_column_mapper_d = hpo_cr.initialize_simple_column_maps(column_name_to_hpo_label_map=items, observed='yes',
    excluded='no')
print(f"We created {len(item_column_mapper_d)} simple column mappers")
# Transfer to column_mapper_d
for k, v in item_column_mapper_d.items():
    column_mapper_d[k] = v

We created 7 simple column mappers


<h2>Option mapper</h2>

In [10]:
severity_d = {'moderate\n(IQ 48)':'Intellectual disability, moderate',
             'moderate':'Intellectual disability, moderate',
             'moderate\n(IQ 49)': 'Intellectual disability, moderate',
             'severe': 'Intellectual disability, severe',
             'mild': 'Intellectual disability, mild'}
severityOfIdMapper = OptionColumnMapper(concept_recognizer=hpo_cr, option_d=severity_d)
severityOfIdMapper.preview_column(df['severity of ID'])
column_mapper_d['severity of ID'] = severityOfIdMapper

In [11]:
mri_custom_map = {'hypomyelination': 'CNS hypomyelination',  
                  'thinning of CC': 'Thin corpus callosum',
                  'white matter volume loss':'Reduced cerebral white matter volume',
                  'widened lateral ventricles': 'Lateral ventricle dilatation',
                  'dysgenesis of corpus callosum': 'Dysplastic corpus callosum',
                  'hypoplasia of mesencephalon and brainstem': 'Hypoplasia of the brainstem'
                  }
mriMapper = OptionColumnMapper(concept_recognizer=hpo_cr, option_d=mri_custom_map, )
mriMapper.preview_column(df['result of external MRI'])
column_mapper_d['result of external MRI'] = mriMapper

In [12]:
additional_custom_map = {'OCD': 'Obsessive-compulsive behavior',  
                         '5th finger clinodactyly': 'Clinodactyly of the 5th finger',
                         'small teeth':'Reduced cerebral white matter volume',
                         'widened lateral ventricles': 'Lateral ventricle dilatation',
                         'dysgenesis of corpus callosum': 'Dysplastic corpus callosum',
                         'dramatic increased weight': 'Obesity'
                        }
excluded = {'pseudostrabismus': "Strabismus"}
additionalFeaturesMapper = OptionColumnMapper(concept_recognizer=hpo_cr, 
                                              option_d=mri_custom_map, 
                                              excluded_d=excluded)
additionalFeaturesMapper.preview_column(df['Additional symptoms'])
column_mapper_d['Additional symptoms'] = additionalFeaturesMapper

<H1>Mapping variants</H1>
<p>MAPK8IP3 variants reported by Platzer et al, Iwasama et al., and Yechieli et al. We have transformed the variants, which were originally expressed using the transcript  NM_015133.4 to be expressed using the MANE select transcript NM_001318852.2</p>
<p>pyphetools maps variants using the VariantValidator API.</p>

In [13]:
d_NM_015133_to_NM_001318852 = {
"c.45C>G": "c.45C>G",
"c.65delG":"c.65del",
"c.79G>T":"c.79G>T",
"c.111C>G": "c.111C>G",
"c.1198G>A": "c.1201G>A",
"c.1331T>C": "c.1334T>C",
"c.1574G>A": "c.1577G>A",
"c.1732C>T": "c.1735C>T",
"c.2982C>G": "c.2985C>G",
"c.3436C>T": "c.3439C>T"
}

df['NM_001318852'] = df['Transcript\nNM_015133.4\nc.'].apply(lambda x: d_NM_015133_to_NM_001318852.get(x.replace(" ","")))

In [14]:
df['NM_001318852']

0       c.65del
1       c.79G>T
2      c.111C>G
3     c.1201G>A
4     c.1334T>C
5     c.1334T>C
6     c.1577G>A
7     c.1735C>T
8     c.1735C>T
9     c.2985C>G
10    c.3439C>T
11    c.3439C>T
12    c.3439C>T
Name: NM_001318852, dtype: object

In [15]:
transcript='NM_001318852.2'
vvalidator = VariantValidator(genome_build="hg38", transcript=transcript)

var_d = {}
for v in df['NM_001318852']:
    var = vvalidator.encode_hgvs(v)
    var_d[v] = var

varMapper = VariantColumnMapper(variant_d=var_d,
                                variant_column_name='NM_001318852', 
                                default_genotype='heterozygous')
# varMapper.preview_column(df['NM_001318852'])

https://rest.variantvalidator.org/VariantValidator/variantvalidator/hg38/NM_001318852.2%3Ac.65del/NM_001318852.2?content-type=application%2Fjson
https://rest.variantvalidator.org/VariantValidator/variantvalidator/hg38/NM_001318852.2%3Ac.79G>T/NM_001318852.2?content-type=application%2Fjson
https://rest.variantvalidator.org/VariantValidator/variantvalidator/hg38/NM_001318852.2%3Ac.111C>G/NM_001318852.2?content-type=application%2Fjson
https://rest.variantvalidator.org/VariantValidator/variantvalidator/hg38/NM_001318852.2%3Ac.1201G>A/NM_001318852.2?content-type=application%2Fjson
https://rest.variantvalidator.org/VariantValidator/variantvalidator/hg38/NM_001318852.2%3Ac.1334T>C/NM_001318852.2?content-type=application%2Fjson
https://rest.variantvalidator.org/VariantValidator/variantvalidator/hg38/NM_001318852.2%3Ac.1334T>C/NM_001318852.2?content-type=application%2Fjson
https://rest.variantvalidator.org/VariantValidator/variantvalidator/hg38/NM_001318852.2%3Ac.1577G>A/NM_001318852.2?content-

In [16]:
ageMapper = AgeColumnMapper.by_year_and_month('age at last assesment')
#ageMapper.preview_column(df['age at last assesment'])
sexMapper = SexColumnMapper(male_symbol='M', female_symbol='F', column_name='Sex')
#sexMapper.preview_column(df['Sex'])

individual_column_name = 'Indvidual\nin\nmanuscript'

encoder = CohortEncoder(df=df, 
                        hpo_cr=hpo_cr, 
                        column_mapper_d=column_mapper_d, 
                        individual_column_name=individual_column_name,
                        agemapper=ageMapper, 
                        sexmapper=sexMapper,
                        variant_mapper=varMapper,
                        metadata=metadata)
disease_id = 'OMIM:618443'
disease_name = 'Neurodevelopmental disorder with or without variable brain abnormalities'
disease = Disease(disease_id=disease_id, disease_label=disease_name)
encoder.set_disease(disease=disease)

<h2>Getting individual data and exporting to GA4GH Phenopacket Schema format</h2>

In [17]:
individuals = encoder.get_individuals()
cvalidator = CohortValidator(cohort=individuals, ontology=hpo_ontology, min_hpo=1, allelic_requirement=AllelicRequirement.MONO_ALLELIC)
qc = QcVisualizer(cohort_validator=cvalidator)
display(HTML(qc.to_summary_html()))

Level,Error category,Count
ERROR,CONFLICT,1
WARNING,REDUNDANT,3
INFORMATION,NOT_MEASURED,14


In [18]:
individuals = cvalidator.get_error_free_individual_list()
table = PhenopacketTable(individual_list=individuals, metadata=metadata)
display(HTML(table.to_html()))

Individual,Disease,Genotype,Phenotypic features
1 (MALE; P14Y8M),Neurodevelopmental disorder with or without variable brain abnormalities (OMIM:618443),NM_001318852.2:c.65del (heterozygous),"Ataxia (HP:0001251); Intellectual disability, moderate (HP:0002342); Cerebellar atrophy (HP:0001272); Emotional lability (HP:0000712); Scoliosis (HP:0002650); Autism (HP:0000717); Hypotonia (HP:0001252); Global developmental delay (HP:0001263); excluded: Developmental regression (HP:0002376); excluded: Abnormality of movement (HP:0100022); excluded: Cerebral visual impairment (HP:0100704); excluded: Seizure (HP:0001250)"
2 (MALE; P4Y),Neurodevelopmental disorder with or without variable brain abnormalities (OMIM:618443),NM_001318852.2:c.79G>T (heterozygous),"Ataxia (HP:0001251); Intellectual disability, severe (HP:0010864); Hypotonia (HP:0001252); Global developmental delay (HP:0001263); excluded: Developmental regression (HP:0002376); excluded: Autism (HP:0000717); excluded: Abnormality of movement (HP:0100022); excluded: Cerebral visual impairment (HP:0100704); excluded: Seizure (HP:0001250)"
3 (MALE; P4Y),Neurodevelopmental disorder with or without variable brain abnormalities (OMIM:618443),NM_001318852.2:c.111C>G (heterozygous),"Intellectual disability, moderate (HP:0002342); Nystagmus (HP:0000639); Hypotonia (HP:0001252); Global developmental delay (HP:0001263); excluded: Developmental regression (HP:0002376); excluded: Autism (HP:0000717); excluded: Abnormality of movement (HP:0100022); excluded: Cerebral visual impairment (HP:0100704); excluded: Seizure (HP:0001250)"
4 (MALE; P7Y6M),Neurodevelopmental disorder with or without variable brain abnormalities (OMIM:618443),NM_001318852.2:c.1201G>A (heterozygous),"Intellectual disability, mild (HP:0001256); Hearing impairment (HP:0000365); Hypertelorism (HP:0000316); Protruding ear (HP:0000411); Hypodontia (HP:0000668); Finger clinodactyly (HP:0040019); Synophrys (HP:0000664); Encopresis (HP:0040183); Autism (HP:0000717); Global developmental delay (HP:0001263); excluded: Developmental regression (HP:0002376); excluded: Hypotonia (HP:0001252); excluded: Abnormality of movement (HP:0100022); excluded: Cerebral visual impairment (HP:0100704); excluded: Seizure (HP:0001250)"
5 (MALE; P10Y),Neurodevelopmental disorder with or without variable brain abnormalities (OMIM:618443),NM_001318852.2:c.1334T>C (heterozygous),"Intellectual disability, moderate (HP:0002342); Perisylvian polymicrogyria (HP:0012650); Microdontia (HP:0000691); Scoliosis (HP:0002650); Hypotonia (HP:0001252); Seizure (HP:0001250); Global developmental delay (HP:0001263); excluded: Developmental regression (HP:0002376); excluded: Autism (HP:0000717); excluded: Abnormality of movement (HP:0100022)"
6 (FEMALE; P9Y),Neurodevelopmental disorder with or without variable brain abnormalities (OMIM:618443),NM_001318852.2:c.1334T>C (heterozygous),"Intellectual disability, mild (HP:0001256); Perisylvian polymicrogyria (HP:0012650); Hypotonia (HP:0001252); Global developmental delay (HP:0001263); excluded: Developmental regression (HP:0002376); excluded: Autism (HP:0000717); excluded: Seizure (HP:0001250)"
7 (FEMALE; P3Y),Neurodevelopmental disorder with or without variable brain abnormalities (OMIM:618443),NM_001318852.2:c.1577G>A (heterozygous),"Intellectual disability, mild (HP:0001256); Global developmental delay (HP:0001263); excluded: Developmental regression (HP:0002376); excluded: Autism (HP:0000717); excluded: Hypotonia (HP:0001252); excluded: Seizure (HP:0001250)"
8 (FEMALE; P5Y),Neurodevelopmental disorder with or without variable brain abnormalities (OMIM:618443),NM_001318852.2:c.1735C>T (heterozygous),"Spastic paraplegia (HP:0001258); Intellectual disability, severe (HP:0010864); Thin corpus callosum (HP:0033725); CNS hypomyelination (HP:0003429); Full cheeks (HP:0000293); Long philtrum (HP:0000343); Hypotonia (HP:0001252); Seizure (HP:0001250); Global developmental delay (HP:0001263); excluded: Developmental regression (HP:0002376); excluded: Autism (HP:0000717)"
9 (FEMALE; P6Y),Neurodevelopmental disorder with or without variable brain abnormalities (OMIM:618443),NM_001318852.2:c.1735C>T (heterozygous),"Global developmental delay (HP:0001263); Intellectual disability, moderate (HP:0002342); Reduced cerebral white matter volume (HP:0034295); Spasticity (HP:0001257); Hypotonia (HP:0001252); Small hand (HP:0200055); Lower limb muscle weakness (HP:0007340); Hypoplasia of the brainstem (HP:0002365); Seizure (HP:0001250); Syringomyelia (HP:0003396); Polymicrogyria (HP:0002126); Thin corpus callosum (HP:0033725); excluded: Developmental regression (HP:0002376); excluded: Autism (HP:0000717)"
10 (MALE; P4Y),Neurodevelopmental disorder with or without variable brain abnormalities (OMIM:618443),NM_001318852.2:c.2985C>G (heterozygous),"Intellectual disability, moderate (HP:0002342); Hypotonia (HP:0001252); Seizure (HP:0001250); Global developmental delay (HP:0001263); excluded: Developmental regression (HP:0002376); excluded: Autism (HP:0000717)"


<h2>Output results in phenopacket format</h2>

In [19]:
Individual.output_individuals_as_phenopackets(individual_list=individuals, 
                                              metadata=metadata, 
                                              outdir="phenopackets")

We output 13 GA4GH phenopackets to the directory phenopackets


In [20]:
# pxf validate --hpo hp.json *.json
# no errors