<H1>MAPK8IP3 genotype phenotype correlations</H1>
<p>Neurodevelopmental disorder with or without variable brain abnormalities (NEDBA; 
<a href="https://omim.org/entry/618443">OMIM:618443</a>) is caused by heterozygous mutation in the MAPK8IP3 gene (<a href="https://omim.org/entry/605431">OMIM:605431</a>). </p>
<p><a href="https://pubmed.ncbi.nlm.nih.gov/30612693/">Platzer et al. (2019; PMID:30612693)</a> reported 13 unrelated patients. <a href="https://pubmed.ncbi.nlm.nih.gov/30945334/">Iwasawa et al. (2019;PMID:)</a> reported 5 patients. <a href="https://pubmed.ncbi.nlm.nih.gov/34321325/">Yechieli et al. (2022; PMID:34321325)</a> reported  a patient with cerebral palsy (spastic triplegia) and intellectual disability with a variant in MAPK8IP3 (c.45C>G, p.Y15X) TODO create phenopacket.</p>
<p>Sundaramurthi et al. reported one additional affected individual (manuscript in preparation, 2023), with the de novo variant NM_001318852.2(MAPK8IP3):c.1735C>T (p.Arg579Cys).</p>
<p>Platzer et al. discussed the possibility of genotype phenotype correlations for the variants p.Arg578Cys, p.Arg1146Cys, and p.Arg578Cys.</p>
<p>TODO are all these variants using the same transcript?</p>

In [20]:
import os
import sys
sys.path.insert(0, os.path.abspath(' ../../src'))

from genophenocorr import *
import glob
import pprint
import json
from google.protobuf.json_format import Parse
from phenopackets import Phenopacket 
import requests
import re
import pickle
import hpotk
from hpotk.ontology.load.obographs import load_ontology
from hpotk.ontology import Ontology
from genophenocorr.patient import PhenopacketPatientCreator
from genophenocorr.phenotype import PhenotypeCreator
from genophenocorr.protein import UniprotProteinMetadataService, ProteinAnnotationCache, ProtCachingFunctionalAnnotator
from genophenocorr.variant import VarCachingFunctionalAnnotator, VariantAnnotationCache, VepFunctionalAnnotator
from genophenocorr.cohort import PhenopacketCohortCreator, CohortAnalysis

ImportError: cannot import name 'PhenopacketCohortCreator' from 'genophenocorr.cohort' (/Users/robinp/GIT/genophenocorr/src/genophenocorr/cohort/__init__.py)

<h3>Setup</h3>
<p>To run genophenocorr, we first set up file paths:</p>
    <ul>
    <li>fpath_hpo: download the HPO ontology file here</li>
    <li>cache_dir: genophenocorr will store intermediate files here to speed up repeated runs of this notebook</li>
    <li>fpath_phenopackets: input phenopacket files, one per individual in the cohort.</li>
    </ul>
        

In [13]:
MAPK8IP3_transcript = "NM_001318852.2"



fpath_hpo = 'hpo_data/hp.json'
cache_dir = 'annotations'
fpath_phenopackets = 'phenopackets'

if not os.path.isdir(cache_dir):
    os.mkdir(cache_dir)

<h3>HPO ontology, validator, and creator</h3>
<p>We use the <a href="https://github.com/TheJacksonLaboratory/hpo-toolkit">HPO toolkit</a> for validation of the ontology file (the file should always validate unless it was corrupted or altered). The PhenotypeCreator object is used by genophenocorr to validate HPO terms in the phenopackets.</p>

In [17]:
hpo: Ontology = load_ontology('http://purl.obolibrary.org/obo/hp.json')
validators = [
    hpotk.validate.AnnotationPropagationValidator(hpo),
    hpotk.validate.ObsoleteTermIdsValidator(hpo),
    hpotk.validate.PhenotypicAbnormalityValidator(hpo)
]
phenotype_creator = PhenotypeCreator(hpo, hpotk.validate.ValidationRunner(validators))

<h2>genophenocorr analysis set up</h2>
<p>todo documentation</p>

In [19]:
# Protein metadata
pm = UniprotProteinMetadataService()
pac = ProteinAnnotationCache(cache_dir)
pfa = ProtCachingFunctionalAnnotator(pac, pm)

# Functional annotator
vac = VariantAnnotationCache(cache_dir)
vep = VepFunctionalAnnotator(pfa)
vfa = VarCachingFunctionalAnnotator(vac, vep)


# Assemble the patient creator
pc = PhenopacketPatientCreator(phenotype_creator, vfa)
cc = PhenopacketCohortCreator(pc)
cohort = cc.create_cohort(fpath_phenopackets)

NameError: name 'PhenopacketCohortCreator' is not defined

In [4]:
allPatients.get_cohort_description_df()

Unnamed: 0,Patient ID,Disease,Gene,Variant,Protein,HPO Terms
0,12,618443,{MRPS34},{chr16:g.1767834C>T},{NP_076425.1},"{HP:0010864, HP:0100021, HP:0001263, HP:0001257}"
1,2,,{LOC107984011},{chr9:g.70598463C>T},{None},"{HP:0001252, HP:0000750, HP:0012469, HP:003193..."
2,3,,{LOC107984011},{chr9:g.70598463C>T},{None},"{HP:0001252, HP:0002069, HP:0000750, HP:000072..."
3,13,618443,{MRPS34},{chr16:g.1767834C>T},{NP_076425.1},"{HP:0001252, HP:0100704, HP:0007301, HP:000237..."
4,1,618443,{JPT2},{chr16:g.1706404del},{NP_653171.1},"{HP:0001251, HP:0001263, HP:0001252, HP:0000717}"
5,4,,{LOC107984011},{chr9:g.70598463C>T},{None},"{HP:0002133, HP:0001252, HP:0010864, HP:000072..."
6,PMID_111_probandA,OMIM:618443,{MAPK8IP3},{chr16:g.1762846C>T},{NP_001305781.1},"{HP:0033725, HP:0002104, HP:0001662, HP:000126..."
7,8,,{KLF9-DT},{chr9:g.70553229G>T},{None},"{HP:0001252, HP:0011147, HP:0000750, HP:003193..."
8,6,618443,{MAPK8IP3},{chr16:g.1760409T>C},{NP_001305781.1},"{HP:0001263, HP:0001256, HP:0001252}"
9,7,618443,{MAPK8IP3},{chr16:g.1762388G>A},{NP_001305781.1},"{HP:0001263, HP:0001256}"


In [12]:
for var in allPatients.all_variants_d.values():
    pprint.pprint(var._variant_json)

{'allele_string': 'C/T',
 'assembly_name': 'GRCh38',
 'colocated_variants': [{'allele_string': 'C/T',
                         'clin_sig': ['likely_pathogenic', 'pathogenic'],
                         'clin_sig_allele': 'T:pathogenic;T:pathogenic/likely_pathogenic',
                         'end': 1767834,
                         'id': 'rs1567214097',
                         'phenotype_or_disease': 1,
                         'pubmed': [30945334, 30612693],
                         'seq_region_name': '16',
                         'start': 1767834,
                         'strand': 1,
                         'var_synonyms': {'ClinVar': ['RCV000779605',
                                                      'VCV000632565',
                                                      'RCV001266867'],
                                          'OMIM': [605431.0005],
                                          'UniProt': ['VAR_082615']}}],
 'end': 1767834,
 'id': 'chr16:g.1767834C>T',
 'input': '