<h1>Noon syndrome 1: Lee at al (2007): phenopackets</h1>
<p>Data imported from <a href="https://pubmed.ncbi.nlm.nih.gov/17661820/">Lee ST, Ki CS, Lee HJ. Mutation analysis of the genes involved in the Ras-mitogen-activated protein kinase (MAPK) pathway in Korean patients with Noonan syndrome. Clin Genet. 2007 Aug;72(2):150-5.PMID: 17661820</a>.</p>

In [1]:
import pandas as pd
pd.set_option('display.max_colwidth', None) # show entire column contents, important!
from IPython.display import display, HTML
import pyphetools
from pyphetools.creation import *
from pyphetools.visualization import *
from pyphetools.validation import CohortValidator
print(f"pyphetools version {pyphetools.__version__}")

pyphetools version 0.9.79




In [2]:
PMID = "PMID:17661820"
title = "Mutation analysis of the genes involved in the Ras-mitogen-activated protein kinase (MAPK) pathway in Korean patients with Noonan syndrome"
citation = Citation(pmid=PMID, title=title)
parser = HpoParser()
hpo_cr = parser.get_hpo_concept_recognizer()
hpo_version = parser.get_version()
hpo_ontology = parser.get_ontology()
metadata = MetaData(created_by="ORCID:0000-0002-0736-9199", citation=citation)
metadata.default_versions_with_hpo(version=hpo_version)
print(f"HPO version {hpo_version}")

HPO version 2024-04-26


In [3]:
df = pd.read_excel('input/Lee2007Noonan1.xlsx')

In [4]:
df.head()

Unnamed: 0,Patient,1,2,3,4,5,6,7
0,Sex,M,F,M,F,F,M,M
1,Age,5,29,4,2,30,6,3
2,PTPN11 mutation,T42A,N308D,N308D,N308D,N308D,N308D,M504V
3,transcript.hgvs,c.124A>G,c.922A>G,c.922A>G,c.922A>G,c.922A>G,c.922A>G,c.1510A>G
4,CHD,"ASD, SVC and IVC anomaly",PS,"ASD, PS, hypoplastic MPA","VSD, PS",PS,"ASD, PS","ASD, mild PS"


In [5]:
# need to convert to column-based format
dft = df.transpose()
dft.columns = dft.iloc[0]
dft.drop(dft.index[0], inplace=True)
dft.head()
dft['individual_id'] = dft.index

In [6]:
column_mapper_list = list()

In [7]:
chd_d = {'ASD': 'Atrial septal defect',
         'SVC': 'Bilateral superior vena cava', # from paper!
             'PS': 'Pulmonic stenosis',
        'hypoplastic MPA': 'Pulmonary artery hypoplasia',
        'VSD': 'Ventricular septal defect'}
chdMapper = OptionColumnMapper(column_name='CHD',concept_recognizer=hpo_cr, option_d=chd_d)
column_mapper_list.append(chdMapper)
chdMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Atrial septal defect (HP:0001631) (observed),4
1,Bilateral superior vena cava (HP:0033379) (observed),1
2,Pulmonic stenosis (HP:0001642) (observed),6
3,Pulmonary artery hypoplasia (HP:0004971) (observed),1
4,Ventricular septal defect (HP:0001629) (observed),1


In [8]:
#Webbed neck HP:0000465
webbedNeckMapper = SimpleColumnMapper(column_name='Webbed neck',hpo_id='HP:0000465',
    hpo_label='Webbed neck',
    observed='Yes',
    excluded='−')
column_mapper_list.append(webbedNeckMapper)
webbedNeckMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,"original value: ""Yes"" -> HP: Webbed neck (HP:0000465) (observed)",5
1,"original value: ""-"" -> HP: Webbed neck (HP:0000465) (not measured)",2


In [9]:
# Short stature HP:0004322
shortStatureMapper = SimpleColumnMapper(column_name='Short stature',
                                        hpo_id='HP:0004322', hpo_label='Short stature',observed='Yes',excluded='−')
column_mapper_list.append(shortStatureMapper)
shortStatureMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,"original value: ""Yes"" -> HP: Short stature (HP:0004322) (observed)",6
1,"original value: ""No"" -> HP: Short stature (HP:0004322) (not measured)",1


In [10]:
# Chest deformity -- assume pectus excavatum, reported for one patient only in detail
# Pectus excavatum HP:0000767
pectusMapper = SimpleColumnMapper(column_name='Chest deformity',
                                  hpo_id='HP:0000767', hpo_label='Pectus excavatum', observed='Yes',  excluded='−')
column_mapper_list.append(pectusMapper)
pectusMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,"original value: ""Yes"" -> HP: Pectus excavatum (HP:0000767) (observed)",5
1,"original value: ""-"" -> HP: Pectus excavatum (HP:0000767) (not measured)",2


In [11]:
# Feeding difficulties HP:0011968
feedingMapper = SimpleColumnMapper(column_name='Feeding problems',
                                   hpo_id='HP:0011968', hpo_label='Feeding difficulties', observed='Yes', excluded='−')
column_mapper_list.append(feedingMapper)
feedingMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,"original value: ""-"" -> HP: Feeding difficulties (HP:0011968) (not measured)",6
1,"original value: ""Yes"" -> HP: Feeding difficulties (HP:0011968) (observed)",1


In [12]:
# Hearing problem
# Hearing impairment HP:0000365
hearingMapper = SimpleColumnMapper(column_name='Hearing problem',
                                   hpo_id='HP:0000365',hpo_label='Hearing impairment', observed='Yes', excluded='−')
column_mapper_list.append(hearingMapper)
hearingMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,"original value: ""-"" -> HP: Hearing impairment (HP:0000365) (not measured)",3
1,"original value: ""Yes"" -> HP: Hearing impairment (HP:0000365) (observed)",4


In [13]:
# Delayed development
# Global developmental delay HP:0001263
devMapper = SimpleColumnMapper(column_name='Delayed development',
                               hpo_id='HP:0001263', hpo_label='Global developmental delay', observed='Yes', excluded='−')
column_mapper_list.append(devMapper)
devMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,"original value: ""Yes"" -> HP: Global developmental delay (HP:0001263) (observed)",5
1,"original value: ""-"" -> HP: Global developmental delay (HP:0001263) (not measured)",2


In [14]:
# Mental retardation
# Intellectual disability, mild HP:0001256
idMapper =  SimpleColumnMapper(column_name='Mental retardation',
                               hpo_id='HP:0001256', hpo_label='Intellectual disability, mild', observed='Mild', excluded='−')
column_mapper_list.append(idMapper)
idMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,"original value: ""Mild"" -> HP: Intellectual disability, mild (HP:0001256) (observed)",1
1,"original value: ""-"" -> HP: Intellectual disability, mild (HP:0001256) (not measured)",6


In [15]:
# Cryptorchidism HP:0000028
cryptorchidismMapper =  SimpleColumnMapper(column_name='Cryptorchidism',hpo_id='HP:0000028',
                                    hpo_label='Cryptorchidism', observed='Yes', excluded='−')
column_mapper_list.append(cryptorchidismMapper)
cryptorchidismMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,"original value: ""Yes"" -> HP: Cryptorchidism (HP:0000028) (observed)",1
1,"original value: ""-"" -> HP: Cryptorchidism (HP:0000028) (not measured)",6


In [16]:
# Cubitus valgus HP:0002967
cvalMapper =  SimpleColumnMapper(column_name='Cubitus valgus',hpo_id='HP:0002967',
    hpo_label='Cubitus valgus',  observed='Yes',  excluded='−')
column_mapper_list.append(cvalMapper)
cvalMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,"original value: ""Yes"" -> HP: Cubitus valgus (HP:0002967) (observed)",3
1,"original value: ""-"" -> HP: Cubitus valgus (HP:0002967) (not measured)",4


In [17]:
# Patient 1 had a small ectopic kidney
other_d = {'Splenomegaly': 'Splenomegaly',
         'Renal': 'Ectopic kidney', # from paper!
             }
otherMapper = OptionColumnMapper(column_name='Others',concept_recognizer=hpo_cr, option_d=other_d)
column_mapper_list.append(otherMapper)
otherMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Abnormality of the kidney (HP:0000077) (observed),1
1,Splenomegaly (HP:0001744) (observed),1


<h3>Variants</h3>
<p>By inspection in ClinVar, the three variants are: NM_002834.5(PTPN11):c.124A>G (p.Thr42Ala), NM_002834.5(PTPN11):c.922A>G (p.Asn308Asp), and  NM_002834.5(PTPN11):c.1510A>G (p.Met504Val) </p>

In [18]:
ptpn11_transcript='NM_002834.5'
ptpn11_id = "HGNC:9644"
vman = VariantManager(df=dft, individual_column_name="individual_id", allele_1_column_name="transcript.hgvs", gene_symbol="PTPN11", gene_id=ptpn11_id,
                      transcript=ptpn11_transcript)

var_d = vman.get_variant_d()
varMapper = VariantColumnMapper(variant_d=var_d, 
                                variant_column_name='transcript.hgvs', 
                                default_genotype="heterozygous")

[INFO] encoding variant "c.922A>G"
https://rest.variantvalidator.org/VariantValidator/variantvalidator/hg38/NM_002834.5%3Ac.922A>G/NM_002834.5?content-type=application%2Fjson
[INFO] encoding variant "c.124A>G"
https://rest.variantvalidator.org/VariantValidator/variantvalidator/hg38/NM_002834.5%3Ac.124A>G/NM_002834.5?content-type=application%2Fjson
[INFO] encoding variant "c.1510A>G"
https://rest.variantvalidator.org/VariantValidator/variantvalidator/hg38/NM_002834.5%3Ac.1510A>G/NM_002834.5?content-type=application%2Fjson


In [19]:
ageMapper = AgeColumnMapper.by_year('Age')
#ageMapper.preview_column(dft['Age'])
sexMapper = SexColumnMapper(male_symbol='M', female_symbol='F', column_name='Sex')
#sexMapper.preview_column(dft['Sex'])

In [20]:
encoder = CohortEncoder(df=dft, 
                        hpo_cr=hpo_cr, 
                        column_mapper_list=column_mapper_list, 
                        individual_column_name="individual_id", 
                        age_at_last_encounter_mapper=ageMapper, 
                        sexmapper=sexMapper,
                        metadata=metadata,
                        variant_mapper=varMapper)
noonan = Disease(disease_id="OMIM:163950", disease_label="Noonan syndrome 1")
encoder.set_disease(noonan)

In [21]:
individuals = encoder.get_individuals()
cvalidator = CohortValidator(cohort=individuals, ontology=hpo_ontology, min_hpo=1, allelic_requirement=AllelicRequirement.MONO_ALLELIC)
qc = QcVisualizer(cohort_validator=cvalidator)
display(HTML(qc.to_summary_html()))

Level,Error category,Count
INFORMATION,NOT_MEASURED,32


In [22]:
individuals = cvalidator.get_error_free_individual_list()
table = PhenopacketTable(individual_list=individuals, metadata=metadata)
display(HTML(table.to_html()))

Individual,Disease,Genotype,Phenotypic features
1 (MALE; P5Y),Noonan syndrome 1 (OMIM:163950),NM_002834.5:c.124A>G (heterozygous),"Atrial septal defect (HP:0001631); Bilateral superior vena cava (HP:0033379); Webbed neck (HP:0000465); Short stature (HP:0004322); Pectus excavatum (HP:0000767); Global developmental delay (HP:0001263); Intellectual disability, mild (HP:0001256); Cryptorchidism (HP:0000028); Cubitus valgus (HP:0002967); Abnormality of the kidney (HP:0000077); Splenomegaly (HP:0001744)"
2 (FEMALE; P29Y),Noonan syndrome 1 (OMIM:163950),NM_002834.5:c.922A>G (heterozygous),Pulmonic stenosis (HP:0001642); Webbed neck (HP:0000465); Short stature (HP:0004322); Pectus excavatum (HP:0000767); Global developmental delay (HP:0001263)
3 (MALE; P4Y),Noonan syndrome 1 (OMIM:163950),NM_002834.5:c.922A>G (heterozygous),Atrial septal defect (HP:0001631); Pulmonic stenosis (HP:0001642); Pulmonary artery hypoplasia (HP:0004971); Short stature (HP:0004322); Pectus excavatum (HP:0000767); Hearing impairment (HP:0000365); Global developmental delay (HP:0001263)
4 (FEMALE; P2Y),Noonan syndrome 1 (OMIM:163950),NM_002834.5:c.922A>G (heterozygous),Ventricular septal defect (HP:0001629); Pulmonic stenosis (HP:0001642); Short stature (HP:0004322); Feeding difficulties (HP:0011968); Hearing impairment (HP:0000365); Global developmental delay (HP:0001263)
5 (FEMALE; P30Y),Noonan syndrome 1 (OMIM:163950),NM_002834.5:c.922A>G (heterozygous),Pulmonic stenosis (HP:0001642); Webbed neck (HP:0000465); Short stature (HP:0004322)
6 (MALE; P6Y),Noonan syndrome 1 (OMIM:163950),NM_002834.5:c.922A>G (heterozygous),Atrial septal defect (HP:0001631); Pulmonic stenosis (HP:0001642); Webbed neck (HP:0000465); Pectus excavatum (HP:0000767); Hearing impairment (HP:0000365); Global developmental delay (HP:0001263); Cubitus valgus (HP:0002967)
7 (MALE; P3Y),Noonan syndrome 1 (OMIM:163950),NM_002834.5:c.1510A>G (heterozygous),Atrial septal defect (HP:0001631); Pulmonic stenosis (HP:0001642); Webbed neck (HP:0000465); Short stature (HP:0004322); Pectus excavatum (HP:0000767); Hearing impairment (HP:0000365); Cubitus valgus (HP:0002967)


In [23]:
Individual.output_individuals_as_phenopackets(individual_list=individuals,
                                              metadata=metadata)

We output 7 GA4GH phenopackets to the directory phenopackets
