# ZMYM3

data derived from [Hiatt SM, et al. (2023) Deleterious, protein-altering variants in the transcriptional coregulator ZMYM3 in 27 individuals with a neurodevelopmental delay phenotype. Am J Hum Genet](https://pubmed.ncbi.nlm.nih.gov/36586412/).
Note that precise information about the age of onset is not provided. From the context, we assume Infantile onset.

In [1]:
import pandas as pd
pd.set_option('display.max_colwidth', None) # show entire column contents, important!
from IPython.display import display, HTML
from pyphetools.creation import *
from pyphetools.visualization import *
from pyphetools.validation import *
import pyphetools
print(f"Using pyphetools version {pyphetools.__version__}")

Using pyphetools version 0.9.78




In [2]:
PMID = "PMID:36586412"
title = "Deleterious, protein-altering variants in the transcriptional coregulator ZMYM3 in 27 individuals with a neurodevelopmental delay phenotype"
cite = Citation(pmid=PMID, title=title)
metadata = MetaData(created_by="ORCID:0000-0002-5648-2155", citation=cite)
parser = HpoParser(hpo_json_file="../hp.json")
hpo_cr = parser.get_hpo_concept_recognizer()
hpo_version = parser.get_version()
hpo_ontology = parser.get_ontology()
metadata.default_versions_with_hpo(version=hpo_version)
print(f"HPO version {hpo_version}")


HPO version 2024-04-04


In [3]:
df = pd.read_excel("input/Hiatt_ZMYM3_2023.xlsx")

In [4]:
df.head(2)

Unnamed: 0,Individual,Sex,Age (years),Zygosity,Inheritance,Mother’s NDD-related phenotype,Variant (NM_005096.3; NP_005087.1),Speech delay,Motor delay,ID,ASD traits,Behavioral problems,Facial dys-morphism,GU anomalies,Other
0,1,male,3.0,hemizygous,maternal,none,c.205G>A (p.Asp69Asn),yes,yes,,,no,no,urinary tract dilatation of left kidney on ultrasound,congenital heart defects
1,21,male,18.2,hemizygous,unknown,none,c.507A>T (p.Arg169Ser),yes,no,yes,yes,yes,yes,hypospadias,–


In [5]:
df.columns

Index(['Individual', 'Sex', 'Age (years)', 'Zygosity', 'Inheritance',
       'Mother’s NDD-related phenotype', 'Variant (NM_005096.3; NP_005087.1)',
       'Speech delay', 'Motor delay', 'ID', 'ASD traits',
       'Behavioral problems', 'Facial dys-morphism', 'GU anomalies', 'Other'],
      dtype='object')

In [6]:
sexMapper = SexColumnMapper(column_name="Sex", male_symbol="male", female_symbol="female")
#sexMapper.preview_column(df)

In [7]:
ageMapper = AgeColumnMapper.by_year(column_name="Age (years)")
#ageMapper.preview_column(df)
# This is the age at last encounter, not age of onset, which is not specified. Leave out for now

In [8]:
# isolate the HGVC cDNA variant
df["NM_005096.3"] = df["Variant (NM_005096.3; NP_005087.1)"].apply(lambda x: x.split()[0])

In [9]:
column_mapper_list = list()
speechMapper = SimpleColumnMapper(hpo_id="HP:0000750", hpo_label="Delayed speech and language development",
                                 observed="yes", excluded="no", column_name="Speech delay")
column_mapper_list.append(speechMapper)
speechMapper.preview_column(df)

Unnamed: 0,mapping,count
0,"original value: ""yes"" -> HP: Delayed speech and language development (HP:0000750) (observed)",26
1,"original value: ""nan"" -> HP: Delayed speech and language development (HP:0000750) (not measured)",1


In [10]:
motorMapper = SimpleColumnMapper(hpo_id="HP:0001270", hpo_label="Motor delay",
                                 observed="yes", excluded="no", column_name="Motor delay")
column_mapper_list.append(motorMapper)
motorMapper.preview_column(df)

Unnamed: 0,mapping,count
0,"original value: ""yes"" -> HP: Motor delay (HP:0001270) (observed)",18
1,"original value: ""no"" -> HP: Motor delay (HP:0001270) (excluded)",8
2,"original value: ""nan"" -> HP: Motor delay (HP:0001270) (not measured)",1


In [11]:
idMapper = SimpleColumnMapper(hpo_id="HP:0001249", hpo_label="Intellectual disability",
                                 observed="yes", excluded="no", column_name="ID")
column_mapper_list.append(idMapper)
idMapper.preview_column(df)

Unnamed: 0,mapping,count
0,"original value: ""nan"" -> HP: Intellectual disability (HP:0001249) (not measured)",7
1,"original value: ""yes"" -> HP: Intellectual disability (HP:0001249) (observed)",17
2,"original value: ""no"" -> HP: Intellectual disability (HP:0001249) (excluded)",3


In [12]:
asdMapper = SimpleColumnMapper(hpo_id="HP:0000729", hpo_label="Autistic behavior",
                                 observed="yes", excluded="no", column_name="ASD traits")
column_mapper_list.append(asdMapper)
asdMapper.preview_column(df)

Unnamed: 0,mapping,count
0,"original value: ""nan"" -> HP: Autistic behavior (HP:0000729) (not measured)",6
1,"original value: ""yes"" -> HP: Autistic behavior (HP:0000729) (observed)",15
2,"original value: ""no"" -> HP: Autistic behavior (HP:0000729) (excluded)",6


In [13]:
behavioralMapper =  SimpleColumnMapper(hpo_id="HP:0000708", hpo_label="Atypical behavior",
                                 observed="yes", excluded="no", column_name="Behavioral problems")
column_mapper_list.append(behavioralMapper)
behavioralMapper.preview_column(df)

Unnamed: 0,mapping,count
0,"original value: ""no"" -> HP: Atypical behavior (HP:0000708) (excluded)",7
1,"original value: ""yes"" -> HP: Atypical behavior (HP:0000708) (observed)",18
2,"original value: ""nan"" -> HP: Atypical behavior (HP:0000708) (not measured)",2


In [14]:
# Omitting "Facial dys-morphism" as detailed descriptions not available
gu_d = {
 'hypospadias': 'Hypospadias',
 'pielonephritis': 'Pyelonephritis',
 'vesicoureteral reflux': 'Vesicoureteral reflux',
 'single renal cyst': 'Renal cyst',
 'cryptorchidism': 'Cryptorchidism',
 'enuresis': 'Enuresis',
 'ambiguous genitalia': 'Ambiguous genitalia',
 'ectopic kidney': 'Ectopic kidney',
 'pyelectasis': 'Dilatation of the renal pelvis'}
excluded = {}
guMapper = OptionColumnMapper(column_name="GU anomalies", concept_recognizer=hpo_cr, option_d=gu_d, excluded_d=excluded)
column_mapper_list.append(guMapper)
guMapper.preview_column(df)

Unnamed: 0,mapping,count
0,Hypospadias (HP:0000047) (observed),3
1,Pyelonephritis (HP:0012330) (observed),1
2,Vesicoureteral reflux (HP:0000076) (observed),2
3,Renal cyst (HP:0000107) (observed),1
4,Cryptorchidism (HP:0000028) (observed),2
5,Enuresis (HP:0000805) (observed),2
6,Ambiguous genitalia (HP:0000062) (observed),1
7,Ectopic kidney (HP:0000086) (observed),1
8,Dilatation of the renal pelvis (HP:0010946) (observed),1


In [15]:
other_d = {
 'fasting and heat intolerance': 'Heat intolerance',
 'excessive fatigue': 'Fatigue',
 'GERD': 'Gastroesophageal reflux',
 'mild short stature': 'Mild short stature',
 'constipation': 'Constipation',
 'short stature': 'Short stature',
 'microcephaly': 'Microcephaly',
 'myopia': 'Myopia',
 'retinopathy': 'Retinopathy',
 'GI dysmotility': 'Gastrointestinal dysmotility',
 'kyphoscoliosis': 'Kyphoscoliosis',
 'kyphosis': 'Kyphosis',
 'Madelung deformity': 'Madelung deformity',
 'scoliosis': 'Scoliosis',
 'reflux': 'Gastroesophageal reflux',
 'joint laxity': 'Joint hypermobility',
 'volvulus of midgut': 'Volvulus',
 'pancreatic cysts': 'Pancreatic cysts'}
excluded = {}
otherMapper = OptionColumnMapper(column_name="Other", concept_recognizer=hpo_cr, option_d=other_d, excluded_d=excluded)
column_mapper_list.append(otherMapper)
otherMapper.preview_column(df)

Unnamed: 0,mapping,count
0,Abnormal heart morphology (HP:0001627) (observed),1
1,Heat intolerance (HP:0002046) (observed),1
2,Fatigue (HP:0012378) (observed),1
3,Gastroesophageal reflux (HP:0002020) (observed),4
4,Mild short stature (HP:0003502) (observed),1
5,Constipation (HP:0002019) (observed),2
6,Short stature (HP:0004322) (observed),5
7,Microcephaly (HP:0000252) (observed),5
8,Myopia (HP:0000545) (observed),1
9,Retinopathy (HP:0000488) (observed),1


In [16]:
ZMYM3_transcript = "NM_005096.3"

vman = VariantManager(df=df,
                      individual_column_name="Individual", 
                      allele_1_column_name="NM_005096.3",
                      gene_symbol="ZMYM3", 
                      transcript=ZMYM3_transcript)

[INFO] encoding variant "c.205G>A"
https://rest.variantvalidator.org/VariantValidator/variantvalidator/hg38/NM_005096.3%3Ac.205G>A/NM_005096.3?content-type=application%2Fjson
[INFO] encoding variant "c.3371G>A"
https://rest.variantvalidator.org/VariantValidator/variantvalidator/hg38/NM_005096.3%3Ac.3371G>A/NM_005096.3?content-type=application%2Fjson
[INFO] encoding variant "c.507A>T"
https://rest.variantvalidator.org/VariantValidator/variantvalidator/hg38/NM_005096.3%3Ac.507A>T/NM_005096.3?content-type=application%2Fjson
[INFO] encoding variant "c.1192C>T"
https://rest.variantvalidator.org/VariantValidator/variantvalidator/hg38/NM_005096.3%3Ac.1192C>T/NM_005096.3?content-type=application%2Fjson
[INFO] encoding variant "c.2193G>C"
https://rest.variantvalidator.org/VariantValidator/variantvalidator/hg38/NM_005096.3%3Ac.2193G>C/NM_005096.3?content-type=application%2Fjson
[INFO] encoding variant "c.1360T>C"
https://rest.variantvalidator.org/VariantValidator/variantvalidator/hg38/NM_005096.

In [17]:
XLID112 = Disease(disease_label="Intellectual developmental disorder, X-linked 112", 
                  disease_id="OMIM:301111")
varMapper = VariantColumnMapper(variant_column_name="NM_005096.3",
                               variant_d=vman.get_variant_d(),
                               default_genotype="heterozygous")

encoder = CohortEncoder(df=df, 
                        hpo_cr=hpo_cr, 
                        column_mapper_list=column_mapper_list, 
                        individual_column_name="Individual",
                        age_at_last_encounter_mapper=AgeColumnMapper.not_provided(), 
                        sexmapper=sexMapper,
                        variant_mapper=varMapper,
                        metadata=metadata)
encoder.set_disease(XLID112)

In [18]:
individuals = encoder.get_individuals()

In [19]:
cvalidator = CohortValidator(cohort=individuals, ontology=hpo_ontology, min_hpo=1, allelic_requirement=AllelicRequirement.MONO_ALLELIC)
qc = QcVisualizer(cohort_validator=cvalidator)
display(HTML(qc.to_summary_html()))

Level,Error category,Count
ERROR,CONFLICT,2
ERROR,INSUFFICIENT_HPOS,1
WARNING,REDUNDANT,15
INFORMATION,NOT_MEASURED,17

ID,Level,Category,Message,HPO Term
PMID_36586412_16,ERROR,INSUFFICIENT_HPOS,Minimum HPO terms required 1 but only 0 found,


In [20]:
individuals = cvalidator.get_error_free_individual_list()
table = PhenopacketTable(individual_list=individuals, metadata=metadata)
display(HTML(table.to_html()))

Individual,Disease,Genotype,Phenotypic features
1 (MALE; n/a),"Intellectual developmental disorder, X-linked 112 (OMIM:301111)",NM_005096.3:c.205G>A (heterozygous),Delayed speech and language development (HP:0000750); Motor delay (HP:0001270); Abnormal heart morphology (HP:0001627); excluded: Atypical behavior (HP:0000708)
21 (MALE; n/a),"Intellectual developmental disorder, X-linked 112 (OMIM:301111)",NM_005096.3:c.507A>T (heterozygous),Delayed speech and language development (HP:0000750); Intellectual disability (HP:0001249); Autistic behavior (HP:0000729); Hypospadias (HP:0000047); excluded: Motor delay (HP:0001270)
2 (MALE; n/a),"Intellectual developmental disorder, X-linked 112 (OMIM:301111)",NM_005096.3:c.721G>A (heterozygous),Delayed speech and language development (HP:0000750); Motor delay (HP:0001270); Heat intolerance (HP:0002046); Fatigue (HP:0012378); excluded: Intellectual disability (HP:0001249); excluded: Autistic behavior (HP:0000729)
3 (MALE; n/a),"Intellectual developmental disorder, X-linked 112 (OMIM:301111)",NM_005096.3:c.905G>A (heterozygous),Delayed speech and language development (HP:0000750); Motor delay (HP:0001270); Autistic behavior (HP:0000729); Pyelonephritis (HP:0012330); Vesicoureteral reflux (HP:0000076); Gastroesophageal reflux (HP:0002020)
4a (MALE; n/a),"Intellectual developmental disorder, X-linked 112 (OMIM:301111)",NM_005096.3:c.1183C>A (heterozygous),Delayed speech and language development (HP:0000750); Motor delay (HP:0001270); Intellectual disability (HP:0001249); Autistic behavior (HP:0000729); Hypospadias (HP:0000047)
4b (MALE; n/a),"Intellectual developmental disorder, X-linked 112 (OMIM:301111)",NM_005096.3:c.1183C>A (heterozygous),Delayed speech and language development (HP:0000750); Intellectual disability (HP:0001249); Autistic behavior (HP:0000729); excluded: Motor delay (HP:0001270)
5 (MALE; n/a),"Intellectual developmental disorder, X-linked 112 (OMIM:301111)",NM_005096.3:c.1192C>T (heterozygous),Delayed speech and language development (HP:0000750); Motor delay (HP:0001270); Intellectual disability (HP:0001249); Autistic behavior (HP:0000729)
22 (MALE; n/a),"Intellectual developmental disorder, X-linked 112 (OMIM:301111)",NM_005096.3:c.1321C>T (heterozygous),Delayed speech and language development (HP:0000750); Intellectual disability (HP:0001249); Autistic behavior (HP:0000729); Mild short stature (HP:0003502); excluded: Motor delay (HP:0001270)
6 (MALE; n/a),"Intellectual developmental disorder, X-linked 112 (OMIM:301111)",NM_005096.3:c.1322G>A (heterozygous),Delayed speech and language development (HP:0000750); Motor delay (HP:0001270); Intellectual disability (HP:0001249); Autistic behavior (HP:0000729); Renal cyst (HP:0000107); Constipation (HP:0002019)
7 (MALE; n/a),"Intellectual developmental disorder, X-linked 112 (OMIM:301111)",NM_005096.3:c.1322G>A (heterozygous),Delayed speech and language development (HP:0000750); Motor delay (HP:0001270); Intellectual disability (HP:0001249); Autistic behavior (HP:0000729); Cryptorchidism (HP:0000028); Enuresis (HP:0000805); Short stature (HP:0004322)


In [21]:
Individual.output_individuals_as_phenopackets(individual_list=individuals, metadata=metadata)

We output 26 GA4GH phenopackets to the directory phenopackets
