<h1>GLI3: Demurger et al 2015</h1>
<p>Extract the clinical data from <a href="https://pubmed.ncbi.nlm.nih.gov/24736735/"target="__blank">Démurger F, et al. (2015) New insights into genotype-phenotype correlation for GLI3 mutations. Eur J Hum Genet ;23(1):92-102. PMID:24736735</a>.<p>
<p>Table 1 (and Supplemental Table 1) present data for Greig cephalopolysyndactyly syndrome (GCPS; MIM# 175700).</p>
<p>Table 2 (and Supplemental Table 2) present data for Pallister–Hall syndrome (PHS; MIM# 146510).</p>

In [1]:
import phenopackets as PPkt
from google.protobuf.json_format import MessageToDict, MessageToJson
from google.protobuf.json_format import Parse, ParseDict
import pandas as pd
import math
from csv import DictReader
pd.set_option('display.max_colwidth', None) # show entire column contents, important!
from collections import defaultdict
import re
import pyphetools
from pyphetools.creation import *
from pyphetools.output import PhenopacketTable
print(f"pyphetools version {pyphetools.__version__}")

pyphetools version 0.6.3


In [2]:
parser = HpoParser()
hpo_cr = parser.get_hpo_concept_recognizer()
hpo_version = parser.get_version()
pmid = "PMID:24736735"
title = "New insights into genotype-phenotype correlation for GLI3 mutations"
metadata = MetaData(created_by="ORCID:0000-0002-0736-9199", pmid=pmid, pubmed_title=title)
metadata.default_versions_with_hpo(version=hpo_version)
pmid="PMID:29198722"

<H2>Greig cephalopolysyndactyly syndrome (GCPS; MIM# 175700)</H2>
<p>c.1543_1544dup) was found in two affected sibs, was present at low level in DNA extracted from blood of their father (Family G068), suggesting a somatic mosaicism. We therefore remove the row corresponding to the father from further analysis.</p>
<p>Along the same line, a FISH analysis revealed a GLI3 deletion in only 56% of blood cells of a patient (G059) with bilateral preaxial PD of the feet and developmental delay. At least two patients (G005 and G019) had Greig cephalopolysyndactyly contiguous gene syndrome (GCPS-CGS) caused by haploinsufficiency of GLI3 and adjacent genes confirmed by array-CGH with a deletion of 7 and 9 Mb, respectively.</p>
<p>These individuals were also removed from the analysis because of the multifactorial pathophysiology.</p>
<p>We removed the corresponding rows from the following table.</p>

In [3]:
df1 = pd.read_csv("input/demurger_table_1.csv", delimiter="\t")
df1.head()

Unnamed: 0,N,cDNA alteration,Predicted protein alteration,Inheritance,Postaxial PD,Preaxial PD,Broad thumbs or halluces,Syndactyly,Macrocephaly,Widely spaced eyes,MRI Findings,Developmental delay,Additional findings
0,G029,327del,Phe109Leufs*50,F,–,FB,,+,–,–,–,–,"Precocious puberty, scaphocephaly"
1,G070,427G>T,Glu143*,F,HB,FL,,–,+,,,–,
2,G070_Mother,427G>T,Glu143*,F,–,–,,–,–,–,,–,
3,G118,444C>A,Tyr148*,F,–,FB,BT,+,+,,,–,
4,G13684,444C>A,Tyr148*,F,–,FB,,+,+,+,,–,


In [4]:
column_mapper_d = defaultdict(ColumnMapper)

In [5]:
postaxial_d = {'HB': 'Postaxial hand polydactyly',
              'FB': 'Postaxial foot polydactyly',}
excluded_d = {"–":'Postaxial polydactyly'}
postaxialMapper = OptionColumnMapper(concept_recognizer=hpo_cr, option_d=postaxial_d, excluded_d=excluded_d)
postaxialMapper.preview_column(df1["Postaxial PD"])
column_mapper_d["Postaxial PD"] = postaxialMapper

In [6]:
preaxial_d = {'HB': 'Preaxial hand polydactyly',
              'FB': 'Preaxial foot polydactyly',
              'FL': 'Preaxial foot polydactyly',}
excluded_d = {"–":'Preaxial polydactyly'}
preaxialMapper = OptionColumnMapper(concept_recognizer=hpo_cr, option_d=preaxial_d, excluded_d=excluded_d)
preaxialMapper.preview_column(df1["Preaxial PD"])
column_mapper_d["Preaxial PD"] = preaxialMapper

In [7]:
thumb_d = {"BT": "Broad thumb", 
          "BH": "Broad hallux",
          "+": [ "Broad thumb", "Broad hallux"]}
excluded = {"–": "Broad thumb"}
thumbMapper = OptionColumnMapper(concept_recognizer=hpo_cr, option_d=thumb_d,excluded_d=excluded_d)
thumbMapper.preview_column(df1["Broad thumbs or halluces"])
column_mapper_d["Broad thumbs or halluces"] = thumbMapper

In [8]:
syndMapper = SimpleColumnMapper(hpo_id="HP:0001159", hpo_label="Syndactyly", observed="+", excluded="–")
syndMapper.preview_column(df1["Syndactyly"])
column_mapper_d["Syndactyly"] = syndMapper

In [9]:
macMapper = SimpleColumnMapper(hpo_id="HP:0000256", hpo_label="Macrocephaly", observed="+", excluded="–")
macMapper.preview_column(df1["Macrocephaly"])
column_mapper_d["Macrocephaly"] = macMapper

In [10]:
#Widely spaced eyes  Hypertelorism HP:0000316
htMapper = SimpleColumnMapper(hpo_id="HP:0000316", hpo_label="Hypertelorism", observed="+", excluded="–")
htMapper.preview_column(df1["Widely spaced eyes"])
column_mapper_d["Widely spaced eyes"] = htMapper

In [11]:
# MRI Findings
mri_d = {'CCH': 'Hypoplasia of the corpus callosum',
         'CCA': 'Agenesis of corpus callosum',
         'pCCA': 'Partial agenesis of the corpus callosum',
         'VD': 'Ventriculomegaly'}
mriMapper = OptionColumnMapper(concept_recognizer=hpo_cr, option_d=mri_d)
mriMapper.preview_column(df1["MRI Findings"])
column_mapper_d["MRI Findings"] = mriMapper

In [12]:
dd_d = {'+': 'Global developmental delay',
         'Mild': 'Mild global developmental delay',
         'Bilateral inguinal hernia': 'Inguinal hernia',
         'strabismus': 'Strabismus',
       "Cataract": "Cataract",
       "Seizures":"Seizure",
       "horseshoe kidney": "Horseshoe kidney",
       "Trigonocephaly": "Trigonocephaly",
       "macrosomia": "Macrosomia",
       "vermis dysgenesis": "Dysgenesis of the cerebellar vermis"}
excluded_d = {"–": "Global developmental delay"}
ddMapper = OptionColumnMapper(concept_recognizer=hpo_cr, option_d=dd_d, excluded_d=excluded_d)
ddMapper.preview_column(df1["Developmental delay"])
column_mapper_d["Developmental delay"] = ddMapper

In [13]:
df1["Additional findings"].unique()

array(['Precocious puberty, scaphocephaly', '\xa0', 'Delta phalanx',
       'Atrial septal defect', 'Umbilical hernia',
       'Bifid distal phalanx, BW= 4150', 'Cerebral prematurity sequelae',
       'Delta metacarpal, BW=4880', 'BW=4740',
       'Hypoplastic cerebellum, microretrognathism',
       'Bilateral keratoconus, umbilical and bilateral inguinal hernia',
       'Macrosomia', 'Neurofibromatosis type 1',
       'Brachydactyly, delta phalanx', 'Brachydactyly, speech delay',
       'Speech delay, exomphalos', 'Umbilical hernia, anterior anus',
       'Laryngomalacia', 'BW=4440', 'Supernumerary nipples'], dtype=object)

In [14]:
add_d = {'Precocious puberty': 'Precocious puberty',
         'scaphocephaly': 'Scaphocephaly',
         'Delta phalanx': 'Triangular shaped phalanges of the hand',
         'Bifid distal phalanx': 'Partial duplication of the distal phalanges of the hand',
       "Hypoplastic cerebellum": "Cerebellar hypoplasia",
       "microretrognathism":"Microretrognathia",
       "keratoconus": "Keratoconus",
       "umbilical": "Umbilical hernia",
         "Umbilical hernia": "Umbilical hernia",
       "inguinal hernia": "Inguinal hernia",
       "Macrosomia": "Large for gestational age",
        "Brachydactyly":"Brachydactyly",
         "anterior anus": "Anteriorly placed anus",
         "Laryngomalacia":"Laryngomalacia",
         "Supernumerary nipples": "Supernumerary nipple"
        }
addMapper = OptionColumnMapper(concept_recognizer=hpo_cr, option_d=add_d)
addMapper.preview_column(df1["Additional findings"])
column_mapper_d["Additional findings"] = addMapper

<h2>GLI3 Variants</h2>
<p>Variants are provided in table 1 according to NM_000168.6.</p>
<p>Note that the contents of the column "cDNA alteration" do not have the "c." required by HGVS, so we add it to all columns before proceding.</p>

In [15]:
gli3_transcript='NM_000168.6'
genome = 'hg38'
#varMapper = VariantColumnMapper(assembly=genome,
#                                column_name='cDNA alteration', 
#                                transcript=transcript, 
#                                default_genotype='heterozygous')
hgvsMapper = VariantValidator(genome_build=genome, transcript=gli3_transcript)

<h3>Small and Structural variants</h3>
<p>We encode the small variants using HGVS and trhe structural variants using the StructuralVariant class</p>

In [16]:
struct_variants = { "rsa7p14.1(kit P179)x1",
                    "46,XY.ish del(7)(p14.1)(RP11-816F16-)",
                    "46,XX.ish del(7)(p14.1p14.1)(GLI3-)" }
gli3_symbol = "GLI3"
gli3_id = "HGNC:4319"
gli3_variants = df1['cDNA alteration'].unique()
gli3_variant_d = defaultdict(Variant)
for gli3v in gli3_variants:
    if gli3v in struct_variants:
        sv = StructuralVariant.chromosomal_deletion(cell_contents=gli3v, gene_id=gli3_id, gene_symbol=gli3_symbol)
        print(gli3v)
        gli3_variant_d[gli3v] = sv
    else:
        hgvs = f"c.{gli3v}"
        v = hgvsMapper.encode_hgvs(gli3v)
        gli3_variant_d[gli3v] = v

https://rest.variantvalidator.org/VariantValidator/variantvalidator/hg38/NM_000168.6%3A327del/NM_000168.6?content-type=application%2Fjson
https://rest.variantvalidator.org/VariantValidator/variantvalidator/hg38/NM_000168.6%3A427G>T/NM_000168.6?content-type=application%2Fjson
https://rest.variantvalidator.org/VariantValidator/variantvalidator/hg38/NM_000168.6%3A444C>A/NM_000168.6?content-type=application%2Fjson
https://rest.variantvalidator.org/VariantValidator/variantvalidator/hg38/NM_000168.6%3A518dup/NM_000168.6?content-type=application%2Fjson
https://rest.variantvalidator.org/VariantValidator/variantvalidator/hg38/NM_000168.6%3A679+1G>T/NM_000168.6?content-type=application%2Fjson
https://rest.variantvalidator.org/VariantValidator/variantvalidator/hg38/NM_000168.6%3A833_843del/NM_000168.6?content-type=application%2Fjson
https://rest.variantvalidator.org/VariantValidator/variantvalidator/hg38/NM_000168.6%3A868C>T/NM_000168.6?content-type=application%2Fjson
https://rest.variantvalidato

In [17]:
variantMapper = VariantColumnMapper(variant_d=gli3_variant_d,
                                   variant_column_name="cDNA alteration",
                                    default_genotype='heterozygous'
                                   )

In [18]:
omim_id = "OMIM:175700"
omim_label = "Greig cephalopolysyndactyly syndrome"
encoder = CohortEncoder(df=df1, 
                        hpo_cr=hpo_cr, 
                        column_mapper_d=column_mapper_d, 
                        individual_column_name="N", 
                        metadata=metadata,
                        agemapper=AgeColumnMapper.not_provided(), 
                        sexmapper=SexColumnMapper.not_provided(),
                        variant_mapper=variantMapper,
                        pmid=pmid)
encoder.set_disease(disease_id=omim_id, label=omim_label)

In [19]:
gcps_individuals = encoder.get_individuals()

In [20]:
Individual.output_individuals_as_phenopackets(individual_list=gcps_individuals, 
                                              pmid=pmid,
                                              metadata=metadata.to_ga4gh(),
                                              outdir="phenopackets")

We output 51 GA4GH phenopackets to the directory phenopackets


In [21]:
from IPython.display import HTML, display
phenopackets = [i.to_ga4gh_phenopacket(metadata=metadata.to_ga4gh()) for i in gcps_individuals]
table = PhenopacketTable(phenopacket_list=phenopackets)
display(HTML(table.to_html()))

Individual,Genotype,Phenotypic features
G029 (UNKNOWN; ),NM_000168.6:c.327del (heterozygous),Preaxial foot polydactyly (HP:0001841); Syndactyly (HP:0001159); Precocious puberty (HP:0000826); Scaphocephaly (HP:0030799)
G070 (UNKNOWN; ),NM_000168.6:c.427G>T (heterozygous),Postaxial hand polydactyly (HP:0001162); Preaxial foot polydactyly (HP:0001841); Macrocephaly (HP:0000256); Hypertelorism (HP:0000316)
G070_Mother (UNKNOWN; ),NM_000168.6:c.427G>T (heterozygous),
G118 (UNKNOWN; ),NM_000168.6:c.444C>A (heterozygous),Preaxial foot polydactyly (HP:0001841); Broad thumb (HP:0011304); Syndactyly (HP:0001159); Macrocephaly (HP:0000256); Hypertelorism (HP:0000316)
G13684 (UNKNOWN; ),NM_000168.6:c.444C>A (heterozygous),Preaxial foot polydactyly (HP:0001841); Syndactyly (HP:0001159); Macrocephaly (HP:0000256); Hypertelorism (HP:0000316)
G13684_Brother (UNKNOWN; ),NM_000168.6:c.444C>A (heterozygous),Preaxial foot polydactyly (HP:0001841); Syndactyly (HP:0001159); Hypertelorism (HP:0000316)
G13684_Mother (UNKNOWN; ),NM_000168.6:c.444C>A (heterozygous),Preaxial foot polydactyly (HP:0001841); Syndactyly (HP:0001159); Macrocephaly (HP:0000256); Hypertelorism (HP:0000316)
G099 (UNKNOWN; ),NM_000168.6:c.518dup (heterozygous),Preaxial foot polydactyly (HP:0001841); Syndactyly (HP:0001159); Macrocephaly (HP:0000256)
G048 (UNKNOWN; ),NM_000168.6:c.679+1G>T (heterozygous),Postaxial hand polydactyly (HP:0001162); Preaxial foot polydactyly (HP:0001841); Preaxial hand polydactyly (HP:0001177); Syndactyly (HP:0001159); Macrocephaly (HP:0000256); Hypertelorism (HP:0000316); Ventriculomegaly (HP:0002119); Triangular shaped phalanges of the hand (HP:0009774)
G15198 (UNKNOWN; ),NM_000168.6:c.833_843del (heterozygous),Postaxial foot polydactyly (HP:0001830); Postaxial hand polydactyly (HP:0001162); Preaxial foot polydactyly (HP:0001841); Preaxial hand polydactyly (HP:0001177); Broad thumb (HP:0011304); Syndactyly (HP:0001159); Macrocephaly (HP:0000256); Hypertelorism (HP:0000316)


<h1> Pallister–Hall syndrome (PHS; MIM# 146510)</h1>
<p>The second half of this notebook extracts data about PHS from supplemental table 2.</p>

In [22]:
df2 = pd.read_csv("input/demurger_table_2.csv", delimiter="\t")
df2.head()

Unnamed: 0,N,cDNA,Predicted protein alteration,Inheritance,Growth delay/GH deficiency,Insertional/postaxial PD,Brachytelephalangism/dactyly,Y-shaped metacarpal/metatarsal,Hypothalamic hamartoma,Craniofacial anomalies,Anal atresia,Bifid epiglottis,Cardiac anomalies,Renal anomalies,Genital anomalies,Lung dysplasia,Intellectual deficiency,Nail dysplasia,Other findings
0,P15112,1995del,Gly666Alafs*27,De novo,-,-,+,+,+,+,+,+,-,-,-,-,-,+,"Overlapping toes, preauricular tag"
1,G097,2072del,Gln691Argfs*2,De novo,+,+,,+,+,-,-,,-,-,+,-,-,,"Micropenis, thin CC"
2,G085,2123_2126del,Gly708Valfs*24,Familial,-,+,,+,,-,-,,-,,-,,-,+,
3,Father,Gly708Valfs*24,Gly708Valfs*24,,+,+,,+,,-,-,,-,,+,,-,,Micropenis
4,Aunt,Gly708Valfs*24,Gly708Valfs*24,,,+,+,,,-,-,,-,,-,,-,+,


<p>Note that the HPO parser and the metadata object can be reused.</p>

In [23]:
df2.columns

Index(['N', 'cDNA', 'Predicted protein alteration', 'Inheritance',
       'Growth delay/GH deficiency', 'Insertional/postaxial PD',
       'Brachytelephalangism/dactyly', 'Y-shaped metacarpal/metatarsal',
       'Hypothalamic hamartoma', 'Craniofacial anomalies', 'Anal atresia',
       'Bifid epiglottis', 'Cardiac anomalies', 'Renal anomalies',
       'Genital anomalies', 'Lung dysplasia', 'Intellectual deficiency',
       'Nail dysplasia', 'Other findings'],
      dtype='object')

In [24]:
generator = SimpleColumnMapperGenerator(df=df2, observed="+", excluded="-", hpo_cr=hpo_cr)

In [25]:
# initialize the column_mapper_d with parsed simple columns
column_mapper_d = generator.try_mapping_columns()

In [26]:
generator.get_mapped_columns()

['Hypothalamic hamartoma',
 'Anal atresia',
 'Bifid epiglottis',
 'Cardiac anomalies',
 'Renal anomalies',
 'Genital anomalies',
 'Nail dysplasia']

In [27]:
generator.get_unmapped_columns()

['N',
 'cDNA',
 'Predicted protein alteration',
 'Inheritance',
 'Growth delay/GH deficiency',
 'Insertional/postaxial PD',
 'Brachytelephalangism/dactyly',
 'Y-shaped metacarpal/metatarsal',
 'Craniofacial anomalies',
 'Lung dysplasia',
 'Intellectual deficiency',
 'Other findings']

In [28]:
# Growth delay HP:0001510
label_d = {
    'Growth delay/GH deficiency': ["Growth delay", "HP:0001510"],
    'Brachytelephalangism/dactyly': ["Shortening of all distal phalanges of the fingers", "HP:0006118"], # Brachyteledactylyly
    'Craniofacial anomalies' : ["Abnormality of the face", "HP:0000271"],
    'Lung dysplasia': ["Abnormal lung development", "HP:4000059"],
    'Intellectual deficiency': ["Intellectual disability", "HP:0001249"]
}

for k, v in label_d.items():
    column_name = k
    print(column_name)
    hpo_label = v[0]
    hpo_id = v[1]
    mapper = SimpleColumnMapper(hpo_id=hpo_id, hpo_label=hpo_label, observed="+", excluded="-")
    #print(mapper.preview_column(df2[column_name]))
    column_mapper_d[column_name] = mapper

Growth delay/GH deficiency
Brachytelephalangism/dactyly
Craniofacial anomalies
Lung dysplasia
Intellectual deficiency


In [29]:
# 'Y-shaped metacarpal/metatarsal'
# Y-shaped metacarpals HP:0006042
# Y-shaped metatarsals HP:0010567
y_d = {'+': "Y-shaped metacarpals", "Y-shaped metatarsals"]}
y_excluded = {"-": ["Y-shaped metacarpals", "Y-shaped metatarsals"]}
yMapper = OptionColumnMapper(concept_recognizer=hpo_cr, option_d=y_d, excluded_d=y_excluded)
yMapper.preview_column(df2['Y-shaped metacarpal/metatarsal'])
column_mapper_d['Y-shaped metacarpal/metatarsal'] = yMapper
#df2['Y-shaped metacarpal/metatarsal']

SyntaxError: closing parenthesis ']' does not match opening parenthesis '{' (2491626658.py, line 4)

In [30]:
other_findings_d = {'Overlapping toes': 'Overlapping toe',
 'preauricular tag': 'Preauricular skin tag',
 'Micropenis': 'Micropenis',
 'thin CC': 'Thin corpus callosum',
 'Sacrococcygeal teratoma': 'Sacrococcygeal teratoma',
 'conical teeth': 'Conical tooth',
 'cryptorchidism': 'Cryptorchidism',
 'micropenis': 'Micropenis',
 'syndactyly': 'Syndactyly',
 'unilateral renal agenesis': 'Unilateral renal agenesis',
 'fine motor delay': 'Motor delay',
 'choanal atresia': 'Choanal atresia',
 'fine DD': 'Global developmental delay',
 'Scoliosis': 'Scoliosis',
 'dental malposition': 'Tooth malposition',
 'Oligohydramnios': 'Oligohydramnios',
 'Seizures': 'Seizure',
 'panhypopituitarism': 'Panhypopituitarism',
 'renal hypoplasia': 'Renal hypoplasia',
 'Syndactyly': 'Syndactyly',
 'Agnathia': 'Mandibular aplasia',
 'hypoplastic maxillary': 'Hypoplasia of the maxilla',
 #'absence of oral orifice': 'PLACEHOLDER',
 'bilateral choanal atresia': 'Bilateral choanal atresia',
 'oligosyndactyly': 'Syndactyly',
 'arthrogryposis': 'Arthrogryposis multiplex congenita',
 'mesomelia bilateral radio-ulnar bowing': 'Mesomelia',
 'absence of tibia and fibula': 'Absent tibia',
 'bilateral renal agenesis': 'Bilateral renal agenesis',
 'pituitary gland agenesis': 'Anterior pituitary agenesis',
 'adrenal agenesis': 'Renal agenesis',
 'uterovaginal aplasia': 'Aplasia of the uterus',
 #'AVC': 'PLACEHOLDER',
 'CCA': 'Agenesis of corpus callosum',
 'microcephaly': 'Microcephaly',
 'Posterior cleft palate': 'Cleft palate',
 'micrognathia': 'Micrognathia',
 'micromelia': 'Micromelia',
 'club feet': 'Talipes equinovarus',
 'adrenal gland hypoplasia': 'Adrenal hypoplasia',
 'anteposed anus': 'Anteriorly placed anus',
 'Bilateral choanal atresia': 'Bilateral choanal atresia',
 'retrognathia': 'Retrognathia',
 'posterior cleft palate': 'Cleft palate',
 #'ear dysplasia': 'PLACEHOLDER',
 #'cervical chondroma': 'PLACEHOLDER',
 'adrenal and pituitary gland agenesis': 'Adrenal gland agenesis',
 #'abnormal aortic arch': 'PLACEHOLDER',
 'Premaxillary agenesis': 'Aplasia of the premaxilla',
 'microretrognathism': 'Microretrognathia',
 'arhinencephaly': 'Arrhinencephaly',
 'hygroma colli': 'Cystic hygroma',
 'intestinal malrotation': 'Intestinal malrotation',
 #'IAC': 'PLACEHOLDER',
 'adrenal gland agenesis': 'Adrenal gland agenesis',
 'Hypertelorism': 'Hypertelorism',
 'retrognatism': 'Retrognathia',
 'cleft palate': 'Cleft palate',
 'abnormal metacarpals': 'Abnormal metacarpal morphology',
# 'Limited ankle mobility': 'PLACEHOLDER',
 'hypopituitarism': 'Hypopituitarism',
 'hypospadias': 'Hypospadias',
 'speech delay': 'Delayed speech and language development',
 'gelastic seizures': 'Focal emotional seizure with laughing'}
other_findingsMapper = OptionColumnMapper(concept_recognizer=hpo_cr, option_d=other_findings_d)
other_findingsMapper.preview_column(df2['Other findings'])
column_mapper_d['Other findings'] = other_findingsMapper

<h2>GLI3 variants</h2>

In [35]:
gli3_variants2 = df2['cDNA'].unique()
gli3_variant_d = defaultdict(Variant)
# NM_000168.6:c.2121del leads to NP_000159.3:p.(Gly708ValfsTer25)
# the exact HGVS cannot be determinded from the original publication, but the variant is equivalent for all intents and purposes
hgvs_d = {"Gly708Valfs*24": "2121del" } 
for gli3v in gli3_variants2:
    # correct HGVS if needed
    if gli3v in hgvs_d:
        gli3variant = hgvs_d.get(gli3v)
    else:
        gli3variant = gli3v
    print(gli3v)
    hgvs = f"c.{gli3v}"
    v = hgvsMapper.encode_hgvs(gli3variant)
    gli3_variant_d[gli3v] = v

1995del
https://rest.variantvalidator.org/VariantValidator/variantvalidator/hg38/NM_000168.6%3A1995del/NM_000168.6?content-type=application%2Fjson
2072del
https://rest.variantvalidator.org/VariantValidator/variantvalidator/hg38/NM_000168.6%3A2072del/NM_000168.6?content-type=application%2Fjson
2123_2126del
https://rest.variantvalidator.org/VariantValidator/variantvalidator/hg38/NM_000168.6%3A2123_2126del/NM_000168.6?content-type=application%2Fjson
Gly708Valfs*24
https://rest.variantvalidator.org/VariantValidator/variantvalidator/hg38/NM_000168.6%3A2121del/NM_000168.6?content-type=application%2Fjson
2149_2150insT
https://rest.variantvalidator.org/VariantValidator/variantvalidator/hg38/NM_000168.6%3A2149_2150insT/NM_000168.6?content-type=application%2Fjson
2149C>T
https://rest.variantvalidator.org/VariantValidator/variantvalidator/hg38/NM_000168.6%3A2149C>T/NM_000168.6?content-type=application%2Fjson
2385del
https://rest.variantvalidator.org/VariantValidator/variantvalidator/hg38/NM_00016

In [36]:
variantMapper = VariantColumnMapper(variant_d=gli3_variant_d,
                                   variant_column_name="cDNA",
                                    default_genotype='heterozygous'
                                   )
omim_id = "OMIM:146510"
omim_label = "Pallister-Hall syndrome"
encoder = CohortEncoder(df=df2, 
                        hpo_cr=hpo_cr, 
                        column_mapper_d=column_mapper_d, 
                        individual_column_name="N", 
                        metadata=metadata,
                        agemapper=AgeColumnMapper.not_provided(), 
                        sexmapper=SexColumnMapper.not_provided(),
                        variant_mapper=variantMapper,
                        pmid=pmid)
encoder.set_disease(disease_id=omim_id, label=omim_label)

In [37]:
phs_individuals = encoder.get_individuals()

Individual.output_individuals_as_phenopackets(individual_list=phs_individuals, 
                                              pmid=pmid,
                                              metadata=metadata.to_ga4gh(),
                                              outdir="phenopackets")

We output 21 GA4GH phenopackets to the directory phenopackets


In [38]:
from IPython.display import HTML, display
phenopackets = [i.to_ga4gh_phenopacket(metadata=metadata.to_ga4gh()) for i in phs_individuals]
table = PhenopacketTable(phenopacket_list=phenopackets)
display(HTML(table.to_html()))

Individual,Genotype,Phenotypic features
P15112 (UNKNOWN; ),NM_000168.6:c.1995del (heterozygous),Hypothalamic hamartoma (HP:0002444); Anal atresia (HP:0002023); Bifid epiglottis (HP:0010564); Nail dysplasia (HP:0002164); Shortening of all distal phalanges of the fingers (HP:0006118); Abnormality of the face (HP:0000271); Overlapping toe (HP:0001845); Preauricular skin tag (HP:0000384)
G097 (UNKNOWN; ),NM_000168.6:c.2072del (heterozygous),Hypothalamic hamartoma (HP:0002444); Bifid epiglottis (HP:0010564); Abnormality of the genital system (HP:0000078); Nail dysplasia (HP:0002164); Growth delay (HP:0001510); Shortening of all distal phalanges of the fingers (HP:0006118); Micropenis (HP:0000054); Thin corpus callosum (HP:0033725)
G085 (UNKNOWN; ),NM_000168.6:c.2123_2126del (heterozygous),Hypothalamic hamartoma (HP:0002444); Bifid epiglottis (HP:0010564); Abnormality of the kidney (HP:0000077); Nail dysplasia (HP:0002164); Shortening of all distal phalanges of the fingers (HP:0006118); Abnormal lung development (HP:4000059)
Father (UNKNOWN; ),NM_000168.6:c.2121del (heterozygous),Hypothalamic hamartoma (HP:0002444); Bifid epiglottis (HP:0010564); Abnormality of the kidney (HP:0000077); Abnormality of the genital system (HP:0000078); Nail dysplasia (HP:0002164); Growth delay (HP:0001510); Shortening of all distal phalanges of the fingers (HP:0006118); Abnormal lung development (HP:4000059); Micropenis (HP:0000054)
Aunt (UNKNOWN; ),NM_000168.6:c.2121del (heterozygous),Hypothalamic hamartoma (HP:0002444); Bifid epiglottis (HP:0010564); Abnormality of the kidney (HP:0000077); Nail dysplasia (HP:0002164); Growth delay (HP:0001510); Shortening of all distal phalanges of the fingers (HP:0006118); Abnormal lung development (HP:4000059)
Grand-mother (UNKNOWN; ),NM_000168.6:c.2121del (heterozygous),Hypothalamic hamartoma (HP:0002444); Bifid epiglottis (HP:0010564); Abnormality of the kidney (HP:0000077); Nail dysplasia (HP:0002164); Growth delay (HP:0001510); Shortening of all distal phalanges of the fingers (HP:0006118); Abnormal lung development (HP:4000059)
G121 (UNKNOWN; ),NM_000168.6:c.2149_2150insT (heterozygous),Hypothalamic hamartoma (HP:0002444); Bifid epiglottis (HP:0010564); Shortening of all distal phalanges of the fingers (HP:0006118); Abnormal lung development (HP:4000059)
G001 (UNKNOWN; ),NM_000168.6:c.2149C>T (heterozygous),Hypothalamic hamartoma (HP:0002444); Anal atresia (HP:0002023); Bifid epiglottis (HP:0010564); Abnormality of the kidney (HP:0000077); Abnormality of the genital system (HP:0000078); Nail dysplasia (HP:0002164); Growth delay (HP:0001510); Shortening of all distal phalanges of the fingers (HP:0006118); Abnormal lung development (HP:4000059); Conical tooth (HP:0000698); Cryptorchidism (HP:0000028); Micropenis (HP:0000054); Motor delay (HP:0001270); Sacrococcygeal teratoma (HP:0030736); Syndactyly (HP:0001159); Unilateral renal agenesis (HP:0000122)
G083 (UNKNOWN; ),NM_000168.6:c.2385del (heterozygous),Hypothalamic hamartoma (HP:0002444); Bifid epiglottis (HP:0010564); Shortening of all distal phalanges of the fingers (HP:0006118); Abnormal lung development (HP:4000059)
Father (UNKNOWN; ),NM_000168.6:c.2385del (heterozygous),Hypothalamic hamartoma (HP:0002444); Bifid epiglottis (HP:0010564); Shortening of all distal phalanges of the fingers (HP:0006118); Abnormal lung development (HP:4000059)
