<h1>Creation of phenopackets from tabular data (individuals in columns)</h1>
<p>We will process <a href="https://pubmed.ncbi.nlm.nih.gov/25168959/" target="__blank">Kosho, et al. (2014) Genotype-phenotype correlation of Coffin-Siris syndrome caused by mutations in SMARCB1, SMARCA4, SMARCE1, and ARID1A</a></p>
<p>pyphetools provides a convenient way of extracting HPO terms from typical tables presented in supplemental material. Typical tables can have the individuals in columns or rows.</p>
<p>The data are extracted from the SMARCB1 section of Table 1.</p>

In [1]:
import phenopackets as php
from google.protobuf.json_format import MessageToDict, MessageToJson
from google.protobuf.json_format import Parse, ParseDict
import pandas as pd
pd.set_option('display.max_colwidth', None) # show entire column contents, important!
from collections import defaultdict
import os
import sys
import numpy as np
from pyphetools.creation import *
from pyphetools.creation.simple_column_mapper import try_mapping_columns
# last tested with pyphetools 0.4.5

<h2>Importing HPO data</h2>
<p>pyphetools uses the Human Phenotype Ontology (HPO) to encode phenotypic features. The recommended way of doing this is to ingest the hp.json file using HpoParser, which in turn creates an HpoConceptRecognizer object. </p>
<p>The HpoParser can accept a hpo_json_file argument if you want to use a specific file. If the argument is not passed, it will download the latext hp.json file from the HPO GitHub site and store it in a new subdirectory called hpo_data. It will not download the file if the file is already downloaded.</p>

In [2]:
parser = HpoParser()
hpo_cr = parser.get_hpo_concept_recognizer()
hpo_version = parser.get_version()
metadata = MetaData(created_by="ORCID:0000-0002-5648-2155")
metadata.default_versions_with_hpo(version=hpo_version)

<h2>Importing the supplemental table</h2>
<p>The Table of the paper was copied into an Excel file that is included in the data subfolder</p>
<p>Here, we use the pandas library to import this file (note that the Python package called openpyxl must be installed to read Excel files with pandas, although the library does not need to be imported in this notebook). pyphetools expects a pandas DataFrame as input, and users can choose any input format available for pandas include CSV, TSV, and Excel, or can use any other method to transform their input data into a Pandas DataFrame before using pyphetools.</p>

In [8]:
df = pd.read_excel('input/PMID_25168959.xlsx')

In [9]:
df

Unnamed: 0,"Patient ID""",Nucleotide change,Amino acid change,Feeding difficulties,Nasal bridge,Phitrum,Upper lip vermilion,Thick lower lip vermilion,High palate,Cleft palate,...,Microcephaly,Sparse scalp hair,hypertrichosis,Thick eyebrows,Long eyelashes,Ptosis,Short 5th finger,Short 5th toe,Prominent interphalangeal joints,Prominent distal phalanges
0,L43,c.1089G>T,p.Lys363Asn,Yes,Narrow,Normal,Thin,No,Yes,No,...,Yes,Yes,Yes,Yes,Yes,No,Yes,,Yes,Yes
1,L5,c.1091_1093del,p.Lys364del,Yes,Wide,Long,Thin,No,No,No,...,Yes,No,Yes,Yes,Yes,No,Yes,,No,
2,L18,c.1091_1093del,p.Lys364del,Yes,Normal,"Broad, long",Normal,Yes,,,...,Yes,Yes,Yes,Yes,Yes,Yes,Yes,,Yes,Yes
3,L37,c.1091_1093del,p.Lys364del,Yes,Wide,Broad,Thick,Yes,Yes,No,...,Yes,yes,Yes,Yes,Yes,Yes,No,,No,Yes
4,Y4,c.1091_1093del,p.Lys364del,Yes,Wide,Broad,Thin,Yes,No,Yes,...,Yes,Yes,Yes,Yes,Yes,No,Yes,Yes,Yes,Yes
5,Y21,c.1091_1093del,p.Lys364del,Yes,Wide,Long,Thin,Yes,Yes,No,...,Yes,Yes,Yes,Yes,Yes,No,Yes,Yes,No,Yes
6,Y22,c.1091_1093del,p.Lys364del,Yes,Wide,Long,Thin,Yes,Yes,No,...,Yes,Yes,Yes,Yes,Yes,Yes,Yes,Yes,No,
7,Y29,c.1091_1093del,p.Lys364del,Yes,Normal,"Broad, short",Thin,No,No,No,...,Yes,Yes,No,Yes,Yes,Yes,No,,Yes,Yes
8,K2588,c.1096c>T,p.Arg366Cys,Yes,Flat,Long,Thin,Yes,No,No,...,Yes,Yes,No,Yes,Yes,No,No,,No,No
9,K2426,c.1121G>A,p.Arg374Gln,Yes,Flat,Normal,Thin,Yes,No,No,...,Yes,Yes,Yes,Yes,Yes,Yes,Yes,,,No


<h1>Converting to row-based format</h1>
<p>To use pyphetools, we need to have the individuals represented as rows (one row per individual) and have the items of interest be encoded as column names. The required transformations for doing this may be different for different input data, but often we will want to transpose the table (using the pandas <tt>transpose</tt> function) and set the column names of the new table to the zero-th row. After this, we drop the zero-th row (otherwise, it will be interpreted as an individual by the pyphetools code).</p>
<p>After this step is completed, the remaining steps to create phenopackets are the same as in the 
    <a href="http://localhost:8888/notebooks/notebooks/Create%20phenopackets%20from%20tabular%20data%20with%20individuals%20in%20rows.ipynb" target="__blank">row-based notebook</a>.</p>
    
Furthermore, for this specific case, there is a Count features row that we want dropped, so we filter out any row that does not have Patient in the first column.

In [10]:
dft = df

<h2>Index vs. normal column</h2>
<p>Another thing to look out for is whether the individuals (usually the first column) are regarded as the index of the table or as the first normal column.</p>
<p>If this is the case, it is easiest to create a new column with the contents of the index -- this will work with the pyphetools software. An example follows -- we can now use 'patient_id' as the column name.</p>

In [11]:
dft.index
dft['patient_id'] = dft['Patient ID"']
dft

Unnamed: 0,"Patient ID""",Nucleotide change,Amino acid change,Feeding difficulties,Nasal bridge,Phitrum,Upper lip vermilion,Thick lower lip vermilion,High palate,Cleft palate,...,Sparse scalp hair,hypertrichosis,Thick eyebrows,Long eyelashes,Ptosis,Short 5th finger,Short 5th toe,Prominent interphalangeal joints,Prominent distal phalanges,patient_id
0,L43,c.1089G>T,p.Lys363Asn,Yes,Narrow,Normal,Thin,No,Yes,No,...,Yes,Yes,Yes,Yes,No,Yes,,Yes,Yes,L43
1,L5,c.1091_1093del,p.Lys364del,Yes,Wide,Long,Thin,No,No,No,...,No,Yes,Yes,Yes,No,Yes,,No,,L5
2,L18,c.1091_1093del,p.Lys364del,Yes,Normal,"Broad, long",Normal,Yes,,,...,Yes,Yes,Yes,Yes,Yes,Yes,,Yes,Yes,L18
3,L37,c.1091_1093del,p.Lys364del,Yes,Wide,Broad,Thick,Yes,Yes,No,...,yes,Yes,Yes,Yes,Yes,No,,No,Yes,L37
4,Y4,c.1091_1093del,p.Lys364del,Yes,Wide,Broad,Thin,Yes,No,Yes,...,Yes,Yes,Yes,Yes,No,Yes,Yes,Yes,Yes,Y4
5,Y21,c.1091_1093del,p.Lys364del,Yes,Wide,Long,Thin,Yes,Yes,No,...,Yes,Yes,Yes,Yes,No,Yes,Yes,No,Yes,Y21
6,Y22,c.1091_1093del,p.Lys364del,Yes,Wide,Long,Thin,Yes,Yes,No,...,Yes,Yes,Yes,Yes,Yes,Yes,Yes,No,,Y22
7,Y29,c.1091_1093del,p.Lys364del,Yes,Normal,"Broad, short",Thin,No,No,No,...,Yes,No,Yes,Yes,Yes,No,,Yes,Yes,Y29
8,K2588,c.1096c>T,p.Arg366Cys,Yes,Flat,Long,Thin,Yes,No,No,...,Yes,No,Yes,Yes,No,No,,No,No,K2588
9,K2426,c.1121G>A,p.Arg374Gln,Yes,Flat,Normal,Thin,Yes,No,No,...,Yes,Yes,Yes,Yes,Yes,Yes,,,No,K2426


Some column names might include spaces in front or after, and a couple of columns are subheadings and only contain NaNs, so lets correct that:

In [12]:
dft.columns = dft.columns.str.strip()
dft = dft.dropna(axis=1, how='all')
dft

Unnamed: 0,"Patient ID""",Nucleotide change,Amino acid change,Feeding difficulties,Nasal bridge,Phitrum,Upper lip vermilion,Thick lower lip vermilion,High palate,Cleft palate,...,Sparse scalp hair,hypertrichosis,Thick eyebrows,Long eyelashes,Ptosis,Short 5th finger,Short 5th toe,Prominent interphalangeal joints,Prominent distal phalanges,patient_id
0,L43,c.1089G>T,p.Lys363Asn,Yes,Narrow,Normal,Thin,No,Yes,No,...,Yes,Yes,Yes,Yes,No,Yes,,Yes,Yes,L43
1,L5,c.1091_1093del,p.Lys364del,Yes,Wide,Long,Thin,No,No,No,...,No,Yes,Yes,Yes,No,Yes,,No,,L5
2,L18,c.1091_1093del,p.Lys364del,Yes,Normal,"Broad, long",Normal,Yes,,,...,Yes,Yes,Yes,Yes,Yes,Yes,,Yes,Yes,L18
3,L37,c.1091_1093del,p.Lys364del,Yes,Wide,Broad,Thick,Yes,Yes,No,...,yes,Yes,Yes,Yes,Yes,No,,No,Yes,L37
4,Y4,c.1091_1093del,p.Lys364del,Yes,Wide,Broad,Thin,Yes,No,Yes,...,Yes,Yes,Yes,Yes,No,Yes,Yes,Yes,Yes,Y4
5,Y21,c.1091_1093del,p.Lys364del,Yes,Wide,Long,Thin,Yes,Yes,No,...,Yes,Yes,Yes,Yes,No,Yes,Yes,No,Yes,Y21
6,Y22,c.1091_1093del,p.Lys364del,Yes,Wide,Long,Thin,Yes,Yes,No,...,Yes,Yes,Yes,Yes,Yes,Yes,Yes,No,,Y22
7,Y29,c.1091_1093del,p.Lys364del,Yes,Normal,"Broad, short",Thin,No,No,No,...,Yes,No,Yes,Yes,Yes,No,,Yes,Yes,Y29
8,K2588,c.1096c>T,p.Arg366Cys,Yes,Flat,Long,Thin,Yes,No,No,...,Yes,No,Yes,Yes,No,No,,No,No,K2588
9,K2426,c.1121G>A,p.Arg374Gln,Yes,Flat,Normal,Thin,Yes,No,No,...,Yes,Yes,Yes,Yes,Yes,Yes,,,No,K2426


<h2>Column mappers</h2>
<p>Please see the notebook "Create phenopackets from tabular data with individuals in rows" for explanations. In the following cell we create a dictionary for the ColumnMappers. Note that the code is identical except that we use the df.loc function to get the corresponding row data</p>

In [13]:
hpo_cr = parser.get_hpo_concept_recognizer()
column_mapper_d, col_not_found = try_mapping_columns(df=dft,
                                                    observed='Yes',
                                                    excluded='No',
                                                    hpo_cr=hpo_cr,
                                                    preview=True)

                                 term    status
0   Feeding difficulties (HP:0011968)  observed
1   Feeding difficulties (HP:0011968)  observed
2   Feeding difficulties (HP:0011968)  observed
3   Feeding difficulties (HP:0011968)  observed
4   Feeding difficulties (HP:0011968)  observed
5   Feeding difficulties (HP:0011968)  observed
6   Feeding difficulties (HP:0011968)  observed
7   Feeding difficulties (HP:0011968)  observed
8   Feeding difficulties (HP:0011968)  observed
9   Feeding difficulties (HP:0011968)  observed
10  Feeding difficulties (HP:0011968)  observed
                                      term    status
0   Thick lower lip vermilion (HP:0000179)  excluded
1   Thick lower lip vermilion (HP:0000179)  excluded
2   Thick lower lip vermilion (HP:0000179)  observed
3   Thick lower lip vermilion (HP:0000179)  observed
4   Thick lower lip vermilion (HP:0000179)  observed
5   Thick lower lip vermilion (HP:0000179)  observed
6   Thick lower lip vermilion (HP:0000179)  observed


                                             term        status
0   Prominent interphalangeal joints (HP:0006237)      observed
1   Prominent interphalangeal joints (HP:0006237)      excluded
2   Prominent interphalangeal joints (HP:0006237)      observed
3   Prominent interphalangeal joints (HP:0006237)      excluded
4   Prominent interphalangeal joints (HP:0006237)      observed
5   Prominent interphalangeal joints (HP:0006237)      excluded
6   Prominent interphalangeal joints (HP:0006237)      excluded
7   Prominent interphalangeal joints (HP:0006237)      observed
8   Prominent interphalangeal joints (HP:0006237)      excluded
9   Prominent interphalangeal joints (HP:0006237)  not measured
10  Prominent interphalangeal joints (HP:0006237)  not measured


Lets see which ones are not mapped yet.

In [14]:
print(col_not_found)

['Patient ID"', 'Nucleotide change', 'Amino acid change', 'Nasal bridge', 'Phitrum', 'Upper lip vermilion', 'Patient ID #', 'Cardiovascular', 'Gastrointestinal', 'Genitouriry', 'Ophthalmological abnormalities', 'CNS structural abnormalities', 'Behavioral abnormalities', 'Age', 'Sex', 'Prominent distal phalanges', 'patient_id']


In [15]:
severity_id = {'Se': 'Intellectual disability, severe',
                 'Mo': 'Intellectual disability, moderate',
               'Mi':'Intellectual disability, mild'}
idMapper = OptionColumnMapper(concept_recognizer=hpo_cr, option_d=severity_id)
#print(idMapper.preview_column(dft['Developmental delay/intellectual disability']))
column_mapper_d['Developmental delay/intellectual disability'] = idMapper

In [16]:
nasal_bridge = {'Narrow': 'Narrow nasal bridge',
                 'Wide': 'Wide nasal bridge',
               'Flat':'Depressed nasal bridge'}
nasalMapper = OptionColumnMapper(concept_recognizer=hpo_cr, option_d=nasal_bridge)
#print(nasalMapper.preview_column(dft['Nasal bridge']))
column_mapper_d['Nasal bridge'] = idMapper

In [17]:
philtrum = {'Broad': 'Broad philtrum',
                 'Long': 'Long philtrum',
               'Flat':'Depressed nasal bridge',
            'Short':'Short philtrum'}
philtrumMapper = OptionColumnMapper(concept_recognizer=hpo_cr, option_d=philtrum )
#print(philtrumMapper.preview_column(dft['Phitrum']))
column_mapper_d['Phitrum'] = philtrumMapper

In [18]:
upperlip = {'Thin': 'Thin upper lip vermilion',
                 'Thick': 'Thick upper lip vermilion'}
upperlipMapper = OptionColumnMapper(concept_recognizer=hpo_cr, option_d=upperlip)
# print(upperlipMapper.preview_column(dft['Upper lip vermilion']))
column_mapper_d['Upper lip vermilion'] = upperlipMapper

In [19]:
cardiovascular = {'dex': 'Dextrocardia',
                 'ps': 'Pulmonic stenosis',
                 'vsd': 'Ventricular septal defect',
                 'asd': 'Atrial septal defect'}
cardiovascularMapper = OptionColumnMapper(concept_recognizer=hpo_cr, option_d=cardiovascular)
# print(cardiovascularMapper.preview_column(dft['Cardiovascular']))
column_mapper_d['Cardiovascular'] = cardiovascularMapper

In [20]:
gastrointestinal = {'pys': 'Pyloric stenosis',
                 'ps': 'Pulmonic stenosis',
                 'ger': 'Gastroesophageal reflux'}
gastrointestinalMapper = OptionColumnMapper(concept_recognizer=hpo_cr, option_d=gastrointestinal)
# print(gastrointestinalMapper.preview_column(dft['Gastrointestinal']))
column_mapper_d['Gastrointestinal'] = gastrointestinalMapper

In [21]:
genitouriry = {'hk': 'Horseshoe kidney',
                 'cr': 'cryptorchidism',
                 'VUR': 'Vesicoureteral reflux',
              'HN': 'hydronephrosis',
              'HU': 'Hydroureter',
              'VD': 'Urethral diverticulum'}
genitouriryMapper = OptionColumnMapper(concept_recognizer=hpo_cr, option_d=genitouriry)
# print(genitouriryMapper.preview_column(dft['Genitouriry']))
column_mapper_d['Genitouriry'] = genitouriryMapper

In [22]:
hernia = {'h': 'Hiatus hernia',
                 'u': 'umbilical hernia',
                 'i': 'inguinal hernia',
         'd': 'Congenital diaphragmatic hernia'}
herniaMapper = OptionColumnMapper(concept_recognizer=hpo_cr, option_d=hernia)
# print(herniaMapper.preview_column(dft['Hernia']))
column_mapper_d['Hernia'] = herniaMapper

In [23]:
opthal = {'my': 'Myopia',
                 'sph': 'spherophakia',
                 'am': 'amblyopia'}
opthalMapper = OptionColumnMapper(concept_recognizer=hpo_cr, option_d=opthal)
# print(opthalMapper.preview_column(dft['Ophthalmological abnormalities']))
column_mapper_d['Ophthalmological abnormalities'] = opthalMapper

In [24]:
corpus_callosum = {'acc': 'Abnormal corpus callosum morphology',
                 'ch': 'Aplasia/Hypoplasia of the cerebellum',
                 'dw': 'Dandy-Walker malformation'}
corpus_callosumMapper = OptionColumnMapper(concept_recognizer=hpo_cr, option_d=corpus_callosum)
# print(corpus_callosumMapper.preview_column(dft['CNS structural abnormalities']))
column_mapper_d['CNS structural abnormalities'] = corpus_callosumMapper

In [25]:
behavioral = {'HyAc': 'Hyperactivity',
                 'im': 'Impulsivity',
                 'tan': 'Abnormal temper tantrums'}
behavioralMapper = OptionColumnMapper(concept_recognizer=hpo_cr, option_d=behavioral)
# print(behavioralMapper.preview_column(dft['Behavioral abnormalities']))
column_mapper_d['Behavioral abnormalities'] = behavioralMapper

<h2>Variant Data</h2>
<p>The variant data (HGVS< transcript) is listed in the Variant (hg19, NM_015133.4) column.</p>

In [27]:
genome = 'hg38'
default_genotype = 'heterozygous'
transcript='NM_003073.3'
varMapper = VariantColumnMapper(assembly=genome,
                                column_name='Nucleotide change', 
                                transcript=transcript, 
                                default_genotype=default_genotype)

<h1>Demographic data</h1>
<p>pyphetools can be used to capture information about age, sex, and individual identifiers. This information is stored in a map of "IndividualMapper" objects. Special treatment may be required for the indifiers, which may be used as the column names or row index.</p>

In [28]:
#age is in years and months, so manually correct it
ageMapper = AgeColumnMapper.by_year('Age')
ageMapper.preview_column(dft['Age'])

Unnamed: 0,original column contents,age
0,13,P13Y
1,6,P6Y
2,9,P9Y
3,10,P10Y
4,21,P21Y
5,3,P3Y
6,1,P1Y
7,4,P4Y
8,7,P7Y


In [29]:
#sex is not in columns, since it were all females in this paper
sexMapper = SexColumnMapper(male_symbol='M', female_symbol='F', column_name='Sex')
sexMapper.preview_column(dft['Sex'])

Unnamed: 0,original column contents,sex
0,F,FEMALE
1,F,FEMALE
2,F,FEMALE
3,M,MALE
4,F,FEMALE
5,F,FEMALE
6,M,MALE
7,M,MALE
8,M,MALE
9,M,MALE


In [30]:
pmid = "PMID: 25168959"
encoder = CohortEncoder(df=dft, hpo_cr=hpo_cr, column_mapper_d=column_mapper_d, 
                        individual_column_name="patient_id", agemapper=ageMapper, sexmapper=sexMapper,
                       variant_mapper=varMapper, metadata=metadata,
                       pmid=pmid)
encoder.set_disease(disease_id='614608', label='COFFIN-SIRIS SYNDROME 3')
individuals = encoder.get_individuals()

https://rest.variantvalidator.org/VariantValidator/variantvalidator/hg38/NM_003073.3%3Ac.1089G>T/NM_003073.3?content-type=application%2Fjson
https://rest.variantvalidator.org/VariantValidator/variantvalidator/hg38/NM_003073.3%3Ac.1091_1093del/NM_003073.3?content-type=application%2Fjson
https://rest.variantvalidator.org/VariantValidator/variantvalidator/hg38/NM_003073.3%3Ac.1091_1093del/NM_003073.3?content-type=application%2Fjson
https://rest.variantvalidator.org/VariantValidator/variantvalidator/hg38/NM_003073.3%3Ac.1091_1093del/NM_003073.3?content-type=application%2Fjson
https://rest.variantvalidator.org/VariantValidator/variantvalidator/hg38/NM_003073.3%3Ac.1091_1093del/NM_003073.3?content-type=application%2Fjson
https://rest.variantvalidator.org/VariantValidator/variantvalidator/hg38/NM_003073.3%3Ac.1091_1093del/NM_003073.3?content-type=application%2Fjson
https://rest.variantvalidator.org/VariantValidator/variantvalidator/hg38/NM_003073.3%3Ac.1091_1093del/NM_003073.3?content-type=ap

In [31]:
encoder.preview_dataframe()

Unnamed: 0_level_0,sex,age,phenotypic features
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
L43,FEMALE,P13Y,"Feeding difficulties (HP:0011968)\nexcluded: Thick lower lip vermilion (HP:0000179)\nHigh palate (HP:0000218)\nexcluded: Cleft palate (HP:0000175)\nScoliosis (HP:0002650)\nHearing impairment (HP:0000365)\nexcluded: Visual impairment (HP:0000505)\nRecurrent infections (HP:0002719)\nHypotonia (HP:0001252)\nexcluded: Seizure (HP:0001250)\nIntellectual disability, moderate (HP:0002342)\nexcluded: Absent speech (HP:0001344)\nexcluded: Small for gestational age (HP:0001518)\nnot measured: Birth length less than 3rd percentile (HP:0003561)\nnot measured: Primary microcephaly (HP:0011451)\nDecreased body weight (HP:0004325)\nShort stature (HP:0004322)\nMicrocephaly (HP:0000252)\nSparse scalp hair (HP:0002209)\nHypertrichosis (HP:0000998)\nThick eyebrow (HP:0000574)\nLong eyelashes (HP:0000527)\nexcluded: Ptosis (HP:0000508)\nShort 5th finger (HP:0009237)\nnot measured: Short 5th toe (HP:0011917)\nProminent interphalangeal joints (HP:0006237)\nThin upper lip vermilion (HP:0000219)"
L5,FEMALE,P6Y,"Feeding difficulties (HP:0011968)\nexcluded: Thick lower lip vermilion (HP:0000179)\nexcluded: High palate (HP:0000218)\nexcluded: Cleft palate (HP:0000175)\nnot measured: Scoliosis (HP:0002650)\nHearing impairment (HP:0000365)\nVisual impairment (HP:0000505)\nnot measured: Recurrent infections (HP:0002719)\nexcluded: Hypotonia (HP:0001252)\nSeizure (HP:0001250)\nIntellectual disability, severe (HP:0010864)\nAbsent speech (HP:0001344)\nexcluded: Small for gestational age (HP:0001518)\nnot measured: Birth length less than 3rd percentile (HP:0003561)\nnot measured: Primary microcephaly (HP:0011451)\nexcluded: Decreased body weight (HP:0004325)\nShort stature (HP:0004322)\nMicrocephaly (HP:0000252)\nexcluded: Sparse scalp hair (HP:0002209)\nHypertrichosis (HP:0000998)\nThick eyebrow (HP:0000574)\nLong eyelashes (HP:0000527)\nexcluded: Ptosis (HP:0000508)\nShort 5th finger (HP:0009237)\nnot measured: Short 5th toe (HP:0011917)\nexcluded: Prominent interphalangeal joints (HP:0006237)\nLong philtrum (HP:0000343)\nThin upper lip vermilion (HP:0000219)"
L18,FEMALE,P9Y,"Feeding difficulties (HP:0011968)\nThick lower lip vermilion (HP:0000179)\nnot measured: High palate (HP:0000218)\nnot measured: Cleft palate (HP:0000175)\nnot measured: Scoliosis (HP:0002650)\nnot measured: Hearing impairment (HP:0000365)\nVisual impairment (HP:0000505)\nRecurrent infections (HP:0002719)\nHypotonia (HP:0001252)\nSeizure (HP:0001250)\nIntellectual disability, severe (HP:0010864)\nAbsent speech (HP:0001344)\nSmall for gestational age (HP:0001518)\nnot measured: Birth length less than 3rd percentile (HP:0003561)\nnot measured: Primary microcephaly (HP:0011451)\nDecreased body weight (HP:0004325)\nShort stature (HP:0004322)\nMicrocephaly (HP:0000252)\nSparse scalp hair (HP:0002209)\nHypertrichosis (HP:0000998)\nThick eyebrow (HP:0000574)\nLong eyelashes (HP:0000527)\nPtosis (HP:0000508)\nShort 5th finger (HP:0009237)\nnot measured: Short 5th toe (HP:0011917)\nProminent interphalangeal joints (HP:0006237)\nBroad philtrum (HP:0000289)"
L37,MALE,P10Y,"Feeding difficulties (HP:0011968)\nThick lower lip vermilion (HP:0000179)\nHigh palate (HP:0000218)\nexcluded: Cleft palate (HP:0000175)\nScoliosis (HP:0002650)\nnot measured: Hearing impairment (HP:0000365)\nVisual impairment (HP:0000505)\nRecurrent infections (HP:0002719)\nHypotonia (HP:0001252)\nSeizure (HP:0001250)\nIntellectual disability, severe (HP:0010864)\nexcluded: Absent speech (HP:0001344)\nSmall for gestational age (HP:0001518)\nnot measured: Birth length less than 3rd percentile (HP:0003561)\nnot measured: Primary microcephaly (HP:0011451)\nDecreased body weight (HP:0004325)\nShort stature (HP:0004322)\nMicrocephaly (HP:0000252)\nnot measured: Sparse scalp hair (HP:0002209)\nHypertrichosis (HP:0000998)\nThick eyebrow (HP:0000574)\nLong eyelashes (HP:0000527)\nPtosis (HP:0000508)\nexcluded: Short 5th finger (HP:0009237)\nnot measured: Short 5th toe (HP:0011917)\nexcluded: Prominent interphalangeal joints (HP:0006237)\nBroad philtrum (HP:0000289)\nThick upper lip vermilion (HP:0000215)\nHyperactivity (HP:0000752)"
Y4,FEMALE,P21Y,"Feeding difficulties (HP:0011968)\nThick lower lip vermilion (HP:0000179)\nexcluded: High palate (HP:0000218)\nCleft palate (HP:0000175)\nScoliosis (HP:0002650)\nHearing impairment (HP:0000365)\nnot measured: Visual impairment (HP:0000505)\nnot measured: Recurrent infections (HP:0002719)\nHypotonia (HP:0001252)\nSeizure (HP:0001250)\nIntellectual disability, severe (HP:0010864)\nAbsent speech (HP:0001344)\nexcluded: Small for gestational age (HP:0001518)\nexcluded: Birth length less than 3rd percentile (HP:0003561)\nexcluded: Primary microcephaly (HP:0011451)\nDecreased body weight (HP:0004325)\nShort stature (HP:0004322)\nMicrocephaly (HP:0000252)\nSparse scalp hair (HP:0002209)\nHypertrichosis (HP:0000998)\nThick eyebrow (HP:0000574)\nLong eyelashes (HP:0000527)\nexcluded: Ptosis (HP:0000508)\nShort 5th finger (HP:0009237)\nShort 5th toe (HP:0011917)\nProminent interphalangeal joints (HP:0006237)\nBroad philtrum (HP:0000289)\nThin upper lip vermilion (HP:0000219)\nHyperactivity (HP:0000752)"
Y21,FEMALE,P9Y,"Feeding difficulties (HP:0011968)\nThick lower lip vermilion (HP:0000179)\nHigh palate (HP:0000218)\nexcluded: Cleft palate (HP:0000175)\nScoliosis (HP:0002650)\nHearing impairment (HP:0000365)\nexcluded: Visual impairment (HP:0000505)\nRecurrent infections (HP:0002719)\nHypotonia (HP:0001252)\nSeizure (HP:0001250)\nIntellectual disability, severe (HP:0010864)\nAbsent speech (HP:0001344)\nSmall for gestational age (HP:0001518)\nBirth length less than 3rd percentile (HP:0003561)\nexcluded: Primary microcephaly (HP:0011451)\nDecreased body weight (HP:0004325)\nShort stature (HP:0004322)\nMicrocephaly (HP:0000252)\nSparse scalp hair (HP:0002209)\nHypertrichosis (HP:0000998)\nThick eyebrow (HP:0000574)\nLong eyelashes (HP:0000527)\nexcluded: Ptosis (HP:0000508)\nShort 5th finger (HP:0009237)\nShort 5th toe (HP:0011917)\nexcluded: Prominent interphalangeal joints (HP:0006237)\nLong philtrum (HP:0000343)\nThin upper lip vermilion (HP:0000219)"
Y22,MALE,P3Y,"Feeding difficulties (HP:0011968)\nThick lower lip vermilion (HP:0000179)\nHigh palate (HP:0000218)\nexcluded: Cleft palate (HP:0000175)\nexcluded: Scoliosis (HP:0002650)\nexcluded: Hearing impairment (HP:0000365)\nexcluded: Visual impairment (HP:0000505)\nRecurrent infections (HP:0002719)\nHypotonia (HP:0001252)\nexcluded: Seizure (HP:0001250)\nIntellectual disability, severe (HP:0010864)\nAbsent speech (HP:0001344)\nexcluded: Small for gestational age (HP:0001518)\nnot measured: Birth length less than 3rd percentile (HP:0003561)\nnot measured: Primary microcephaly (HP:0011451)\nDecreased body weight (HP:0004325)\nShort stature (HP:0004322)\nMicrocephaly (HP:0000252)\nSparse scalp hair (HP:0002209)\nHypertrichosis (HP:0000998)\nThick eyebrow (HP:0000574)\nLong eyelashes (HP:0000527)\nPtosis (HP:0000508)\nShort 5th finger (HP:0009237)\nShort 5th toe (HP:0011917)\nexcluded: Prominent interphalangeal joints (HP:0006237)\nLong philtrum (HP:0000343)\nThin upper lip vermilion (HP:0000219)"
Y29,MALE,P9Y,"Feeding difficulties (HP:0011968)\nexcluded: Thick lower lip vermilion (HP:0000179)\nexcluded: High palate (HP:0000218)\nexcluded: Cleft palate (HP:0000175)\nScoliosis (HP:0002650)\nexcluded: Hearing impairment (HP:0000365)\nexcluded: Visual impairment (HP:0000505)\nRecurrent infections (HP:0002719)\nHypotonia (HP:0001252)\nSeizure (HP:0001250)\nIntellectual disability, severe (HP:0010864)\nAbsent speech (HP:0001344)\nnot measured: Small for gestational age (HP:0001518)\nnot measured: Birth length less than 3rd percentile (HP:0003561)\nnot measured: Primary microcephaly (HP:0011451)\nDecreased body weight (HP:0004325)\nShort stature (HP:0004322)\nMicrocephaly (HP:0000252)\nSparse scalp hair (HP:0002209)\nexcluded: Hypertrichosis (HP:0000998)\nThick eyebrow (HP:0000574)\nLong eyelashes (HP:0000527)\nPtosis (HP:0000508)\nexcluded: Short 5th finger (HP:0009237)\nnot measured: Short 5th toe (HP:0011917)\nProminent interphalangeal joints (HP:0006237)\nBroad philtrum (HP:0000289)\nThin upper lip vermilion (HP:0000219)\nVesicoureteral reflux (HP:0000076)\nHydroureter (HP:0000072)\nhydronephrosis (HP:0000126)\nUrethral diverticulum (HP:0008722)"
K2588,MALE,P1Y,"Feeding difficulties (HP:0011968)\nThick lower lip vermilion (HP:0000179)\nexcluded: High palate (HP:0000218)\nexcluded: Cleft palate (HP:0000175)\nexcluded: Scoliosis (HP:0002650)\nnot measured: Hearing impairment (HP:0000365)\nnot measured: Visual impairment (HP:0000505)\nexcluded: Recurrent infections (HP:0002719)\nexcluded: Hypotonia (HP:0001252)\nnot measured: Seizure (HP:0001250)\nIntellectual disability, mild (HP:0001256)\nexcluded: Absent speech (HP:0001344)\nexcluded: Small for gestational age (HP:0001518)\nexcluded: Birth length less than 3rd percentile (HP:0003561)\nexcluded: Primary microcephaly (HP:0011451)\nnot measured: Decreased body weight (HP:0004325)\nShort stature (HP:0004322)\nMicrocephaly (HP:0000252)\nSparse scalp hair (HP:0002209)\nexcluded: Hypertrichosis (HP:0000998)\nThick eyebrow (HP:0000574)\nLong eyelashes (HP:0000527)\nexcluded: Ptosis (HP:0000508)\nexcluded: Short 5th finger (HP:0009237)\nnot measured: Short 5th toe (HP:0011917)\nexcluded: Prominent interphalangeal joints (HP:0006237)\nLong philtrum (HP:0000343)\nThin upper lip vermilion (HP:0000219)"
K2426,MALE,P4Y,"Feeding difficulties (HP:0011968)\nThick lower lip vermilion (HP:0000179)\nexcluded: High palate (HP:0000218)\nexcluded: Cleft palate (HP:0000175)\nScoliosis (HP:0002650)\nHearing impairment (HP:0000365)\nVisual impairment (HP:0000505)\nRecurrent infections (HP:0002719)\nexcluded: Hypotonia (HP:0001252)\nSeizure (HP:0001250)\nIntellectual disability, moderate (HP:0002342)\nAbsent speech (HP:0001344)\nexcluded: Small for gestational age (HP:0001518)\nexcluded: Birth length less than 3rd percentile (HP:0003561)\nexcluded: Primary microcephaly (HP:0011451)\nnot measured: Decreased body weight (HP:0004325)\nShort stature (HP:0004322)\nMicrocephaly (HP:0000252)\nSparse scalp hair (HP:0002209)\nHypertrichosis (HP:0000998)\nThick eyebrow (HP:0000574)\nLong eyelashes (HP:0000527)\nPtosis (HP:0000508)\nShort 5th finger (HP:0009237)\nnot measured: Short 5th toe (HP:0011917)\nnot measured: Prominent interphalangeal joints (HP:0006237)\nThin upper lip vermilion (HP:0000219)\nHyperactivity (HP:0000752)"


In [33]:
i1 = individuals[0]
phenopacket1 = i1.to_ga4gh_phenopacket(metadata=metadata.to_ga4gh())
json_string = MessageToJson(phenopacket1)
print(json_string)

{
  "id": "L43",
  "subject": {
    "id": "L43",
    "timeAtLastEncounter": {
      "age": {
        "iso8601duration": "P13Y"
      }
    },
    "sex": "FEMALE"
  },
  "phenotypicFeatures": [
    {
      "type": {
        "id": "HP:0011968",
        "label": "Feeding difficulties"
      }
    },
    {
      "type": {
        "id": "HP:0000179",
        "label": "Thick lower lip vermilion"
      },
      "excluded": true
    },
    {
      "type": {
        "id": "HP:0000218",
        "label": "High palate"
      }
    },
    {
      "type": {
        "id": "HP:0000175",
        "label": "Cleft palate"
      },
      "excluded": true
    },
    {
      "type": {
        "id": "HP:0002650",
        "label": "Scoliosis"
      }
    },
    {
      "type": {
        "id": "HP:0000365",
        "label": "Hearing impairment"
      }
    },
    {
      "type": {
        "id": "HP:0000505",
        "label": "Visual impairment"
      },
      "excluded": true
    },
    {
      "type": {
      

In [34]:
output_directory = "../../phenopackets/SMARCB1/"
encoder.output_phenopackets(outdir=output_directory)

https://rest.variantvalidator.org/VariantValidator/variantvalidator/hg38/NM_003073.3%3Ac.1089G>T/NM_003073.3?content-type=application%2Fjson
https://rest.variantvalidator.org/VariantValidator/variantvalidator/hg38/NM_003073.3%3Ac.1091_1093del/NM_003073.3?content-type=application%2Fjson
https://rest.variantvalidator.org/VariantValidator/variantvalidator/hg38/NM_003073.3%3Ac.1091_1093del/NM_003073.3?content-type=application%2Fjson
https://rest.variantvalidator.org/VariantValidator/variantvalidator/hg38/NM_003073.3%3Ac.1091_1093del/NM_003073.3?content-type=application%2Fjson
https://rest.variantvalidator.org/VariantValidator/variantvalidator/hg38/NM_003073.3%3Ac.1091_1093del/NM_003073.3?content-type=application%2Fjson
https://rest.variantvalidator.org/VariantValidator/variantvalidator/hg38/NM_003073.3%3Ac.1091_1093del/NM_003073.3?content-type=application%2Fjson
https://rest.variantvalidator.org/VariantValidator/variantvalidator/hg38/NM_003073.3%3Ac.1091_1093del/NM_003073.3?content-type=ap