<h1>COL3A1: Vandervore (2017)</h1>
<p>Data derived from <a href="https://pubmed.ncbi.nlm.nih.gov/28258187/" target="__blank">Vandervore, et al. (2017) Bi-allelic variants in COL3A1 encoding the ligand to GPR56 are associated with cobblestone-like cortical malformation, white matter changes and cerebellar cysts</a></p>

In [1]:
import pandas as pd
pd.set_option('display.max_colwidth', None) # show entire column contents, important!
from IPython.display import HTML, display
from pyphetools.creation import *
from pyphetools.visualization import *
from pyphetools.validation import *
import pyphetools
print(f"pyphetools version {pyphetools.__version__}")

pyphetools version 0.9.31


<h2>Importing HPO data</h2>
<p>pyphetools uses the Human Phenotype Ontology (HPO) to encode phenotypic features. The recommended way of doing this is to ingest the hp.json file using HpoParser, which in turn creates an HpoConceptRecognizer object. </p>
<p>The HpoParser can accept a hpo_json_file argument if you want to use a specific file. If the argument is not passed, it will download the latext hp.json file from the HPO GitHub site and store it in a new subdirectory called hpo_data. It will not download the file if the file is already downloaded.</p>

In [2]:
PMID = "PMID:28258187"
title = "Bi-allelic variants in COL3A1 encoding the ligand to GPR56 are associated with cobblestone-like cortical malformation, white matter changes and cerebellar cysts"
cite = Citation(pmid=PMID, title=title)
parser = HpoParser(hpo_json_file="../hp.json")
hpo_cr = parser.get_hpo_concept_recognizer()
hpo_version = parser.get_version()
hpo_ontology = parser.get_ontology()
metadata = MetaData(created_by="ORCID:0000-0002-5648-2155", citation=cite)
metadata.default_versions_with_hpo(version=hpo_version)
print(f"HPO version {hpo_version}")

HPO version 2024-01-16


<h2>Importing the clinical data</h2>

In [3]:
df = pd.read_excel('input/PMID_28258187.xlsx')

In [4]:
df.head()

Unnamed: 0,Clinical features,Patient 1 (11.3 this manuscript),Patient 2 (11.4 this manuscript),"Patient 3 (Plancke et al, 2009)","Patient 4 (Jergensen et al, 2014)"
0,patient_id,Patient 1,Patient 2,Patient 3,Patient 4
1,Sex,female,male,female,female
2,Age at examination(years),7,3.5,10,19
3,Mutation in COL3A1,c.145C>G,c.145C>G,c.479dupT,c.1786C>T
4,Second mutation in COL3A1,c.145C>G,c.145C>G,c.479dupT,c.3851G>A


In [5]:
# Covert to row based data
dft = df.transpose()
dft.columns = dft.iloc[0]
dft.drop(dft.index[0], inplace=True)
import re
dft.columns = dft.columns.str.strip()
dft = dft.dropna(axis=1, how='all')
# simplify the name of the id column to remove e.g.,(11.3 this manuscript) f
#dft.set_index("patient_id", inplace = True)
dft.head(2)

Clinical features,patient_id,Sex,Age at examination(years),Mutation in COL3A1,Second mutation in COL3A1,Major features,Minor features,Additional features,Congenital anomalies,Neurological examination,...,Basal ganglia,Corpus callosum,Hippocampus,Cortex,White matter_5,Vermis,Post fossa,Pituitary,Arachnoid cysts,Vessels
Patient 1 (11.3 this manuscript),Patient 1,female,7.0,c.145C>G,c.145C>G,-,-,-,-,"Global developmental delay, walks without support, uses a few words",...,"Thalamus normal putamen/globus pallidus small, accentuated Virchow- Robin spaces","Present, elongated and mildly thickened",Normal,"Dysplastic cerebellar cortex, multiple cortical cysts superior>inferior","No hypoplasia, multifocal lesions in cerebellar white matter",Vermis hypoplasia cysts,Mega cisterna magna,Normal,,Intracranial segment of the A carotis interna is normal
Patient 2 (11.4 this manuscript),Patient 2,male,3.5,c.145C>G,c.145C>G,-,-,-,-,"Global developmental delay. sits independently, no words",...,"Normal volume and signal intensity, accentuated Virchow-Robin spaces",Present elongated and mildly thickened,Normal,"Dysplastic cerebellar cortex, multiple cortical cysts superior>inferior",No hypoplasia multifocal lesions in cerebellar white matter,Vermis hypoplasia cysts,Mega cisterna magna,Normal,Bilateral temporal pole arachnoidal cysts,Intracranial segment of the A carotis interna is normal


<h2>Column mappers</h2>
<p>Please see the notebook "Create phenopackets from tabular data with individuals in rows" for explanations. In the following cell we create a dictionary for the ColumnMappers. Note that the code is identical except that we use the df.loc function to get the corresponding row data</p>

In [6]:
column_mapper_list = list()

Lets try to get code autoformatted so that we can easily copy-paste and change it.

In [7]:
major_features = {'Easy bruising thin translucent skin': 'Bruising susceptibility',
 'arterial tissue fragility': 'Abnormal arterial physiology',
 'Easy bruising': 'Bruising susceptibility',
 'thin translucent skin': 'Dermal translucency',
 'arterial dissections': 'Arterial dissection',}
major_featuresMapper = OptionColumnMapper(column_name="Major features",concept_recognizer=hpo_cr, option_d=major_features)
column_mapper_list.append(major_featuresMapper)
major_featuresMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,"original value: ""Easy bruising thin translucent skin, intestinal/arterial tissue fragility. characteristic facial appearance"" -> HP: Bruising susceptibility (HP:0000978) (observed)",1
1,"original value: ""Easy bruising, thin translucent skin, arterial dissections, characteristic facial appearance"" -> HP: Bruising susceptibility (HP:0000978) (observed)",1


In [8]:
cortex = {'Dysplastic cerebellar cortex': 'Abnormal cerebellar cortex morphology'}
cortexMapper = OptionColumnMapper(column_name="Cortex", concept_recognizer=hpo_cr, option_d=cortex)
column_mapper_list.append(cortexMapper)
cortexMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,"original value: ""Dysplastic cerebellar cortex, multiple cortical cysts superior>inferior"" -> HP: Abnormal cerebellar cortex morphology (HP:0031422) (observed)",2
1,"original value: ""Cortical cysts"" -> HP: Renal cortical cysts (HP:0000803) (observed)",1
2,"original value: ""Few cortical cysts superior cerebellar lobe"" -> HP: Renal cortical cysts (HP:0000803) (observed)",1


In [9]:
minor_features = {'Early-onset varicose veins': 'Varicose veins',
 'small joint hypermobility': 'Joint hypermobility',
 'tendon rupture': 'Tendon rupture'}
minor_featuresMapper = OptionColumnMapper(column_name="Minor features",concept_recognizer=hpo_cr, option_d=minor_features)
column_mapper_list.append(minor_featuresMapper)
minor_featuresMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,"original value: ""Early-onset varicose veins, small joint hypermobility"" -> HP: Varicose veins (HP:0002619) (observed)",1
1,"original value: ""Small joint hypermobility, tendon rupture, first-degree relative with vascular Ehlers- Danlos syndrome"" -> HP: Joint hypermobility (HP:0001382) (observed)",1


In [10]:
additional_features = {'Pulmonary valve stenosis': 'Pulmonic stenosis',
 'pronounced atrophic scars': 'Atrophic scars',
 'multiple gingival recessions': 'Gingival recession',
 'slender fingers': 'Slender finger'}
additional_featuresMapper = OptionColumnMapper(column_name="Additional features", concept_recognizer=hpo_cr, option_d=additional_features)
column_mapper_list.append(additional_featuresMapper)
major_featuresMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,"original value: ""Easy bruising thin translucent skin, intestinal/arterial tissue fragility. characteristic facial appearance"" -> HP: Bruising susceptibility (HP:0000978) (observed)",1
1,"original value: ""Easy bruising, thin translucent skin, arterial dissections, characteristic facial appearance"" -> HP: Bruising susceptibility (HP:0000978) (observed)",1


In [11]:
congenital_anomalies = {'Talipes equinovarus': 'Talipes equinovarus'}
congenital_anomaliesMapper = OptionColumnMapper(column_name="Congenital anomalies", concept_recognizer=hpo_cr, option_d=congenital_anomalies)
column_mapper_list.append(congenital_anomaliesMapper)
congenital_anomaliesMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,"original value: ""Talipes equinovarus"" -> HP: Talipes equinovarus (HP:0001762) (observed)",1


In [12]:
neurological_examination = {'Global developmental delay': 'Global developmental delay',
 'no words': 'Absent speech',
 'Delayed motor milestones': 'Motor delay'}
neurological_examinationMapper = OptionColumnMapper(column_name="Neurological examination", concept_recognizer=hpo_cr, option_d=neurological_examination)
column_mapper_list.append(neurological_examinationMapper)
neurological_examinationMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,"original value: ""Global developmental delay, walks without support, uses a few words"" -> HP: Global developmental delay (HP:0001263) (observed)",1
1,"original value: ""Global developmental delay. sits independently, no words"" -> HP: Global developmental delay (HP:0001263) (observed)",1
2,"original value: ""Delayed motor milestones, normal language development"" -> HP: Motor delay (HP:0001270) (observed)",1


In [13]:
head_circumference = {'>97th centile': 'Macrocephaly'}
head_circumferenceMapper = OptionColumnMapper(column_name="Head circumference", concept_recognizer=hpo_cr, option_d=head_circumference)
column_mapper_list.append(head_circumferenceMapper)
head_circumferenceMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,"original value: "">97th centile"" -> HP: Macrocephaly (HP:0000256) (observed)",1


In [14]:
epilepsy_onset = {'Spasms/5 years': 'Seizure',
 'Spasms/26 months': 'Seizure',
 'Absence seizures/unknown': 'Typical absence seizure'}
epilepsy_onsetMapper = OptionColumnMapper(column_name="Epilepsy/onset", concept_recognizer=hpo_cr, option_d=epilepsy_onset)
column_mapper_list.append(epilepsy_onsetMapper)
epilepsy_onsetMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,"original value: ""Spasms/5 years"" -> HP: Seizure (HP:0001250) (observed)",1
1,"original value: ""Spasms/26 months"" -> HP: Seizure (HP:0001250) (observed)",1
2,"original value: ""Absence seizures/unknown"" -> HP: Typical absence seizure (HP:0011147) (observed)",1


In [15]:
gyral_pattern = {'Diffuse thickened cobblestone cortex with relative sparing of the temporal lobes': 'Dysgyria with thickened cortex',
 'Diffuse thickened cobblestone cortex with relative sparing of the temporal poles': 'Dysgyria with thickened cortex',
 'Frontal cobblestone cortex': 'Dysgyria with thickened cortex',
 'parietal polymicrogyria': 'Polymicrogyria',
 'Bilateral frontal polymicrogyria including cingulate gyri': 'Polymicrogyria'}
gyral_patternMapper = OptionColumnMapper(column_name="Gyral pattern", concept_recognizer=hpo_cr, option_d=gyral_pattern)
column_mapper_list.append(gyral_patternMapper)
gyral_patternMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,"original value: ""Diffuse thickened cobblestone cortex with relative sparing of the temporal lobes"" -> HP: Dysgyria with thickened cortex (HP:0032400) (observed)",1
1,"original value: ""Diffuse thickened cobblestone cortex with relative sparing of the temporal poles"" -> HP: Dysgyria with thickened cortex (HP:0032400) (observed)",1
2,"original value: ""Frontal cobblestone cortex, parietal polymicrogyria, relative sparing of the temporal lobes"" -> HP: Dysgyria with thickened cortex (HP:0032400) (observed)",1
3,"original value: ""Bilateral frontal polymicrogyria including cingulate gyri"" -> HP: Polymicrogyria (HP:0002126) (observed)",1


In [16]:
white_matter = {'Globale reduction of white matter': 'Hypointensity of cerebral white matter on MRI',
 'Diffuse hypomyelination': 'Cerebral hypomyelination'}
white_matterMapper = OptionColumnMapper(column_name="White matter", concept_recognizer=hpo_cr, option_d=white_matter)
column_mapper_list.append(white_matterMapper)
white_matterMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,"original value: ""Globale reduction of white matter"" -> HP: Hypointensity of cerebral white matter on MRI (HP:0007103) (observed)",1
1,"original value: ""Diffuse hypomyelination"" -> HP: Cerebral hypomyelination (HP:0006808) (observed)",1


In [17]:
white_matter_2 = {'Prominent perivascular spaces': 'Dilation of Virchow-Robin spaces',
 'Prominent perivascular spaces bilateral frontal': 'Dilation of Virchow-Robin spaces'}
white_matter_2Mapper = OptionColumnMapper(column_name="White matter_2", concept_recognizer=hpo_cr, option_d=white_matter_2)
column_mapper_list.append(white_matter_2Mapper)
white_matter_2Mapper.preview_column(dft)

Unnamed: 0,mapping,count
0,"original value: ""Prominent perivascular spaces"" -> HP: Dilation of Virchow-Robin spaces (HP:0012520) (observed)",3
1,"original value: ""Prominent perivascular spaces bilateral frontal"" -> HP: Dilation of Virchow-Robin spaces (HP:0012520) (observed)",1


In [18]:
white_matter_3 = {'Frontal nodular heterotopia (beads)': 'Gray matter heterotopia',
 'perisylvian and occipital band heterotopia': 'Gray matter heterotopia'}
white_matter_3Mapper = OptionColumnMapper(column_name="White matter_3", concept_recognizer=hpo_cr, option_d=white_matter_3)
column_mapper_list.append(white_matter_3Mapper)
white_matter_3Mapper.preview_column(dft)

In [19]:
lateral_ventricles = {'Ventriculomegaly': 'Lateral ventricle dilatation',
 'Mild enlargement': 'Lateral ventricle dilatation'}
lateral_ventriclesMapper = OptionColumnMapper(column_name="Lateral ventricles", concept_recognizer=hpo_cr, option_d=lateral_ventricles)
column_mapper_list.append(lateral_ventriclesMapper)
lateral_ventriclesMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,"original value: ""Ventriculomegaly"" -> HP: Lateral ventricle dilatation (HP:0006956) (observed)",2
1,"original value: ""Mild enlargement"" -> HP: Lateral ventricle dilatation (HP:0006956) (observed)",1


In [20]:
third_ventricle = {'Ventriculomegaly': 'Dilated third ventricle',
 'Mild enlargement': 'Dilated third ventricle'}
third_ventricleMapper = OptionColumnMapper(column_name="Third ventricle", concept_recognizer=hpo_cr, option_d=third_ventricle)
column_mapper_list.append(third_ventricleMapper)
third_ventricleMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,"original value: ""Ventriculomegaly"" -> HP: Dilated third ventricle (HP:0007082) (observed)",2
1,"original value: ""Mild enlargement"" -> HP: Dilated third ventricle (HP:0007082) (observed)",1


In [21]:
brainstem = {'Hypoplastic': 'Abnormal brainstem morphology',
 'Mildly hypoplastic': 'Abnormal brainstem morphology',
 'Hypoplasia of the pons': 'Hypoplasia of the pons'}
brainstemMapper = OptionColumnMapper(column_name="Brainstem", concept_recognizer=hpo_cr, option_d=brainstem)
column_mapper_list.append(brainstemMapper)
brainstemMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,"original value: ""Hypoplastic"" -> HP: Abnormal brainstem morphology (HP:0002363) (observed)",1
1,"original value: ""Mildly hypoplastic"" -> HP: Abnormal brainstem morphology (HP:0002363) (observed)",1
2,"original value: ""Hypoplasia of the pons"" -> HP: Hypoplasia of the pons (HP:0012110) (observed)",1


In [22]:
basal_ganglia = {'Thalamus normal putamen/globus pallidus small': 'Abnormal globus pallidus morphology',
 'accentuated Virchow- Robin spaces': 'Dilation of Virchow-Robin spaces'}
basal_gangliaMapper = OptionColumnMapper(column_name="Basal ganglia", concept_recognizer=hpo_cr, option_d=basal_ganglia)
column_mapper_list.append(basal_gangliaMapper)
basal_gangliaMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,"original value: ""Thalamus normal putamen/globus pallidus small, accentuated Virchow- Robin spaces"" -> HP: Dilation of Virchow-Robin spaces (HP:0012520) (observed)",1


In [23]:
corpus_callosum = {'elongated and mildly thickened': 'Abnormal length of corpus callosum',
 'Present elongated and mildly thickened': 'Abnormal length of corpus callosum',
 'elongated': 'Abnormal length of corpus callosum'}
corpus_callosumMapper = OptionColumnMapper(column_name="Corpus callosum", concept_recognizer=hpo_cr, option_d=corpus_callosum)
column_mapper_list.append(corpus_callosumMapper)
corpus_callosumMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,"original value: ""Present, elongated and mildly thickened"" -> HP: Abnormal length of corpus callosum (HP:0200011) (observed)",1
1,"original value: ""Present elongated and mildly thickened"" -> HP: Abnormal length of corpus callosum (HP:0200011) (observed)",1
2,"original value: ""Present, elongated"" -> HP: Abnormal length of corpus callosum (HP:0200011) (observed)",1


In [24]:
vermis = {'Vermis hypoplasia cysts': 'Cerebellar vermis hypoplasia',
 'Mild atrophy': 'Cerebellar vermis hypoplasia'}
vermisMapper = OptionColumnMapper(column_name="Vermis", concept_recognizer=hpo_cr, option_d=vermis)
column_mapper_list.append(vermisMapper)
vermisMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,"original value: ""Vermis hypoplasia cysts"" -> HP: Cerebellar vermis hypoplasia (HP:0001320) (observed)",2
1,"original value: ""Mild atrophy"" -> HP: Cerebellar vermis hypoplasia (HP:0001320) (observed)",1


In [25]:
post_fossa = {'Mega cisterna magna': 'Enlarged cisterna magna',
 'Mega cistema magna': 'Enlarged cisterna magna'}
post_fossaMapper = OptionColumnMapper(column_name="Post fossa",concept_recognizer=hpo_cr, option_d=post_fossa)
column_mapper_list.append(post_fossaMapper)
post_fossaMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,"original value: ""Mega cisterna magna"" -> HP: Enlarged cisterna magna (HP:0002280) (observed)",2
1,"original value: ""Mega cistema magna"" -> HP: Enlarged cisterna magna (HP:0002280) (observed)",1


In [26]:
vessels = { 'Dilatation left A carotis interna': 'Carotid artery dilatation',
 'stenosis right A carotis interna': 'Carotid artery stenosis'}
vesselsMapper = OptionColumnMapper(column_name="Vessels", concept_recognizer=hpo_cr, option_d=vessels)
column_mapper_list.append(vesselsMapper)
vesselsMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,"original value: ""Dilatation left A carotis interna, stenosis right A carotis interna"" -> HP: Carotid artery dilatation (HP:0012163) (observed)",1


In [27]:
arachnoid_cysts = {'Bilateral temporal pole arachnoidal cysts': 'Arachnoid cyst'}
arachnoid_cystsMapper = OptionColumnMapper(column_name="Arachnoid cysts", concept_recognizer=hpo_cr, option_d=arachnoid_cysts)
column_mapper_list.append(arachnoid_cystsMapper)
arachnoid_cystsMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,"original value: ""Bilateral temporal pole arachnoidal cysts"" -> HP: Arachnoid cyst (HP:0100702) (observed)",2


In [28]:
COL3A1_transcript='NM_000090.4'
vmanager = VariantManager(df=dft, individual_column_name="patient_id",cohort_name="COL3A1",transcript=COL3A1_transcript,
                          allele_1_column_name="Mutation in COL3A1", allele_2_column_name="Second mutation in COL3A1")

[INFO] encoding variant "c.145C>G"
https://rest.variantvalidator.org/VariantValidator/variantvalidator/hg38/NM_000090.4%3Ac.145C>G/NM_000090.4?content-type=application%2Fjson
[INFO] encoding variant "c.3851G>A"
https://rest.variantvalidator.org/VariantValidator/variantvalidator/hg38/NM_000090.4%3Ac.3851G>A/NM_000090.4?content-type=application%2Fjson
[INFO] encoding variant "c.1786C>T"
https://rest.variantvalidator.org/VariantValidator/variantvalidator/hg38/NM_000090.4%3Ac.1786C>T/NM_000090.4?content-type=application%2Fjson
[INFO] encoding variant "c.479dupT"
https://rest.variantvalidator.org/VariantValidator/variantvalidator/hg38/NM_000090.4%3Ac.479dupT/NM_000090.4?content-type=application%2Fjson


In [29]:
vmanager.to_summary()

Unnamed: 0,status,count,alleles
0,mapped,37,"c.1862G>A, c.755G>A, c.951+5G>C, c.724C>T, c.565G>C, c.659_664del, c.[1546G>T;1556G>T], c.2869G>A, c.1346G>T, c.556G>A, c.2134_2160del, c.2283+5G>T, c.598C>T, c.1194+1G>A, c.2870G>T, c.583G>A, c.897+2T>G, c.763G>T, c.547G>A, c.2357G>A, c.3256G>C, c.3338G>A, c.2356G>A, c.2815G>A, c.1330G>A, c.3525+1G>A, c.754G>A, c.2518G>A, c.897+2T>A, c.848T>A, c.665G>A, c.1977+5G>C, c.1662+1G>A, c.145C>G, c.3851G>A, c.1786C>T, c.479dupT"
1,unmapped,0,


<h1>Demographic data</h1>

In [30]:
ageMapper = AgeColumnMapper.by_year('Age at examination(years)')
ageMapper.preview_column(dft)

Unnamed: 0,original column contents,age
0,7.0,P7Y
1,3.5,P3Y6M
2,10.0,P10Y
3,19.0,P19Y


In [31]:
sexMapper = SexColumnMapper(male_symbol='male', female_symbol='female', column_name='Sex')
sexMapper.preview_column(dft)

Unnamed: 0,original column contents,sex
0,female,FEMALE
1,male,MALE
2,female,FEMALE
3,female,FEMALE


In [32]:
disease = Disease(disease_id='OMIM:618343', disease_label='Polymicrogyria with or without vascular-type EDS')
encoder = CohortEncoder(df=dft, 
                        hpo_cr=hpo_cr, 
                        column_mapper_list=column_mapper_list, 
                        individual_column_name="patient_id", 
                        agemapper=ageMapper, 
                        sexmapper=sexMapper,
                        metadata=metadata)
encoder.set_disease(disease)

# Add variant data

In [33]:
individuals = encoder.get_individuals()
vmanager.add_variants_to_individuals(individuals)

In [34]:
dft.columns

Index(['patient_id', 'Sex', 'Age at examination(years)', 'Mutation in COL3A1',
       'Second mutation in COL3A1', 'Major features', 'Minor features',
       'Additional features', 'Congenital anomalies',
       'Neurological examination', 'Head circumference', 'Epilepsy/onset',
       'Age at MRI', 'Gyral pattern', 'Gradient', 'Periventricular region',
       'White matter', 'White matter_2', 'White matter_3', 'White matter_4',
       'Calcification', 'Lateral ventricles', 'Third ventricle',
       'Fourth venticle', 'Brainstem', 'Basal ganglia', 'Corpus callosum',
       'Hippocampus', 'Cortex', 'White matter_5', 'Vermis', 'Post fossa',
       'Pituitary', 'Arachnoid cysts', 'Vessels'],
      dtype='object', name='Clinical features')

In [35]:
cvalidator = CohortValidator(cohort=individuals, ontology=hpo_ontology, min_hpo=1, allelic_requirement=AllelicRequirement.BI_ALLELIC)
validated_individuals = cvalidator.get_validated_individual_list()
qc = QcVisualizer(cohort_validator=cvalidator)
display(HTML(qc.to_html()))

ID,Level,Category,Message,HPO Term
PMID_28258187_Patient_1,WARNING,DUPLICATE,Dilation of Virchow-Robin spaces is listed multiple times,Dilation of Virchow-Robin spaces (HP:0012520)


## Clean annotations
We use the validated individuals to get a version of the phenopackets without the redundant term

In [36]:
individuals = cvalidator.get_error_free_individual_list()
cvalidator = CohortValidator(cohort=individuals, ontology=hpo_ontology, min_hpo=1, allelic_requirement=AllelicRequirement.BI_ALLELIC)
validated_individuals = cvalidator.get_validated_individual_list()
qc = QcVisualizer(cohort_validator=cvalidator)
display(HTML(qc.to_html()))

In [37]:
table = PhenopacketTable(individual_list=individuals, metadata=metadata)
display(HTML(table.to_html()))

Individual,Disease,Genotype,Phenotypic features
Patient 1 (FEMALE; P7Y),Polymicrogyria with or without vascular-type EDS (OMIM:618343),NM_000090.4:c.145C>G (homozygous),Dysgyria with thickened cortex (HP:0032400); Lateral ventricle dilatation (HP:0006956); Seizure (HP:0001250); Cerebellar vermis hypoplasia (HP:0001320); Hypointensity of cerebral white matter on MRI (HP:0007103); Global developmental delay (HP:0001263); Abnormal length of corpus callosum (HP:0200011); Renal cortical cysts (HP:0000803); Enlarged cisterna magna (HP:0002280); Dilated third ventricle (HP:0007082); Abnormal cerebellar cortex morphology (HP:0031422); Dilation of Virchow-Robin spaces (HP:0012520); Abnormal brainstem morphology (HP:0002363)
Patient 2 (MALE; P3Y6M),Polymicrogyria with or without vascular-type EDS (OMIM:618343),NM_000090.4:c.145C>G (homozygous),Abnormal cerebellar cortex morphology (HP:0031422); Renal cortical cysts (HP:0000803); Global developmental delay (HP:0001263); Absent speech (HP:0001344); Macrocephaly (HP:0000256); Seizure (HP:0001250); Dysgyria with thickened cortex (HP:0032400); Cerebral hypomyelination (HP:0006808); Dilation of Virchow-Robin spaces (HP:0012520); Lateral ventricle dilatation (HP:0006956); Dilated third ventricle (HP:0007082); Abnormal brainstem morphology (HP:0002363); Abnormal length of corpus callosum (HP:0200011); Cerebellar vermis hypoplasia (HP:0001320); Enlarged cisterna magna (HP:0002280); Arachnoid cyst (HP:0100702)
Patient 3 (FEMALE; P10Y),Polymicrogyria with or without vascular-type EDS (OMIM:618343),NM_000090.4:c.479dup (homozygous),Bruising susceptibility (HP:0000978); Abnormal arterial physiology (HP:0025323); Renal cortical cysts (HP:0000803); Varicose veins (HP:0002619); Joint hypermobility (HP:0001382); Pulmonic stenosis (HP:0001642); Atrophic scars (HP:0001075); Gingival recession (HP:0030816); Talipes equinovarus (HP:0001762); Motor delay (HP:0001270); Typical absence seizure (HP:0011147); Dysgyria with thickened cortex (HP:0032400); Polymicrogyria (HP:0002126); Dilation of Virchow-Robin spaces (HP:0012520); Lateral ventricle dilatation (HP:0006956); Dilated third ventricle (HP:0007082); Hypoplasia of the pons (HP:0012110); Abnormal length of corpus callosum (HP:0200011); Arachnoid cyst (HP:0100702)
Patient 4 (FEMALE; P19Y),Polymicrogyria with or without vascular-type EDS (OMIM:618343),NM_000090.4:c.1786C>T (heterozygous) NM_000090.4:c.3851G>A (heterozygous),Bruising susceptibility (HP:0000978); Dermal translucency (HP:0010648); Arterial dissection (HP:0005294); Renal cortical cysts (HP:0000803); Joint hypermobility (HP:0001382); Tendon rupture (HP:0100550); Slender finger (HP:0001238); Polymicrogyria (HP:0002126); Dilation of Virchow-Robin spaces (HP:0012520); Cerebellar vermis hypoplasia (HP:0001320); Enlarged cisterna magna (HP:0002280); Carotid artery dilatation (HP:0012163); Carotid artery stenosis (HP:0100546)


In [38]:
output_directory = "phenopackets"
Individual.output_individuals_as_phenopackets(individual_list=individuals,
                                              metadata=metadata,
                                              outdir=output_directory)

We output 4 GA4GH phenopackets to the directory phenopackets
