# TBX1
Data in this notebook was taken from [Yagi H, et al. (2003) Role of TBX1 in human del22q11.2 syndrome. Lancet. 2003 Oct 25;362(9393):1366-73. PMID:14585638](https://pubmed.ncbi.nlm.nih.gov/14585638/)


The authors report:
Conotruncal anomaly face is characterised by nine facial features, including ocular hypertelorism, lateral displacement of the inner canthi, short palpebral fissures, swollen eyelids, dysmorphism of the nose, low-set ears and minor ear-lobe anomalies, and is almost always associated with velopharyngeal insufficiency and a nasal voice. On the basis of the phenotypic study, dysmorphism of the nose—a new description of a nose that seems to be divided into two parts (upper part and lower part) at the join of the wing and at the sides—was added. Dysmorphism of the nose was observed consistently in patients with the 22q11.2 deletion.
Each positive characteristic feature of the nine facial features of conotruncal anomaly face and velopharyngeal insufficiency was
counted as 1 point. An atypical finding was counted as 0·5 point.

We mapped the variants to HGNC notation as follows:

- Patient F1:443T→A (F148Y):  NM_001379200.1(TBX1):c.470T>A (p.Phe157Tyr) (See ClinVar VCV000007563.3)
- Patient F2 had the 928G→A (G310S): NM_001379200.1(TBX1):c.955G>A (p.Gly319Ser) (See ClinVar VCV000007564.14)
- Patient F3-1: 1223delC (deletion of cytosine at 1223):NM_001379200.1(TBX1):c.1250del (p.Ser417fs) See ClinVar RCV001815166.9)

In [1]:
import pandas as pd
pd.set_option('display.max_colwidth', None) # show entire column contents, important!
from IPython.display import display, HTML
from pyphetools.creation import *
from pyphetools.visualization import *
from pyphetools.validation import *
import pyphetools
print(f"Using pyphetools version {pyphetools.__version__}")

Using pyphetools version 0.9.16


In [2]:
PMID = "PMID:14585638"
title = "Role of TBX1 in human del22q11.2 syndrome"
cite = Citation(pmid=PMID, title=title)
parser = HpoParser("../hp.json")
hpo_cr = parser.get_hpo_concept_recognizer()
hpo_version = parser.get_version()
hpo_ontology = parser.get_ontology()
metadata = MetaData(created_by="ORCID:0000-0002-5648-2155", citation=cite)
metadata.default_versions_with_hpo(version=hpo_version)
print(f"HPO version {hpo_version}")

HPO version 2023-10-09


In [3]:
df = pd.read_excel("input/Yagi_TBX1.xlsx")

In [4]:
df.head()

Unnamed: 0,individual,Dx,Sex,Age,NM_001379200.1,Ocular hypertelorism,Lateral displacement of the inner canthi,Short palpebral fissures,Swollen eyelids,Dysmorphism of the nose,...,Low-set ears,Minor ear anomalies,Velopharyngeal insufficiency,Unnamed: 14,Cardiac defects,Hypoplastic thymus,Parathyroid dysfunction,Deafness,pmid,title
0,F1,CAFS,F,7,c.470T>A,1.0,1,1.0,1.0,1.0,...,1,1,1,,TOF;PA;ASDII;MAPCA,0,0,na,PMID:14585638,Role of TBX1 in human del22q11.2 syndrome
1,F2,DGS,M,13,c.955G>A,1.0,1,0.5,1.0,1.0,...,1,1,1,,IAA(B);VSD(II);PH,1,1,1,PMID:14585638,Role of TBX1 in human del22q11.2 syndrome
2,F3-1,CAFS,F,15,c.1250del,1.0,1,0.0,0.5,1.0,...,0,0,0,,TOF;RAA,1,0,na,PMID:14585638,Role of TBX1 in human del22q11.2 syndrome
3,F3-2,CAFS,F,46,c.1250del,0.5,1,0.0,0.5,0.5,...,0,0,1,,-,na,0,na,PMID:14585638,Role of TBX1 in human del22q11.2 syndrome
4,F3-3,CAFS,M,14,c.1250del,1.0,1,0.0,1.0,1.0,...,1,1,1,,-,0,1,na,PMID:14585638,Role of TBX1 in human del22q11.2 syndrome


In [5]:
generator = SimpleColumnMapperGenerator(df=df, hpo_cr=hpo_cr, observed="1", excluded="0")
mapper_d = generator.try_mapping_columns()
display(HTML(generator.to_html()))

Result,Columns
Mapped,Ocular hypertelorism; Short palpebral fissures; Small mouth; Low-set ears; Velopharyngeal insufficiency; Parathyroid dysfunction; Deafness
Unmapped,individual; Dx; Sex; Age; NM_001379200.1; Lateral displacement of the inner canthi; Swollen eyelids; Dysmorphism of the nose; Minor ear anomalies; Unnamed: 14; Cardiac defects; Hypoplastic thymus; pmid; title


In [6]:
cardiac_d = {"TOF": "Tetralogy of Fallot",
            "ASDII" : "Secundum atrial septal defect",
            "PA": "Pulmonary artery atresia",
            "PH": "Pulmonary arterial hypertension",
            "VSD(III)": "Perimembranous ventricular septal defect",
            "MAPCA": "Aortopulmonary collateral arteries",
            "IAA(B)": "Interrupted aortic arch type B",
            "TOF": "Tetralogy of Fallot",
            "RAA": "Right aortic arch"}
cardiacMapper = OptionColumnMapper(concept_recognizer=hpo_cr, option_d=cardiac_d)
cardiacMapper.preview_column(df["Cardiac defects"])
mapper_d["Cardiac defects"] = cardiacMapper

In [7]:
# Hypoplasia of the thymus HP:0000778
hypoThymusMapper = SimpleColumnMapper(hpo_id="HP:0000778", hpo_label="Hypoplasia of the thymus", observed="1", excluded="0")
hypoThymusMapper.preview_column(df["Hypoplastic thymus"])
mapper_d["Hypoplastic thymus"] = hypoThymusMapper

In [8]:
# ocular hypertelorism (with increased interpupillary distance due to increased separation of the inner canthi)
# "Lateral displacement of the inner canthi" -- Code as Hypertelorism HP:0000316
hypertelorismMapper = SimpleColumnMapper(hpo_id="HP:0000316", hpo_label="Hypertelorism", observed="1", excluded="0")
hypertelorismMapper.preview_column(df["Lateral displacement of the inner canthi"])
mapper_d["Lateral displacement of the inner canthi"] = hypertelorismMapper

In [9]:
# Swollen eyelids - not coding, unclear what the eyelid feature is in Figure 1.

In [10]:
#  Arrowheads show the area of the nose that seems to be divided into
# two parts (upper and lower) at the joint of the wing and at the sides.
# coding as Wide nasal base HP:0012810
nasalDysMapper = SimpleColumnMapper(hpo_id="HP:0012810", hpo_label="Wide nasal base", observed="1", excluded="0")
nasalDysMapper.preview_column(df["Dysmorphism of the nose"])
mapper_d["Dysmorphism of the nose"] = nasalDysMapper

In [None]:
var_d = {}
tbx1_transcript = "NM_001379200.1"
varValidator = VariantValidator(genome_build="hg38", transcript=tbx1_transcript)
for v in df["NM_001379200.1"].unique():
    var = varValidator.encode_hgvs(v)
    var_d [v] = var
print(f"Extracted {len(var_d)} variants with Variant Validator")

https://rest.variantvalidator.org/VariantValidator/variantvalidator/hg38/NM_001379200.1%3Ac.470T>A/NM_001379200.1?content-type=application%2Fjson


In [None]:
varMapper = VariantColumnMapper(variant_d=var_d,
                               variant_column_name="NM_001379200.1",
                               default_genotype="heterozygous")

In [None]:
ageMapper = AgeColumnMapper.by_year(column_name="Age")
#ageMapper.preview_column(df["Age"])
sexMapper = SexColumnMapper(male_symbol="M", female_symbol="F", column_name="Sex")
#sexMapper.preview_column(df["Sex"])

In [None]:
dgs = Disease(disease_id="OMIM:188400", disease_label="DiGeorge syndrome")
cafs = Disease(disease_id="OMIM:217095", disease_label="Conotruncal anomaly face syndrome")
disease_d = {  "DGS": dgs, "CAFS": cafs }
diseaseMapper = DiseaseIdColumnMapper(column_name="Dx", disease_id_map=disease_d)

In [None]:
encoder = MixedCohortEncoder(df=df,
                             individual_column_name="individual",
                             hpo_cr=hpo_cr,
                             column_mapper_d=mapper_d,
                             disease_id_mapper=diseaseMapper,
                             metadata=metadata,
                             pmid_column="pmid",
                             title_column="title",
                             variant_mapper=varMapper,
                             agemapper=ageMapper,
                             sexmapper=sexMapper
                            )

In [None]:
individuals = encoder.get_individuals()

In [None]:
cvalidator = CohortValidator(cohort=individuals, ontology=hpo_ontology, min_hpo=1, allelic_requirement=AllelicRequirement.MONO_ALLELIC)
qc = QcVisualizer(cohort_validator=cvalidator)
display(HTML(qc.to_html()))

In [None]:
cohort = cvalidator.get_error_free_individual_list()
table = PhenopacketTable(individual_list=individuals, metadata=metadata)
display(HTML(table.to_html()))