# Loeys-Dietz syndrome 1 and 3
Loeys-Dietz syndrome (LDS) is an autosomal dominant aortic aneurysm syndrome characterized by multisystemic involvement. The most typical clinical triad includes hypertelorism, bifid uvula or cleft palate and aortic aneurysm with tortuosity. Affected individuals may expereince aortic dissection at smaller aortic diameter and arterial aneurysms throughout the arterial tree. The genetic cause is heterogeneous and includes mutations in genes encoding for components of the transforming growth factor beta (TGFβ) signalling pathway: TGFBR1, TGFBR2, SMAD2, SMAD3, TGFB2 and TGFB3 (See [Velchev JD, et al. (2021). Loeys-Dietz Syndrome. Adv Exp Med Biol](https://pubmed.ncbi.nlm.nih.gov/34807423/)).

This notebook will explore whether there are significant differences in phenotypic features between LDS1 (TGFBR1) and LDS3 (SMAD3)

In [15]:
import gpsea
import hpotk

store = hpotk.configure_ontology_store()
hpo = store.load_minimal_hpo()
print(f'Loaded HPO v{hpo.version}')
print(f"Using genophenocorr version {gpsea.__version__}")

Loaded HPO v2024-12-12
Using genophenocorr version 0.9.1


# LDS1

In [16]:
from ppktstore.registry import configure_phenopacket_registry

tgfbr1_symbol = 'TGFBR1'
tgfbr1_mane_tx_id = 'NM_004612.4'
tgfbr1_mane_protein_id = 'NP_004603.1' # TGF-beta receptor type-1 isoform 1 precursor"
lds1_disease_id = "OMIM:609192"


phenopacket_registry = configure_phenopacket_registry()
with phenopacket_registry.open_phenopacket_store("0.1.23") as ps:
    lds1_phenopackets = tuple(ps.iter_cohort_phenopackets(tgfbr1_symbol))
tgfbr1_len = len(lds1_phenopackets)
print(f"{len(lds1_phenopackets)} LDS1 phenopackets")

41 LDS1 phenopackets


# LDS3
[Loeys-Dietz syndrome-3 (LDS3)](https://omim.org/entry/613795) is caused by heterozygous mutation in the SMAD3 gene.

In [17]:
smad3_symbol = 'SMAD3'
smad3_mane_tx_id = 'NM_005902.4'
smad3_mane_protein_id = 'NP_005893.1' # mothers against decapentaplegic homolog 3
lds3_disease_id = "OMIM:613795"

from ppktstore.registry import configure_phenopacket_registry
phenopacket_registry = configure_phenopacket_registry()
with phenopacket_registry.open_phenopacket_store("0.1.23") as ps:
    lds3_phenopackets = tuple(ps.iter_cohort_phenopackets(smad3_symbol))

print(f"{len(lds3_phenopackets)} LDS3 phenopackets")

49 LDS3 phenopackets


In [18]:
from gpsea.preprocessing import configure_caching_cohort_creator, load_phenopackets

lds_phenopackets = list()
lds_phenopackets.extend(lds1_phenopackets)
lds_phenopackets.extend(lds3_phenopackets)


cohort_creator = configure_caching_cohort_creator(hpo)
cohort, validation = load_phenopackets(
    phenopackets=lds_phenopackets, 
    cohort_creator=cohort_creator,
)

validation.summarize()

Individuals Processed: 100%|██████████| 90/90 [00:00<00:00, 541.94individuals/s]
Validated under permissive policy


In [19]:
from gpsea.analysis.pcats import configure_hpo_term_analysis
from gpsea.analysis.clf import prepare_classifiers_for_terms_of_interest

analysis = configure_hpo_term_analysis(hpo)

pheno_clfs = prepare_classifiers_for_terms_of_interest(
    cohort=cohort,
    hpo=hpo,
)

In [20]:
from gpsea.analysis.clf import diagnosis_classifier
from gpsea.view import MtcStatsViewer

lds_1_3_disease_clf = diagnosis_classifier (
    diagnoses=(lds1_disease_id, lds3_disease_id),
    labels=('LDS1', 'LDS3'),
)

lds1_3_result = analysis.compare_genotype_vs_phenotypes(
    cohort=cohort,
    gt_clf=lds_1_3_disease_clf,
    pheno_clfs=pheno_clfs,
)

viewer = MtcStatsViewer()
viewer.process(lds1_3_result)

Code,Reason,Count
HMF01,Skipping term with maximum frequency that was less than threshold 0.4,42
HMF03,Skipping term because of a child term with the same individual counts,2
HMF08,Skipping general term,63
HMF09,Skipping term with maximum annotation frequency that was less than threshold 0.4,150


In [21]:
from gpsea.view import summarize_hpo_analysis
summarize_hpo_analysis(hpo=hpo, result=lds1_3_result)

Diagnosis,OMIM:609192,OMIM:609192,OMIM:613795,OMIM:613795,Unnamed: 5_level_0,Unnamed: 6_level_0
Unnamed: 0_level_1,Count,Percent,Count,Percent,Corrected p values,p values
Osteoarthritis [HP:0002758],0/11,0%,26/38,68%,0.000975,4.6e-05
Scoliosis [HP:0002650],18/21,86%,20/43,47%,0.023527,0.003015
Aortic aneurysm [HP:0004942],11/11,100%,26/48,54%,0.023527,0.004327
Hypertelorism [HP:0000316],15/19,79%,13/35,37%,0.023527,0.004481
Joint hypermobility [HP:0001382],12/19,63%,12/36,33%,0.197692,0.04707
High palate [HP:0000218],12/16,75%,12/20,60%,1.0,0.481498
Inguinal hernia [HP:0000023],6/14,43%,12/39,31%,1.0,0.514672
Arterial tortuosity [HP:0005116],9/17,53%,11/26,42%,1.0,0.545009
Disproportionate tall stature [HP:0001519],9/17,53%,8/19,42%,1.0,0.738795
Abnormal oral cavity morphology [HP:0000163],17/17,100%,24/24,100%,1.0,1.0


# Summary

In [22]:
from gpseacs.report import GpseaAnalysisReport, GPAnalysisResultSummary

f_results = (
  GPAnalysisResultSummary.from_multi( result=lds1_3_result,  ),
)

caption = """."""
report = GpseaAnalysisReport(name="LDS 1 and 3", 
                             cohort=cohort, 
                             fet_results=f_results,
                             gene_symbol="n/a",
                             mane_tx_id="n/a",
                             mane_protein_id="n/a",
                             caption=caption)

In [23]:
from gpseacs.report import GpseaNotebookSummarizer
summarizer = GpseaNotebookSummarizer(hpo=hpo, gpsea_version=gpsea.__version__)
summarizer.summarize_report(report=report)

HPO Term,OMIM:609192,OMIM:613795,p-val,adj. p-val
Scoliosis [HP:0002650],18/21 (86%),20/43 (47%),0.003,0.024
Hypertelorism [HP:0000316],15/19 (79%),13/35 (37%),0.004,0.024
Aortic aneurysm [HP:0004942],11/11 (100%),26/48 (54%),0.004,0.024
Osteoarthritis [HP:0002758],0/11 (0%),26/38 (68%),4.64e-05,0.000975


In [24]:
summarizer.process_latex(report=report)

Output to ../../supplement/tex/LDS_1_and_3_summary_draft.tex
