# Loeys-Dietz syndrome 3 and 6
Loeys-Dietz syndrome (LDS) is an autosomal dominant aortic aneurysm syndrome characterized by multisystemic involvement. The most typical clinical triad includes hypertelorism, bifid uvula or cleft palate and aortic aneurysm with tortuosity. Affected individuals may expereince aortic dissection at smaller aortic diameter and arterial aneurysms throughout the arterial tree. The genetic cause is heterogeneous and includes mutations in genes encoding for components of the transforming growth factor beta (TGFβ) signalling pathway: TGFBR1, TGFBR2, SMAD2, SMAD3, TGFB2 and TGFB3 (See [Velchev JD, et al. (2021). Loeys-Dietz Syndrome. Adv Exp Med Biol](https://pubmed.ncbi.nlm.nih.gov/34807423/)).

This notebook will explore whether there are significant differences in phenotypic features between LDS3 (SMAD3) and LDS6 (SMAD2)

In [14]:
import gpsea
import hpotk

store = hpotk.configure_ontology_store()
hpo = store.load_minimal_hpo()
print(f'Loaded HPO v{hpo.version}')
print(f"Using genophenocorr version {gpsea.__version__}")

Loaded HPO v2024-12-12
Using genophenocorr version 0.9.1


# LDS3
[Loeys-Dietz syndrome-3 (LDS3)](https://omim.org/entry/613795) is caused by heterozygous mutation in the SMAD3 gene.

In [15]:
from ppktstore.registry import configure_phenopacket_registry

smad3_symbol = 'SMAD3'
smad3_mane_tx_id = 'NM_005902.4'
smad3_mane_protein_id = 'NP_005893.1' # mothers against decapentaplegic homolog 3
lds3_disease_id = "OMIM:613795"

phenopacket_registry = configure_phenopacket_registry()
with phenopacket_registry.open_phenopacket_store("0.1.23") as ps:
    lds3_phenopackets = tuple(ps.iter_cohort_phenopackets(smad3_symbol))

print(f"{len(lds3_phenopackets)} LDS3 phenopackets")

49 LDS3 phenopackets


# LDS6
[Loeys-Dietz syndrome-6 (LDS6)](https://omim.org/entry/619656) is caused by heterozygous mutation in the SMAD2 gene 

In [16]:
smad2_symbol = 'SMAD2'
smad2_mane_tx_id = 'NM_005901.6'
smad2_mane_protein_id = 'NP_005892.1' # mothers against decapentaplegic homolog 2 isoform 1

lds6_disease_id = "OMIM:619656"

phenopacket_registry = configure_phenopacket_registry()
with phenopacket_registry.open_phenopacket_store("0.1.20") as ps:
    lds6_phenopackets = tuple(ps.iter_cohort_phenopackets(smad2_symbol))

print(f"{len(lds6_phenopackets)} LDS6 phenopackets")

16 LDS6 phenopackets


In [17]:
from gpsea.preprocessing import configure_caching_cohort_creator, load_phenopackets

lds3_and_lds6_phenopackets = list()
lds3_and_lds6_phenopackets.extend(lds3_phenopackets)
lds3_and_lds6_phenopackets.extend(lds6_phenopackets)
print(f"Got {len(lds3_and_lds6_phenopackets)} LDS3 and LDS6 phenopackets")

cohort_creator = configure_caching_cohort_creator(hpo)
cohort, validation = load_phenopackets(
    phenopackets=lds3_and_lds6_phenopackets, 
    cohort_creator=cohort_creator,
)

validation.summarize()

Got 65 LDS3 and LDS6 phenopackets
Individuals Processed: 100%|██████████| 65/65 [00:00<00:00, 668.12individuals/s]
Validated under permissive policy


In [18]:
from gpsea.analysis.pcats import configure_hpo_term_analysis
from gpsea.analysis.clf import prepare_classifiers_for_terms_of_interest

analysis = configure_hpo_term_analysis(hpo)

pheno_clfs = prepare_classifiers_for_terms_of_interest(
    cohort=cohort,
    hpo=hpo,
)

In [19]:
from gpsea.analysis.clf import diagnosis_classifier
from gpsea.view import MtcStatsViewer

lds_3_6_disease_clf = diagnosis_classifier (
    diagnoses=(lds3_disease_id, lds6_disease_id),
    labels=('LDS3', 'LDS6'),
)

lds_3_6_result = analysis.compare_genotype_vs_phenotypes(
    cohort=cohort,
    gt_clf=lds_3_6_disease_clf,
    pheno_clfs=pheno_clfs,
)

viewer = MtcStatsViewer()
viewer.process(lds_3_6_result)

Code,Reason,Count
HMF01,Skipping term with maximum frequency that was less than threshold 0.4,21
HMF03,Skipping term because of a child term with the same individual counts,4
HMF08,Skipping general term,49
HMF09,Skipping term with maximum annotation frequency that was less than threshold 0.4,71


In [20]:
from gpsea.view import summarize_hpo_analysis
summarize_hpo_analysis(hpo=hpo, result=lds_3_6_result)

Diagnosis,OMIM:613795,OMIM:613795,OMIM:619656,OMIM:619656,Unnamed: 5_level_0,Unnamed: 6_level_0
Unnamed: 0_level_1,Count,Percent,Count,Percent,Corrected p values,p values
Thoracic aortic aneurysm [HP:0012727],0/22,0%,10/15,67%,0.000181,9e-06
Aortic aneurysm [HP:0004942],26/48,54%,10/10,100%,0.095422,0.009088
Soft skin [HP:0000977],23/37,62%,0/4,0%,0.211514,0.030216
Varicose veins [HP:0002619],14/22,64%,4/12,33%,0.737449,0.151415
High palate [HP:0000218],12/20,60%,5/15,33%,0.737449,0.175583
Arterial tortuosity [HP:0005116],11/26,42%,1/1,100%,1.0,0.444444
Inguinal hernia [HP:0000023],12/39,31%,5/11,45%,1.0,0.475054
Umbilical hernia [HP:0001537],12/39,31%,2/4,50%,1.0,0.585479
Osteoarthritis [HP:0002758],26/38,68%,2/2,100%,1.0,1.0
Arthritis [HP:0001369],26/26,100%,2/2,100%,1.0,1.0


In [21]:
from gpseacs.report import GpseaAnalysisReport, GPAnalysisResultSummary

f_results = (
  GPAnalysisResultSummary.from_multi( result=lds_3_6_result,  ),
)


caption = """."""
report = GpseaAnalysisReport(name="LDS 3 and 6", 
                             cohort=cohort, 
                             fet_results=f_results,
                             gene_symbol="n/a",
                             mane_tx_id="n/a",
                             mane_protein_id="n/a",
                             caption=caption)

In [22]:
from gpseacs.report import GpseaNotebookSummarizer
summarizer = GpseaNotebookSummarizer(hpo=hpo, gpsea_version=gpsea.__version__)
summarizer.summarize_report(report=report)

HPO Term,OMIM:613795,OMIM:619656,p-val,adj. p-val
Thoracic aortic aneurysm [HP:0012727],0/22 (0%),10/15 (67%),8.62e-06,0.000181


In [23]:
summarizer.process_latex(report=report)

Output to ../../supplement/tex/LDS_3_and_6_summary_draft.tex
