# RNU4-2
[ReNU syndrome (RENU)](https://omim.org/entry/620851) is caused by heterozygous mutation in the RNU4-2 gene.

RNU4-2 is one of two U4 snRNA genes.

[Chen Y, et al. (2024)](https://pubmed.ncbi.nlm.nih.gov/38991538/) report:
> The n.64_65insT variant is one of six single base insertions that we observe in the 18 bp critical region in individuals with NDD, in a total of 100 individuals across cohorts. By contrast, single base insertions are very rare in population cohorts. Although we do also observe some SNVs in this region in individuals with NDD, our initial data suggest these SNVs may result in a milder phenotype. However, given this observation is based on only four fully phenotyped individuals, it needs to be confirmed in larger cohorts. 

In [1]:
import hpotk
import gpsea

store = hpotk.configure_ontology_store()
hpo = store.load_minimal_hpo()
print(f'Loaded HPO v{hpo.version}')
print(f"Using gpsea version {gpsea.__version__}")

Loaded HPO v2025-01-16
Using gpsea version 0.9.6.dev0


# RNU4-2
We user the [Matched Annotation from NCBI and EMBL-EBI (MANE)](https://www.ncbi.nlm.nih.gov/refseq/MANE/) transcript for RNU4-2. RNU4-2 is non-coding and thus there is no protein identifier.

In [2]:
gene_symbol = 'RNU4-2'
mane_tx_id = 'NR_003137.3'

In [3]:
from ppktstore.registry import configure_phenopacket_registry
from gpsea.preprocessing import load_phenopackets
from gpsea.preprocessing import configure_caching_cohort_creator

phenopacket_store_release = '0.1.24' 
registry = configure_phenopacket_registry()

with registry.open_phenopacket_store(release=phenopacket_store_release) as ps:
    phenopackets = tuple(ps.iter_cohort_phenopackets(gene_symbol))

cohort_creator = configure_caching_cohort_creator(hpo)

cohort, qc_results = load_phenopackets(phenopackets, cohort_creator)  
qc_results.summarize()

Individuals Processed: 100%|██████████| 61/61 [00:04<00:00, 12.92 individuals/s]
Validated under permissive policy


In [4]:
from gpsea.view import CohortViewer
viewer = CohortViewer(hpo)
viewer.process(cohort=cohort, transcript_id=mane_tx_id)

n,HPO Term
49,Absent speech
48,Intellectual disability
46,Delayed ability to walk
45,Seizure
42,Hypotonia
38,Short stature
37,Failure to thrive
35,Severe global developmental delay
35,Constipation
32,Strabismus

n,Disease
61,ReNU syndrome

n,Variant key,HGVS,Variant Class
53,12_120291839_120291839_T_TA,12_120291839_120291839_T_TA (None),
2,12_120291828_120291828_G_A,12_120291828_120291828_G_A (None),
2,12_120291835_120291835_G_A,12_120291835_120291835_G_A (None),
1,12_120291835_120291835_G_GT,12_120291835_120291835_G_GT (None),
1,12_120291839_120291839_T_TC,12_120291839_120291839_T_TC (None),
1,12_120291839_120291839_T_C,12_120291839_120291839_T_C (None),
1,12_120291826_120291826_T_TA,12_120291826_120291826_T_TA (None),

Variant effect,Count


# Genotype phenotype correlation analysis

In [5]:
from gpsea.analysis.pcats import configure_hpo_term_analysis
analysis = configure_hpo_term_analysis(hpo)

from gpsea.analysis.clf import prepare_classifiers_for_terms_of_interest
pheno_predicates = prepare_classifiers_for_terms_of_interest(
    cohort=cohort,
    hpo=hpo,
)

In [6]:
from gpsea.analysis.predicate import variant_key
from gpsea.analysis.clf import monoallelic_classifier


is_n64_65insT = variant_key("12_120291839_120291839_T_TA") # n.64_65insT
is_b = variant_key("12_120291835_120291835_G_GT") #12_120291835_120291835_G_GT
is_c = variant_key("12_120291839_120291839_T_TC") # 12_120291839_120291839_T_TC
is_d = variant_key("12_120291826_120291826_T_TA") # 12_120291826_120291826_T_TA

is_insertion = is_n64_65insT | is_b | is_c | is_d 


is_n64_65insT_predicate = monoallelic_classifier(
    a_predicate=is_n64_65insT,
    b_predicate=~is_n64_65insT,
    a_label= "n.64_65insT", 
    b_label="other"
)

is_insertion_predicate = monoallelic_classifier(
    a_predicate=is_insertion,
    b_predicate=~is_insertion,
    a_label="insertion", 
    b_label="other"
)

In [7]:
n64_65insT_result = analysis.compare_genotype_vs_phenotypes(
    cohort=cohort,
    gt_clf=is_n64_65insT_predicate,
    pheno_clfs=pheno_predicates,
)
from gpsea.view import MtcStatsViewer

viewer = MtcStatsViewer()
viewer.process(n64_65insT_result)

Reason,Count
Skip terms if all counts are identical to counts for a child term,11
"Skipping ""general"" level terms",92
Skipping terms that are rare on the cohort level (in less than 40% of the cohort members),255


In [8]:
from gpsea.view import summarize_hpo_analysis

summarize_hpo_analysis(hpo=hpo, result=n64_65insT_result)

Allele group,n.64_65insT,other,Corrected p values,p values
Severe global developmental delay [HP:0011344],33/38 (87%),2/7 (29%),0.305249,0.003469
Moderate global developmental delay [HP:0011343],5/38 (13%),5/7 (71%),0.305249,0.003469
Absent speech [HP:0001344],45/51 (88%),4/8 (50%),1.000000,0.022273
Hypotonia [HP:0001252],46/49 (94%),5/8 (62%),1.000000,0.030777
Tube feeding [HP:0033454],15/47 (32%),0/8 (0%),1.000000,0.090954
...,...,...,...,...
Enlarged cisterna magna [HP:0002280],8/40 (20%),1/6 (17%),1.000000,1.000000
Hypoplasia of the corpus callosum [HP:0002079],8/48 (17%),1/7 (14%),1.000000,1.000000
Blue sclerae [HP:0000592],8/53 (15%),1/8 (12%),1.000000,1.000000
Progressive microcephaly [HP:0000253],9/38 (24%),1/4 (25%),1.000000,1.000000


In [9]:
insertion_result = analysis.compare_genotype_vs_phenotypes(
    cohort=cohort,
    gt_clf=is_insertion_predicate,
    pheno_clfs=pheno_predicates,
)
summarize_hpo_analysis(hpo=hpo, result=insertion_result)

Allele group,insertion,other,Corrected p values,p values
Moderate global developmental delay [HP:0011343],6/41 (15%),4/4 (100%),0.124031,0.001409
Severe global developmental delay [HP:0011344],35/41 (85%),0/4 (0%),0.124031,0.001409
Absent speech [HP:0001344],47/54 (87%),2/5 (40%),1.000000,0.030294
High forehead [HP:0000348],3/56 (5%),2/5 (40%),1.000000,0.049231
Cavum septum pellucidum [HP:0002389],0/42 (0%),1/3 (33%),1.000000,0.066667
...,...,...,...,...
Hearing impairment [HP:0000365],8/41 (20%),1/4 (25%),1.000000,1.000000
Progressive microcephaly [HP:0000253],9/39 (23%),1/3 (33%),1.000000,1.000000
Enlarged cisterna magna [HP:0002280],9/43 (21%),0/3 (0%),1.000000,1.000000
Cerebral palsy [HP:0100021],9/46 (20%),0/4 (0%),1.000000,1.000000


In [10]:
from gpsea.analysis.clf import sex_classifier
mf_result = analysis.compare_genotype_vs_phenotypes(
    cohort=cohort,
    gt_clf=sex_classifier(),
    pheno_clfs=pheno_predicates,
)

summarize_hpo_analysis(hpo, mf_result)

Sex,FEMALE,MALE,Corrected p values,p values
Reduced circulating growth hormone concentration [HP:0034323],0/11 (0%),11/22 (50%),0.884362,0.005025
Sleep abnormality [HP:0002360],0/27 (0%),5/33 (15%),1.000000,0.058238
Long face [HP:0000276],0/22 (0%),5/28 (18%),1.000000,0.058815
Enlarged cisterna magna [HP:0002280],6/20 (30%),3/26 (12%),1.000000,0.148662
Self-biting [HP:0012169],7/21 (33%),4/27 (15%),1.000000,0.173639
...,...,...,...,...
Cerebral palsy [HP:0100021],4/22 (18%),5/28 (18%),1.000000,1.000000
Blue sclerae [HP:0000592],4/28 (14%),5/33 (15%),1.000000,1.000000
Progressive microcephaly [HP:0000253],5/20 (25%),5/22 (23%),1.000000,1.000000
Blepharophimosis [HP:0000581],8/26 (31%),7/26 (27%),1.000000,1.000000


# Summary

In [11]:
from gpseacs.report import GpseaAnalysisReport, GPAnalysisResultSummary


fet_results = (
    GPAnalysisResultSummary.from_multi(
        result=n64_65insT_result,
    ),
    GPAnalysisResultSummary.from_multi(
        result=insertion_result,
    ),
    GPAnalysisResultSummary.from_multi(
        result=mf_result
    )
)

caption = "No previous statistical analysis of correlations with PTPN11 missense variants identified in the medical literature."
report = GpseaAnalysisReport(name=gene_symbol, 
                             cohort=cohort, 
                             fet_results=fet_results,
                             gene_symbol=gene_symbol,
                             mane_tx_id=mane_tx_id,
                             mane_protein_id="",
                             caption=caption)

In [12]:
from gpseacs.report import GpseaNotebookSummarizer
summarizer = GpseaNotebookSummarizer(hpo=hpo, gpsea_version=gpsea.__version__)
summarizer.summarize_report(report=report)

Genotype (A),Genotype (B),Tests performed,Significant tests
n.64_65insT,other,176,0

Genotype (A),Genotype (B),Tests performed,Significant tests
insertion,other,176,0

Genotype (A),Genotype (B),Tests performed,Significant tests
FEMALE,MALE,176,0


In [13]:
summarizer.process_latex(report=report)

Output to ../../supplement/tex/RNU4-2_summary_draft.tex
