# RNU4-2
[ReNU syndrome (RENU)](https://omim.org/entry/620851) is caused by heterozygous mutation in the RNU4-2 gene.

RNU4-2 is one of two U4 snRNA genes.

[Chen Y, et al. (2024)](https://pubmed.ncbi.nlm.nih.gov/38991538/) report:
> The n.64_65insT variant is one of six single base insertions that we observe in the 18â€‰bp critical region in individuals with NDD, in a total of 100 individuals across cohorts. By contrast, single base insertions are very rare in population cohorts. Although we do also observe some SNVs in this region in individuals with NDD, our initial data suggest these SNVs may result in a milder phenotype. However, given this observation is based on only four fully phenotyped individuals, it needs to be confirmed in larger cohorts. 

In [1]:
import hpotk
import gpsea

store = hpotk.configure_ontology_store()
hpo = store.load_minimal_hpo(release='v2024-08-13')
print(f'Loaded HPO v{hpo.version}')
print(f"Using gpsea version {gpsea.__version__}")

Loaded HPO v2024-08-13
Using gpsea version 0.5.1.dev0


# RNU4-2
We user the [Matched Annotation from NCBI and EMBL-EBI (MANE)](https://www.ncbi.nlm.nih.gov/refseq/MANE/) transcript for RNU4-2. RNU4-2 is non-coding and thus there is no protein identifier.

In [2]:
gene_symbol = 'RNU4-2'
mane_tx_id = 'NR_003137.3'

In [3]:
from ppktstore.registry import configure_phenopacket_registry

phenopacket_store_release = '0.1.21'  # Update, if necessary
registry = configure_phenopacket_registry()

#with registry.open_phenopacket_store(release=phenopacket_store_release) as ps:
#    phenopackets = tuple(ps.iter_cohort_phenopackets(gene_symbol))

#print(f'Loaded {len(phenopackets)} phenopackets')

from gpsea.preprocessing import configure_caching_cohort_creator

cohort_name = "RNU4-2"	


cohort_creator = configure_caching_cohort_creator(hpo)
from gpsea.preprocessing import load_phenopacket_folder
pp_dir = '/Users/robin/GIT/phenopacket-store/notebooks/RNU4-2/phenopackets/'
cohort, qc_results = load_phenopacket_folder(pp_dir, cohort_creator)  
qc_results.summarize()


Individuals Processed: 61individuals [00:03, 18.51individuals/s]
Validated under permissive policy


In [4]:
from gpsea.view import CohortViewable
viewer = CohortViewable(hpo)
viewer.process(cohort=cohort, transcript_id=mane_tx_id)

HPO Term,ID,Seen in n individuals
Absent speech,HP:0001344,49
Intellectual disability,HP:0001249,48
Delayed ability to walk,HP:0031936,46
Seizure,HP:0001250,45
Hypotonia,HP:0001252,42
Short stature,HP:0004322,38
Failure to thrive,HP:0001508,37
Severe global developmental delay,HP:0011344,35
Constipation,HP:0002019,35
Strabismus,HP:0000486,32

Count,Variant key,Variant Name,Protein Variant,Variant Class
53,12_120291839_120291839_T_TA,12_120291839_120291839_T_TA,,
2,12_120291828_120291828_G_A,12_120291828_120291828_G_A,,
2,12_120291835_120291835_G_A,12_120291835_120291835_G_A,,
1,12_120291835_120291835_G_GT,12_120291835_120291835_G_GT,,
1,12_120291839_120291839_T_TC,12_120291839_120291839_T_TC,,
1,12_120291839_120291839_T_C,12_120291839_120291839_T_C,,
1,12_120291826_120291826_T_TA,12_120291826_120291826_T_TA,,

Disease Name,Disease ID,Annotation Count
ReNU syndrome,OMIM:620851,61

Variant effect,Annotation Count


In [5]:
from gpsea.analysis.predicate.phenotype import prepare_predicates_for_terms_of_interest
from gpsea.analysis.pcats.stats import FisherExactTest
from gpsea.analysis.mtc_filter import HpoMtcFilter
from gpsea.analysis.pcats import HpoTermAnalysis

pheno_predicates = prepare_predicates_for_terms_of_interest(
    cohort=cohort,
    hpo=hpo,
)

mtc_filter = HpoMtcFilter.default_filter(hpo=hpo, term_frequency_threshold=0.4, annotation_frequency_threshold=0.4)
mtc_correction = 'fdr_bh'
statistic = FisherExactTest()

analysis = HpoTermAnalysis(
    count_statistic=statistic,
    mtc_filter=mtc_filter,
    mtc_correction=mtc_correction,
    mtc_alpha=0.05,
)

In [6]:
from gpsea.analysis.predicate.genotype import VariantPredicates, monoallelic_predicate
from gpsea.model import VariantEffect


is_n64_65insT = VariantPredicates.variant_key("12_120291839_120291839_T_TA") # n.64_65insT
is_b = VariantPredicates.variant_key("12_120291835_120291835_G_GT") #12_120291835_120291835_G_GT
is_c = VariantPredicates.variant_key("12_120291839_120291839_T_TC") # 12_120291839_120291839_T_TC
is_d = VariantPredicates.variant_key("12_120291826_120291826_T_TA") # 12_120291826_120291826_T_TA

is_insertion = is_n64_65insT | is_b | is_c | is_d 


is_n64_65insT_predicate = monoallelic_predicate(
    a_predicate=is_n64_65insT,
    b_predicate=~is_n64_65insT,
    names=("n.64_65insT", "other")
)

is_insertion_predicate = monoallelic_predicate(
    a_predicate=is_insertion,
    b_predicate=~is_insertion,
    names=("insertion", "other")
)

is_n64_65insT_predicate.display_question()

'Allele group: n.64_65insT, other'

In [7]:
result = analysis.compare_genotype_vs_phenotypes(
    cohort=cohort,
    gt_predicate=is_n64_65insT_predicate,
    pheno_predicates=pheno_predicates,
)
from gpsea.view import MtcStatsViewer

viewer = MtcStatsViewer()
viewer.process(result)

Code,Reason,Count
HMF01,Skipping term with maximum frequency that was less than threshold 0.4,145
HMF03,Skipping term because of a child term with the same individual counts,11
HMF08,Skipping general term,84
HMF09,Skipping term with maximum annotation frequency that was less than threshold 0.4,176


In [8]:
from gpsea.view import summarize_hpo_analysis

summarize_hpo_analysis(hpo=hpo, result=result)

Allele group,n.64_65insT,n.64_65insT,other,other,Unnamed: 5_level_0,Unnamed: 6_level_0
Unnamed: 0_level_1,Count,Percent,Count,Percent,Corrected p values,p values
Severe global developmental delay [HP:0011344],33/38,87%,2/7,29%,0.090187,0.003469
Moderate global developmental delay [HP:0011343],5/38,13%,5/7,71%,0.090187,0.003469
Absent speech [HP:0001344],45/51,88%,4/8,50%,0.386065,0.022273
Hypotonia [HP:0001252],46/49,94%,5/8,62%,0.400098,0.030777
Hearing impairment [HP:0000365],6/38,16%,3/7,43%,1.0,0.130678
Primary microcephaly [HP:0011451],19/34,56%,1/5,20%,1.0,0.181764
Seizure [HP:0001250],37/52,71%,8/8,100%,1.0,0.18244
Constipation [HP:0002019],32/48,67%,3/7,43%,1.0,0.241929
Delayed ability to walk [HP:0031936],40/41,98%,6/7,86%,1.0,0.27305
Strabismus [HP:0000486],29/49,59%,3/7,43%,1.0,0.446515


In [9]:
result = analysis.compare_genotype_vs_phenotypes(
    cohort=cohort,
    gt_predicate=is_insertion_predicate,
    pheno_predicates=pheno_predicates,
)
summarize_hpo_analysis(hpo=hpo, result=result)

Allele group,insertion,insertion,other,other,Unnamed: 5_level_0,Unnamed: 6_level_0
Unnamed: 0_level_1,Count,Percent,Count,Percent,Corrected p values,p values
Moderate global developmental delay [HP:0011343],6/41,15%,4/4,100%,0.036646,0.001409
Severe global developmental delay [HP:0011344],35/41,85%,0/4,0%,0.036646,0.001409
Absent speech [HP:0001344],47/54,87%,2/5,40%,0.525091,0.030294
High forehead [HP:0000348],3/56,5%,2/5,40%,0.64,0.049231
Diabetes insipidus [HP:0000873],9/48,19%,3/5,60%,0.6469,0.070212
Hypotonia [HP:0001252],48/52,92%,3/5,60%,0.6469,0.080878
Failure to thrive [HP:0001508],36/50,72%,1/4,25%,0.6469,0.087083
Constipation [HP:0002019],34/51,67%,1/4,25%,0.852773,0.131196
Delayed ability to walk [HP:0031936],43/44,98%,3/4,75%,0.93223,0.161348
Short stature [HP:0004322],36/46,78%,2/4,50%,1.0,0.239917


In [10]:
from gpsea.analysis.predicate.genotype import sex_predicate
result = analysis.compare_genotype_vs_phenotypes(
    cohort=cohort,
    gt_predicate=sex_predicate(),
    pheno_predicates=pheno_predicates,
)
from gpsea.view import summarize_hpo_analysis
summary_df = summarize_hpo_analysis(hpo, result)
summary_df

Sex of the individual,FEMALE,FEMALE,MALE,MALE,Unnamed: 5_level_0,Unnamed: 6_level_0
Unnamed: 0_level_1,Count,Percent,Count,Percent,Corrected p values,p values
Reduced circulating growth hormone concentration [HP:0034323],0/11,0%,11/22,50%,0.256264,0.005025
Feeding difficulties [HP:0011968],16/18,89%,22/22,100%,1.0,0.196154
Gastroesophageal reflux [HP:0002020],9/24,38%,17/31,55%,1.0,0.277773
Delayed gross motor development [HP:0002194],19/20,95%,27/27,100%,1.0,0.425532
Short stature [HP:0004322],18/22,82%,20/28,71%,1.0,0.511611
Intrauterine growth retardation [HP:0001511],8/17,47%,9/27,33%,1.0,0.525863
Hypotonia [HP:0001252],24/26,92%,27/31,87%,1.0,0.67794
Facial hypotonia [HP:0000297],4/10,40%,5/16,31%,1.0,0.692449
Severe global developmental delay [HP:0011344],15/20,75%,20/25,80%,1.0,0.731035
Absent speech [HP:0001344],23/27,85%,26/32,81%,1.0,0.741246
