# Retinal Degeneration Associated With RPGRIP1


Data from [Beryozkin A, et al. Retinal Degeneration Associated With RPGRIP1: A Review of Natural History, Mutation Spectrum, and Genotype-Phenotype Correlation in 228 Patients](https://pubmed.ncbi.nlm.nih.gov/34722527)

In [1]:
import genophenocorr

print(f"Using genophenocorr version {genophenocorr.__version__}")

Using genophenocorr version 0.1.1dev


## Setup

### Load HPO

We use HPO `v2023-10-09` release for this analysis.

In [2]:
import hpotk

store = hpotk.configure_ontology_store()
hpo = store.load_minimal_hpo(release='v2023-10-09')
print(f'Loaded HPO v{hpo.version}')

Loaded HPO v2023-10-09


### Load Phenopackets

We will load phenopacket JSON files located in `phenopackets` folder that is next to the notebook.

In [3]:
from genophenocorr.preprocessing import configure_caching_cohort_creator, load_phenopacket_folder

fpath_phenopackets = 'phenopackets'
cohort_creator = configure_caching_cohort_creator(hpo)
cohort = load_phenopacket_folder(fpath_phenopackets, cohort_creator)

Patients Created:   1%|          | 2/229 [00:02<04:00,  1.06s/it]Expected a result but got an Error for variant: 14_21320155_21320155_T_A
{"error":"Could not connect to database homo_sapiens_core_111_38 as user ensro using [DBI:mysql:database=homo_sapiens_core_111_38;host=fb1-mysql-ens-rest-web.ebi.ac.uk;port=4571] as a locator:DBI connect('database=homo_sapiens_core_111_38;host=fb1-mysql-ens-rest-web.ebi.ac.uk;port=4571','ensro',...) failed: Can't connect to MySQL server on 'fb1-mysql-ens-rest-web.ebi.ac.uk' (111) at /nfs/public/ro/ensweb/live/rest/www_111/ensembl/modules/Bio/EnsEMBL/DBSQL/DBConnection.pm line 260."}
Patients Created:   9%|▉         | 21/229 [00:14<02:12,  1.57it/s]Expected a result but got an Error for variant: 14_21348303_21348303_G_A
{"error":"Could not connect to database homo_sapiens_core_111_38 as user ensro using [DBI:mysql:database=homo_sapiens_core_111_38;host=fb1-mysql-ens-rest-web.ebi.ac.uk;port=4571] as a locator:DBI connect('database=homo_sapiens_core_111

### Pick transcript

We choose the [MANE Select](https://www.ncbi.nlm.nih.gov/nuccore/NM_020366.4) transcript for *RPGRIP1*.

In [4]:
tx_id = 'NM_020366.4'

## Explore cohort

Explore the cohort to guide selection of the genotype-phenotype analysis.


In [5]:
cohort.list_all_variants()

[('14_21312457_21312458_GA_G', 25),
 ('14_21325943_21325943_G_T', 12),
 ('14_21302530_21302531_AG_A', 8),
 ('14_21345145_21345145_C_T', 8),
 ('14_21325252_21325252_G_A', 7),
 ('14_21345139_21345146_CAAGGCCG_C', 7),
 ('14_21327671_21327671_A_AT', 6),
 ('14_21325265_21325265_A_G', 5),
 ('14_21303542_21303542_C_T', 5),
 ('14_21317724_21317724_C_T', 5),
 ('14_21326131_21326131_C_T', 4),
 ('14_21327800_21327801_CT_C', 4),
 ('14_21348210_21348214_AAAAG_A', 4),
 ('14_21326544_21327883_ATTTTTAGTAGAGATGGGATTTCTCCATGTTGGTCAGGCTGGTCTTCAACTCCCGACCTCAGGTGAACCTCCCACCTGAGCCTCCCAAAGTGCTGGGATTACAGACGTGAGCCACCGCGCCTGGCTGAACAAACTTTTTCAAGCTCTGTAATGCTGTCTAGTATCTGTCTTTACTAAAGGCCTGTTGTTTCTTAGTGCATGACTACATAGATATCTGATTATAAACTGAGACCTTAACACTCCCCCATCATTCTCTCACTTCTTTTAAACACTGGACACAAGTTAGAGAGATTTCCACACCAGATCATGACAAACACAAATTTCTTGGATTTTTTTTTTCCTCCCAATGTGGAGCTGAGCTCCATACTGTCTTTCCTAACTTTTATACCTAGGATTGTGGGGGTGTACCAAGAGGGGTCAACTCTTTGACTACAGTCCTGGGAGGGTGAGGTGGGGGTATCCATGTTTTCCTTAGGAAGTGGGGATAGCTGCAGTCAGAAACAACCATATTTAACAA

In [6]:
cohort.list_data_by_tx()

{'NM_001377950.1': Counter({'INTRON_VARIANT': 83,
          'FRAMESHIFT_VARIANT': 35,
          'STOP_GAINED': 34,
          'SPLICE_ACCEPTOR_VARIANT': 10,
          'SPLICE_REGION_VARIANT': 8,
          'SPLICE_DONOR_VARIANT': 7,
          'MISSENSE_VARIANT': 5,
          'CODING_SEQUENCE_VARIANT': 5,
          'SPLICE_DONOR_5TH_BASE_VARIANT': 4,
          'INFRAME_DELETION': 1,
          'SPLICE_POLYPYRIMIDINE_TRACT_VARIANT': 1}),
 'NM_001377523.1': Counter({'INTRON_VARIANT': 82,
          'FRAMESHIFT_VARIANT': 35,
          'STOP_GAINED': 34,
          'SPLICE_ACCEPTOR_VARIANT': 10,
          'SPLICE_REGION_VARIANT': 8,
          'SPLICE_DONOR_VARIANT': 7,
          'MISSENSE_VARIANT': 5,
          'CODING_SEQUENCE_VARIANT': 5,
          'SPLICE_DONOR_5TH_BASE_VARIANT': 4,
          'INFRAME_DELETION': 1,
          'SPLICE_POLYPYRIMIDINE_TRACT_VARIANT': 1}),
 'NM_020366.4': Counter({'FRAMESHIFT_VARIANT': 91,
          'STOP_GAINED': 78,
          'MISSENSE_VARIANT': 47,
          'S

## Configure the analysis

In [7]:
from genophenocorr.analysis import configure_cohort_analysis

analysis = configure_cohort_analysis(cohort, hpo)

## Run the analyses

Test for presence of genotype-phenotype correlations between missense variants vs. others.

In [8]:
from genophenocorr.model import VariantEffect
from genophenocorr.analysis.predicate import PatientCategories

by_missense = analysis.compare_by_variant_effect(VariantEffect.MISSENSE_VARIANT, tx_id=tx_id)
by_missense.summarize(hpo, PatientCategories.YES)

MISSENSE_VARIANT on NM_020366.4,Yes,Yes,No,No,Unnamed: 5_level_0,Unnamed: 6_level_0
Unnamed: 0_level_1,Count,Percent,Count,Percent,p value,Corrected p value
Eye poking [HP:0001483],3/5,60%,24/33,73%,0.615394,1.0
Phenotypic abnormality [HP:0000118],23/23,100%,96/96,100%,1.0,1.0
Myopia [HP:0000545],0/0,0%,4/4,100%,1.0,1.0
Reduced visual acuity [HP:0007663],23/23,100%,89/89,100%,1.0,1.0
Visual field defect [HP:0001123],1/1,100%,9/9,100%,1.0,1.0
Moderate hypermetropia [HP:0031729],3/3,100%,9/9,100%,1.0,1.0
Self-injurious behavior [HP:0100716],3/3,100%,24/24,100%,1.0,1.0
Abnormal eye physiology [HP:0012373],23/23,100%,94/94,100%,1.0,1.0
Restricted or repetitive behaviors or interests [HP:0031432],3/3,100%,24/24,100%,1.0,1.0
Mild hypermetropia [HP:0031728],1/1,100%,1/1,100%,1.0,1.0


Test for presence of genotype-phenotype correlations between frameshift variants vs. others.

In [9]:
by_frameshift = analysis.compare_by_variant_effect(VariantEffect.FRAMESHIFT_VARIANT, tx_id=tx_id)
by_frameshift.summarize(hpo, PatientCategories.YES)

FRAMESHIFT_VARIANT on NM_020366.4,Yes,Yes,No,No,Unnamed: 5_level_0,Unnamed: 6_level_0
Unnamed: 0_level_1,Count,Percent,Count,Percent,p value,Corrected p value
Eye poking [HP:0001483],22/26,85%,5/12,42%,0.017391,0.643461
Phenotypic abnormality [HP:0000118],51/51,100%,68/68,100%,1.0,1.0
Myopia [HP:0000545],4/4,100%,0/0,0%,1.0,1.0
Reduced visual acuity [HP:0007663],47/47,100%,65/65,100%,1.0,1.0
Visual field defect [HP:0001123],5/5,100%,5/5,100%,1.0,1.0
Moderate hypermetropia [HP:0031729],7/7,100%,5/5,100%,1.0,1.0
Self-injurious behavior [HP:0100716],22/22,100%,5/5,100%,1.0,1.0
Abnormal eye physiology [HP:0012373],51/51,100%,66/66,100%,1.0,1.0
Restricted or repetitive behaviors or interests [HP:0031432],22/22,100%,5/5,100%,1.0,1.0
Mild hypermetropia [HP:0031728],1/1,100%,1/1,100%,1.0,1.0


Or between subjects with >=1 allele of a variant vs. the other subjects:

In [10]:
variant_key = '14_21312457_21312458_GA_G'

by_var = analysis.compare_by_variant_key(variant_key)
by_var.summarize(hpo, PatientCategories.YES)

>=1 allele of the variant 14_21312457_21312458_GA_G,Yes,Yes,No,No,Unnamed: 5_level_0,Unnamed: 6_level_0
Unnamed: 0_level_1,Count,Percent,Count,Percent,p value,Corrected p value
Eye poking [HP:0001483],16/16,100%,11/22,50%,0.000736,0.027242
Phenotypic abnormality [HP:0000118],17/17,100%,102/102,100%,1.0,1.0
Myopia [HP:0000545],0/0,0%,4/4,100%,1.0,1.0
Reduced visual acuity [HP:0007663],17/17,100%,95/95,100%,1.0,1.0
Visual field defect [HP:0001123],0/0,0%,10/10,100%,1.0,1.0
Moderate hypermetropia [HP:0031729],0/0,0%,12/12,100%,1.0,1.0
Self-injurious behavior [HP:0100716],16/16,100%,11/11,100%,1.0,1.0
Abnormal eye physiology [HP:0012373],17/17,100%,100/100,100%,1.0,1.0
Restricted or repetitive behaviors or interests [HP:0031432],16/16,100%,11/11,100%,1.0,1.0
Mild hypermetropia [HP:0031728],0/0,0%,2/2,100%,1.0,1.0


TODO - finalize!