# SUOX


Sulfite Oxidase Deficiency

Data from [Li JT, et al. Mutation analysis of SUOX in isolated sulfite oxidase deficiency with ectopia lentis as the presenting feature: insights into genotype-phenotype correlation](https://pubmed.ncbi.nlm.nih.gov/36303223/)

In [1]:
import genophenocorr

print(f"Using genophenocorr version {genophenocorr.__version__}")

Using genophenocorr version 0.1.1dev


## Setup

### Load HPO

We use HPO `v2023-10-09` release for this analysis.

In [2]:
import hpotk

store = hpotk.configure_ontology_store()
hpo = store.load_minimal_hpo(release='v2023-10-09')
print(f'Loaded HPO v{hpo.version}')

Loaded HPO v2023-10-09


### Load Phenopackets

We will load phenopacket JSON files located in `phenopackets` folder that is next to the notebook.

In [3]:
from genophenocorr.preprocessing import configure_caching_cohort_creator, load_phenopacket_folder

fpath_phenopackets = 'phenopackets'
cohort_creator = configure_caching_cohort_creator(hpo)
cohort = load_phenopacket_folder(fpath_phenopackets, cohort_creator)

Patients Created: 100%|██████████| 35/35 [00:00<00:00, 643.59it/s]
Validated under none policy
35 phenopacket(s) found at `phenopackets`
  patient #0
    phenotype-features
     ·No diseases found.
  patient #1
    phenotype-features
     ·No diseases found.
  patient #2
    phenotype-features
     ·No diseases found.
  patient #3
    phenotype-features
     ·No diseases found.
  patient #4
    phenotype-features
     ·No diseases found.
  patient #5
    phenotype-features
     ·No diseases found.
  patient #6
    phenotype-features
     ·No diseases found.
  patient #7
    phenotype-features
     ·No diseases found.
  patient #8
    phenotype-features
     ·No diseases found.
  patient #9
    phenotype-features
     ·No diseases found.
  patient #10
    phenotype-features
     ·No diseases found.
  patient #11
    phenotype-features
     ·No diseases found.
  patient #12
    phenotype-features
     ·No diseases found.
  patient #13
    phenotype-features
     ·No diseases found.
  pat

### Pick transcript

We choose the [MANE Select](https://www.ncbi.nlm.nih.gov/nuccore/NM_001032386.2) transcript for *SUOX*.

In [4]:
tx_id = 'NM_001032386.2'

## Explore cohort

Explore the cohort to guide selection of the genotype-phenotype analysis.

In [5]:
from IPython.display import HTML, display
from genophenocorr.view import CohortViewer

viewer = CohortViewer(hpo)

In [6]:
cohort.list_all_variants(10)

[('12_56004589_56004589_C_G', 7),
 ('12_56004039_56004039_G_A', 3),
 ('12_56004485_56004485_C_T', 3),
 ('12_56004765_56004765_G_A', 3),
 ('12_56004771_56004771_A_T', 2),
 ('12_56004933_56004933_A_ACAATGTGCAGCCAGACACCGTGGCCC', 2),
 ('12_56004905_56004909_ATTGT_A', 2),
 ('12_56004273_56004273_G_A', 2),
 ('12_56004192_56004192_G_A', 1),
 ('12_56004486_56004486_G_A', 1)]

In [7]:
cohort.list_all_phenotypes()

[('HP:0001250', 28),
 ('HP:0001252', 15),
 ('HP:0032350', 13),
 ('HP:0002071', 11),
 ('HP:0001276', 11),
 ('HP:0000252', 10),
 ('HP:0012758', 8),
 ('HP:0001083', 7),
 ('HP:0003537', 7),
 ('HP:0500152', 7),
 ('HP:0034332', 6),
 ('HP:0003166', 5),
 ('HP:0011935', 2),
 ('HP:0034745', 2),
 ('HP:0010934', 2),
 ('HP:0011814', 1),
 ('HP:0500181', 1)]

In [8]:
cohort.list_data_by_tx()

{'NM_001032386.2': Counter({'MISSENSE_VARIANT': 29,
          'STOP_GAINED': 10,
          'FRAMESHIFT_VARIANT': 9}),
 'NM_001032387.2': Counter({'MISSENSE_VARIANT': 29,
          'STOP_GAINED': 10,
          'FRAMESHIFT_VARIANT': 9}),
 'NM_000456.3': Counter({'MISSENSE_VARIANT': 29,
          'STOP_GAINED': 10,
          'FRAMESHIFT_VARIANT': 9})}

In [9]:
len(cohort.list_all_patients())

35

## Configure the analysis

In [10]:
from genophenocorr.analysis import configure_cohort_analysis

analysis = configure_cohort_analysis(cohort, hpo)

## Run the analyses

Test for presence of genotype-phenotype correlations between subjects with missense variants vs. the other subjects.

In [11]:
from genophenocorr.model import VariantEffect
from genophenocorr.analysis.predicate import PatientCategories

missense = analysis.compare_by_variant_effect(VariantEffect.MISSENSE_VARIANT, tx_id)
missense.summarize(hpo, PatientCategories.YES)

MISSENSE_VARIANT on NM_001032386.2,Yes,Yes,No,No,Unnamed: 5_level_0,Unnamed: 6_level_0
Unnamed: 0_level_1,Count,Percent,Count,Percent,p value,Corrected p value
Seizure [HP:0001250],17/24,71%,11/11,100%,0.072129,1.0
Hypouricemia [HP:0003537],3/10,30%,4/5,80%,0.118881,1.0
Cognitive regression [HP:0034332],6/17,35%,0/8,0%,0.129170,1.0
Increased urinary taurine [HP:0003166],5/5,100%,0/1,0%,0.166667,1.0
Hypotonia [HP:0001252],12/16,75%,3/7,43%,0.181896,1.0
...,...,...,...,...,...,...
Xanthinuria [HP:0010934],2/9,22%,0/2,0%,1.000000,1.0
Abnormality of head or neck [HP:0000152],6/6,100%,4/4,100%,1.000000,1.0
Abnormal muscle physiology [HP:0011804],14/14,100%,5/5,100%,1.000000,1.0
Abnormality of the urinary system [HP:0000079],12/12,100%,2/2,100%,1.000000,1.0


Test for presence of genotype-phenotype correlations between subjects with >=1 allele of a variant vs. the others.

In [12]:
by_variant = analysis.compare_by_variant_key('12_56004589_56004589_C_G')
by_variant.summarize(hpo, PatientCategories.YES)

>=1 allele of the variant 12_56004589_56004589_C_G,Yes,Yes,No,No,Unnamed: 5_level_0,Unnamed: 6_level_0
Unnamed: 0_level_1,Count,Percent,Count,Percent,p value,Corrected p value
Hypotonia [HP:0001252],1/5,20%,14/18,78%,0.032869,1.0
Neurodevelopmental delay [HP:0012758],0/6,0%,8/19,42%,0.129170,1.0
Abnormality of extrapyramidal motor function [HP:0002071],1/6,17%,10/19,53%,0.180435,1.0
Cognitive regression [HP:0034332],0/6,0%,6/19,32%,0.277764,1.0
Ectopia lentis [HP:0001083],2/3,67%,5/15,33%,0.528186,1.0
...,...,...,...,...,...,...
Xanthinuria [HP:0010934],0/1,0%,2/10,20%,1.000000,1.0
Abnormality of head or neck [HP:0000152],2/2,100%,8/8,100%,1.000000,1.0
Abnormal muscle physiology [HP:0011804],2/2,100%,17/17,100%,1.000000,1.0
Abnormality of the urinary system [HP:0000079],2/2,100%,12/12,100%,1.000000,1.0


TODO - finalize!