# NRXN1 Structural Variant Analysis - Getting Started

This notebook provides an introduction to the NRXN1 SV analysis pipeline.

## Overview

NRXN1 (Neurexin 1) is one of the largest genes in the human genome (~1.1 Mb) and encodes a critical synaptic adhesion molecule. Copy number variants (CNVs) in NRXN1 are strongly associated with:
- Autism Spectrum Disorder
- Schizophrenia
- Intellectual Disability
- Other neurodevelopmental conditions

In [None]:
import sys
sys.path.insert(0, '..')

import pandas as pd
import numpy as np
import plotly.graph_objects as go
from plotly.subplots import make_subplots

from src.genomics.regions import NRXN1Region, GenomicRegion
from src.genomics.variants import Variant, VariantLoader, VariantFilter
from src.cnv.detector import CNVDetector, CNVCall, CNVType
from src.visualization.plots import GeneViewer, CNVPlotter

## 1. Exploring the NRXN1 Gene Structure

In [None]:
nrxn1 = NRXN1Region()

print(f"NRXN1 Location: chr{nrxn1.CHROMOSOME}:{nrxn1.START:,}-{nrxn1.END:,}")
print(f"Gene Size: {(nrxn1.END - nrxn1.START):,} bp ({(nrxn1.END - nrxn1.START)/1e6:.2f} Mb)")
print(f"Number of Alpha Exons: {len(nrxn1.exons_alpha)}")
print(f"Number of Beta Exons: {len(nrxn1.exons_beta)}")
print(f"Functional Domains: {len(nrxn1.domains)}")

In [None]:
exon_df = nrxn1.to_dataframe()
exon_df.head(10)

## 2. Visualizing Gene Structure

In [None]:
viewer = GeneViewer()
fig_dict = viewer.create_gene_track(show_domains=True)
fig = go.Figure(fig_dict)
fig.show()

## 3. Working with Variants

In [None]:
example_variants = [
    Variant("2", 50200000, "A", "G", variant_id="var1", quality=35.0),
    Variant("2", 50300000, "ATCG", "A", variant_id="var2", quality=40.0),
    Variant("2", 50400000, "C", "CAGT", variant_id="var3", quality=30.0),
]

for var in example_variants:
    print(f"{var.variant_id}: {var.variant_type}, Length: {var.length}, Quality: {var.quality}")

In [None]:
vf = VariantFilter()
filtered = vf.apply_filters(
    example_variants,
    min_quality=32.0,
    pass_only=False
)
print(f"Filtered variants: {len(filtered)} / {len(example_variants)}")

## 4. CNV Detection Example

In [None]:
example_cnvs = [
    CNVCall("2", 50200000, 50350000, CNVType.DELETION, "manta", quality=55.0),
    CNVCall("2", 50500000, 50650000, CNVType.DUPLICATION, "delly", quality=45.0),
    CNVCall("2", 50800000, 51000000, CNVType.DELETION, "cnvnator", quality=60.0),
]

for cnv in example_cnvs:
    affected = cnv.get_affected_exons(nrxn1)
    print(f"{cnv.cnv_type.value}: {cnv.length:,} bp, Caller: {cnv.caller}, Affected exons: {len(affected)}")

In [None]:
plotter = CNVPlotter()
landscape = plotter.plot_cnv_landscape(example_cnvs, "Example CNV Landscape")
fig = go.Figure(landscape)
fig.show()

## 5. Next Steps

- See `02_cnv_analysis.ipynb` for detailed CNV detection
- See `03_annotation.ipynb` for variant annotation
- See `04_ml_prediction.ipynb` for pathogenicity prediction