# Variant Interpretation Case Study: Frameshift variant in REEP1

This notebook walks through the clinical classification of a hypothetical frameshift variant using ACMG guidelines. The purpose of this exercise is to illustrate my understanding of variant curation principles as applied in a diagnostic genomics context.

**Patient Information:**

A 68-year-old male presents with a phenotype consistent with “pure” hereditary spastic paraplegia. Multiple individuals in his family appear similarly affected. The family pedigree suggests an autosomal dominant mode of inheritance.

A targeted gene panel was ordered for hereditary spastic paraplegia, including the following genes: *SPAST*, *ATL1*, *REEP1*, and *NIPA1*.

The VCF file is parsed below to extract basic variant-level information:

In [9]:
import pandas as pd
from io import StringIO
vcf_path = "mock_variant.vcf"
with open(vcf_path) as f:
    lines = [line for line in f if not line.startswith("##")]

# Join the header and data lines
vcf_content = "".join(lines)

# Load into a DataFrame
vcf_df = pd.read_csv(StringIO(vcf_content), sep="\t")

vcf_df.head()

Unnamed: 0,#CHROM,POS,ID,REF,ALT,QUAL,FILTER,INFO
0,2,86523147,.,C,-,.,PASS,ACMG_CLASS=Likely_pathogenic


Following external annotation using tools such as Ensembl VEP or ANNOVAR, the following variant was identified:

A heterozygous *REEP1* (NM_022912.2): c.471del, p.(Thr158fs) variant.

# Analysis:
**Variant type:**
- This is frameshift variant likely to leads to nonsense mediated decay (NMD)
- Too far upstream to warrant PVS1_VS criterion
- Variant removes 10% of protein and after MDT discussion PVS1_Strong is chosen <br>
**Classification: PVS1_Strong**

**Inheritence data:** 
- Family history suggests autosomal dominant inheritance
- Multiple family members seem to be affected but segregation data is insufficient
- There are no reports of this variant in the literature <br>
**No classification**

**Population databases:**
- Variant is not available in gnomAD or other major population databases <br>
**Classification: PM2_Moderate**

**Computational and predictive data:**
- In silico tools discussed in course were developed to evaluate missense and splice site variants, not frameshifts <br>
**No classification**

**Functional data:**
- There are no functional studies for this variant at the time <br>
**No classification**

**Phenotype data:** 
- Patient has only been tested using a limited gene panel
- There are many genes associated with the phenotypes presented in the patient <br>
**No classification**

# Final classification: Likely Pathogenic
Analysis results: PVS1_Strong and PM2_Moderate <br>
According to the ACMG guidelines, a combination of one Strong and one Moderate evidence supports a **Likely Pathogenic** classification

**Further considerations:** <br>
Given the family history, if additional relatives are available for testing, segregation analysis may strengthen the interpretation. A detailed pedigree review during MDT discussion could help determine whether such analysis would be informative.