# Test Venn Diagram Function
This notebook tests `venn_ip()` for creating Venn diagrams of protein overlaps.

**Input:** Loads `data_after_stat.pkl` (statistical results)  
**Output:** Venn diagram and overlap CSV files

In [None]:
# Cell 1: Imports
import sys
sys.path.append('..')

from ipms.analysis import load_data, venn_ip
import pandas as pd

print("âœ“ All imports successful!")




# Cell 2: Load Statistical Results
data = load_data('../results/data_after_stat.pkl')

print(f"\nâœ“ Loaded {data['metadata']['n_proteins']} proteins")
print(f"Comparisons: {list(data['significant_proteins'].keys())}")
print(f"Thresholds: {data['stats_params']['threshold_label']}")



# Cell 3: Create Venn Diagram
venn_results = venn_ip(data, show_names=True, top_n=30)

print("\nâœ“ Venn diagram created!")
print("Check results/figures/viz/ for the plot")




# # Cell 4: Show Protein Names in Each Category
# venn_results = venn_ip(data, show_names=True, top_n=30)

# print("\nâœ“ Check output above for protein lists!")





# Cell 5: Inspect Overlap Results Programmatically
if venn_results:
    print("="*60)
    print("OVERLAP SUMMARY")
    print("="*60)
    
    for category, proteins in venn_results['overlaps'].items():
        print(f"\n{category}: {len(proteins)} proteins")
    
    # Access specific categories
    print("\n" + "="*60)
    print("EXAMPLE: Access shared proteins")
    print("="*60)
    
    if 'shared' in venn_results['overlaps']:
        shared_proteins = venn_results['overlaps']['shared']
        print(f"\nProteins in both conditions: {len(shared_proteins)}")
        print(f"Accessions: {list(shared_proteins)[:5]}...")

## Understanding the Venn Diagram

### **For 2 Comparisons (e.g., WT vs EV, d2d3 vs EV):**

- **Left circle only** (red): Proteins enriched ONLY in WT
- **Right circle only** (blue): Proteins enriched ONLY in d2d3
- **Overlap** (purple): Proteins enriched in BOTH WT and d2d3

### **What This Tells You:**

**Shared proteins (overlap):**
- Core AS3MT interactors
- Work with both WT and d2d3 isoform
- Most important for AS3MT function

**WT-only proteins:**
- Specific to WT AS3MT
- Lost when exons 2-3 are deleted
- May require intact structure

**d2d3-only proteins:**
- Specific to AS3MT-d2d3 isoform
- New interactions gained
- May compensate for lost function

---

## Output Files

### **Venn Diagram:**
- `results/figures/viz/venn_diagram_pval05_l2fc1.pdf`

### **CSV Files (one per category):**
- `venn_WT_vs_EV_only_pval05_l2fc1.csv` - WT-specific proteins
- `venn_d2d3_vs_EV_only_pval05_l2fc1.csv` - d2d3-specific proteins  
- `venn_shared_pval05_l2fc1.csv` - Proteins in both

Each CSV contains:
- Protein accession
- Gene symbol
- Log2FC for each comparison
- P-values
- Average log2FC (sorted by this)

---

## Next Steps

1. **Examine shared proteins** - Core AS3MT interactors
2. **Check WT-only** - What's lost in d2d3?
3. **Check d2d3-only** - New interactions?
4. **Pathway analysis** on each group
5. **Validate top hits** with Western blot

**Your Venn analysis is complete!** ðŸŽ‰