# Test Visualization Function
This notebook tests `viz_ip()` for creating volcano plots and heatmaps.

**Input:** Loads `data_after_stat.pkl` (statistical results)  
**Output:** Volcano plots, heatmaps saved to results/figures/viz/

# Imports/ Load/ Graphs

In [None]:
# Imports
import sys
sys.path.append('..')

import pandas as pd
import importlib
import ipms.analysis
importlib.reload(ipms.analysis)
from ipms.analysis import load_data, viz_ip

print("âœ“ All imports successful!")
# Statistical Results
data = load_data('../results/data_after_stat.pkl')

print(f"\nâœ“ Loaded {data['metadata']['n_proteins']} proteins")
print(f"Comparisons: {list(data['significant_proteins'].keys())}")
print(f"Thresholds: {data['stats_params']['threshold_label']}")


# Create All Visualizations (Default)
viz_ip(data)

print("\nâœ“ Visualizations complete!")
print("Check results/figures/viz/ for plots")

In [None]:
# Cell 4: Custom Visualizations (Optional)
# Uncomment to customize

# Just volcano plots, no heatmap
# viz_ip(data, create_volcano=True, create_heatmap=False)

# Just heatmap, no volcano
# viz_ip(data, create_volcano=False, create_heatmap=True)

# Show top 100 proteins in heatmap
# viz_ip(data, top_n=100)

print("(Uncomment above to try custom options)")

# Double check output files

In [None]:
# Cell 5: Check Output Files
import os

viz_dir = '../results/figures/viz'
threshold_label = data['stats_params']['threshold_label']

print("Visualization files:")
print("="*60)

# Check for volcano plots
for comparison in data['significant_proteins'].keys():
    volcano_file = f'volcano_{comparison}_{threshold_label}.pdf'
    filepath = os.path.join(viz_dir, volcano_file)
    if os.path.exists(filepath):
        size_kb = os.path.getsize(filepath) / 1024
        print(f"âœ“ {volcano_file} ({size_kb:.1f} KB)")
    else:
        print(f"âœ— {volcano_file} NOT FOUND")

# Check for heatmap
heatmap_file = f'heatmap_top_proteins_{threshold_label}.pdf'
filepath = os.path.join(viz_dir, heatmap_file)
if os.path.exists(filepath):
    size_kb = os.path.getsize(filepath) / 1024
    print(f"âœ“ {heatmap_file} ({size_kb:.1f} KB)")
else:
    print(f"âœ— {heatmap_file} NOT FOUND")

print("\nâœ“ All plots created successfully!")



# Cell 6: Summary
print("="*60)
print("VISUALIZATION SUMMARY")
print("="*60)

print(f"\nProteins analyzed: {data['metadata']['n_proteins']}")
print(f"Comparisons visualized: {len(data['significant_proteins'])}")

print(f"\nSignificant proteins per comparison:")
for comparison, stats in data['significant_proteins'].items():
    print(f"  {comparison}: {stats['total']}")

print(f"\nPlots created:")
for comparison in data['significant_proteins'].keys():
    print(f"  â€¢ Volcano plot: {comparison}")
print(f"  â€¢ Heatmap: Top enriched proteins")

print("\n" + "="*60)
print("ANALYSIS PIPELINE COMPLETE!")
print("="*60)
print("\nYour results are ready:")
print("  â€¢ CSV tables: results/tables/")
print("  â€¢ Plots: results/figures/viz/")
print("  â€¢ QC plots: results/figures/qc/")

## Understanding the Plots

### Volcano Plots
**What they show:**
- X-axis: Log2 fold change (treatment vs control)
- Y-axis: -Log10 adjusted p-value (significance)
- Colors:
  - **Red**: Enriched (significant, positive FC)
  - **Blue**: Depleted (significant, negative FC)
  - **Gray**: Not significant

**How to read:**
- Top right = Highly enriched, very significant
- Top left = Highly depleted, very significant
- Top proteins are labeled with gene symbols

### Heatmap
**What it shows:**
- Rows: Top significant proteins (by fold change)
- Columns: Your samples (grouped by condition)
- Colors: Normalized intensity (log2)
  - Red = High intensity
  - Blue = Low intensity

**What to look for:**
- Proteins should be high (red) in treatment samples
- Proteins should be low (blue) in control samples
- Replicates should cluster together

---

## Next Steps

### For Publication:
1. Open the PDFs in results/figures/viz/
2. Import into Illustrator/Inkscape for final formatting
3. Adjust colors, labels, font sizes as needed

### For Further Analysis:
1. Use the CSV files in results/tables/
2. Perform pathway enrichment (GO, KEGG)
3. Look up interactions in STRING database
4. Validate top hits with Western blot

### Your Data:
- **stats_results_pval05_l2fc1.csv** - All proteins with statistics
- **WT_vs_EV_significant_pval05_l2fc1.csv** - WT-specific interactors
- **d2d3_vs_EV_significant_pval05_l2fc1.csv** - d2d3-specific interactors

**ðŸŽ‰ Your IP-MS analysis is complete!**