# Test Boxplot Function
This notebook tests `boxplot_ip()` for visualizing log2 intensities of enriched proteins.

**Input:** Loads `data_after_stat.pkl` (statistical results)  
**Output:** Boxplot PDFs showing protein intensities across conditions

In [None]:
# Cell 1: Imports
import sys
sys.path.append('..')

from ipms.analysis import load_data, boxplot_ip
import pandas as pd

print("‚úì All imports successful!")

# Cell 2: Load Statistical Results
data = load_data('../results/data_after_stat.pkl')

print(f"\n‚úì Loaded {data['metadata']['n_proteins']} proteins")
print(f"Comparisons: {list(data['significant_proteins'].keys())}")
print(f"Thresholds: {data['stats_params']['threshold_label']}")

# Show how many enriched proteins
for comp, proteins in data['significant_proteins'].items():
    print(f"\n{comp}: {len(proteins)} enriched proteins")


# Cell 4: Create Boxplot with ALL Enriched Proteins
boxplot_ip(data, top_n=None, group_by='protein')

print("\n‚úì Boxplot of proteins.")


## Understanding the Boxplots

### **What They Show:**

**Boxplot grouped by condition:**
- Each row = one protein
- Box colors = different conditions (EV, WT, d2d3)
- Box = median + IQR (25th-75th percentile)
- Dots = individual replicate values

**Individual protein boxplots:**
- One panel per protein
- X-axis = conditions
- Y-axis = log2 intensity
- Easier to see individual protein patterns

### **What to Look For:**

‚úÖ **Good enrichment:**
- Treatment boxes much higher than EV
- Little overlap between treatment and control
- Tight boxes = consistent across replicates

‚ö†Ô∏è **Weak enrichment:**
- Lots of overlap with EV
- Wide boxes = high variability
- May be background binding

### **Example Interpretation:**

```
Protein: USP7
  EV:    median = 20 (low)
  WT:    median = 25 (high)  ‚Üê 5 log2 units higher!
  d2d3:  median = 24 (high)
  
Conclusion: USP7 is strongly enriched in both WT and d2d3
```

---

## Output Files

### **Boxplots:**
- `boxplot_enriched_proteins_pval05_l2fc1.pdf` - All proteins, grouped by condition
- `boxplot_individual_proteins_pval05_l2fc1.pdf` - Individual protein plots

---

## Tips

**Change number of proteins shown:**
```python
boxplot_ip(data, top_n=50)  # Show top 50
boxplot_ip(data, top_n=10)  # Show top 10
```

**Show all proteins:**
```python
boxplot_ip(data, top_n=None)  # All significant proteins
```

**Choose plot style:**
```python
boxplot_ip(data, group_by='condition')  # One plot, all proteins
boxplot_ip(data, group_by='protein')    # Grid of individual plots
```

---

**Your boxplot analysis is complete!** üìä