# R1: Biological Plausibility - CHIP Analysis

## Reviewer Question

**Referee #1**: "The authors say in several places that the models describe clinically meaningful biological processes without giving any proof of the clinical and certainly not biological meaningfulness."

## Why This Matters

Demonstrating biological plausibility is critical for validating that signatures capture real biological pathways.

## Our Approach

We demonstrate biological plausibility through **genetic mutation carrier analysis**:

1. **FH Carrier Analysis**: Familial Hypercholesterolemia carriers show Signature 5 enrichment (see R1_Q3_Clinical_Meaning.ipynb)
2. **CHIP Analysis**: Clonal hematopoiesis mutations show signature-specific enrichment patterns

**CHIP (Clonal Hematopoiesis of Indeterminate Potential)** causes chronic inflammation and is associated with multiple outcomes.

## Key Findings

✅ **CHIP carriers show Signature 16 enrichment** before multiple outcomes
✅ **Validates inflammatory pathway** → multiple disease outcomes


## 2. Signature 16 (Critical Care/Inflammation) Enrichment

Signature 16 captures critical care and inflammatory processes. CHIP mutations cause chronic inflammation, so we expect strong enrichment in this signature.
...

In [None]:
# Focus on Signature 16 associations
sig16_results = chip_summary[chip_summary['signature'] == 16].copy()
sig16_results = sig16_results.sort_values('p_value')

print("="*80)
print("SIGNATURE 16 (CRITICAL CARE/INFLAMMATION) ASSOCIATIONS")
print("="*80)

# Format for display
display_cols = ['mutation', 'outcome', 'n_carriers', 'carrier_prop_rising', 
                'noncarrier_prop_rising', 'OR', 'p_value']
sig16_display = sig16_results[display_cols].copy()
sig16_display['carrier_prop_rising'] = sig16_display['carrier_prop_rising'].apply(lambda x: f"{x:.1%}")
sig16_display['noncarrier_prop_rising'] = sig16_display['noncarrier_prop_rising'].apply(lambda x: f"{x:.1%}")
sig16_display['OR'] = sig16_display['OR'].apply(lambda x: f"{x:.2f}")
sig16_display['p_value'] = sig16_display['p_value'].apply(lambda x: f"{x:.2e}")

sig16_display.columns = ['Mutation', 'Outcome', 'N Carriers', 'Carrier % Rising', 
                          'Non-carrier % Rising', 'OR', 'P-value']

display(sig16_display.head(15))


## 3. Summary and Response

### Key Findings

1. **CHIP mutations show signature-specific enrichment**: CHIP carriers show strong enrichment in Signature 16 (inflammation/critical care) before multiple disease outcomes.

2. **Different genetic mechanisms map to different signatures**: FH carriers enrich Signature 5 (cardiovascular), while CHIP carriers enrich Signature 16 (inflammation), demonstrating biological specificity.

### Response to Reviewer

We demonstrate biological meaningfulness through genetic mutation carrier analysis. CHIP carriers (acquired inflammatory mutations) show enrichment in **Signature 16** (inflammation/critical care) before multiple outcomes (hematologic, cardiovascular, infectious), validating the inflammation→disease pathway. The distinct signature enrichment patterns for different genetic mechanisms demonstrates that our signatures capture biologically distinct pathways.


In [None]:
import pandas as pd
from pathlib import Path
from IPython.display import display, Image

# Load CHIP summary results
chip_summary_path = Path("../../results/chip_multiple_signatures/chip_multiple_signatures_summary.csv")
chip_summary = pd.read_csv(chip_summary_path)

print("="*80)
print("CHIP ANALYSIS SUMMARY")
print("="*80)
print(f"
Total associations: {len(chip_summary)}")
print(f"Significant (p<0.05): {len(chip_summary[chip_summary["p_value"] < 0.05])}")

# Focus on Signature 16
sig16 = chip_summary[chip_summary["signature"] == 16].sort_values("p_value")
display(sig16.head(15))