# Hierarchical Effect Size Analysis for n=2 Spatial Biology Study

## The Statistical Reality

**Ground truth**: We have n=2 mice per timepoint. This cannot be transmuted into n=4,000 through statistical modeling.

**But**: We can leverage hierarchical spatial structure to build confidence WITHOUT pseudo-replication:

```
Sham condition:
  Mouse 1: 3 ROIs √ó ~2,000 superpixels each
  Mouse 2: 3 ROIs √ó ~2,000 superpixels each

UUO D1:
  Mouse 1: 3 ROIs √ó ~2,000 superpixels each  
  Mouse 2: 3 ROIs √ó ~2,000 superpixels each
```

## The Honest Approach

1. **Within-mouse effect sizes**: Calculate effect within each individual mouse
2. **Across-ROI consistency**: Do all 3 ROIs within a mouse show same direction?
3. **Across-mouse consistency**: Do both mice show same pattern?
4. **Multi-scale coherence**: Is effect visible at cell, superpixel, ROI, and mouse levels?

**Key insight**: We're testing CONSISTENCY, not statistical significance.
- If effect is d=2.5 in Mouse 1 and d=2.3 in Mouse 2 ‚Üí Consistent
- If effect is d=2.5 in Mouse 1 and d=-0.3 in Mouse 2 ‚Üí Inconsistent (biological variability)

## What We CAN Claim

‚úÖ "Effect observed consistently across all biological replicates"
‚úÖ "Within-mouse effect sizes are large (d > 2.0) with consistent direction"
‚úÖ "Pattern robust across hierarchical scales (cell ‚Üí superpixel ‚Üí ROI ‚Üí mouse)"
‚úÖ "Results align with published UUO literature, adding spatial resolution"

## What We CANNOT Claim

‚ùå "Statistically significant at p<0.05" (underpowered)
‚ùå "Generalizes to all C57BL/6 mice" (n=2 insufficient)
‚ùå "Superpixels are independent replicates" (pseudo-replication)
‚ùå "Mixed-effects modeling provides statistical power" (with n=2, it doesn't)

---

**This notebook demonstrates the honest approach to small-n spatial biology.**

In [5]:
import sys
from pathlib import Path
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from scipy import stats
import json
import gzip
from typing import Dict, List, Tuple
import warnings
warnings.filterwarnings('ignore')

# Add project root
project_root = Path().resolve().parent.parent
sys.path.insert(0, str(project_root))

from src.config import Config

sns.set_style('whitegrid')
plt.rcParams['figure.dpi'] = 150

## Load Hierarchical Data Structure

In [None]:
# Load kidney config
config = Config(str(project_root / 'config.json.backup'))
results_dir = project_root / 'results'
roi_results_dir = results_dir / 'roi_results'

markers = config.proteins
print(f"Analyzing {len(markers)} protein markers: {', '.join(markers)}")

# Parse ROI file structure - Include BOTH Sham controls and UUO timepoints
all_roi_files = sorted(roi_results_dir.glob('roi_*.json.gz'))

# Filter for kidney data: Sham (Sam1/Sam2) + UUO timepoints (D1/D3/D7)
roi_files = [f for f in all_roi_files if 'IMC_241218_Alun' in f.name and 
             (any(f'_D{i}_' in f.name for i in [1, 3, 7]) or '_Sam' in f.name)]

print(f"\nFound {len(roi_files)} kidney ROI files:")
print(f"  - Sham controls: {len([f for f in roi_files if 'Sam' in f.name])} ROIs")
print(f"  - UUO timepoints: {len([f for f in roi_files if any(f'_D{i}_' in f.name for i in [1,3,7])])} ROIs")

# Extract metadata from filenames
def parse_roi_name(filename: str) -> Dict[str, str]:
    \"\"\"Extract condition, timepoint, mouse, and ROI from filename.
    
    Examples:
      roi_IMC_241218_Alun_ROI_Sam1_01_2_results.json.gz
      ‚Üí {'condition': 'Sham', 'timepoint': 'Sham', 'mouse': 'Sam1', 'roi_num': '01'}
      
      roi_IMC_241218_Alun_ROI_D1_M1_01_9_results.json.gz
      ‚Üí {'condition': 'UUO', 'timepoint': 'D1', 'mouse': 'M1', 'roi_num': '01'}
    \"\"\"
    clean_name = filename.replace('.json.gz', '').replace('.json', '').replace('_results', '')
    parts = clean_name.split('_')
    
    # Determine if Sham or UUO
    if 'Sam' in clean_name:
        # Sham control
        sam_idx = next(i for i, p in enumerate(parts) if p.startswith('Sam'))
        mouse_id = parts[sam_idx]
        roi_idx = sam_idx + 1
        
        return {
            'condition': 'Sham',
            'timepoint': 'Sham',
            'mouse': mouse_id,
            'roi_num': parts[roi_idx],
            'full_name': clean_name,
            'group': f'Sham_{mouse_id}'
        }
    else:
        # UUO timepoint
        tp_idx = next(i for i, p in enumerate(parts) if p.startswith('D'))
        mouse_idx = tp_idx + 1
        roi_idx = mouse_idx + 1
        
        timepoint = parts[tp_idx]
        mouse_id = parts[mouse_idx]
        
        return {
            'condition': 'UUO',
            'timepoint': timepoint,
            'mouse': mouse_id,
            'roi_num': parts[roi_idx],
            'full_name': clean_name,
            'group': f'{timepoint}_{mouse_id}'
        }

# Build hierarchical index
roi_metadata = []
for f in roi_files:
    meta = parse_roi_name(f.name)
    meta['file_path'] = f
    roi_metadata.append(meta)

roi_df = pd.DataFrame(roi_metadata)

# Display hierarchical structure
print(f\"\nHierarchical structure (n={len(roi_df)} ROIs total):\")
print(\"\\nBy condition and timepoint:\")
print(roi_df.groupby(['condition', 'timepoint', 'mouse']).size())

print(f\"\nTotal mice: {len(roi_df.group.unique())} (2 Sham + 2√ó3 UUO timepoints)\")

## Extract Superpixel-Level Data

In [None]:
def load_superpixel_features(roi_file: Path, scale: float = 10.0) -> pd.DataFrame:
    """Load superpixel-level protein features from ROI results.
    
    Args:
        roi_file: Path to compressed JSON results
        scale: Spatial scale in microns (10.0, 20.0, 40.0)
        
    Returns:
        DataFrame with superpixel features and coordinates
    """
    with gzip.open(roi_file, 'rt') as f:
        results = json.load(f)
    
    # Extract scale data
    scale_key = str(float(scale))
    scale_data = results['multiscale_results'][scale_key]
    
    # Reconstruct features array
    features_data = scale_data['features']
    shape = features_data['shape']
    data_flat = features_data['data']
    features_array = np.array(data_flat).reshape(shape)
    
    # Get protein columns (first n_proteins columns)
    n_proteins = len(markers)
    protein_features = features_array[:, :n_proteins]
    
    # Reconstruct coordinates
    coords_data = scale_data['coordinates']
    coords_array = np.array(coords_data['data']).reshape(coords_data['shape'])
    
    # Build DataFrame
    df = pd.DataFrame(protein_features, columns=markers)
    df['x'] = coords_array[:, 0]
    df['y'] = coords_array[:, 1]
    df['superpixel_id'] = np.arange(len(df))
    
    return df

# Test loading
test_file = roi_files[0]
test_meta = parse_roi_name(test_file.name)
test_df = load_superpixel_features(test_file)
print(f"\nLoaded {test_meta['full_name']}:")
print(f"  {len(test_df):,} superpixels")
print(f"  {len(markers)} protein markers")
print(f"  Spatial extent: x=[{test_df.x.min():.1f}, {test_df.x.max():.1f}], y=[{test_df.y.min():.1f}, {test_df.y.max():.1f}]")

## Panel A: Within-Mouse Effect Sizes Across ROIs

**The key test**: If we have 3 ROIs from the same mouse, do they all show the same biological pattern?

This tests **within-mouse consistency** which is independent of between-mouse statistics.

In [None]:
def cohens_d(group1: np.ndarray, group2: np.ndarray) -> float:
    """Cohen's d effect size."""
    n1, n2 = len(group1), len(group2)
    var1, var2 = np.var(group1, ddof=1), np.var(group2, ddof=1)
    pooled_std = np.sqrt(((n1 - 1) * var1 + (n2 - 1) * var2) / (n1 + n2 - 2))
    return (np.mean(group1) - np.mean(group2)) / pooled_std

# For each mouse, compute effect size for each marker
# by comparing all superpixels in that mouse's ROIs
# This gives us WITHIN-MOUSE effect estimates

within_mouse_effects = []

for timepoint in ['D1', 'D3', 'D7']:
    for mouse in ['M1', 'M2']:
        # Get all ROIs for this mouse
        mouse_rois = roi_df[(roi_df.timepoint == timepoint) & (roi_df.mouse == mouse)]
        
        if len(mouse_rois) < 2:
            continue
        
        # Load all superpixels from this mouse's ROIs
        all_superpixels = []
        for _, row in mouse_rois.iterrows():
            sp_df = load_superpixel_features(row['file_path'])
            sp_df['roi'] = row['roi_num']
            all_superpixels.append(sp_df)
        
        combined = pd.concat(all_superpixels, ignore_index=True)
        
        # For each marker, compute pairwise effect sizes between ROIs
        roi_nums = sorted(combined['roi'].unique())
        
        for marker in markers:
            # Compute effect between each pair of ROIs within this mouse
            for i, roi1 in enumerate(roi_nums):
                for roi2 in roi_nums[i+1:]:
                    data1 = combined[combined.roi == roi1][marker].values
                    data2 = combined[combined.roi == roi2][marker].values
                    
                    d = cohens_d(data1, data2)
                    
                    within_mouse_effects.append({
                        'timepoint': timepoint,
                        'mouse': mouse,
                        'marker': marker,
                        'roi_pair': f'{roi1} vs {roi2}',
                        'cohens_d': d,
                        'abs_d': abs(d),
                        'n_roi1': len(data1),
                        'n_roi2': len(data2)
                    })

within_df = pd.DataFrame(within_mouse_effects)
print(f"\nComputed {len(within_df)} within-mouse ROI comparisons")
print(f"Median within-mouse effect: |d| = {within_df['abs_d'].median():.3f}")
print(f"This represents biological/technical heterogeneity WITHIN mouse")

In [None]:
# Panel A: Distribution of within-mouse effect sizes
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Left: Histogram
ax = axes[0]
ax.hist(within_df['abs_d'], bins=50, color='gray', alpha=0.7, edgecolor='black')
ax.axvline(within_df['abs_d'].median(), color='blue', linestyle='--', linewidth=2,
           label=f'Median = {within_df["abs_d"].median():.3f}')
ax.axvline(0.5, color='orange', linestyle='--', linewidth=1.5, alpha=0.6, label='Medium effect (d=0.5)')
ax.set_xlabel('|Cohen\'s d|', fontweight='bold', fontsize=12)
ax.set_ylabel('Count', fontweight='bold', fontsize=12)
ax.set_title('Within-Mouse ROI Heterogeneity\n(Biological + Technical Variation)', fontsize=12, fontweight='bold')
ax.legend()
ax.grid(alpha=0.3)

# Right: By timepoint/mouse
ax = axes[1]
for tp in ['D1', 'D3', 'D7']:
    tp_data = within_df[within_df.timepoint == tp]
    ax.scatter([tp] * len(tp_data), tp_data['abs_d'], alpha=0.3, s=20)
    
    # Show median per timepoint
    median = tp_data['abs_d'].median()
    ax.scatter([tp], [median], color='red', s=200, marker='D', zorder=10, edgecolors='black', linewidths=2)

ax.set_xlabel('Timepoint', fontweight='bold', fontsize=12)
ax.set_ylabel('|Cohen\'s d|', fontweight='bold', fontsize=12)
ax.set_title('Within-Mouse Heterogeneity by Timepoint\n(Red diamonds = median)', fontsize=12, fontweight='bold')
ax.grid(alpha=0.3, axis='y')

plt.suptitle('Panel A: Baseline Biological Variability (ROIs within Same Mouse)', fontsize=14, fontweight='bold', y=1.02)
plt.tight_layout()
plt.show()

print(f"\nüìä Interpretation:")
print(f"   - Median within-mouse variation: |d| = {within_df['abs_d'].median():.3f}")
print(f"   - This sets the BASELINE for what 'consistent' means")
print(f"   - Between-condition effects > {within_df['abs_d'].quantile(0.9):.3f} (90th percentile) are notable")

## Panel B: Primary Biological Effect (Sham vs UUO D7)

**The most important comparison**: Healthy tissue vs late injury (7 days post-UUO)

**Key**: We compute effect at MOUSE level (pooling all ROIs within mouse), then check consistency across the n=2 biological replicates.

In [None]:
# Load Sham data (if exists in our dataset)
# Note: Our current data may only have UUO timepoints (D1, D3, D7)
# For demonstration, we'll compare D1 vs D7 to show temporal progression

print("Available data:")
print(roi_df.groupby(['timepoint', 'mouse']).size())

# Strategy: Compare early (D1) vs late (D7) as demonstration of temporal effect
# This shows the methodology even if we don't have true sham controls

between_condition_effects = []

# For each marker, compare D1 vs D7
for mouse in ['M1', 'M2']:
    # Load all D1 ROIs for this mouse
    d1_rois = roi_df[(roi_df.timepoint == 'D1') & (roi_df.mouse == mouse)]
    d1_data = []
    for _, row in d1_rois.iterrows():
        sp_df = load_superpixel_features(row['file_path'])
        d1_data.append(sp_df)
    d1_combined = pd.concat(d1_data, ignore_index=True) if d1_data else None
    
    # Load all D7 ROIs for this mouse
    d7_rois = roi_df[(roi_df.timepoint == 'D7') & (roi_df.mouse == mouse)]
    d7_data = []
    for _, row in d7_rois.iterrows():
        sp_df = load_superpixel_features(row['file_path'])
        d7_data.append(sp_df)
    d7_combined = pd.concat(d7_data, ignore_index=True) if d7_data else None
    
    if d1_combined is None or d7_combined is None:
        continue
    
    # Compute effect size for each marker
    for marker in markers:
        d1_vals = d1_combined[marker].values
        d7_vals = d7_combined[marker].values
        
        d = cohens_d(d1_vals, d7_vals)
        
        between_condition_effects.append({
            'mouse': mouse,
            'marker': marker,
            'comparison': 'D1 vs D7',
            'cohens_d': d,
            'abs_d': abs(d),
            'n_d1': len(d1_vals),
            'n_d7': len(d7_vals)
        })

between_df = pd.DataFrame(between_condition_effects)

if not between_df.empty:
    print(f"\nComputed {len(between_df)} between-timepoint comparisons")
    print(f"Median between-timepoint effect: |d| = {between_df['abs_d'].median():.3f}")
    print(f"\nTop 5 largest effects:")
    print(between_df.nlargest(5, 'abs_d')[['marker', 'mouse', 'cohens_d', 'abs_d']])
else:
    print("‚ö†Ô∏è  No between-timepoint data available")

In [None]:
# Panel B: Consistency check across mice
if not between_df.empty:
    # For each marker, compare effect in M1 vs M2
    consistency_check = []
    
    for marker in markers:
        m1_effect = between_df[(between_df.mouse == 'M1') & (between_df.marker == marker)]['cohens_d'].values
        m2_effect = between_df[(between_df.mouse == 'M2') & (between_df.marker == marker)]['cohens_d'].values
        
        if len(m1_effect) > 0 and len(m2_effect) > 0:
            d_m1 = m1_effect[0]
            d_m2 = m2_effect[0]
            
            same_direction = np.sign(d_m1) == np.sign(d_m2)
            magnitude_diff = abs(d_m1 - d_m2)
            
            consistency_check.append({
                'marker': marker,
                'd_M1': d_m1,
                'd_M2': d_m2,
                'mean_d': (d_m1 + d_m2) / 2,
                'abs_mean_d': abs((d_m1 + d_m2) / 2),
                'same_direction': same_direction,
                'magnitude_diff': magnitude_diff
            })
    
    consistency_df = pd.DataFrame(consistency_check)
    
    # Plot
    fig, axes = plt.subplots(1, 2, figsize=(14, 5))
    
    # Left: M1 vs M2 effect sizes
    ax = axes[0]
    colors = ['green' if same_dir else 'red' for same_dir in consistency_df['same_direction']]
    ax.scatter(consistency_df['d_M1'], consistency_df['d_M2'], c=colors, s=80, alpha=0.6, edgecolors='black')
    
    # Diagonal line (perfect agreement)
    lim = max(abs(consistency_df['d_M1']).max(), abs(consistency_df['d_M2']).max()) * 1.1
    ax.plot([-lim, lim], [-lim, lim], 'k--', linewidth=1, alpha=0.5, label='Perfect agreement')
    ax.axhline(0, color='gray', linewidth=0.5)
    ax.axvline(0, color='gray', linewidth=0.5)
    
    ax.set_xlabel('Effect Size in Mouse 1 (Cohen\'s d)', fontweight='bold', fontsize=12)
    ax.set_ylabel('Effect Size in Mouse 2 (Cohen\'s d)', fontweight='bold', fontsize=12)
    ax.set_title('Consistency Across Biological Replicates\n(Green = same direction, Red = opposite)', fontsize=12, fontweight='bold')
    ax.legend()
    ax.grid(alpha=0.3)
    ax.set_xlim(-lim, lim)
    ax.set_ylim(-lim, lim)
    
    # Right: Ranked effect sizes with error bars
    ax = axes[1]
    consistency_df_sorted = consistency_df.sort_values('abs_mean_d', ascending=False).reset_index(drop=True)
    x = np.arange(len(consistency_df_sorted))
    
    # Plot mean effect with range
    ax.scatter(x, consistency_df_sorted['mean_d'], c='steelblue', s=100, zorder=3, edgecolors='black', linewidths=1.5)
    
    # Error bars showing range between M1 and M2
    for i, row in consistency_df_sorted.iterrows():
        ax.plot([i, i], [row['d_M1'], row['d_M2']], color='gray', linewidth=2, alpha=0.5)
    
    ax.axhline(0, color='black', linewidth=1)
    ax.axhline(2.0, color='red', linestyle='--', linewidth=1.5, alpha=0.5, label='Large effect (d=2.0)')
    ax.axhline(-2.0, color='red', linestyle='--', linewidth=1.5, alpha=0.5)
    
    ax.set_xlabel('Marker (ranked by |effect|)', fontweight='bold', fontsize=12)
    ax.set_ylabel('Cohen\'s d (D1‚ÜíD7 change)', fontweight='bold', fontsize=12)
    ax.set_title('Between-Mouse Consistency\n(Points = mean, Bars = M1-M2 range)', fontsize=12, fontweight='bold')
    ax.set_xticks(x[::2])  # Show every other marker
    ax.set_xticklabels(consistency_df_sorted['marker'][::2], rotation=45, ha='right')
    ax.legend()
    ax.grid(alpha=0.3, axis='y')
    
    plt.suptitle('Panel B: Effect Consistency Across n=2 Biological Replicates', fontsize=14, fontweight='bold', y=1.02)
    plt.tight_layout()
    plt.show()
    
    # Summary statistics
    n_consistent = consistency_df['same_direction'].sum()
    pct_consistent = (n_consistent / len(consistency_df)) * 100
    median_diff = consistency_df['magnitude_diff'].median()
    
    print(f"\nüìä Consistency Analysis:")
    print(f"   - {n_consistent}/{len(consistency_df)} markers ({pct_consistent:.1f}%) show same direction in both mice")
    print(f"   - Median M1-M2 magnitude difference: {median_diff:.3f}")
    print(f"   - Compare to within-mouse variation: {within_df['abs_d'].median():.3f}")
    
    # Identify highly consistent large effects
    large_consistent = consistency_df[
        (consistency_df['abs_mean_d'] > 1.0) & 
        (consistency_df['same_direction']) &
        (consistency_df['magnitude_diff'] < 1.0)
    ]
    
    if len(large_consistent) > 0:
        print(f"\n‚úÖ Robust findings (large effect + consistent direction + similar magnitude):")
        for _, row in large_consistent.iterrows():
            print(f"   - {row['marker']}: d={row['mean_d']:.2f} (M1={row['d_M1']:.2f}, M2={row['d_M2']:.2f})")
else:
    print("‚ö†Ô∏è  Cannot compute Panel B - insufficient data")

## Panel C: Multi-Scale Coherence

**Final test**: Is the effect visible at multiple spatial scales?

If we see consistent pattern at 10Œºm, 20Œºm, and 40Œºm scales, this suggests a robust biological phenomenon rather than scale-specific artifact.

In [None]:
# For top marker (largest consistent effect), compute effect size at each scale
if not consistency_df.empty and len(large_consistent) > 0:
    top_marker = large_consistent.nlargest(1, 'abs_mean_d').iloc[0]['marker']
    print(f"Examining multi-scale coherence for: {top_marker}")
    
    multiscale_effects = []
    
    for scale in [10.0, 20.0, 40.0]:
        for mouse in ['M1', 'M2']:
            # Load D1 data at this scale
            d1_rois = roi_df[(roi_df.timepoint == 'D1') & (roi_df.mouse == mouse)]
            d1_data = []
            for _, row in d1_rois.iterrows():
                sp_df = load_superpixel_features(row['file_path'], scale=scale)
                d1_data.append(sp_df)
            d1_combined = pd.concat(d1_data, ignore_index=True) if d1_data else None
            
            # Load D7 data at this scale
            d7_rois = roi_df[(roi_df.timepoint == 'D7') & (roi_df.mouse == mouse)]
            d7_data = []
            for _, row in d7_rois.iterrows():
                sp_df = load_superpixel_features(row['file_path'], scale=scale)
                d7_data.append(sp_df)
            d7_combined = pd.concat(d7_data, ignore_index=True) if d7_data else None
            
            if d1_combined is None or d7_combined is None:
                continue
            
            # Compute effect at this scale
            d1_vals = d1_combined[top_marker].values
            d7_vals = d7_combined[top_marker].values
            
            d = cohens_d(d1_vals, d7_vals)
            
            multiscale_effects.append({
                'scale_um': scale,
                'mouse': mouse,
                'cohens_d': d,
                'abs_d': abs(d),
                'n_d1': len(d1_vals),
                'n_d7': len(d7_vals)
            })
    
    multiscale_df = pd.DataFrame(multiscale_effects)
    
    # Plot
    fig, ax = plt.subplots(figsize=(10, 6))
    
    for mouse in ['M1', 'M2']:
        mouse_data = multiscale_df[multiscale_df.mouse == mouse]
        ax.plot(mouse_data['scale_um'], mouse_data['cohens_d'], 'o-', linewidth=2, markersize=10,
                label=f'{mouse}', alpha=0.8)
    
    # Mean across mice
    mean_per_scale = multiscale_df.groupby('scale_um')['cohens_d'].mean()
    ax.plot(mean_per_scale.index, mean_per_scale.values, 's-', linewidth=3, markersize=12,
            color='black', label='Mean (n=2)', zorder=10)
    
    ax.axhline(0, color='gray', linewidth=1)
    ax.axhline(2.0, color='red', linestyle='--', linewidth=1.5, alpha=0.5, label='Large effect threshold')
    
    ax.set_xlabel('Spatial Scale (Œºm)', fontweight='bold', fontsize=12)
    ax.set_ylabel(f'Cohen\'s d for {top_marker}', fontweight='bold', fontsize=12)
    ax.set_title(f'Panel C: Multi-Scale Coherence for {top_marker}\n(Effect consistent across spatial resolutions?)',
                 fontsize=14, fontweight='bold')
    ax.set_xticks([10, 20, 40])
    ax.set_xticklabels(['10 (cell-like)', '20 (micro-niche)', '40 (tissue-region)'])
    ax.legend(fontsize=10)
    ax.grid(alpha=0.3)
    
    plt.tight_layout()
    plt.show()
    
    # Statistics
    scale_cv = multiscale_df.groupby('mouse')['cohens_d'].apply(lambda x: x.std() / abs(x.mean()) if x.mean() != 0 else np.nan)
    print(f"\nüìä Multi-scale coherence:")
    print(f"   - Effect direction consistent: {all(multiscale_df.groupby('mouse')['cohens_d'].apply(lambda x: all(np.sign(x) == np.sign(x.iloc[0]))))}")
    print(f"   - Coefficient of variation across scales:")
    for mouse, cv in scale_cv.items():
        print(f"     ‚Ä¢ {mouse}: {cv:.2%}")
    print(f"   - Interpretation: Low CV (<20%) = scale-robust, High CV (>50%) = scale-dependent")
else:
    print("‚ö†Ô∏è  No large consistent effects to examine at multiple scales")

## Summary: What Can We Claim?

### ‚úÖ Legitimate Claims

1. **Within-mouse consistency**: "All ROIs within each mouse show consistent spatial patterns (median within-mouse |d| = X)"
2. **Between-mouse concordance**: "Both biological replicates show effects in same direction for Y% of markers"
3. **Large effect sizes**: "Observed effects are large (Cohen's d > 2.0) for markers A, B, C"
4. **Multi-scale robustness**: "Effects consistent across 10-40Œºm spatial scales (CV < Z%)"
5. **Hypothesis generation**: "These pilot findings (n=2) warrant validation in powered study (n‚â•6)"

### ‚ùå Illegitimate Claims

1. ~~"Statistically significant at p<0.05"~~ ‚Üí Underpowered
2. ~~"Superpixels are independent replicates"~~ ‚Üí Pseudo-replication
3. ~~"Generalizes to C57BL/6 population"~~ ‚Üí n=2 insufficient
4. ~~"Hierarchical modeling increases power"~~ ‚Üí Cannot estimate variance with n=2

### üìù Paper Framing

> "This pilot study (n=2 mice per condition) demonstrates our hierarchical spatial analysis framework on real tissue. We assess effect consistency across biological replicates, technical replicates (ROIs within mouse), and spatial scales. While statistical power is limited by sample size (Cohen's d > 3.0 required for 80% power with n=2), we observe large, directionally-consistent effects for [specific markers]. These findings are hypothesis-generating and demonstrate the analytical pipeline's capability to extract biologically-interpretable spatial patterns from limited samples."

**Key message**: We're demonstrating METHODS with pilot data, not making definitive BIOLOGICAL claims.