# GNN-MAPS: Diagnostic Experiments

## üî¨ Goal: Figure Out WHY GNN is Underperforming

**The Problem**:
- MLP: 88.2% F1
- GNN: 83.2% F1 (5 pp WORSE!)
- GNN should BEAT MLP when spatial patterns exist

**The Hypothesis**: K=5 neighbors is TOO SMALL
- With 100K+ cells, K=5 = only immediate neighbors
- 2-layer GNN with K=5 ‚Üí receptive field of only ~25 cells (0.025% of tissue)
- Biological spatial patterns likely span 10-20+ cells

## üß™ Three Diagnostic Experiments:

### **Experiment 1: K-Sensitivity Analysis** ‚≠ê MOST IMPORTANT
- Test K = 5, 10, 15, 20, 25
- Plot K vs F1-score
- **Expected**: If K is the issue ‚Üí performance improves with larger K
- **If flat**: K is NOT the issue ‚Üí problem is elsewhere

### **Experiment 2: Spatial Pattern Visualization**
- Plot each cell type in X-Y space
- Visual inspection: Are cell types clustered or randomly distributed?
- **Expected**: If clustered ‚Üí spatial patterns exist ‚Üí GNN SHOULD work
- **If random**: No patterns ‚Üí GNN will never beat MLP

### **Experiment 3: Random Graph Baseline**
- Train GNN with: (1) True KNN graph, (2) Random graph, (3) No edges
- **Expected**: True > Random > No edges
- **If True ‚âà Random**: GNN is NOT using the graph structure!

---

**Runtime**: ~1-2 hours on Kaggle P100

In [None]:
# Install PyTorch Geometric and its dependencies (Kaggle-compatible)
import sys
import torch

print(f"Python version: {sys.version}")
print(f"PyTorch version: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"CUDA version: {torch.version.cuda}")

!pip install -q torch-geometric

import torch
pytorch_version = torch.__version__.split('+')[0]
cuda_version = torch.version.cuda.replace('.', '') if torch.cuda.is_available() else 'cpu'

print(f"\nInstalling PyG extensions for PyTorch {pytorch_version} and CUDA {cuda_version}...")

if torch.cuda.is_available():
    !pip install -q torch-scatter torch-sparse -f https://data.pyg.org/whl/torch-{pytorch_version}+cu{cuda_version}.html
else:
    !pip install -q torch-scatter torch-sparse -f https://data.pyg.org/whl/torch-{pytorch_version}+cpu.html

print("\n‚úÖ PyTorch Geometric installation complete!")

In [None]:
# Import libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import os
from tqdm import tqdm

import torch
import torch.nn.functional as F
from torch_geometric.data import Data
from torch_geometric.nn import SAGEConv
from sklearn.neighbors import NearestNeighbors
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import f1_score

print("‚úÖ All libraries loaded!")
print(f"   PyTorch: {torch.__version__}")
print(f"   Device: {'GPU' if torch.cuda.is_available() else 'CPU'}")

# 1. Load and Prepare Data

In [None]:
# Load data
df = pd.read_csv("/kaggle/input/chl-codex-annotated/cHL_CODEX_annotation.csv")
print(f"Dataset: {df.shape[0]:,} cells √ó {df.shape[1]} features")
display(df.head(3))

In [None]:
# Define columns
x_col = 'X_cent'
y_col = 'Y_cent'
label_col = 'cellType'

marker_cols = [
    'BCL.2', 'CCR6', 'CD11b', 'CD11c', 'CD15', 'CD16', 'CD162', 'CD163',
    'CD2', 'CD20', 'CD206', 'CD25', 'CD30', 'CD31', 'CD4', 'CD44',
    'CD45RA', 'CD45RO', 'CD45', 'CD5', 'CD56', 'CD57', 'CD68', 'CD69',
    'CD7', 'CD8', 'Collagen.4', 'Cytokeratin', 'DAPI.01', 'EGFR',
    'FoxP3', 'Granzyme.B', 'HLA.DR', 'IDO.1', 'LAG.3', 'MCT', 'MMP.9',
    'MUC.1', 'PD.1', 'PD.L1', 'Podoplanin', 'T.bet', 'TCR.g.d', 'TCRb',
    'Tim.3', 'VISA', 'Vimentin', 'a.SMA', 'b.Catenin'
]

# Normalize features
scaler = StandardScaler()
X_normalized = scaler.fit_transform(df[marker_cols].values)
x = torch.tensor(X_normalized, dtype=torch.float)

# Encode labels
unique_labels = sorted(df[label_col].unique())
label_map = {name: i for i, name in enumerate(unique_labels)}
y = torch.tensor(df[label_col].map(label_map).values, dtype=torch.long)
num_classes = len(label_map)

# Get coordinates
coords = df[[x_col, y_col]].values

# Random train/test split (80/20)
torch.manual_seed(42)
random_perm = torch.randperm(len(df))
n_train = int(0.8 * len(df))
train_mask = torch.zeros(len(df), dtype=torch.bool)
test_mask = torch.zeros(len(df), dtype=torch.bool)
train_mask[random_perm[:n_train]] = True
test_mask[random_perm[n_train:]] = True

print(f"\n‚úÖ Prepared:")
print(f"   Features: {len(marker_cols)} markers")
print(f"   Labels: {num_classes} cell types")
print(f"   Train: {train_mask.sum():,} | Test: {test_mask.sum():,}")

# 2. Experiment 2: Spatial Pattern Visualization (Do First!)

Before testing different K values, let's check if spatial patterns even exist!

In [None]:
print("=" * 80)
print("EXPERIMENT 2: SPATIAL PATTERN VISUALIZATION")
print("=" * 80)

# Get top cell types by count
cell_type_counts = df[label_col].value_counts()
top_cell_types = cell_type_counts.head(9).index.tolist()

print(f"\nVisualizing top 9 cell types (out of {num_classes}):")
for i, ct in enumerate(top_cell_types, 1):
    count = cell_type_counts[ct]
    pct = 100 * count / len(df)
    print(f"  {i}. {ct:<30s} {count:>6,} cells ({pct:>5.2f}%)")

# Create 3x3 grid of spatial plots
fig, axes = plt.subplots(3, 3, figsize=(18, 18))
axes = axes.flatten()

for idx, cell_type in enumerate(top_cell_types):
    ax = axes[idx]
    
    # Get cells of this type
    mask = (df[label_col] == cell_type).values
    type_coords = coords[mask]
    other_coords = coords[~mask]
    
    # Plot: other cells (gray, background) + this type (colored)
    ax.scatter(other_coords[:, 0], other_coords[:, 1], 
               c='lightgray', s=0.5, alpha=0.2, label='Other')
    ax.scatter(type_coords[:, 0], type_coords[:, 1], 
               c=f'C{idx}', s=2, alpha=0.7, label=cell_type)
    
    ax.set_title(f"{cell_type}\n({cell_type_counts[cell_type]:,} cells)", 
                 fontsize=11, fontweight='bold')
    ax.set_xlabel('X Position', fontsize=9)
    ax.set_ylabel('Y Position', fontsize=9)
    ax.legend(loc='upper right', fontsize=8)
    ax.grid(alpha=0.3)

plt.tight_layout()
plt.suptitle('Spatial Distribution of Top 9 Cell Types\n(Are they clustered or random?)', 
             fontsize=16, fontweight='bold', y=1.01)
plt.show()

print("\n" + "=" * 80)
print("VISUAL INSPECTION GUIDE:")
print("=" * 80)
print("‚úÖ CLUSTERED (good for GNN):")
print("   - Cells form distinct regions/patches")
print("   - Clear spatial organization")
print("   ‚Üí GNN should be able to learn these patterns!")
print("\n‚ùå RANDOM (bad for GNN):")
print("   - Cells uniformly scattered across tissue")
print("   - No clear spatial structure")
print("   ‚Üí No spatial pattern to learn ‚Üí GNN will never beat MLP")
print("=" * 80)

# 3. Model Definition

In [None]:
class GraphSAGE(torch.nn.Module):
    """2-layer GraphSAGE (same as previous experiments)"""
    def __init__(self, in_channels, hidden_channels, out_channels, dropout=0.1):
        super().__init__()
        self.conv1 = SAGEConv(in_channels, hidden_channels)
        self.conv2 = SAGEConv(hidden_channels, out_channels)
        self.dropout = dropout

    def forward(self, x, edge_index):
        x = self.conv1(x, edge_index)
        x = F.relu(x)
        x = F.dropout(x, p=self.dropout, training=self.training)
        x = self.conv2(x, edge_index)
        return F.log_softmax(x, dim=1)

print("‚úÖ GraphSAGE model defined")

In [None]:
# Training functions
def train_epoch(model, data, optimizer):
    model.train()
    optimizer.zero_grad()
    out = model(data.x, data.edge_index)
    loss = F.nll_loss(out[data.train_mask], data.y[data.train_mask])
    loss.backward()
    optimizer.step()
    return loss.item()

@torch.no_grad()
def evaluate(model, data, mask):
    model.eval()
    out = model(data.x, data.edge_index)
    pred = out.argmax(dim=1)
    y_true = data.y[mask].cpu().numpy()
    y_pred = pred[mask].cpu().numpy()
    f1 = f1_score(y_true, y_pred, average='weighted', zero_division=0)
    return f1

def quick_train(model, data, lr=0.001, epochs=100, patience=20, verbose=False):
    """
    Fast training for diagnostic purposes.
    Reduced epochs (100 instead of 500) to speed up experiments.
    """
    optimizer = torch.optim.Adam(model.parameters(), lr=lr)
    best_f1 = 0
    patience_counter = 0
    
    for epoch in range(1, epochs + 1):
        loss = train_epoch(model, data, optimizer)
        
        if epoch % 10 == 0:
            test_f1 = evaluate(model, data, data.test_mask)
            
            if test_f1 > best_f1:
                best_f1 = test_f1
                patience_counter = 0
            else:
                patience_counter += 1
            
            if verbose and epoch % 20 == 0:
                print(f"Epoch {epoch:3d} | Loss: {loss:.4f} | Test F1: {test_f1:.4f}")
            
            if patience_counter >= patience // 10:
                if verbose:
                    print(f"Early stop at epoch {epoch}")
                break
    
    return best_f1

print("‚úÖ Training functions defined")

# 4. Experiment 1: K-Sensitivity Analysis ‚≠ê

**THE BIG TEST**: Does performance improve with larger K?

In [None]:
print("=" * 80)
print("EXPERIMENT 1: K-SENSITIVITY ANALYSIS")
print("=" * 80)

# Test different K values
k_values = [5, 10, 15, 20, 25]
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f"\nüíª Device: {device}")

results_k_sensitivity = []

for k in k_values:
    print(f"\n{'='*80}")
    print(f"Testing K = {k} neighbors")
    print(f"{'='*80}")
    
    # Build KNN graph with K neighbors
    print(f"Building KNN graph with K={k}...")
    nbrs = NearestNeighbors(n_neighbors=k + 1, algorithm='ball_tree').fit(coords)
    distances, indices = nbrs.kneighbors(coords)
    
    source_nodes = np.repeat(np.arange(len(df)), k)
    target_nodes = indices[:, 1:].flatten()  # Skip self
    edge_index = torch.tensor([source_nodes, target_nodes], dtype=torch.long)
    
    print(f"‚úÖ Graph built: {edge_index.shape[1]:,} edges")
    
    # Create data object
    data = Data(
        x=x,
        edge_index=edge_index,
        y=y,
        train_mask=train_mask,
        test_mask=test_mask
    ).to(device)
    
    # Train model (quick training for diagnostics)
    print(f"Training GNN (100 epochs with early stopping)...")
    model = GraphSAGE(len(marker_cols), 512, num_classes, dropout=0.1).to(device)
    best_f1 = quick_train(model, data, epochs=100, patience=20, verbose=True)
    
    # Final evaluation
    final_f1 = evaluate(model, data, data.test_mask)
    
    results_k_sensitivity.append({
        'K': k,
        'F1': final_f1,
        'edges': edge_index.shape[1]
    })
    
    print(f"\n‚úÖ K={k}: Final Test F1 = {final_f1:.4f} ({final_f1*100:.2f}%)")

print("\n" + "=" * 80)
print("K-SENSITIVITY RESULTS")
print("=" * 80)

df_k_results = pd.DataFrame(results_k_sensitivity)
print("\n" + df_k_results.to_string(index=False))

# Compute improvement
baseline_f1 = df_k_results[df_k_results['K'] == 5]['F1'].values[0]
best_k = df_k_results.loc[df_k_results['F1'].idxmax(), 'K']
best_f1 = df_k_results['F1'].max()
improvement = (best_f1 - baseline_f1) * 100

print(f"\nüìä Analysis:")
print(f"   Baseline (K=5):  {baseline_f1:.4f} ({baseline_f1*100:.2f}%)")
print(f"   Best (K={best_k}):     {best_f1:.4f} ({best_f1*100:.2f}%)")
print(f"   Improvement:     {improvement:+.2f} pp")

In [None]:
# Visualize K-sensitivity
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(16, 6))

# Plot 1: K vs F1-Score
ax1.plot(df_k_results['K'], df_k_results['F1'], 
         marker='o', linewidth=3, markersize=10, color='#e74c3c')
ax1.axhline(y=0.882, color='blue', linestyle='--', linewidth=2, alpha=0.7, label='MLP Baseline (88.2%)')
ax1.axhline(y=0.90, color='green', linestyle='--', linewidth=2, alpha=0.7, label='MAPS Target (90%)')
ax1.set_xlabel('K (Number of Neighbors)', fontsize=14, fontweight='bold')
ax1.set_ylabel('Test F1-Score', fontsize=14, fontweight='bold')
ax1.set_title('K-Sensitivity: Does Larger K Help?', fontsize=16, fontweight='bold')
ax1.grid(alpha=0.3)
ax1.legend(fontsize=12)
ax1.set_ylim([0.8, 0.95])

# Annotate points
for _, row in df_k_results.iterrows():
    ax1.annotate(f"{row['F1']:.3f}", 
                xy=(row['K'], row['F1']), 
                xytext=(0, 10), 
                textcoords='offset points',
                ha='center', fontsize=11, fontweight='bold')

# Plot 2: K vs Improvement over K=5
improvements = [(f1 - baseline_f1) * 100 for f1 in df_k_results['F1']]
colors = ['#2ecc71' if imp > 0 else '#e74c3c' for imp in improvements]

bars = ax2.bar(df_k_results['K'], improvements, color=colors, edgecolor='black', linewidth=1.5)
ax2.axhline(y=0, color='black', linestyle='-', linewidth=1)
ax2.set_xlabel('K (Number of Neighbors)', fontsize=14, fontweight='bold')
ax2.set_ylabel('Improvement vs K=5 (percentage points)', fontsize=14, fontweight='bold')
ax2.set_title('Improvement Over Baseline (K=5)', fontsize=16, fontweight='bold')
ax2.grid(axis='y', alpha=0.3)

for bar, imp in zip(bars, improvements):
    height = bar.get_height()
    ax2.text(bar.get_x() + bar.get_width()/2., height,
            f'{imp:+.2f} pp',
            ha='center', va='bottom' if height > 0 else 'top', 
            fontsize=11, fontweight='bold')

plt.tight_layout()
plt.show()

# Interpretation
print("\n" + "=" * 80)
print("INTERPRETATION GUIDE")
print("=" * 80)

if improvement > 2.0:
    print("\n‚úÖ SIGNIFICANT IMPROVEMENT!")
    print(f"   K={best_k} improves by {improvement:.2f} pp over K=5")
    print("   ‚Üí K WAS the bottleneck!")
    print("   ‚Üí Increase K and potentially beat MLP")
    print(f"   ‚Üí Next: Try K={best_k} with full 500-epoch training")
elif improvement > 0.5:
    print("\n‚ö†Ô∏è MODEST IMPROVEMENT")
    print(f"   K={best_k} improves by {improvement:.2f} pp over K=5")
    print("   ‚Üí K helps but not enough")
    print("   ‚Üí Try deeper GNN or different architecture")
else:
    print("\n‚ùå NO MEANINGFUL IMPROVEMENT")
    print(f"   Best improvement: {improvement:.2f} pp")
    print("   ‚Üí K is NOT the main issue")
    print("   ‚Üí Problem is either:")
    print("      1. No spatial patterns exist (check visualization above)")
    print("      2. GNN architecture is wrong")
    print("      3. Need manual spatial features instead")

print("=" * 80)

# 5. Experiment 3: Random Graph Baseline

**Critical Test**: Is the GNN actually using the graph structure?

In [None]:
print("=" * 80)
print("EXPERIMENT 3: RANDOM GRAPH BASELINE")
print("=" * 80)

# Use K=15 (middle value from k-sensitivity)
k_test = 15
print(f"\nTesting with K={k_test} neighbors\n")

# Build TRUE KNN graph
print("1Ô∏è‚É£ Building TRUE KNN graph...")
nbrs = NearestNeighbors(n_neighbors=k_test + 1, algorithm='ball_tree').fit(coords)
distances, indices = nbrs.kneighbors(coords)
source_nodes = np.repeat(np.arange(len(df)), k_test)
target_nodes = indices[:, 1:].flatten()
edge_index_true = torch.tensor([source_nodes, target_nodes], dtype=torch.long)
print(f"   ‚úÖ {edge_index_true.shape[1]:,} edges")

# Build RANDOM graph (same number of edges)
print("\n2Ô∏è‚É£ Building RANDOM graph (same # edges)...")
n_cells = len(df)
n_edges = edge_index_true.shape[1]
random_sources = np.random.randint(0, n_cells, size=n_edges)
random_targets = np.random.randint(0, n_cells, size=n_edges)
# Remove self-loops
mask = random_sources != random_targets
edge_index_random = torch.tensor([random_sources[mask], random_targets[mask]], dtype=torch.long)
print(f"   ‚úÖ {edge_index_random.shape[1]:,} edges")

# NO GRAPH (empty edge index)
print("\n3Ô∏è‚É£ Creating NO GRAPH baseline (empty edges)...")
edge_index_empty = torch.tensor([[], []], dtype=torch.long)
print(f"   ‚úÖ 0 edges")

# Train on each graph type
graph_types = [
    ('True KNN Graph', edge_index_true),
    ('Random Graph', edge_index_random),
    ('No Graph (Empty)', edge_index_empty),
]

results_graph_baseline = []

for graph_name, edge_index in graph_types:
    print(f"\n{'='*80}")
    print(f"Training with: {graph_name}")
    print(f"{'='*80}")
    
    # Create data
    data = Data(
        x=x,
        edge_index=edge_index,
        y=y,
        train_mask=train_mask,
        test_mask=test_mask
    ).to(device)
    
    # Train
    model = GraphSAGE(len(marker_cols), 512, num_classes, dropout=0.1).to(device)
    best_f1 = quick_train(model, data, epochs=100, patience=20, verbose=True)
    final_f1 = evaluate(model, data, data.test_mask)
    
    results_graph_baseline.append({
        'Graph Type': graph_name,
        'F1': final_f1,
        'Edges': edge_index.shape[1]
    })
    
    print(f"\n‚úÖ {graph_name}: Final F1 = {final_f1:.4f} ({final_f1*100:.2f}%)")

print("\n" + "=" * 80)
print("GRAPH BASELINE RESULTS")
print("=" * 80)

df_graph_results = pd.DataFrame(results_graph_baseline)
print("\n" + df_graph_results.to_string(index=False))

In [None]:
# Visualize graph baseline results
fig, ax = plt.subplots(1, 1, figsize=(10, 7))

graph_names = df_graph_results['Graph Type'].tolist()
f1_scores = df_graph_results['F1'].tolist()
colors = ['#2ecc71', '#e67e22', '#e74c3c']

bars = ax.bar(graph_names, f1_scores, color=colors, edgecolor='black', linewidth=2, width=0.6)
ax.axhline(y=0.882, color='blue', linestyle='--', linewidth=2, label='MLP Baseline (88.2%)')
ax.set_ylabel('Test F1-Score', fontsize=14, fontweight='bold')
ax.set_title('Graph Structure Test: Is GNN Using the Graph?', fontsize=16, fontweight='bold')
ax.set_ylim([0.75, 0.95])
ax.grid(axis='y', alpha=0.3)
ax.legend(fontsize=12)

for bar, f1 in zip(bars, f1_scores):
    height = bar.get_height()
    ax.text(bar.get_x() + bar.get_width()/2., height,
            f'{f1:.4f}\n({f1*100:.2f}%)',
            ha='center', va='bottom', fontsize=13, fontweight='bold')

plt.tight_layout()
plt.show()

# Interpretation
true_f1 = df_graph_results[df_graph_results['Graph Type'] == 'True KNN Graph']['F1'].values[0]
random_f1 = df_graph_results[df_graph_results['Graph Type'] == 'Random Graph']['F1'].values[0]
empty_f1 = df_graph_results[df_graph_results['Graph Type'] == 'No Graph (Empty)']['F1'].values[0]

print("\n" + "=" * 80)
print("INTERPRETATION")
print("=" * 80)

diff_true_vs_random = (true_f1 - random_f1) * 100
diff_true_vs_empty = (true_f1 - empty_f1) * 100

print(f"\nTrue vs Random: {diff_true_vs_random:+.2f} pp")
print(f"True vs Empty:  {diff_true_vs_empty:+.2f} pp")

if diff_true_vs_random > 1.0:
    print("\n‚úÖ GNN IS USING THE GRAPH!")
    print("   True graph >> Random graph")
    print("   ‚Üí GNN learns from spatial structure")
    print("   ‚Üí Problem is K/architecture, not fundamental")
elif diff_true_vs_random > 0.3:
    print("\n‚ö†Ô∏è GNN WEAKLY USES GRAPH")
    print("   True graph slightly > Random graph")
    print("   ‚Üí GNN uses graph but benefit is small")
    print("   ‚Üí Spatial patterns might be weak")
else:
    print("\n‚ùå GNN NOT USING GRAPH!")
    print("   True graph ‚âà Random graph")
    print("   ‚Üí GNN ignores graph structure")
    print("   ‚Üí Either:")
    print("      1. Implementation bug (message passing not working)")
    print("      2. Protein features so strong that graph is redundant")
    print("      3. No spatial patterns to learn")

print("=" * 80)

# 6. Summary & Recommendations

In [None]:
print("=" * 80)
print("DIAGNOSTIC SUMMARY")
print("=" * 80)

print("\n" + "="*80)
print("1Ô∏è‚É£ EXPERIMENT 1: K-SENSITIVITY")
print("="*80)
print(f"\n{df_k_results.to_string(index=False)}")
print(f"\n   ‚Üí Best K: {best_k} (F1 = {best_f1:.4f})")
print(f"   ‚Üí Improvement over K=5: {improvement:+.2f} pp")

print("\n" + "="*80)
print("2Ô∏è‚É£ EXPERIMENT 2: SPATIAL PATTERNS")
print("="*80)
print("\n   ‚Üí Check visualization above")
print("   ‚Üí Are cell types clustered or random?")

print("\n" + "="*80)
print("3Ô∏è‚É£ EXPERIMENT 3: GRAPH BASELINE")
print("="*80)
print(f"\n{df_graph_results.to_string(index=False)}")
print(f"\n   ‚Üí True vs Random: {diff_true_vs_random:+.2f} pp")
print(f"   ‚Üí True vs Empty:  {diff_true_vs_empty:+.2f} pp")

print("\n" + "="*80)
print("FINAL RECOMMENDATIONS")
print("="*80)

# Decision tree based on results
if improvement > 2.0 and diff_true_vs_random > 1.0:
    print("\nüéâ GOOD NEWS! Both K and graph matter!")
    print("\n‚úÖ NEXT STEPS:")
    print(f"   1. Use K={best_k} with full 500-epoch training")
    print("   2. Try 3-4 layer GNN (deeper receptive field)")
    print("   3. Experiment with GAT (attention mechanism)")
    print("   4. Expected: Should beat MLP (88.2%) and approach MAPS (90%)")

elif improvement > 0.5:
    print("\n‚ö†Ô∏è MIXED RESULTS: K helps modestly")
    print("\n‚úÖ NEXT STEPS:")
    print(f"   1. Try K={best_k} with deeper GNN (3-4 layers)")
    print("   2. Experiment with different architectures (GAT, GCN)")
    print("   3. Consider manual spatial features as alternative")
    print("   4. Expected: Might reach 85-87%, still below MLP")

else:
    print("\n‚ùå BAD NEWS: K doesn't help significantly")
    print("\nüîÑ ALTERNATIVE APPROACHES:")
    print("   1. MANUAL SPATIAL FEATURES (recommended):")
    print("      - For each cell: avg neighbor expression, cell type composition")
    print("      - Add to MLP ‚Üí simpler and might work better")
    print("\n   2. ABANDON SPATIAL CONTEXT:")
    print("      - Focus on improving MLP (88.2% ‚Üí 90%)")
    print("      - Hyperparameter tuning, ensembles, better features")
    print("\n   3. INVESTIGATE DATA:")
    print("      - Compute spatial autocorrelation (Moran's I)")
    print("      - If no spatial patterns ‚Üí GNN will never help")

print("\n" + "="*80)
print("üöÄ DIAGNOSTICS COMPLETE!")
print("="*80)

print(f"\nüíª Hardware: {torch.cuda.get_device_name(0) if torch.cuda.is_available() else 'CPU'}")
print(f"‚è±Ô∏è  Note: Used quick training (100 epochs) for diagnostics")
print(f"üìä Full training (500 epochs) may show +1-2% improvement")