# Notebook 03: OpenMC Loop & Inference

## Solving the Validation Paradox

**Learning Objective:** Validate models using reactor simulations and sensitivity weighting.

**Focus:**
- **U-235**: Reactor validation with OpenMC (criticality safety)
- **Cl-35**: Uncertainty quantification for research applications

### The Validation Paradox (Revisited)

```
Model A: Test MSE = 0.010, k_eff error = 450 pcm → UNSAFE!
Model B: Test MSE = 0.015, k_eff error = 50 pcm  → SAFE!
```

**Standard ML** picks Model A (lower MSE)  
**Physics-informed ML** picks Model B (better reactor prediction)

### The Solution: Sensitivity Weighting

Weight loss by ∂k/∂σ:
- High-sensitivity reactions (U-235 fission) → Large penalty
- Low-sensitivity reactions (trace captures, Cl-35) → Small penalty

**Why this matters for our isotopes:**
- **U-235**: Critical for reactor safety → Must validate with OpenMC
- **Cl-35**: Research interest → Needs uncertainty quantification, not reactor validation

In [None]:
import sys
sys.path.append('..')

import numpy as np
from pathlib import Path
from nucml_next.validation import OpenMCValidator, SensitivityAnalyzer, ReactorBenchmark

# Verify EXFOR data exists (needed for model training/validation)
exfor_path = Path('../data/exfor_enriched.parquet')
if not exfor_path.exists():
    raise FileNotFoundError(
        f"EXFOR data not found at {exfor_path}\n"
        "Please run: python scripts/ingest_exfor.py --exfor-root <path> --output data/exfor_enriched.parquet"
    )

print("✓ Imports successful")
print("✓ EXFOR data found")

print("\n💡 NOTE: Models being validated should be trained with DataSelection:")
print("   from nucml_next.data import DataSelection, NucmlDataset")
print("   selection = DataSelection(projectile='neutron', mt_mode='reactor_core')")
print("   dataset = NucmlDataset(data_path='...', selection=selection)")
print("   → Ensures physics-aware training for reactor validation\n")

### Step 1: Sensitivity Analysis

In [None]:
# Analyze which reactions matter for reactors
analyzer = SensitivityAnalyzer(reactor_type='PWR')
analyzer.print_sensitivity_report()

print("\n💡 Key Insight: Comparing our focus isotopes...")
print("="*80)
print("U-235 FISSION:")
print("  • Sensitivity: ~0.8 (EXTREMELY high!)")
print("  • Why: Dominant fission source in LWR reactors")
print("  • Impact: 1% cross-section error → ~800 pcm k_eff error")
print("  • Validation: MUST use OpenMC reactor simulations")
print()
print("Cl-35 (n,p):")
print("  • Sensitivity: ~0.0001 (negligible for reactors)")
print("  • Why: Not a major reactor material")
print("  • Impact: Even 50% error → <1 pcm k_eff error")
print("  • Validation: Use experimental benchmarks, not reactor sims")
print("="*80)
print("\n🎯 This is why we need DIFFERENT validation strategies for different isotopes!")

### Step 2: Validate with OpenMC

In [None]:
# Initialize validator
validator = OpenMCValidator(reactor_type='PWR')

# Simulate k_eigenvalue with model predictions
# NOTE: In production, these would come from your trained GNN-Transformer model
# trained on real EXFOR data. Here we demonstrate the validation workflow.
cross_section_data = {
    (92, 235, 18): np.array([500.0]),  # U-235 fission (example prediction)
    (92, 235, 102): np.array([100.0]), # U-235 capture (example prediction)
}

results = validator.run_k_eigenvalue(cross_section_data)

print(f"\nReactor Prediction:")
print(f"  k_eff = {results['k_eff_mean']:.5f} ± {results['k_eff_std']:.5f}")
print(f"  Reference = {results['k_eff_nominal']:.5f}")

# Compute reactivity error
error_pcm = validator.compute_reactivity_error(results['k_eff_mean'])
print(f"  Error = {error_pcm:.1f} pcm")

if abs(error_pcm) < 100:
    print("\n✓ EXCELLENT reactor prediction!")
elif abs(error_pcm) < 300:
    print("\n✓ ACCEPTABLE reactor prediction")
else:
    print("\n❌ POOR reactor prediction - needs improvement")

### Step 3: Benchmark Validation

In [None]:
# Compare against standard reactor benchmarks
ReactorBenchmark.list_available_benchmarks()

# Validate against PWR pin cell
benchmark = ReactorBenchmark('PWR_PIN')
benchmark.print_benchmark_info()

# Validate prediction
validation = benchmark.validate_prediction(results['k_eff_mean'])
benchmark.print_validation_results(validation)

### Step 4: Retrain with Sensitivity Weighting

In [None]:
from nucml_next.physics import SensitivityWeightedLoss

# Get sensitivity weights
weights = analyzer.get_sensitivity_weights(cross_section_data)

# Create sensitivity-weighted loss
loss_fn = SensitivityWeightedLoss()

print("\n🎯 Training Strategy:")
print("   1. Train with standard MSE → Get baseline")
print("   2. Retrain with sensitivity weighting → Improve reactor accuracy")
print("   3. Validate with OpenMC → Verify k_eff error < 100 pcm")
print("\n   This is how we solve the Validation Paradox!")

### 🎓 Final Takeaway

> **The Validation Paradox is SOLVED!**
>
> By combining real EXFOR data with context-aware validation:
> 1. We train on experimental measurements (not simulations)
> 2. We apply **different validation strategies** based on application:
>    - **U-235 (reactor fuel)**: Sensitivity weighting + OpenMC validation
>    - **Cl-35 (research)**: Uncertainty quantification + experimental benchmarks
> 3. We achieve application-appropriate ML predictions
>
> **Low MSE ≠ Safe Reactor**  
> **Context-Aware Validation + Real Data = Trustworthy Predictions** ✓

---

## Summary: The NUCML-Next Journey

| Notebook | Topic | Key Lesson |
|----------|-------|------------|
| 00 | Baselines | Classical ML fails on both data-rich (U-235) and data-sparse (Cl-35) |
| 01 | Data Fabric | Graph structure enables transfer learning between isotopes |
| 02 | GNN-Transformer | Smooth predictions for both well-studied and under-studied isotopes |
| 03 | Validation | Different isotopes need different validation strategies |

### What You've Learned

✓ Why classical ML fails for physics problems  
✓ How to build graph representations from real nuclear data  
✓ How to train physics-informed models on EXFOR measurements  
✓ How GNNs enable transfer learning (U-235 → Cl-35)  
✓ How to validate with reactor simulations (U-235)  
✓ How to quantify uncertainty for research isotopes (Cl-35)  
✓ How to solve the Validation Paradox with context-aware approaches  

---

## Application Domains

### **U-235 (Well-Understood, Reactor-Critical)**
- ✅ Extensive EXFOR data enables high-accuracy models
- ✅ OpenMC validation ensures reactor safety
- ✅ Sensitivity weighting prioritizes critical energy ranges
- 🎯 **Goal**: Sub-100 pcm k_eff prediction accuracy

### **Cl-35 (Research Interest, Data-Sparse)**
- ✅ GNN transfer learning compensates for sparse data
- ✅ Uncertainty quantification guides new experiments
- ✅ Smooth physics-informed predictions avoid overfitting
- 🎯 **Goal**: Reliable cross sections with quantified uncertainties

---

**Congratulations! You've completed NUCML-Next!** 🎉

*You're now ready to apply physics-informed deep learning to production nuclear data evaluation using real IAEA EXFOR experimental measurements, with appropriate validation strategies for different isotopes and applications.*