# Notebook 00: Baselines and Limitations

## Understanding Why Classical ML Fails for Nuclear Data

**Learning Objective:** Understand *why* classical machine learning fails for nuclear data evaluation using real experimental data.

### The Problem

Nuclear cross sections œÉ(E) are smooth, continuous functions of energy. They exhibit:
- **Resonance peaks**: Sharp but smooth features
- **Threshold behavior**: œÉ(E) = 0 for E < E_threshold, then rises smoothly
- **Physical constraints**: Conservation laws, unitarity, causality

### Why This Matters

A reactor calculation uses millions of cross-section evaluations. If predictions are:
- **Jagged** ‚Üí Unphysical neutron transport
- **Discontinuous** ‚Üí Numerical instabilities
- **Wrong at key energies** ‚Üí Incorrect k_eff (criticality)

This is the **Validation Paradox**: Low MSE ‚â† Safe Reactor!

---

## Part 1: The Naive Approach

Let's examine why tree-based models struggle with real nuclear cross-section data from IAEA EXFOR.

In [None]:
import sys
sys.path.append('..')

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from pathlib import Path

from nucml_next.data import NucmlDataset
from nucml_next.baselines import XGBoostEvaluator, DecisionTreeEvaluator

# Set plotting style
plt.style.use('seaborn-v0_8-darkgrid')
sns.set_palette("husl")
%matplotlib inline

# Verify EXFOR data exists
exfor_path = Path('../data/exfor_processed.parquet')
if not exfor_path.exists():
    raise FileNotFoundError(
        f"EXFOR data not found at {exfor_path}\n"
        "Please run: python scripts/ingest_exfor.py --exfor-root <path> --output data/exfor_processed.parquet"
    )

print("‚úì Imports successful")
print("‚úì EXFOR data found")
print("Welcome to NUCML-Next: Understanding ML Limitations with Real Nuclear Data")

### Step 1.1: Load Real EXFOR Data (Tabular View)

We'll use the **tabular projection** of real IAEA EXFOR nuclear cross-section data - this is what classical ML expects.

In [None]:
# Load real EXFOR data in tabular mode
dataset = NucmlDataset(
    data_path='../data/exfor_processed.parquet',
    mode='tabular'
)

# Project to tabular format with NAIVE features
# This shows how classical ML sees the data: [Z, A, E, MT_onehot]
df_naive = dataset.to_tabular(mode='naive')

print(f"Dataset shape: {df_naive.shape}")
print(f"\nFeatures (Naive Mode):")
print(df_naive.columns.tolist())
print(f"\nFirst few rows:")
df_naive.head()

**Notice:** The naive approach treats reactions as independent categories (MT_2, MT_18, etc.).

**Problem:** This ignores physics! (n,2n) and (n,3n) are related - they differ by one neutron.

But tree-based models don't know this. To them, MT=16 and MT=17 are just labels.

### Step 1.2: Train Decision Tree (The "Villain")

We'll intentionally configure the tree to show the **staircase effect**.

In [None]:
# Initialize Decision Tree with limited depth (exaggerates stairs)
dt_model = DecisionTreeEvaluator(
    max_depth=6,          # Shallow tree = coarse stairs
    min_samples_leaf=20,  # Large leaves = big steps
)

# Train on naive features
dt_metrics = dt_model.train(df_naive)

print("\n" + "="*60)
print("Decision Tree Performance:")
print("="*60)
for key, value in dt_metrics.items():
    print(f"  {key:20s}: {value}")

### Step 1.3: The Failure Mode - Visualize the Staircase Effect

Let's predict cross sections in a resonance region and see what happens...

In [None]:
# Predict for U-235 capture reaction in resonance region
Z, A = 92, 235
mt_code = 102  # (n,Œ≥) capture
energy_range = (1.0, 100.0)  # 1-100 eV (resonance region)

# Get ground truth
mask = (dataset.df['Z'] == Z) & (dataset.df['A'] == A) & (dataset.df['MT'] == mt_code)
df_truth = dataset.df[mask].copy()
df_truth = df_truth[(df_truth['Energy'] >= energy_range[0]) & 
                     (df_truth['Energy'] <= energy_range[1])]

# Get Decision Tree predictions (dense sampling to see steps)
energies_dt, predictions_dt = dt_model.predict_resonance_region(
    Z, A, mt_code, energy_range, num_points=1000, mode='naive'
)

# Plot the catastrophe
fig, ax = plt.subplots(figsize=(12, 6))

# Ground truth (smooth curve)
ax.plot(df_truth['Energy'], df_truth['CrossSection'], 
        'b-', linewidth=2, label='Ground Truth (Physics)', alpha=0.7)

# Decision Tree predictions (jagged stairs)
ax.plot(energies_dt, predictions_dt, 
        'r-', linewidth=1.5, label='Decision Tree', alpha=0.8)

ax.set_xlabel('Energy (eV)', fontsize=12, fontweight='bold')
ax.set_ylabel('Cross Section (barns)', fontsize=12, fontweight='bold')
ax.set_title('The Staircase Effect: Why Decision Trees Fail\nU-235 (n,Œ≥) Resonance Region',
             fontsize=14, fontweight='bold')
ax.legend(fontsize=11)
ax.set_yscale('log')
ax.grid(True, alpha=0.3)

# Annotate the problem
ax.annotate('Unphysical discontinuities!\n(Real cross sections are smooth)',
            xy=(30, predictions_dt[300]), xytext=(50, predictions_dt[300]*5),
            arrowprops=dict(arrowstyle='->', color='red', lw=2),
            fontsize=10, color='red', fontweight='bold',
            bbox=dict(boxstyle='round', facecolor='yellow', alpha=0.7))

plt.tight_layout()
plt.show()

print("\n‚ö†Ô∏è  OBSERVATION: Decision Tree creates JAGGED predictions!")
print("    Real nuclear cross sections are SMOOTH.")
print("    These stairs would cause numerical instabilities in reactor codes.")

### üî¥ Critical Insight #1: Piecewise Constant ‚â† Physics

Decision trees partition feature space into rectangles:
```
if Energy < 10.5:
    if Energy < 5.2:
        return 150.0  # Constant!
    else:
        return 89.0   # Jump!
else:
    return 45.0
```

Real physics:
```
œÉ(E) = œÉ_0 * Œì / ((E - E_r)¬≤ + Œì¬≤/4)  # Smooth Breit-Wigner!
```

---

## Part 2: Can XGBoost Save Us?

Let's try a more sophisticated ensemble method.

In [None]:
# Initialize XGBoost
xgb_naive = XGBoostEvaluator(
    n_estimators=100,
    max_depth=6,
    learning_rate=0.1,
)

# Train on naive features
xgb_metrics_naive = xgb_naive.train(df_naive)

print("\n" + "="*60)
print("XGBoost Performance (Naive Features):")
print("="*60)
for key, value in xgb_metrics_naive.items():
    if value is not None:
        print(f"  {key:20s}: {value}")

In [None]:
# Get XGBoost predictions
energies_xgb, predictions_xgb = xgb_naive.predict_resonance_region(
    Z, A, mt_code, energy_range, num_points=1000, mode='naive'
)

# Comparative plot
fig, ax = plt.subplots(figsize=(12, 6))

# Ground truth
ax.plot(df_truth['Energy'], df_truth['CrossSection'], 
        'b-', linewidth=3, label='Ground Truth', alpha=0.7, zorder=1)

# Decision Tree (stairs)
ax.plot(energies_dt, predictions_dt, 
        'r--', linewidth=1.5, label='Decision Tree (Staircase)', alpha=0.6, zorder=2)

# XGBoost (smoother but not smooth)
ax.plot(energies_xgb, predictions_xgb, 
        'g-', linewidth=2, label='XGBoost (Better, but...)', alpha=0.8, zorder=3)

ax.set_xlabel('Energy (eV)', fontsize=12, fontweight='bold')
ax.set_ylabel('Cross Section (barns)', fontsize=12, fontweight='bold')
ax.set_title('XGBoost vs Decision Tree: Improvement but Still Not Physics-Compliant',
             fontsize=14, fontweight='bold')
ax.legend(fontsize=11)
ax.set_yscale('log')
ax.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

print("\n‚úì XGBoost is SMOOTHER (ensemble averaging)")
print("‚úó But still has micro-steps and can't guarantee smoothness")
print("‚úó No awareness of resonance physics")

### üü° Critical Insight #2: Ensembles Help, But...

XGBoost averages many trees, which smooths predictions.

**BUT:**
- Still piecewise constant at fine scale
- No guarantee of smoothness
- Can't learn resonance physics (Breit-Wigner shape)
- Poor extrapolation beyond training data

---

## Part 3: The Upgrade - Physics-Aware Features

What if we give XGBoost *better features*?

Instead of naive [Z, A, E, MT_onehot], use physics-derived features from the graph:
- **Q-value**: Reaction energy
- **Threshold**: E_threshold
- **ŒîZ, ŒîA**: Nuclear topology

This is the bridge to deep learning!

In [None]:
# Get physics-aware tabular projection
df_physics = dataset.to_tabular(mode='physics')

print("Physics-Aware Features:")
print(df_physics.columns.tolist())
print(f"\nFirst few rows:")
df_physics.head()

In [None]:
# Train XGBoost with physics features
xgb_physics = XGBoostEvaluator(
    n_estimators=100,
    max_depth=6,
    learning_rate=0.1,
)

xgb_metrics_physics = xgb_physics.train(df_physics)

print("\n" + "="*60)
print("XGBoost Performance (Physics Features):")
print("="*60)
for key, value in xgb_metrics_physics.items():
    if value is not None:
        print(f"  {key:20s}: {value}")

print("\nComparison with Naive Features:")
print(f"  Test MSE (Naive):   {xgb_metrics_naive['test_mse']:.4e}")
print(f"  Test MSE (Physics): {xgb_metrics_physics['test_mse']:.4e}")
improvement = (xgb_metrics_naive['test_mse'] - xgb_metrics_physics['test_mse']) / xgb_metrics_naive['test_mse'] * 100
print(f"  Improvement: {improvement:.1f}%")

In [None]:
# Get physics-mode predictions
energies_xgb_phys, predictions_xgb_phys = xgb_physics.predict_resonance_region(
    Z, A, mt_code, energy_range, num_points=1000, mode='physics'
)

# Final comparison
fig, ax = plt.subplots(figsize=(14, 7))

# Ground truth
ax.plot(df_truth['Energy'], df_truth['CrossSection'], 
        'b-', linewidth=3, label='Ground Truth (Physics)', alpha=0.8, zorder=1)

# XGBoost naive
ax.plot(energies_xgb, predictions_xgb, 
        'orange', linewidth=2, linestyle='--', label='XGBoost (Naive Features)', alpha=0.6, zorder=2)

# XGBoost physics
ax.plot(energies_xgb_phys, predictions_xgb_phys, 
        'g-', linewidth=2.5, label='XGBoost (Physics Features)', alpha=0.8, zorder=3)

ax.set_xlabel('Energy (eV)', fontsize=13, fontweight='bold')
ax.set_ylabel('Cross Section (barns)', fontsize=13, fontweight='bold')
ax.set_title('Physics Features Help... But We Can Do Better!\nU-235 (n,Œ≥) Resonance Region',
             fontsize=15, fontweight='bold')
ax.legend(fontsize=12, loc='best')
ax.set_yscale('log')
ax.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

print("\n‚úì Physics features improve accuracy")
print("‚úì Model learns about thresholds and reaction energetics")
print("‚úó STILL can't guarantee smooth resonance curves")
print("‚úó STILL poor extrapolation")
print("‚úó No explicit physics constraints (unitarity, conservation laws)")

### üü¢ Critical Insight #3: Features Matter, But Architecture Matters More

Physics-aware features help XGBoost understand reactions better.

**BUT** the fundamental problem remains:
- Tree-based models are **piecewise constant**
- No inductive bias for **smoothness**
- No way to encode **physical constraints**

---

## Part 4: Feature Importance Analysis

Let's see what XGBoost "thinks" is important.

In [None]:
# Get feature importance
importance_physics = xgb_physics.get_feature_importance()

# Plot
fig, ax = plt.subplots(figsize=(10, 6))
ax.barh(importance_physics['Feature'], importance_physics['Importance'])
ax.set_xlabel('Importance (Gain)', fontsize=12, fontweight='bold')
ax.set_title('XGBoost Feature Importance (Physics Mode)', fontsize=14, fontweight='bold')
ax.invert_yaxis()
plt.tight_layout()
plt.show()

print("\nTop 5 Most Important Features:")
print(importance_physics.head())

### üéì Key Takeaway

> **Low MSE on test data does NOT guarantee safe reactor predictions!**
>
> We need models that:
> 1. Respect physics (smoothness, thresholds, unitarity)
> 2. Extrapolate correctly (beyond training data)
> 3. Prioritize safety-critical reactions (sensitivity weighting)
>
> This is why we need **Physics-Informed Deep Learning**.

---

## Next Steps

In **Notebook 01**, we'll:
- Build the **Chart of Nuclides as a Graph**
- Visualize nuclear topology with real EXFOR data
- Understand how GNNs can capture isotope relationships

In **Notebook 02**, we'll:
- Implement **GNN + Transformer**
- Train on graph-structured real data
- See **smooth, physics-compliant predictions**!

In **Notebook 03**, we'll:
- Integrate with **OpenMC** for reactor validation
- Solve the **Validation Paradox**
- Achieve reactor-grade accuracy with real nuclear data

Continue to `01_Data_Fabric_and_Graph.ipynb` ‚Üí