# 14 · Radiation & Noise Modeling (SpectraMind V50)

Educational, **mission‑grade** notebook to illustrate how radiation environments and detector/system noise influence transit spectra and downstream diagnostics. This notebook is **pipeline‑safe**: it *reads* existing artifacts (calibrated spectra or predictions) and writes educational diagnostics under `outputs/notebooks/14_radiation_noise_modeling/`.

### Objectives
1. Summarize noise sources relevant to spaceborne spectroscopy (shot noise, read noise, dark current, cosmic rays / radiation hits, background).
2. Load one or more spectra from `outputs/` and inject controllable noise models (Poisson, Gaussian, 1/f) and **cosmic‑ray transients**.
3. Visualize impact on FFT/autocorr structure, per‑bin variance, and symbolic bands.
4. Export a compact diagnostics bundle (JSON + CSV + PNGs) for teaching and QA.

> Contract: **Thin orchestration** over CLI/outputs; no ad‑hoc calibration or model training here. For production calibration and prediction, use the dedicated notebooks and CLI.

In [None]:
import os, sys, json, shutil, subprocess, platform, textwrap
from pathlib import Path
from datetime import datetime
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
try:
    import seaborn as sns
    sns.set_context('notebook'); sns.set_style('whitegrid')
except Exception:
    pass

ROOT = Path.cwd().resolve()
NB_OUT = ROOT / 'outputs' / 'notebooks' / '14_radiation_noise_modeling'
NB_OUT.mkdir(parents=True, exist_ok=True)

ENV = {
    'python': platform.python_version(),
    'platform': platform.platform(),
    'time': datetime.utcnow().isoformat()+'Z',
}
(NB_OUT/'env_snapshot.json').write_text(json.dumps(ENV, indent=2))
print('ROOT:', ROOT) ; print('NB_OUT:', NB_OUT)

## 0) Background: radiation & detector/system noise (recap)
**Noise categories (simplified):**
- **Shot noise (photon counting)**: Poisson with variance $\sigma^2 \approx N$ photons. Dominant at high flux, fundamental.
- **Read noise**: electronics/ADC noise per read; Gaussian with fixed variance per exposure/read.
- **Dark current**: thermally generated electrons; behaves like additional Poisson process with rate depending on temperature.
- **Background (zodiacal/thermal)**: adds counts and variance; often Poisson‑like per pixel/extraction window.
- **1/f (pink) noise)**: low‑frequency drift; can imprint long‑scale structure in spectra/time series.
- **Cosmic rays / radiation hits**: transient, often impulsive events (spikes/glitches) that must be detected/flagged.

We’ll illustrate how each affects a clean spectrum (or an existing prediction) by perturbing it with controlled levels and visualizing FFT/autocorr & per‑bin variance.

## 1) Load a base spectrum from outputs/
We try `outputs/` for a predictions CSV or NPY and reduce to a single `(wavelength_index, mu)` spectrum for demonstration.

If none are found, we synthesize a plausible spectrum with two absorption features for educational use.

In [None]:
def find_candidates():
    roots = [ROOT/'outputs']
    pats = ['**/predictions.csv','**/mu.csv','**/spectra.npy','**/mu.npy']
    cands = []
    for r in roots:
        if not r.exists():
            continue
        for pat in pats:
            cands += list(r.glob(pat))
    return sorted(set(cands), key=lambda p: p.stat().st_mtime) if cands else []

def load_one_spectrum(path: Path) -> pd.DataFrame:
    if path.suffix=='.npy':
        arr = np.load(path)
        if arr.ndim==1: arr = arr[None,:]
        mu = arr[0]
        return pd.DataFrame({'wavelength_index': np.arange(len(mu)), 'mu': mu})
    df = pd.read_csv(path)
    cols = {str(c).lower(): c for c in df.columns}
    if {'planet_id','wavelength_index','mu'}.issubset(cols):
        one = df[df[cols['planet_id']]==df[cols['planet_id']].iloc[0]].copy()
        one = one.rename(columns={cols['wavelength_index']:'wavelength_index', cols['mu']:'mu'})
        return one[['wavelength_index','mu']].sort_values('wavelength_index').reset_index(drop=True)
    mu_cols = [c for c in df.columns if str(c).startswith('mu_')]
    if mu_cols:
        mu = df[mu_cols].iloc[0].to_numpy(float)
        return pd.DataFrame({'wavelength_index': np.arange(len(mu)), 'mu': mu})
    raise ValueError(f'Unsupported schema for {path}')

CANDS = find_candidates()
if not CANDS:
    # synthesize a smooth demo spectrum
    x = np.linspace(0, 1, 283)
    mu = 0.01 + 0.002*np.exp(-0.5*((x-0.35)/0.08)**2) + 0.0015*np.exp(-0.5*((x-0.75)/0.05)**2)
    base = pd.DataFrame({'wavelength_index': np.arange(283), 'mu': mu})
    base_source = 'synthetic'
    print('No outputs found; using synthetic spectrum.')
else:
    base = load_one_spectrum(CANDS[-1])
    base_source = CANDS[-1].relative_to(ROOT).as_posix()
    print('Loaded from:', base_source)

base.head()

In [None]:
plt.figure(figsize=(10,3))
plt.plot(base['wavelength_index'], base['mu'], lw=1.5)
plt.title('Base spectrum (μ)')
plt.xlabel('wavelength index'); plt.ylabel('μ (arb)')
plt.tight_layout(); plt.savefig(NB_OUT/'base_spectrum.png', dpi=150); plt.close()
print('Saved base_spectrum.png')

## 2) Noise models
We implement simple, composable perturbations:
- **Poisson shot noise**: `y ~ Poisson(λ=S·μ) / S` with scaling `S` to get counts domain.
- **Gaussian read noise**: `N(0, σ_read)`.
- **1/f noise**: generated via colored‑noise frequency shaping.
- **Cosmic ray hits**: sparse spikes at random indices; optionally spread with small kernels.

All random draws are controlled by a fixed seed for reproducibility. Adjust parameters as needed for teaching.

In [None]:
rng = np.random.default_rng(1234)

def add_shot_noise(mu, scale_counts=5e5):
    lam = np.clip(scale_counts*np.maximum(mu, 0), 0, None)
    y_counts = rng.poisson(lam)
    return y_counts/scale_counts

def add_read_noise(mu, sigma_read=2e-4):
    return mu + rng.normal(0.0, sigma_read, size=mu.shape)

def add_1f_noise(mu, alpha=1.0, amp=2e-4):
    n = len(mu)
    white = rng.normal(0,1,n)
    f = np.fft.rfftfreq(n)
    spec = np.fft.rfft(white)
    with np.errstate(divide='ignore', invalid='ignore'):
        shaping = 1.0/np.maximum(f, 1e-6)**alpha
    shaped = spec*shaping
    x = np.fft.irfft(shaped, n)
    x = amp*x/np.std(x)
    return mu + x

def add_cosmic_rays(mu, n_hits=3, spike_amp=0.01, kernel=[1.0, 0.5]):
    y = mu.copy()
    W = len(mu)
    if n_hits <= 0:
        return y, np.array([], dtype=int)
    hits = rng.choice(W, size=min(n_hits,W), replace=False)
    for h in hits:
        for k,a in enumerate(kernel):
            idx = h+k
            if idx < W:
                y[idx] += a*spike_amp
    return y, hits

mu = base['mu'].to_numpy(float)
noisy_shot   = add_shot_noise(mu, scale_counts=3e5)
noisy_read   = add_read_noise(mu, sigma_read=2e-4)
noisy_1f     = add_1f_noise(mu, alpha=1.0, amp=2e-4)
noisy_cr, H  = add_cosmic_rays(mu, n_hits=4, spike_amp=0.01)
H

In [None]:
x = base['wavelength_index']
fig, ax = plt.subplots(2,2, figsize=(12,6), sharex=True)
ax = ax.ravel()
ax[0].plot(x, mu, lw=1.2, label='base')
ax[0].plot(x, noisy_shot, lw=0.8, label='shot')
ax[0].set_title('Shot noise') ; ax[0].legend()

ax[1].plot(x, mu, lw=1.2, label='base')
ax[1].plot(x, noisy_read, lw=0.8, label='read')
ax[1].set_title('Read noise') ; ax[1].legend()

ax[2].plot(x, mu, lw=1.2, label='base')
ax[2].plot(x, noisy_1f, lw=0.8, label='1/f')
ax[2].set_title('1/f noise') ; ax[2].legend()

ax[3].plot(x, mu, lw=1.2, label='base')
ax[3].plot(x, noisy_cr, lw=0.8, label='cosmic rays')
if H.size:
    ax[3].scatter(x.iloc[H], noisy_cr[H], s=20, zorder=3)
ax[3].set_title('Radiation hits (spikes)') ; ax[3].legend()

for a in ax: a.set_ylabel('μ (arb)')
ax[2].set_xlabel('wavelength index'); ax[3].set_xlabel('wavelength index')
fig.tight_layout(); fig.savefig(NB_OUT/'noise_panels.png', dpi=150); plt.close(fig)
print('Saved noise_panels.png')

## 3) FFT & autocorrelation impact
We reuse the simple FFT/AC routines used elsewhere to illustrate how each noise class alters spectral frequency content and lag structure.

In [None]:
def fft_power_onesided(y):
    y = np.asarray(y, float)
    y = y - np.nanmean(y)
    fy = np.fft.rfft(y)
    power = np.abs(fy)**2
    freqs = np.fft.rfftfreq(len(y))
    return freqs, power

def autocorr_norm(y):
    y = np.asarray(y, float)
    y = y - np.nanmean(y)
    r = np.correlate(y, y, mode='full')
    r = r[r.size//2:]
    if r[0] != 0:
        r = r / r[0]
    lags = np.arange(r.size)
    return lags, r

series = {
    'base': mu,
    'shot': noisy_shot,
    'read': noisy_read,
    '1f':   noisy_1f,
    'cr':   noisy_cr
}

fig, ax = plt.subplots(2,2, figsize=(12,6))
ax = ax.ravel()
for i,(name, y) in enumerate(series.items()):
    f,p = fft_power_onesided(y)
    ax[i//2].semilogy(f[1:], p[1:], lw=1, label=name)
ax[0].set_title('FFT power (A)') ; ax[1].set_title('FFT power (B)')
ax[0].legend(); ax[1].legend()
for a in ax[:2]: a.set_xlabel('freq'); a.set_ylabel('power')

for i,(name, y) in enumerate(series.items()):
    l,r = autocorr_norm(y)
    ax[2 + (i%2)].plot(l, r, lw=1, label=name)
ax[2].set_title('Autocorr (A)') ; ax[3].set_title('Autocorr (B)')
ax[2].legend(); ax[3].legend()
for a in ax[2:]: a.set_xlabel('lag'); a.set_ylabel('norm acorr')
fig.tight_layout(); fig.savefig(NB_OUT/'fft_autocorr_noise_compare.png', dpi=150); plt.close(fig)
print('Saved fft_autocorr_noise_compare.png')

## 4) Per‑bin variance & symbolic band overlays
We compute per‑bin variance across a small ensemble of perturbed spectra and (optionally) overlay symbolic bands (e.g., water) to show if noise masks lines of interest.

In [None]:
def ensemble_variance(mu, K=64, cfg=None):
    cfg = cfg or {}
    ens = []
    for k in range(K):
        y = mu.copy()
        if cfg.get('shot'):   y = add_shot_noise(y, scale_counts=cfg.get('shot_scale',3e5))
        if cfg.get('read'):   y = add_read_noise(y, sigma_read=cfg.get('read_sigma',2e-4))
        if cfg.get('one_over_f'): y = add_1f_noise(y, alpha=cfg.get('alpha',1.0), amp=cfg.get('amp',2e-4))
        if cfg.get('cosmic_hits'):
            y,_ = add_cosmic_rays(y, n_hits=cfg.get('cr_n',3), spike_amp=cfg.get('cr_amp',0.01))
        ens.append(y)
    ens = np.stack(ens, axis=0)
    return ens.var(axis=0)

cfg = {'shot':True, 'read':True, 'one_over_f':True, 'cosmic_hits':True}
var_bins = ensemble_variance(mu, K=64, cfg=cfg)
plt.figure(figsize=(10,3))
plt.plot(base['wavelength_index'], var_bins, lw=1)
plt.title('Per‑bin variance under composite noise model')
plt.xlabel('wavelength index'); plt.ylabel('variance')
plt.tight_layout(); plt.savefig(NB_OUT/'perbin_variance.png', dpi=150); plt.close()
print('Saved perbin_variance.png')

## 5) Bundle export
We export a compact JSON + CSV for dashboards and lessons learned.

In [None]:
bundle = {
  'source': base_source,
  'noise_panels': 'noise_panels.png',
  'fft_autocorr_compare': 'fft_autocorr_noise_compare.png',
  'perbin_variance': 'perbin_variance.png',
  'notes': 'Educational demo; parameters are illustrative and not instrument‑calibrated.'
}
(NB_OUT/'radiation_noise_bundle.json').write_text(json.dumps(bundle, indent=2))
pd.DataFrame({'wavelength_index': base['wavelength_index'], 'mu_base': mu, 'var_noise': var_bins}).to_csv(NB_OUT/'radiation_noise_detail.csv', index=False)
print('Wrote bundle & detail CSV to', NB_OUT)

---
### Notes & Further Reading
- **Shot/read/dark/background** processes and their statistics are standard detector topics; see your instrument handbook for exact models and units.
- **Cosmic ray** rates depend on orbit and shielding; spikes must be detected/flagged to avoid biasing spectra and FFT/autocorr diagnostics.
- For production pipeline, rely on the **calibration kill chain** and CLI diagnostics rather than these educational injectors.