# 🧠 SpectraMind V50 — 03_train_v50_demo.ipynb

Mission‑grade **mini training run** that exercises the official CLI with Hydra overrides.

**Standards**
- Notebooks are *thin orchestration*: **CLI → Hydra → DVC artifacts**. No ad‑hoc training logic.
- Inputs/outputs are versioned and written under `data/` and `outputs/` (DVC‑tracked where applicable).
- Figures/JSON from this demo are saved under `outputs/train_demo/`.
- Logs/diaries live under `logs/` (Rich console, JSONL, and Markdown run diary).

**What this notebook does**
1) Environment & CLI sanity checks.
2) (Optional) print the merged Hydra config for visibility.
3) Run a **fast** demo training via CLI (e.g., `training.fast_dev_run=true`).
4) Surface key artifacts (e.g., `loss_curve.png`), and tail the run diary.
5) Persist a small `training_report.json` for downstream diagnostics.

> For full training or sweeps, use the CLI directly with your Hydra configs (e.g., `spectramind train training.epochs=50 model=...`).

## 0) Setup & paths
Assumes this notebook is in `/notebooks`. We’ll create a demo output folder under `outputs/train_demo/`.

In [None]:
import os, sys, json, subprocess, textwrap
from pathlib import Path
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
sns.set_context('notebook'); sns.set_style('whitegrid')

NB_DIR = Path.cwd()
ROOT = NB_DIR if (NB_DIR / 'data').exists() else NB_DIR.parents[0]
DATA = ROOT / 'data'
OUT = ROOT / 'outputs'
LOGS = ROOT / 'logs'
DEMO = OUT / 'train_demo'
OUT.mkdir(parents=True, exist_ok=True)
LOGS.mkdir(parents=True, exist_ok=True)
DEMO.mkdir(parents=True, exist_ok=True)

print('ROOT:', ROOT)
print('DATA:', DATA)
print('OUT :', OUT)
print('LOGS:', LOGS)
print('DEMO:', DEMO)

## 1) Environment & CLI snapshot (best‑effort)
Capture versions and commit state for auditability. Cells are robust to missing tools and won’t hard‑fail.

In [None]:
def _run(cmd: str, cwd: Path | None = None):
    print(f"\n$ {cmd}")
    try:
        p = subprocess.run(cmd, shell=True, text=True, capture_output=True, cwd=str(cwd or ROOT))
        out = (p.stdout or '')[-2000:]
        err = (p.stderr or '')[-2000:]
        print(out.strip())
        if p.returncode != 0 and err.strip():
            print('[stderr]', err.strip())
        return p.returncode
    except Exception as e:
        print('[skip]', e)
        return -1

_run('python --version')
_run('spectramind --version')              # unified CLI (if available)
_run('dvc --version')                       # DVC presence
_run('git rev-parse --short HEAD')          # commit id
_run('git status -s')                       # working tree state

env_snapshot = {
    'python': sys.version.split()[0],
    'cwd': str(Path.cwd()),
}
with open(DEMO / 'env_snapshot.json', 'w') as f:
    json.dump(env_snapshot, f, indent=2)
print('Saved', DEMO / 'env_snapshot.json')

## 2) (Optional) Print merged Hydra config for this demo
If the CLI supports a `--print-config` flag (or similar), we can render the composed configuration before running.

> If your CLI exposes a different flag for printing configs, update the command below accordingly, or skip if unsupported.

In [None]:
# This is optional and may be a no-op if the CLI doesn't support it.
_run('spectramind train training.fast_dev_run=true --print-config')

## 3) Run a **fast** training demo via CLI
We’ll prefer `training.fast_dev_run=true` for a smoke test. As a fallback (if unsupported), use `training.epochs=1`.

Artifacts (e.g., `loss_curve.png`, checkpoints, JSON logs) should be written under `outputs/` per your configs. We won’t assume exact subpaths; we’ll discover them after the run.

In [None]:
ret = _run('spectramind train training.fast_dev_run=true')
if ret != 0:
    print('\nRetrying with epochs=1 fallback...')
    ret = _run('spectramind train training.epochs=1')
print('Training command exit code:', ret)

## 4) Surface key artifacts (loss curve, checkpoints, logs)
We’ll search `outputs/` for typical training artifacts and display what we find, without assuming a fixed layout. Any discovered images (e.g. `loss_curve.png`) are copied or referenced in `outputs/train_demo/` for convenience.

In [None]:
from pathlib import Path
import shutil

def find_all(patterns):
    found = []
    for pat in patterns:
        found.extend(OUT.glob(pat))
        found.extend(OUT.glob(f'**/{pat}'))
    # unique preserve order
    seen = set(); uniq = []
    for p in found:
        if p not in seen:
            uniq.append(p); seen.add(p)
    return uniq

loss_imgs = find_all(['loss_curve.png', '*loss*.png'])
ckpts = find_all(['*.pt', '*.pth', '*.ckpt'])
json_logs = find_all(['*.json', '*.jsonl'])

print('Found loss images:')
for p in loss_imgs[:5]:
    print(' -', p.relative_to(ROOT))
print('\nFound checkpoints:')
for p in ckpts[:5]:
    print(' -', p.relative_to(ROOT))
print('\nFound logs (json/jsonl):')
for p in json_logs[:5]:
    print(' -', p.relative_to(ROOT))

# Optionally copy the first loss curve into the demo folder for easy viewing
if loss_imgs:
    try:
        target = DEMO / 'loss_curve.png'
        shutil.copy2(loss_imgs[0], target)
        print('Copied loss curve to', target.relative_to(ROOT))
    except Exception as e:
        print('[skip copy]', e)

### Display the first loss curve (if found)
This is just a convenience visualization for the demo run; the canonical artifact remains where the training script wrote it under `outputs/`.

In [None]:
import matplotlib.image as mpimg
from IPython.display import display

img_path = DEMO / 'loss_curve.png'
if not img_path.exists() and loss_imgs:
    img_path = loss_imgs[0]

if img_path.exists():
    img = mpimg.imread(str(img_path))
    plt.figure(figsize=(8,4))
    plt.imshow(img)
    plt.axis('off')
    plt.title('Loss curve (demo)')
    plt.tight_layout(); plt.show()
else:
    print('No loss curve image found.')

## 5) Tail the CLI run diary (Markdown) and DVC status
Helpful to confirm reproducibility and locate Hydra config snapshots linked to the run.

In [None]:
log_md = LOGS / 'v50_debug_log.md'
print('CLI journal exists?', log_md.exists(), log_md)
if log_md.exists():
    try:
        tail = '\n'.join(log_md.read_text(errors='ignore').splitlines()[-60:])
        print('\n--- tail logs/v50_debug_log.md ---\n' + tail)
    except Exception as e:
        print('[skip log tail]', e)

_run('dvc status')

## 6) Persist a `training_report.json`
We record what we found so downstream notebooks or dashboards can reference these artifacts directly.

In [None]:
report = {
    'loss_images': [str(p.relative_to(ROOT)) for p in loss_imgs[:10]],
    'checkpoints': [str(p.relative_to(ROOT)) for p in ckpts[:10]],
    'json_logs': [str(p.relative_to(ROOT)) for p in json_logs[:10]],
}
with open(DEMO / 'training_report.json', 'w') as f:
    json.dump(report, f, indent=2)
print('Saved', DEMO / 'training_report.json')

## 7) Next steps / reference CLI commands
Use the CLI for real training; scale epochs, switch models/optimizers, and attach diagnostics.

```bash
# Minimal smoketest (preferred):
# spectramind train training.fast_dev_run=true

# Short run:
# spectramind train training.epochs=5 optimizer=adamw model=v50_default

# Resume / checkpoint (example — adapt to your layout):
# spectramind train training.resume_from=outputs/checkpoints/last.ckpt

# After training, run diagnostics and HTML dashboard:
# spectramind diagnose dashboard --no-umap=false --no-tsne=false \
#   --out outputs/diagnostics/report.html
```

**See also**: `04_predict_v50_demo.ipynb` for inference and `05_diagnostics_suite.ipynb` for unified diagnostics.