# Market NN Plus Ultra — MVP Retraining Walkthrough

This notebook operationalises the optimisation plan by executing the MVP training→evaluation→monitoring loop end-to-end. It mirrors the quickstart steps while capturing artefacts, diagnostics, and guardrail checks you can review after each run. All instructions are written for a Windows host running PowerShell. Launch Jupyter after activating the project's `.venv` so imports resolve without raising `ModuleNotFoundError`.


## Execution Checklist

We keep the quickstart sequence intact and mark completion inline. Update notes after each cell to maintain traceability.

- [ ] Prepare a reproducible SQLite fixture
- [ ] Run the retraining plan with evaluation enabled
- [ ] Capture a reference return series for monitoring
- [ ] Generate a monitoring snapshot
- [ ] Review profitability + guardrail outputs

## 0. Environment bootstrap

Install the project in editable mode directly from the notebook so the `market_nn_plus_ultra` package is registered in your active Windows virtual environment. If you are launching Jupyter from PowerShell, remember to run `Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope Process -Force` once per session before activating `.venv\Scripts\Activate.ps1`.


In [None]:
import os
from pathlib import Path
import subprocess

notebook_dir = Path.cwd().resolve()
if notebook_dir.name == 'notebooks' and (notebook_dir.parent / 'pyproject.toml').exists():
    project_root = notebook_dir.parent
else:
    project_root = notebook_dir
os.chdir(project_root)
print(f'Working directory: {project_root}')

subprocess.run(['python', '-m', 'pip', 'install', '--quiet', '--upgrade', 'pip'], check=True)
subprocess.run(['python', '-m', 'pip', 'install', '--quiet', '-e', '.'], check=True)


## 1. Prepare the SQLite fixture

We reuse `scripts/make_fixture.py` with reduced dimensions so CPU-bound smoke tests finish quickly. When rerunning the notebook, delete the existing database or pass `--overwrite` to refresh the data.

In [None]:
fixture_path = project_root / 'data' / 'plus_ultra_fixture.db'
fixture_path.parent.mkdir(parents=True, exist_ok=True)
if fixture_path.exists():
    fixture_path.unlink()

cmd = [
    'python', 'scripts/make_fixture.py', str(fixture_path),
    '--symbols', 'SPY', 'QQQ', 'IWM',
    '--rows', '4096',
    '--freq', '30min',
    '--alt-features', '2',
]
print(' '.join(cmd))
fixture_result = subprocess.run(cmd, check=True, capture_output=True, text=True)
print(fixture_result.stdout)


✅ **Checklist update:** Prepared the SQLite fixture.

## 2. Run the retraining plan with evaluation

The automation CLI chains schema validation, masked pretraining, supervised training, and inference. We point it at the CPU-friendly MVP configs added alongside this notebook. Guardrail thresholds are loose to keep the run focused on smoke-testing the pipeline.

In [None]:
automation_dir = project_root / 'automation_runs' / 'mvp_notebook'
automation_dir.mkdir(parents=True, exist_ok=True)

automation_cmd = [
    'python', 'scripts/automation/retrain.py',
    '--dataset', str(fixture_path),
    '--train-config', 'configs/mvp_quickstart.yaml',
    '--pretrain-config', 'configs/mvp_pretrain.yaml',
    '--run-evaluation',
    '--eval-output', str(automation_dir / 'evaluation'),
    '--eval-min-sharpe', '-0.5',
    '--eval-max-drawdown', '0.8',
    '--eval-max-gross-exposure', '2.0',
    '--eval-max-turnover', '5.0',
    '--eval-min-tail-return', '-0.5',
    '--eval-max-tail-frequency', '0.5',
]
print(' '.join(str(arg) for arg in automation_cmd))
automation_result = subprocess.run(automation_cmd, check=True, capture_output=True, text=True)
print(automation_result.stdout)


✅ **Checklist update:** Ran the retraining plan and produced evaluation artifacts.

## 3. Capture a reference return series

Monitoring compares live ROI to a baseline. For the MVP we recycle realised returns from the evaluation parquet. In production you would reference an audited benchmark catalogue.

In [None]:
import pandas as pd

predictions_path = automation_dir / 'evaluation' / 'predictions.parquet'
reference_path = automation_dir / 'evaluation' / 'reference_returns.parquet'

predictions = pd.read_parquet(predictions_path)
reference = predictions[["realised_return"]].dropna()
reference.to_parquet(reference_path)
print(reference.head())
print(f"Saved reference series to {reference_path}")


✅ **Checklist update:** Captured the reference return series.

## 4. Generate a monitoring snapshot

The monitoring CLI fuses guardrail metrics with drift diagnostics. Feed it both the reference series and the evaluation directory so alerts stay contextualised.

In [None]:
monitoring_output = automation_dir / 'evaluation' / 'monitoring_snapshot.json'
monitoring_cmd = [
    'python', 'scripts/monitoring/live_monitor.py',
    str(reference_path),
    '--evaluation-dir', str(automation_dir / 'evaluation'),
    '--output', str(monitoring_output),
]
print(' '.join(str(arg) for arg in monitoring_cmd))
monitoring_result = subprocess.run(monitoring_cmd, check=True, capture_output=True, text=True)
print(monitoring_result.stdout)


✅ **Checklist update:** Generated the monitoring snapshot.

## 5. Review profitability & guardrails

Summarise the operations output and the monitoring snapshot to confirm the pipeline surfaces ROI, drawdowns, and alert flags. Update the checklist when verification is complete.

In [None]:
import json

operations_path = automation_dir / 'evaluation' / 'operations_summary.json'
operations = json.loads(operations_path.read_text())
monitoring_payload = json.loads(monitoring_output.read_text())

print('Operations Summary Keys:', operations.keys())
print('Sharpe:', operations.get('risk', {}).get('sharpe'))
print('Max Drawdown:', operations.get('risk', {}).get('max_drawdown'))
print('Alerts:', operations.get('alerts'))

print()
print('Monitoring Snapshot Keys:', monitoring_payload.keys())
print('Guardrail Alerts:', monitoring_payload.get('alerts'))
print('Window Count:', monitoring_payload.get('window_count'))


✅ **Checklist update:** Reviewed profitability and guardrail outputs.