v0.5.0
evo 0.5.0 makes the loop optimize the whole system — the model weights and the harness — in one run, against one objective. Plus a new Claude Code workflow driver with a live meta-controller, subagents, and a much richer dashboard.
Optimize the model, not just the harness
- evo can now fine-tune the base model (SFT / LoRA / RL) as a move inside the optimization loop, alongside the prompts, scaffold, and skills it already tuned. You hand it the whole stack and it decides what to spend the budget on.
- New
evo:finetuningskill: picks or diagnoses a training move (SFT, LoRA, DPO/KTO/ORPO, RFT, GRPO/PPO/RLOO) with a reward-shape decision tree, a smoke-run gate, and failure diagnostics. Warm-start from the parent policy by default (EVO_PARENT_POLICY).
Workflow driver + live meta-controller (Claude Code)
- A dynamic-workflow driver for the optimize loop — now the default on Claude Code (prose orchestration is opt-out).
- A concurrent meta-controller that watches a run and can restructure the loop live: set knobs, toggle phases, rewrite prompts, inject steps — plus a STOP signal with a gated enforcer. The autonomous stop-nudge is suppressed under the workflow driver.
- Scan clusters experiments by failure class; a context capsule loads category skills and known learnings; cross-history pattern recognition before proposing.
Subagents
evo:verifierandevo:ideatornow run as subagents.- New benchmark-reviewer subagent; the discover baseline is gated on its review.
Dashboard
- Live log tail, trackio link/sparkline in the node drawer, and per-experiment annotations.
- Cleaner tabs/logs; committed-experiment trace handling improvements.
EVO_DASHBOARD_HOSTto bind0.0.0.0for Modal/cloud.
CLI & hooks
evo waitgained process / log / GPU probes and a--for ideatorsselector so the loop can block on proposals.--per-exp-timeouton init with a--timeoutper-call override; a PostToolUse hint when the agent starts a long-running command.evo abortnow finds the subprocess tree cross-platform (Windows included), so detached benchmark/training children don't survive as orphans.
Integrity & config
task-skillsconfig: discover resolves category skills and agents load them on demand.- Literature research is required before the first experiment; training on the benchmark set is banned.
Fixes
- hook-drain staging honors
CLAUDE_CONFIG_DIRand from-path installs (fixes the SessionStart exit-127 warning).
Install
uv tool install evo-hq-cli==0.5.0
evo install claude-code # or codex / cursor / openclaw / pi
Also published: evo-hq-agent 0.5.0 (PyPI), @evo-hq/evo-agent and @evo-hq/pi-evo 0.5.0 (npm).
Full changelog: v0.4.5...v0.5.0