Release v0.5.0 · evo-hq/evo

evo 0.5.0 makes the loop optimize the whole system — the model weights and the harness — in one run, against one objective. Plus a new Claude Code workflow driver with a live meta-controller, subagents, and a much richer dashboard.

Optimize the model, not just the harness

evo can now fine-tune the base model (SFT / LoRA / RL) as a move inside the optimization loop, alongside the prompts, scaffold, and skills it already tuned. You hand it the whole stack and it decides what to spend the budget on.
New evo:finetuning skill: picks or diagnoses a training move (SFT, LoRA, DPO/KTO/ORPO, RFT, GRPO/PPO/RLOO) with a reward-shape decision tree, a smoke-run gate, and failure diagnostics. Warm-start from the parent policy by default (EVO_PARENT_POLICY).

Workflow driver + live meta-controller (Claude Code)

A dynamic-workflow driver for the optimize loop — now the default on Claude Code (prose orchestration is opt-out).
A concurrent meta-controller that watches a run and can restructure the loop live: set knobs, toggle phases, rewrite prompts, inject steps — plus a STOP signal with a gated enforcer. The autonomous stop-nudge is suppressed under the workflow driver.
Scan clusters experiments by failure class; a context capsule loads category skills and known learnings; cross-history pattern recognition before proposing.

Subagents

evo:verifier and evo:ideator now run as subagents.
New benchmark-reviewer subagent; the discover baseline is gated on its review.

Dashboard

Live log tail, trackio link/sparkline in the node drawer, and per-experiment annotations.
Cleaner tabs/logs; committed-experiment trace handling improvements.
EVO_DASHBOARD_HOST to bind 0.0.0.0 for Modal/cloud.

CLI & hooks

evo wait gained process / log / GPU probes and a --for ideators selector so the loop can block on proposals.
--per-exp-timeout on init with a --timeout per-call override; a PostToolUse hint when the agent starts a long-running command.
evo abort now finds the subprocess tree cross-platform (Windows included), so detached benchmark/training children don't survive as orphans.

Integrity & config

task-skills config: discover resolves category skills and agents load them on demand.
Literature research is required before the first experiment; training on the benchmark set is banned.

Fixes

hook-drain staging honors CLAUDE_CONFIG_DIR and from-path installs (fixes the SessionStart exit-127 warning).

Install

uv tool install evo-hq-cli==0.5.0
evo install claude-code   # or codex / cursor / openclaw / pi

Also published: evo-hq-agent 0.5.0 (PyPI), @evo-hq/evo-agent and @evo-hq/pi-evo 0.5.0 (npm).

Full changelog: v0.4.5...v0.5.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.5.0

Choose a tag to compare

Sorry, something went wrong.