Skip to content

v0.5.0

Choose a tag to compare

@github-actions github-actions released this 06 Jun 09:40
· 19 commits to main since this release
0090ce9

evo 0.5.0 makes the loop optimize the whole system — the model weights and the harness — in one run, against one objective. Plus a new Claude Code workflow driver with a live meta-controller, subagents, and a much richer dashboard.

Optimize the model, not just the harness

  • evo can now fine-tune the base model (SFT / LoRA / RL) as a move inside the optimization loop, alongside the prompts, scaffold, and skills it already tuned. You hand it the whole stack and it decides what to spend the budget on.
  • New evo:finetuning skill: picks or diagnoses a training move (SFT, LoRA, DPO/KTO/ORPO, RFT, GRPO/PPO/RLOO) with a reward-shape decision tree, a smoke-run gate, and failure diagnostics. Warm-start from the parent policy by default (EVO_PARENT_POLICY).

Workflow driver + live meta-controller (Claude Code)

  • A dynamic-workflow driver for the optimize loop — now the default on Claude Code (prose orchestration is opt-out).
  • A concurrent meta-controller that watches a run and can restructure the loop live: set knobs, toggle phases, rewrite prompts, inject steps — plus a STOP signal with a gated enforcer. The autonomous stop-nudge is suppressed under the workflow driver.
  • Scan clusters experiments by failure class; a context capsule loads category skills and known learnings; cross-history pattern recognition before proposing.

Subagents

  • evo:verifier and evo:ideator now run as subagents.
  • New benchmark-reviewer subagent; the discover baseline is gated on its review.

Dashboard

  • Live log tail, trackio link/sparkline in the node drawer, and per-experiment annotations.
  • Cleaner tabs/logs; committed-experiment trace handling improvements.
  • EVO_DASHBOARD_HOST to bind 0.0.0.0 for Modal/cloud.

CLI & hooks

  • evo wait gained process / log / GPU probes and a --for ideators selector so the loop can block on proposals.
  • --per-exp-timeout on init with a --timeout per-call override; a PostToolUse hint when the agent starts a long-running command.
  • evo abort now finds the subprocess tree cross-platform (Windows included), so detached benchmark/training children don't survive as orphans.

Integrity & config

  • task-skills config: discover resolves category skills and agents load them on demand.
  • Literature research is required before the first experiment; training on the benchmark set is banned.

Fixes

  • hook-drain staging honors CLAUDE_CONFIG_DIR and from-path installs (fixes the SessionStart exit-127 warning).

Install

uv tool install evo-hq-cli==0.5.0
evo install claude-code   # or codex / cursor / openclaw / pi

Also published: evo-hq-agent 0.5.0 (PyPI), @evo-hq/evo-agent and @evo-hq/pi-evo 0.5.0 (npm).

Full changelog: v0.4.5...v0.5.0