Releases: bitwise-media-group/evolve
Releases · bitwise-media-group/evolve
v0.4.0
Immutable
release. Only release title and notes can be modified.
Changelog
New Features
- dc0f000: feat(checks): add non-blocking skill-quality signals (@dmccaffery)
- 1250054: feat(report)!: case-major plugin reports and normalized results schema (v5) (@dmccaffery)
- eacd6d0: feat(report): add --migrate to upgrade stored results to the latest schema (@dmccaffery)
- a8d72c8: feat(view): add web report viewer (@dmccaffery)
Bug Fixes
- a09f6a9: refactor(evalspec): split triggers and evals into separate files (@dmccaffery)
Documentation Updates
- 6f27c65: docs: add Authoring skills guide (@dmccaffery)
- e41f31f: docs: add results and reports reference pages (@dmccaffery)
Other Work
- 1104644: ci: migrate workflow callers to github-workflows v3 (@dmccaffery)
v0.3.1
Immutable
release. Only release title and notes can be modified.
Changelog
New Features
- 6687789: feat(tui): scroll the dashboard rollup pane (@dmccaffery)
- 34a8885: feat: add tool_call assertion to verify tool and MCP invocations (@dmccaffery)
Documentation Updates
- e2752f7: docs(readme): separate harnesses from providers, add docs CTA (@dmccaffery)
- e560de1: docs: add eval assertions reference and restructure evaluations docs (@dmccaffery)
- 8f7d111: docs: add triggers, behavioral-eval, and execution authoring pages (@dmccaffery)
- a9fc2f0: docs: clarify and split the TUI keyboard-shortcut tables (@dmccaffery)
- 9566ce6: docs: point config reference links at docs/config/index.md (@dmccaffery)
Other Work
- d758a4b: style(docs): keycap shortcuts and brand-tinted inline-code chips (@dmccaffery)
v0.3.0
Immutable
release. Only release title and notes can be modified.
Changelog
New Features
- f500d83: feat!: honest eval token reporting and default_models-scoped results (@dmccaffery)
- e1d00b6: feat(cli): add hidden --profile flag for cpu/memory pprof (@dmccaffery)
- 5da5c7c: feat(cli): wire plain-output run commands with selection and sandbox flags (@dmccaffery)
- da8e085: feat(config): generate .evolve config JSON Schema from viper keys (@dmccaffery)
- 50671ce: feat(docs): publish a zensical documentation site (@dmccaffery)
- 3d7782b: feat(provider): add Antigravity (agy) CLI provider (@dmccaffery)
- 5c4cbb7: feat(provider): add GitHub Copilot CLI provider (@dmccaffery)
- b18a336: feat(results): record eval assertion counts and auto-upgrade older schemas (@dmccaffery)
- f356f52: feat(run): add --failed flag to select previously-failing triggers/evals (@dmccaffery)
- f81fd3f: feat(run): add --plugin filter and multi-value --skill/--model (@dmccaffery)
- 6595070: feat(run): add Reporter observer and Catalog/Plan/Filter selection seam (@dmccaffery)
- 5f635e5: feat(run): add interleaved per-skill sweep (@dmccaffery)
- 4933dfb: feat(run): baseline benchmarks, run history, and regression/improvement deltas (@dmccaffery)
- 8468d80: feat(run): per-case trigger/eval selection with preselection reasons (@dmccaffery)
- 6167995: feat(run): retain run-scoped workspaces and surface output/log paths (@dmccaffery)
- ee97c6d: feat(runner): sandbox agent runs to protect source repositories (@dmccaffery)
- cecd01e: feat(telemetry): OpenTelemetry instrumentation and --telemetry-dir flag (@dmccaffery)
- ba8ee34: feat(tui): drive the selection form from a stateful plan.Session (@dmccaffery)
- c4c4ce8: feat(tui): interactive selection form and live run dashboard (@dmccaffery)
- ad6c545: feat(tui): reclaim selection-form space and mirror dashboard styling (@dmccaffery)
- 46e4e08: feat(tui): render the EVOLVE wordmark in per-letter colour (@dmccaffery)
- ad2d01b: feat(tui): rework the run dashboard with a navigable execution tree (@dmccaffery)
- 4024d6c: feat(tui): seed queued rows from prior results and update them live (@dmccaffery)
- 78e19a8: feat(tui): show prior results for cases a partial run does not re-run (@dmccaffery)
- fbc3e49: feat(tui): tint pane headings with each pane's accent color (@dmccaffery)
- 82f9863: feat(tui): upgrade to bubbletea v2 and recolor to cyberdream (@dmccaffery)
- 8fda8f4: feat: add --modified flag to rerun cases whose content changed (@dmccaffery)
- 1162049: feat: raise default max_turns from 10 to 20 (@dmccaffery)
Bug Fixes
- 276d7fa: fix(make): build golangci-lint before linting (@dmccaffery)
- 126fa16: fix(plan): restrict both tiers when a skill is queued via one (@dmccaffery)
- 46ab5e2: fix(provider)!: replace Cursor models with Composer 2.5 (@dmccaffery)
- 360849d: fix(provider): disable agent CLIs' own OS sandbox under evolve's (@dmccaffery)
- 50b6c06: fix(provider): drop the antigravity Claude Sonnet 4.6 model (@dmccaffery)
- 8622320: fix(provider): split codex cached input tokens off the headline input figure (@dmccaffery)
- 62ef48b: fix(run): don't pre-select unfillable units under --new (@dmccaffery)
- 2fff1a8: fix(run): serialize PlainReporter writes for concurrent use (@dmccaffery)
- 912a0a0: fix(run): surface claude stdout errors on the runtime-error path (@dmccaffery)
- 67d84ee: fix(tui): spin only running rows; mark queued rows pending (@dmccaffery)
- 5ae2cff: fix: strip ANSI escape sequences from captured execution output (@dmccaffery)
- cb24c5e: refactor(configdoc): emit the config reference as an embeddable fragment (@dmccaffery)
- 1afe7ea: refactor(plan): own run planning in one package the form, engine, and dashboard share (@dmccaffery)
- aadbfaf: refactor(provider)!: split provider into harness, provider, and model (@dmccaffery)
- 27c5e17: refactor: drop leftover default_models references (@dmccaffery)
Documentation Updates
- 82afacc: docs: document the interactive TUI with a dashboard screenshot (@dmccaffery)
- 79b74ce: docs: regenerate CLI and config reference (@dmccaffery)
Other Work
- dd920b0: ci: checkout repo with lfs for docs (@dmccaffery)
- 6f95bb8: ci: emit JUnit test results for Codecov upload (@dmccaffery)
- a25cdf7: ci: migrate github-workflows callers to v2 (@dmccaffery)
- 8743990: perf(tui): render only on-screen dashboard rows (@dmccaffery)
- 67c78d9: style: format generated config examples (@dmccaffery)
- 54bb9b2: style: reformat DESIGN.md with prettier (@dmccaffery)
v0.2.1
Immutable
release. Only release title and notes can be modified.
Changelog
New Features
- e08452b: feat(evals): surface runtime failures and isolate token-counting credentials (@dmccaffery)
- 1003ac9: feat(provider): accept EVOLVE_CLAUDE_CODE_OAUTH_TOKEN for token counting (@dmccaffery)
Documentation Updates
- c6ff572: docs: add SECURITY.md and CodeQL code-scanning triage reports (#8) (@dmccaffery)
Other Work
- e7ea202: ci: adopt org reusable workflows and add fast-forward merge family (#11) (@dmccaffery)
- b59ec3f: ci: allow manual release via workflow_dispatch (@dmccaffery)
v0.2.0
Immutable
release. Only release title and notes can be modified.
Changelog
New Features
- 6ed3dd7: feat(evalspec)!: adopt the skill-creator eval superset (#4) (@dmccaffery)
Other Work
- ca6ab8a: ci(release): build the release PR only after the release tag exists (@dmccaffery)
- eb27d3b: ci: add CodeQL analysis and code coverage (#6) (@dmccaffery)
v0.1.0
Immutable
release. Only release title and notes can be modified.
Changelog
New Features
- 29c7d13: feat(checks)!: make the SKILL.md license rule opt-in
- 55bb90d: feat(release): group release notes by kind with author credit
- 5b1ab4b: feat(release): windows builds, cosign signing, homebrew cask, attestations
- fbe89a3: feat(runner): split process-tree kill into per-platform files
- ae3c896: feat: add evolve, a CLI that evaluates coding-agent plugins
Documentation Updates
- e626cfc: docs: add README, design notes, and generated reference docs