Skip to content

v3.10.5: CLI fleet commands fixed, prompt caching, adversarial QCSD reviews, qe-arena

Choose a tag to compare

@proffesor-for-testing proffesor-for-testing released this 10 Jun 12:31
· 33 commits to main since this release

What's New

Your CLI fleet commands work now. aqe memory store/get/search, aqe agent list, aqe task, aqe team, and aqe pipeline previously always failed with "Fleet not initialized" — fixed; they work from the terminal for the first time.

Cheaper LLM calls. Prompt caching was a dead flag — the Anthropic provider now sets a real cache breakpoint (cached input bills at ~10% of full price) and the cost tracker accounts for cache reads/writes honestly.

Cleaner self-learning under reasoning-heavy models. Extended-thinking scratchpads are scrubbed before trajectories are stored or embedded, so your pattern bank learns what agents did, not how they deliberated. Fresh installs no longer silently skip post-task learning, and stores created by older versions are migrated in place.

Runaway-loop protection. If the same Bash command fails 3× in a row the hooks warn; at 5× they recommend a block with a recovery hint (AQE_STRICT_TOOL_LOOP=1 makes it a hard deny).

Adversarially verified QCSD reviews. The development-phase swarm runs as a deterministic workflow: every finding faces 3 blind refuters, and only confirmed findings reach your report — killed findings are kept with their refutations for audit.

aqe arena — test strategies that compete. Tournaments score strategies on real mutation kill rate and coverage, reproducible byte-for-byte under --seed:

aqe arena run --strategies 4 --target fixtures/arena-demo --seed 42

Ready for nested subagents. Anthropic announced depth-5 nested subagents; AQE records spawn provenance today and ships a probe script that tells you the moment the capability goes live.

Getting Started

npx agentic-qe init --auto

See CHANGELOG and the verification records on #520 for full details (ADR-099 through ADR-104).