Skip to content

v3.11.3: Oracle evals, durable-first tests & the conservation layer

Latest

Choose a tag to compare

@proffesor-for-testing proffesor-for-testing released this 28 Jun 06:56
2196647

What's New

Tests and evals you can actually trust. AQE now grades a generated test by running it — against your real code (it must pass) and against deliberately-broken versions (it must catch them) — instead of checking for keywords. Test generation leads with durable tests (invariants, contracts, properties) that survive a rewrite, and quality reports add a mutation-score / regenerability signal next to coverage.

A new conservation guard keeps AQE's own CLI commands, output formats, dashboard API, and skill trees stable so your scripts don't break silently — with a deprecation policy for deliberate, announced changes.

Plus: cheaper model lanes — choose a budget or local model for test generation and prove it clears the oracle (npm run providers:health, eval:models, eval:live:*) — and fixes to the OpenRouter and Gemini provider routers.

Additive and bug-fix only — no breaking CLI/API changes.

Getting Started

```bash
npx agentic-qe init --auto
```

See the CHANGELOG, oracle-evals plan (ADR-113), conservation-layer policy (ADR-114), and cheaper-model guide for details.