v3.10.6: Pattern Space cross-pollination — evidence labels, safety eval, invariant CI
·
10 commits
to main
since this release
What's New
A quality review of a community framework surfaced disciplines worth turning inward on AQE itself. Six improvements (ADR-105–110):
- Evidence-class labels on findings — every QE finding now says what kind of truth it is (
EXECUTED/STATIC/INFERRED/CONJECTURE). Quality gates block only on verified evidence; unverified inference routes to adversarial verification first. - Behavioral safety eval — the rules that protect your learning database (never
rma.db, always back up first, verify row counts) are now a real pass/fail gate, tested against the live model tiers under temptation — as openers and mid-task. - Shipped-agent invariant CI —
verify:invariantsmakes sure shipped agents never silently lose their non-negotiable sections, enforced on every PR and release. - Benchmark lineage + pre-registered rubrics — benchmark claims are now auditable from the repo alone (rubric committed, and its hash recorded, before any data exists).
- Interaction benchmark for qualitative agents — trust-tier-3 eval infrastructure for agents whose output is prose, with hidden-test ground truth and a non-same-family judge.
- Kept-nulls in the learning loop — when a learned pattern fails, the failure is remembered and weighted by context, so "worked elsewhere" no longer outranks "failed here."
Getting Started
```bash
npx agentic-qe init --auto
```
See CHANGELOG and release notes for full details.