Skip to content

v3.10.6: Pattern Space cross-pollination — evidence labels, safety eval, invariant CI

Choose a tag to compare

@proffesor-for-testing proffesor-for-testing released this 11 Jun 17:37
· 10 commits to main since this release
d8745cf

What's New

A quality review of a community framework surfaced disciplines worth turning inward on AQE itself. Six improvements (ADR-105–110):

  • Evidence-class labels on findings — every QE finding now says what kind of truth it is (EXECUTED / STATIC / INFERRED / CONJECTURE). Quality gates block only on verified evidence; unverified inference routes to adversarial verification first.
  • Behavioral safety eval — the rules that protect your learning database (never rm a .db, always back up first, verify row counts) are now a real pass/fail gate, tested against the live model tiers under temptation — as openers and mid-task.
  • Shipped-agent invariant CIverify:invariants makes sure shipped agents never silently lose their non-negotiable sections, enforced on every PR and release.
  • Benchmark lineage + pre-registered rubrics — benchmark claims are now auditable from the repo alone (rubric committed, and its hash recorded, before any data exists).
  • Interaction benchmark for qualitative agents — trust-tier-3 eval infrastructure for agents whose output is prose, with hidden-test ground truth and a non-same-family judge.
  • Kept-nulls in the learning loop — when a learned pattern fails, the failure is remembered and weighted by context, so "worked elsewhere" no longer outranks "failed here."

Getting Started

```bash
npx agentic-qe init --auto
```

See CHANGELOG and release notes for full details.