Release v3.9.31: Self-Learning Loop — Kernel-Side Dream Cycles, Three-Signal Routing · proffesor-for-testing/agentic-qe

What's New

The QE fleet's self-learning loop now closes end-to-end. Previously the pattern catalog would converge on whichever pattern won the first routing decision, hooks would never trigger dream cycles, and the Q-learning table filled up with data nothing consumed. This release fixes all of that.

Headline features

aqe learning loop-health — new operator dashboard. Shows liveness of the three pipeline components (CapturedExperienceBridge, LearningConsolidationWorker, DreamScheduler) plus a 7-day routing-diversification view (exploit/explore counts, avg quality per bucket, avg mincut multiplier, avg Q-weight). One command to know if the loop is closing.
Three-signal agent routing (ADR-095). Routing decisions now blend three signals: static scoring (existing), a Q-value bonus from rl_q_values (sigmoid-normalized, ramps from 0 to 0.4 weight over 20 visits per state×agent), and crypto-random ε-greedy exploration (default 5%) gated by a mincut safety multiplier that dampens exploration 5x when swarm topology is critical. Set AQE_ROUTER_EXPLORATION_RATE=0 to disable exploration as a rollback knob.
Kernel-side dream cycles (ADR-094). Dream cycles used to run inside short-lived hook subprocesses, holding the SQLite write transaction for up to 10 seconds and losing errors to stderr. They now run inside the long-lived kernel via the DreamScheduler. Hook subprocesses keep only the cheap experience-counter bump. A new boundary-enforcement test in CI prevents future regressions.
aqe init --auto daemon script now probes for globally-installed agentic-qe directly via require.resolve, so the daemon pidfile points at the real MCP server (not the npx wrapper that exits immediately).

Reliability fixes

dream_insights table now gets a retention sweep on every worker tick (30-day default; applied insights stay forever as audit trail).
Bridge cursor is monotonic — cross-process races can't regress the cursor and cause duplicate event re-publication.
qe_pattern_usage audit table is now structurally consistent with qe_patterns.usage_count (single shared writer).
Pre-task bridge now writes regardless of pattern match count, so Q-learning gets signal for low-confidence prompts.

Closes

#480 (post-edit dream trigger)
#486 (mineExperiences auto-trigger + qe_pattern_usage dual-write)
#487 (Q-learning starved for low-confidence prompts)
#488 (production-readiness blind spots — Phases 1-4)

Getting Started

```bash
npx agentic-qe@3.9.31 init --auto
aqe learning loop-health
```

See the CHANGELOG and v3.9.31 release notes for full details. Architecture decisions documented in ADR-094 and ADR-095.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

v3.9.31: Self-Learning Loop — Kernel-Side Dream Cycles, Three-Signal Routing

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

What's New

Headline features

Reliability fixes

Closes

Getting Started

Uh oh!