v0.9.22
[0.9.22] - 2026-03-11
Added
- Reflection feedback loop: Watches for repeated tool errors with the same semantic signature, invokes the LLM as a diagnostician to analyze root cause and suggest a fix, persists the learning as an
ErrorSolution, and verifies on next successful call. - Evidence state tracking: Records concrete observations per turn, supports pre-execution evidence gates blocking premature writes, and provides contradiction detection.
- Execution state machine: Selects a budget tier at turn start, tracks plan version, idempotency keys, and background-handoff flags, and can suspend its own budget for a force-text closeout.
- Validation state: Records matched success criteria, failed checks, and exposes helpers for building structured step results and blocker payloads.
- Pre-execution planning and critique gate: LLM-driven planning pass before the first tool call generates a
PlanState, a critique pass checks it, and both are logged as decision events. - Execution replay notes and diagnose surfacing:
LearningContextaccumulatesReplayNoteentries during a turn;DiagnoseToolbuilds anExecutionReplaySummaryfrom decision-point events. - Execution checkpoint in context window: When history is trimmed, an
[SYSTEM] EXECUTION CHECKPOINTmessage is injected summarising the active request, completed work, and latest evidence. - Scheduled run health tracking: Persisted in
scheduled_run_stateand used for budget auto-extension decisions. - New system directives:
GlobalDailyBudgetAutoExtended,EvidenceGroundingRequired,StructuredToolResultSynthesis,ReflectionDiagnosis. - Execution failure taxonomy: Errors classified as
ToolContractFailure,ToolInvocationFailure,EnvironmentFailure, orLogicFailurefor targeted recovery coaching. - Executor handoff for CLI agents: CLI agent invocations acting as task executors now persist
ExecutorHandoffcontext and write structuredExecutorStepResultback to the task. - Resume checkpoint reconstructs execution snapshot:
build_resume_checkpoint()readsExecutionStateSnapshotdecision events to reconstruct last known execution state.
Changed
- Stopping phase integrates execution and validation state for budget decisions
- Scheduled run auto-extension uses health metrics
- Repetitive call guard context-aware for API tools
- Result learning returns semantic failure info
- Web tool budget caps raised
- Read-saturation thresholds raised
- Force-text closeout prohibits future-tense promises
Fixed
- Repeated API call guard falsely claimed requests were "blocked"
manage_memoriesschedule actions called withoutgoal_id- Session memory contaminated by non-owner traffic
- Scheduled run budget extended despite unproductive runs
- CLI agent delegation lost task context on failure
- Execution state not recoverable after crash/restart
Full Changelog: v0.9.21...v0.9.22
Full Changelog: v0.9.21...v0.9.22