Skip to content

v0.9.22

Choose a tag to compare

@davo20019 davo20019 released this 12 Mar 08:56
· 190 commits to master since this release

[0.9.22] - 2026-03-11

Added

  • Reflection feedback loop: Watches for repeated tool errors with the same semantic signature, invokes the LLM as a diagnostician to analyze root cause and suggest a fix, persists the learning as an ErrorSolution, and verifies on next successful call.
  • Evidence state tracking: Records concrete observations per turn, supports pre-execution evidence gates blocking premature writes, and provides contradiction detection.
  • Execution state machine: Selects a budget tier at turn start, tracks plan version, idempotency keys, and background-handoff flags, and can suspend its own budget for a force-text closeout.
  • Validation state: Records matched success criteria, failed checks, and exposes helpers for building structured step results and blocker payloads.
  • Pre-execution planning and critique gate: LLM-driven planning pass before the first tool call generates a PlanState, a critique pass checks it, and both are logged as decision events.
  • Execution replay notes and diagnose surfacing: LearningContext accumulates ReplayNote entries during a turn; DiagnoseTool builds an ExecutionReplaySummary from decision-point events.
  • Execution checkpoint in context window: When history is trimmed, an [SYSTEM] EXECUTION CHECKPOINT message is injected summarising the active request, completed work, and latest evidence.
  • Scheduled run health tracking: Persisted in scheduled_run_state and used for budget auto-extension decisions.
  • New system directives: GlobalDailyBudgetAutoExtended, EvidenceGroundingRequired, StructuredToolResultSynthesis, ReflectionDiagnosis.
  • Execution failure taxonomy: Errors classified as ToolContractFailure, ToolInvocationFailure, EnvironmentFailure, or LogicFailure for targeted recovery coaching.
  • Executor handoff for CLI agents: CLI agent invocations acting as task executors now persist ExecutorHandoff context and write structured ExecutorStepResult back to the task.
  • Resume checkpoint reconstructs execution snapshot: build_resume_checkpoint() reads ExecutionStateSnapshot decision events to reconstruct last known execution state.

Changed

  • Stopping phase integrates execution and validation state for budget decisions
  • Scheduled run auto-extension uses health metrics
  • Repetitive call guard context-aware for API tools
  • Result learning returns semantic failure info
  • Web tool budget caps raised
  • Read-saturation thresholds raised
  • Force-text closeout prohibits future-tense promises

Fixed

  • Repeated API call guard falsely claimed requests were "blocked"
  • manage_memories schedule actions called without goal_id
  • Session memory contaminated by non-owner traffic
  • Scheduled run budget extended despite unproductive runs
  • CLI agent delegation lost task context on failure
  • Execution state not recoverable after crash/restart

Full Changelog: v0.9.21...v0.9.22

Full Changelog: v0.9.21...v0.9.22