Skip to content

v0.23.0

Choose a tag to compare

@github-actions github-actions released this 10 Jun 06:58
· 7 commits to main since this release

[0.23.0] - 2026-06-10

Schema Compatibility

  • Two new schemas added (backward compatible, additive only):
    golden-trace-case (.harness/schemas/golden-trace-case.schema.json) and
    structural-test-result (.harness/schemas/structural-test-result.schema.json).
    Both are registered in the stable schema policy (now 17 contracts). No
    existing schema changed.

Added

  • migrate-settings command (alias migrate-config): repairs older
    installs — moves legacy harness.config.json, restores the
    Edit/Write/MultiEdit baseline permissions, merges unmerged hooks and
    compiled permission hints into .claude/settings.json. Doctor now fails
    with the remediation command when it detects this drift.
  • Golden-trace regression gate (check:golden-traces): 7 recorded cases
    replay deterministic harness surfaces hermetically and exit 2 naming the
    case and assertion when behavior regresses; wired into the generated CI
    workflow.
  • Structural-test --json on all six language adapters (TypeScript,
    Python, Go, Rust, Swift, Kotlin) emitting one structural-test-result
    object for dashboards and PR annotations.
  • Loop detection (PostToolUse): repeated edits to one file inside a
    window inject a doom-loop warning and a loop_detected telemetry row.
  • Symbol index (symbol-index.mjs): where-is-X navigation to file:line
    across 8 languages with incremental PostToolUse updates.
  • Context budget (context-budget.mjs + contextBudget.totalChars):
    explicit per-source allocation for SessionStart context with line-boundary
    truncation and a usage report.
  • Observational memory (observation-compress.mjs): PreCompact and
    SessionEnd compress transcripts into dense observations; SessionStart
    re-injects the three most recent.
  • MCP skill server (mcp-skill-server.mjs): dependency-free MCP stdio
    server exposing list_skills/load_skill to any MCP client.
  • Terrain-correct value-prop benches: eval tasks 05–07 (weak-model,
    greenfield, adversarial) plus docs/benchmarks/value-prop-protocol.md
    replacing the 1-shot A/B design that returned five consecutive nulls.
  • Doctor checks for eslint-plugin-boundaries presence/version and for
    settings drift (baseline permissions, unmerged hooks).

Fixed

  • Codex advisor runner retries transient failures (timeout, non-zero
    exit) with jittered backoff and records each attempt in the runtime-proof
    JSONL; a hung Codex CLI no longer blocks the task until manual recovery.
  • Mistargeted advisor decisions are rejected: any decision whose
    taskId differs from the active task fails instead of being persisted
    into the active task's decision path.
  • Telemetry rotation is concurrency-safe: mkdir-as-lock with PID-unique
    temp files and stale-lock recovery; concurrent rotations no longer clobber
    each other's snapshot.
  • Task contracts with typo'd layer names now get a denial naming the
    valid layers from the domains config instead of a confusing scope message.
  • Advisor decision writes are proof-checked at write time: a decision
    referencing a missing or invented runtime proof is denied by the
    PreToolUse guard with contextual guidance instead of failing later at the
    Stop gate.
  • SubagentStop structural checks are cached on a tree fingerprint, so
    unchanged trees skip the workspace-wide rerun (failed runs are never
    cached).

Changed

  • src/core/upgrade.mjs split into phase modules under src/core/upgrade/
    behind an unchanged re-export facade.

Full history: CHANGELOG.md
Install: npx agent-harness-kit@0.23.0 init