Skip to content

v0.18.0

Choose a tag to compare

@github-actions github-actions released this 30 May 03:39
· 46 commits to main since this release

[0.18.0] - 2026-05-30

Schema Compatibility

  • Added a machine-readable schema policy and checker that require durable
    harness schemas to declare compatibility, migration, deprecation, and
    changelog rules before 1.0.
  • Added stable schemas for permission compile output, explain block records,
    runtime parity reports, and state export manifests; runtime parity JSON now
    carries schemaVersion: 1.
  • Explain block records now accept mode: "bypass" and an optional
    fingerprint field for bypass-audit diagnostics.

Documentation

  • Added generated guides for policy pack authoring, runtime parity scorecards,
    and team/PR adoption.
  • Documented the Codex native hook-fire probe and the
    AHK_E2E_CODEX_REQUIRE_HOOKS=1 release switch for making missing lifecycle
    hook artifacts blocking.

Runtime Parity

  • Added codex-parity-probes.mjs and npm run check:codex-parity-probes
    as the release wrapper for strict Codex hook and reviewer artifact probes.
  • The strict Codex parity wrapper now emits per-probe JSON outcomes so release
    logs identify whether hook-fire or reviewer-artifact parity failed, passed,
    remained planned, or was not reached before the driver failed.
  • The kit repo release gate now validates root-local failure records in
    addition to generated-template failure record fixtures, so observed runtime
    gaps cannot sit outside readiness.
  • Runtime parity JSON now includes per-runtime status counts plus category,
    promotion criteria, and next-step metadata for partial capability rows.
  • The real Codex E2E smoke now probes generated SessionStart and SessionEnd
    lifecycle hook artifacts. The probe is non-blocking by default while Codex
    hook-fire parity remains partial, and becomes a hard failure when
    AHK_E2E_CODEX_REQUIRE_HOOKS=1 is set. The smoke now initializes a git
    workspace and captures Codex feature/JSONL diagnostics when lifecycle
    artifacts are missing.
  • The same E2E smoke now probes Codex reviewer decision artifact creation and
    can make it blocking with AHK_E2E_CODEX_REQUIRE_REVIEWER_ARTIFACT=1.
  • Added runtime-parity-report.mjs --fail-partial so release lanes can turn
    partial runtime capabilities into blocking failures without changing the
    default warning-only scorecard behavior.

Explain Layer

  • Added agent-harness-kit explain and generated .harness/scripts/explain.mjs
    diagnostics for last-block, task, permission, evidence, and readiness modes.
  • agent-harness-kit explain --bypass <fingerprint> now explains bypass audit
    rows and approved request coverage from the same central diagnostics surface.
  • agent-harness-kit explain --task <id> now points at
    task-evidence-check --task=<id> instead of the Stop-hook-only
    --active-task mode.
  • agent-harness-kit explain --permission <tool> --task <id> no longer reports
    .harness/permissions.json as missing when the task contract supplies the
    relevant allow/deny decision.
  • agent-harness-kit explain --task <id> now reports linked evidence pass-proof
    gaps such as missing diffSummary or UI artifacts, matching
    explain --evidence diagnostics.
  • agent-harness-kit explain now reports repo-escaping evidence paths as unsafe
    path fields instead of mislabeling them as ordinary missing files.
  • Task-scoped permission explanations now keep sourceRule on the task contract
    even when .harness/permissions.json exists but was not part of the decision.
  • agent-harness-kit explain --last-block now skips remediation telemetry such
    as block_remediated so resolved blocks do not hide the last active block.
  • agent-harness-kit explain --last-block now recognizes canonical block
    telemetry events and routes task-evidence or permission-denied blocks to the
    most direct repair command.
  • Generated installs now include .harness/docs/explain.md, documenting JSON
    output, repair commands, and override expectations.

Evidence Attestation

  • Added check-evidence-attestation.mjs as a strict readiness gate for passing
    evidence bundles. It requires attested pass checks with command metadata,
    stdout/stderr sidecar hashes, and a replay plan produced by
    task-evidence-check --verify-hashes --replay-plan.
  • Generated installs now include harness:evidence:attestation,
    evidenceAttestation config, and the evidence-attestation readiness gate.
  • verify-ui summaries now include route, assertion, and DOM snapshot hash
    metadata, and passing UI evidence must carry that browser proof metadata.
  • Generated installs now include .harness/docs/evidence-attestation.md.

Permission Compiler

  • Added task-aware permission compilation, permissions diff, and
    permissions explain decision-chain output for task, skill, and default
    policies.
  • Added check-permissions-drift.mjs and wired generated readiness plus npm
    scripts to use the dedicated drift wrapper.
  • High-risk task contracts now fail permission compilation when they use
    wildcard or broad Bash permissions.
  • Permission compilation now emits runtime hook expectations for Claude and
    Codex, and fails when generated hook matchers drift from the compiled runtime
    contract, including Codex apply_patch mutation coverage.
  • Generated installs now write .harness/permissions.compiled.json during
    rendering and merge compiler-derived Claude permission hints into
    .claude/settings.json.
  • harness-report now runs the permission compiler, surfaces compiled skill and
    task contract counts, and fails release JSON output on compiler-detected
    permission drift or high-risk task permission errors.

Bypass Governance

  • Added the structured bypass request workflow: bypass request,
    bypass audit --strict, and bypass explain now share the same strict audit
    engine used by release readiness.
  • Strict bypass audit now accepts only approved, unexpired request scopes, rejects
    scope mismatches, and requires failure-record links for
    converted-to-failure-record acknowledgements.

Harness Noise Reporting

  • Added report-harness-noise.mjs and generated npm scripts for ranking noisy
    rules from block telemetry, bypass records, false-positive acknowledgements,
    review latency, and Stop-hook loop-guard activations.
  • Statusline last-block alerts now filter telemetry for real block records, so
    idle or permission prompt notifications do not masquerade as blocking gates.
  • harness-report now embeds harness-noise status so release dashboards expose
    false-positive and override pressure instead of burying it in logs.

Upgrade

  • Upgrade now reconciles executable bits for unchanged managed scripts recorded
    in the install lockfile, while preserving user-modified sidecar targets.

Eval Tasks

  • The kit repo check:eval-tasks release gate now uses a deterministic Node
    wrapper instead of shell chaining, with aggregate JSON output covering both
    root-local and generated-template eval task directories.
  • Added a package-script regression test that blocks unquoted shell chaining in
    npm scripts, keeping multi-directory gates on deterministic Node wrappers.

Failure Learning

  • The kit repo check:failure-records release gate now uses a deterministic
    Node wrapper instead of shell chaining, with aggregate JSON output covering
    both generated-template and root-local failure record directories.
  • check-failure-records now reports the records directory it validated in
    both text and JSON output, making multi-directory release gates easier to
    audit.

Full history: CHANGELOG.md
Install: npx agent-harness-kit@0.18.0 init