v0.7.0

github-actions released this 21 Jun 01:53

f73036d

Changed

Deepened event-log append handling behind a transaction interface so session start, stop, provider import, manual markers, and filesystem watcher appends share one locked replay-and-append path.
Deepened Provider Evidence handling behind a typed module so Codex and Claude adapters construct the shared event-log shape in one place, while review, session confidence, and watch token baselines read provider commands, results, risk signals, labels, and token totals through one interface.
Refactored replay-safe evidence extraction into internal/evidence so reviewer replay and future verifier-facing replay can reuse deterministic event-derived summary, confidence, risk, gaps, and timeline logic without invoking git commands.
Added artifact-only receipt verification in internal/receipt so bundle and local verification share a single artifact-hash/signature validation path while local checks continue to include workspace diff parity validation.
Documented the production replay evaluator contract in README and docs/replay-evaluator-contract.md, covering verification, trust, quality gates, policy checks, privacy, claims, and outcome semantics.

Added

Added evaluator-loop replay implementation tracking (PLAN.md Step 0).
Added local replay signer trust policy support (PLAN.md Step 2): configuration-level trust.trusted_signer_key_ids, agentreceipt replay --trusted-signer-key-id, and deterministic trust status reporting (trust_status, signer_trusted, policy_valid).
Added replay evaluator scoring signals (PLAN.md Step 4): additive evaluator_signals counters for command activity, risk-relevant command classes, and changed-file category signals (read_command_count, network_command_count, changed_test_file_count, and related fields).
Added replay quality gate evidence (PLAN.md Step 5): top-level quality_gates summarizing command-classified quality checks (format/lint/tests/race_tests/typecheck/security/coverage/build/smoke/verify), failed_command_details for failed commands with redacted outputs and evidence, and command metadata (cwd, time) for richer verifier context.
Added replay patch semantic summaries (PLAN.md Step 6): top-level patch_summary with category counts, additions/deletions, semantic changed-file entries, Go symbol hints, and test/production relationship signals for final patch review.
Added replay policy checks and review focus prompts (PLAN.md Step 7): top-level policy_checks with deterministic pass/fail/warn/not_applicable/unknown statuses, and review_focus prompts synthesized from verification gaps, quality gates, patch summary, policy checks, and failed commands.
Added replay privacy reporting, claim confidence, and outcome classification (PLAN.md Step 8): top-level privacy redaction metadata, claims for verification/authenticity/trust/gates/policies/outcome, and outcome states for completed, completed_with_gaps, failed, abandoned, committed, and needs_human_review sessions.
Added replay implementation progress tracking (PROGRESS.md) and committed the first planning-control milestone for verifier-facing replay work.
Added replay evaluator characterization coverage to ensure replay output does not leak raw provider risk_signals.
Added verifier-facing replay report construction in internal/replay, including command pairing, command risk mapping, evidence gaps, risk-to-evidence references, and artifact hash metadata.
Added agentreceipt replay CLI command to emit machine-readable verifier JSON with required --session validation and JSON output mode.
Added portable replay bundle generation for agentreceipt replay via --bundle, including required artifact packaging, normalized Codex trace copying, and replay.json manifest emission.
Added smoke-level replay coverage for agentreceipt replay JSON and bundle outputs, plus validation that replay requires --session and emits machine-readable output without raw provider logs.
Added replay workflow documentation updates in README and PRD/TECH_SPEC for verifier-only usage, artifact requirements, explicit-session behavior, and privacy constraints.
Added replay acceptance coverage in internal/replay for tampered events.jsonl, manifest.json, receipt.json, and final.patch to keep replay verification invalidation behavior explicit.
Added component-level replay verification fields in verifier output (event_chain_valid, final_patch_hash_valid, manifest_hash_valid, receipt_hash_valid) plus stable signature failure context (signature_error_code) for actionable replay review.
Added factual replay contract and smoke assertions clarifying that agentreceipt replay reports evidence facts only; no policy recommendations or scoring.
Split replay verification output into explicit integrity/authenticity and outcome verdict signals (integrity_valid, authenticity_valid, authenticity_status, overall_verdict, component_results) to support evaluator-safe consumption without overloading valid.
Hardened signer portability for replay verification by ensuring embedded public-key metadata is treated as the canonical path for signature checks and by codifying legacy behavior when signer material is missing (legacy_missing_embedded_signer).
Fixed filesystem watcher shutdown robustness so stale or already-exited watcher processes no longer produce filesystem watcher did not stop cleanly.

Assets 7