v0.7.0
Changed
- Deepened event-log append handling behind a transaction interface so session start, stop, provider import, manual markers, and filesystem watcher appends share one locked replay-and-append path.
- Deepened Provider Evidence handling behind a typed module so Codex and Claude adapters construct the shared event-log shape in one place, while review, session confidence, and watch token baselines read provider commands, results, risk signals, labels, and token totals through one interface.
- Refactored replay-safe evidence extraction into
internal/evidenceso reviewer replay and future verifier-facing replay can reuse deterministic event-derived summary, confidence, risk, gaps, and timeline logic without invoking git commands. - Added artifact-only receipt verification in
internal/receiptso bundle and local verification share a single artifact-hash/signature validation path while local checks continue to include workspace diff parity validation. - Documented the production replay evaluator contract in README and
docs/replay-evaluator-contract.md, covering verification, trust, quality gates, policy checks, privacy, claims, and outcome semantics.
Added
-
Added evaluator-loop replay implementation tracking (
PLAN.mdStep 0). -
Added local replay signer trust policy support (
PLAN.mdStep 2): configuration-leveltrust.trusted_signer_key_ids,agentreceipt replay --trusted-signer-key-id, and deterministic trust status reporting (trust_status,signer_trusted,policy_valid). -
Added replay evaluator scoring signals (
PLAN.mdStep 4): additiveevaluator_signalscounters for command activity, risk-relevant command classes, and changed-file category signals (read_command_count,network_command_count,changed_test_file_count, and related fields). -
Added replay quality gate evidence (
PLAN.mdStep 5): top-levelquality_gatessummarizing command-classified quality checks (format/lint/tests/race_tests/typecheck/security/coverage/build/smoke/verify),failed_command_detailsfor failed commands with redacted outputs and evidence, and command metadata (cwd,time) for richer verifier context. -
Added replay patch semantic summaries (
PLAN.mdStep 6): top-levelpatch_summarywith category counts, additions/deletions, semantic changed-file entries, Go symbol hints, and test/production relationship signals for final patch review. -
Added replay policy checks and review focus prompts (
PLAN.mdStep 7): top-levelpolicy_checkswith deterministic pass/fail/warn/not_applicable/unknown statuses, andreview_focusprompts synthesized from verification gaps, quality gates, patch summary, policy checks, and failed commands. -
Added replay privacy reporting, claim confidence, and outcome classification (
PLAN.mdStep 8): top-levelprivacyredaction metadata,claimsfor verification/authenticity/trust/gates/policies/outcome, andoutcomestates for completed, completed_with_gaps, failed, abandoned, committed, and needs_human_review sessions. -
Added replay implementation progress tracking (
PROGRESS.md) and committed the first planning-control milestone for verifier-facing replay work. -
Added replay evaluator characterization coverage to ensure replay output does not leak raw provider
risk_signals. -
Added verifier-facing replay report construction in
internal/replay, including command pairing, command risk mapping, evidence gaps, risk-to-evidence references, and artifact hash metadata. -
Added
agentreceipt replayCLI command to emit machine-readable verifier JSON with required--sessionvalidation and JSON output mode. -
Added portable replay bundle generation for
agentreceipt replayvia--bundle, including required artifact packaging, normalized Codex trace copying, andreplay.jsonmanifest emission. -
Added smoke-level replay coverage for
agentreceipt replayJSON and bundle outputs, plus validation that replay requires--sessionand emits machine-readable output without raw provider logs. -
Added replay workflow documentation updates in README and PRD/TECH_SPEC for verifier-only usage, artifact requirements, explicit-session behavior, and privacy constraints.
-
Added replay acceptance coverage in
internal/replayfor tamperedevents.jsonl,manifest.json,receipt.json, andfinal.patchto keep replay verification invalidation behavior explicit. -
Added component-level replay verification fields in verifier output (
event_chain_valid,final_patch_hash_valid,manifest_hash_valid,receipt_hash_valid) plus stable signature failure context (signature_error_code) for actionable replay review. -
Added factual replay contract and smoke assertions clarifying that
agentreceipt replayreports evidence facts only; no policy recommendations or scoring. -
Split replay verification output into explicit integrity/authenticity and outcome verdict signals (
integrity_valid,authenticity_valid,authenticity_status,overall_verdict,component_results) to support evaluator-safe consumption without overloadingvalid. -
Hardened signer portability for replay verification by ensuring embedded public-key metadata is treated as the canonical path for signature checks and by codifying legacy behavior when signer material is missing (
legacy_missing_embedded_signer). -
Fixed filesystem watcher shutdown robustness so stale or already-exited watcher processes no longer produce
filesystem watcher did not stop cleanly.