Skip to content

Phase 4: Policy versioning + audit replay #123

@hanwencheng

Description

@hanwencheng

Context

Policies change. A vendor tightens their default spend cap; a parent loosens family-memory access; a regulation forces a new constraint. Without versioning, every change is destructive — old audit events can't be re-evaluated against the new policy, and there's no "what would have happened" sandbox.

This issue ships policy versioning (every policy update creates an immutable version with a timestamp) plus audit replay (given a time window + target policy version, recompute what the decisions WOULD have been).

Per milestones-roadmap.md §5, this is M4 depth: regulator-grade reproducibility + safe policy iteration for vendors.

Scope (M4)

Policy versioning

  • Every policy update (per-vendor template, per-actor override, system-default) creates a new version
  • Version metadata: policy_id, version_number, timestamp, actor_who_changed_it, change_summary
  • Old versions retained immutably; can be referenced by version number
  • Policy version is recorded on every audit row that the policy applied to

Audit replay endpoint

  • Endpoint: POST /v1/audit/replay { time_window, target_policy_version, actor_omni? }
  • For each event in the window: re-evaluate the policy at the target version + report "what would the decision have been?"
  • Returns: { event_id, original_decision, simulated_decision, divergence: bool, simulated_reason? }
  • Aggregated view: how many events would have flipped under the new policy

Use cases

  • Vendor evaluating a stricter policy before deploying it: "If I drop the default payment cap from ¥500 to ¥300, how many devices would have hit approval-required last month?"
  • Parent reviewing "if I had set this limit yesterday, how many requests would have been denied?"
  • Regulator export with policy version stamp on every event — supports compliance reconstruction

Diff report

For a replay run, a diff report shows:

Out of scope (defer)

  • Auto-rollout of new policy versions (M5 — for M4, vendor manually triggers deployment after reviewing the diff)
  • ML-suggested policy changes ("you might want to tighten X based on patterns") — M5
  • Cross-vendor policy comparison (M5 — privacy-bound)
  • Policy version branching ("test policy version 7.2 on actor X only") — M5

Acceptance criteria

Risks

Risk Mitigation
Replay computation is too slow under load Background job for large replays; UI shows "computing…" with ETA; result cached for re-fetch
Policy version storage grows unbounded Versions are immutable text blobs; cheap to store; retention policy: keep all for 3 years (regulator alignment)
Replay's simulated decision differs from actual decision in non-obvious ways (cross-version race conditions) Replay treats the target policy as the SOLE policy for that run — no half-and-half states; documented
Vendor accidentally deploys a policy version that breaks all their devices Vendor portal requires explicit "Deploy" action after reviewing the diff; rollback to previous version is one-click

References

Effort

~1-2 weeks. Sequencing:

  1. (Days 1-3) Policy versioning storage + version metadata + audit-row version stamp
  2. (Days 3-6) Audit replay endpoint + decision re-evaluation engine
  3. (Days 6-9) Diff report aggregation + UI rendering (extends Phase 2: Audit dashboard (two-tier visible: real-time feed + chain anchor) #115)
  4. (Days 9-11) Performance pass + caching + background-job pattern
  5. (Days 11-14) Regulator export updates + acceptance tests

Pickup notes for the next agent / developer

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/auditAudit worker, two-tier audit (off-chain feed + on-chain anchor)area/brokerBroker server, cap-token issuance, OIDC issuance

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions