Skip to content

Reduce repeated high-token auto-review runs on stale snapshots #375

@shiny-code-bot

Description

@shiny-code-bot

Problem

Auto-review has been spending substantial tokens on repeated detached/stale review snapshots. In cbusillo/codex-skills, the auto-review ledger reported:

  • tokens=2036776t across recent runs
  • high_burn=5
  • run 168d4bd0-e694-4023-b7f6-1188095bae06: tokens=1467380t, findings later verified stale against current main
  • run d494a552-6cf2-40b0-963c-dc1effbd4a24: tokens=190756t, useful but reviewing a detached snapshot rather than clearly scoped active target

This makes stale review prompts costly and increases noise around readiness decisions.

Desired Behavior

Reduce token burn for repeated or stale auto-review contexts.

Possible approaches:

  • Cache review context and reuse summaries when the target snapshot has not changed.
  • Detect when a finding's snapshot is no longer active and avoid launching another full review unless requested.
  • Add a low-cost freshness/applicability preflight before high-context review.
  • Cap repeated stale review prompts for the same finding/run family after current HEAD verification.
  • Prefer compact diff/snapshot comparison before loading broad repo context.

Acceptance Criteria

  • Repeated auto-review runs against unchanged detached snapshots consume little or no additional model budget.
  • Stale/superseded findings do not trigger high-token review loops by default.
  • The ledger or UI exposes enough token/cache diagnostics to confirm whether caching is working.
  • Agents can still request a fresh full review explicitly when a stale finding appears suspicious or safety-critical.

Why This Matters

The auto-review system caught a real issue in the later run, so the feature is valuable. But repeated stale high-token runs make review expensive and noisy. A cheap applicability/cache layer would preserve signal while keeping the system practical for long-running coding sessions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions