Add reviewer-grade dual-source provenance and CHANGE_ME setup safety by pengfei-threemoonslab · Pull Request #103 · ThreeMoonsLab/agents-shipgate

pengfei-threemoonslab · 2026-05-20T23:50:40Z

Summary

Make high-risk findings reviewable in PR/release workflows without grep: each finding now carries two structured pointers — the tool source (Finding.source, already there) AND a manifest evidence pointer (Finding.policy_evidence_source, new) so reviewers can jump to the YAML line where the missing mitigation should live.
Unresolved CHANGE_ME placeholders in shipgate.yaml now flow into report.source_warnings, tripping the existing source_warning_count > 0 → review_required gate so a stub manifest can no longer produce a release packet that looks like real evidence.
Catalog-driven escalation override: approval / confirmation / idempotency / broad-scope / prohibited-action / runtime-trace / HITL-evidence check IDs are now flagged requires_human_review_regardless_of_patch=True. annotate_remediation forces autofix_safe=False BEFORE derive_agent_action runs, so even a high-confidence non-manual patch on those check IDs lands at propose_patch_for_review, never auto_apply.
Reviewer surfaces (SARIF, packet markdown, GitHub Step Summary, CapabilityFact, tool_inventory rows, scenario YAML rows) all render path:line citations when structured pointers are available.

Type

Verification

CI is authoritative for `python -m ruff check .`, `python -m compileall -q src tests`, and `python -m pytest`.

Additional local checks run:

`python -m pytest` — 1663 passed, 4 skipped on the full suite.
End-to-end smoke against `samples/support_refund_agent`:
- `SHIP-POLICY-APPROVAL-MISSING` carries `source.path=specs/support-tools.openapi.yaml line=97` AND `policy_evidence_source.path=shipgate.yaml line=51 pointer=/policies/require_approval_for_tools`.
- `report.sarif` for the same finding lists two `locations` (one per pointer).
- `packet.md` §1 renders `stripe.create_refund lacks a declared approval policy — specs/support-tools.openapi.yaml:97 — shipgate.yaml:51`.
- Run IDs unchanged across the change (`_run_id` excludes the new field, mirroring the v0.11 structured-source exclusion).
- Existing `finding.fingerprint` values unchanged — baselines stay matched.

Release-readiness notes

No user-code import added to default scan paths
No network access added to default scan paths
New or changed check IDs are documented in `docs/checks.md` (no new check IDs in this PR; only metadata flag changes on 12 existing entries)
Report/schema changes are additive or documented in `STABILITY.md` — bumped `report_schema_version` 0.18 → 0.19. New fields (`Finding.policy_evidence_source`, `ReleaseDecisionItem.{source, policy_evidence_source}`) are optional with `None` defaults; v0.18 schema preserved as frozen reference.

What changed

Schema (`v0.18` → `v0.19`, additive)

`Finding.policy_evidence_source: SourceReference | None` — the manifest pointer when the finding's mitigation is missing in shipgate.yaml.
`ReleaseDecisionItem.source` and `.policy_evidence_source` — mirror the finding fields so packet re-rendering from `packet.json` keeps both citations.
New `docs/report-schema.v0.19.json` generated from the model; v0.18 kept as frozen reference.

Provenance infrastructure

`load_manifest_with_positions()` in `config/loader.py` returns `(manifest, PositionIndex)`; wraps `InputParseError` → `ConfigError` so doctor/scan exit codes are unchanged.
`ScanContext.manifest_positions: PositionIndex` defaults to the unsupported sentinel; tests that build contexts directly keep working.
`tool_finding` / `agent_finding` gain an optional `policy_evidence_pointer` kwarg.

High-risk emitters (real manifest fields)

`checks/policy.py` — `/policies/require_approval_for_tools`, `/policies/require_confirmation_for_tools`.
`checks/side_effects.py` — `/policies/require_idempotency_for_tools`.
`checks/manifest_scope.py` — `/agent/declared_purpose` and `/agent/prohibited_actions/{i}` (indexed).
`checks/auth.py` — `/permissions/scopes` for both agent-level and tool-level broad-scope.
`checks/evidence.py` HITL — `/validation/required_evidence/...` and `/validation/target_review_posture`.

Reviewer surfaces

SARIF — `_result` now emits up to two `locations[]` entries (tool + manifest evidence).
Packet markdown — §1 Blockers/Review items and §2 Capability-Intent divergences append ` — path:line` citations from both pointers. Used `_to_decision_items` in both `packet/builder.py` and `release_decision._to_item` to thread the dual pointers onto `ReleaseDecisionItem`.
GitHub Step Summary — action/tool diff highlights look up the underlying tool in the new `tool_inventory` source keys and append `(path:line)`.
CapabilityFact — `source_ref` enriched with `#L{line}` from the originating tool (no schema change; `source_ref` was already `str | None`).
tool_inventory — rows gain `source_path`, `source_start_line`, `source_pointer` keys (dicts are extensible — no schema field churn).
Scenario YAML — `_rows_to_payload` adds a nested `source: {tool, policy_evidence}` block per row when structured pointers are available.

Setup safety

`cli/scan.py` calls `collect_placeholders` against the manifest text and appends per-placeholder warnings to `source_warnings` (and into the existing redaction pass). Doctor's `SHIP-DIAG-CHANGE-ME-PLACEHOLDERS` diagnostic is unchanged; same fact now surfaces on both paths.

Catalog escalation override

`CheckMetadata.requires_human_review_regardless_of_patch: bool = False` added.
Set `True` in `_REMEDIATION_OVERRIDES` for 12 check IDs: `SHIP-POLICY-APPROVAL-MISSING`, `SHIP-POLICY-CONFIRMATION-MISSING`, `SHIP-SIDEFX-IDEMPOTENCY-MISSING`, `SHIP-AUTH-MANIFEST-BROAD-SCOPE`, `SHIP-AUTH-TOOL-BROAD-SCOPE`, `SHIP-SCOPE-PROHIBITED-TOOL-PRESENT`, `SHIP-API-TRACE-APPROVAL-MISSING`, `SHIP-API-TRACE-CONFIRMATION-MISSING`, `SHIP-EVIDENCE-APPROVAL-TRACE-MISSING`, `SHIP-EVIDENCE-OVERRIDE-REASON-MISSING`, `SHIP-EVIDENCE-HIGH-RISK-EXCLUSION-MISSING`, `SHIP-EVIDENCE-HITL-PROMOTION-CRITERIA-MISSING`.
`annotate_remediation` flips `autofix_safe=False` and `requires_human_review=True` before `derive_agent_action` runs, so all three fields agree and the existing routing logic naturally lands at `propose_patch_for_review` (when patches present) or `escalate_to_human` (otherwise).

Stability invariants preserved

`_run_id` excludes `policy_evidence_source` entirely — same rationale as the v0.11 structured-source exclusion. Same scan input → same run_id.
`finding_fingerprint` hashes only `check_id + tool_name + evidence`. Adding `policy_evidence_source` does not affect identity; existing baselines stay matched. Covered by `test_fingerprint_stable_with_policy_evidence_source`.
Legacy `source.ref`/`source.location` strings on the new `agent_finding` path stay `None` so the legacy hash inputs don't drift.

Docs

v0.19 mentioned as current in: `AGENTS.md`, `README.md`, `docs/INDEX.md`, `docs/agent-contract-current.md`, `docs/agent-autofix-boundary.md`, `docs/autofix-policy.md`, `docs/examples.md`, `docs/faq.md`, `llms.txt`, `llms-full.txt`, `.well-known/agents-shipgate.json`, `skills/agents-shipgate/SKILL.md`.
v0.18 schema preserved as the frozen reference in all the same docs.

🤖 Generated with Claude Code

Bumps the report schema to v0.19 (additive) to make high-risk findings reviewable enough for PR/release workflows without grep: - Finding.policy_evidence_source (new): manifest pointer + line for findings whose triggering evidence lives in two places — the tool itself (Finding.source) and the missing-mitigation slot in the manifest (e.g. /policies/require_approval_for_tools). - ReleaseDecisionItem.{source, policy_evidence_source} mirror the finding fields so packet re-rendering and reviewer surfaces (markdown, SARIF, GitHub Step Summary, scenario YAML) cite both the tool and manifest sites for the same release item. - load_manifest_with_positions() builds a YAML PositionIndex for the manifest; ScanContext threads it; tool_finding/agent_finding accept an optional policy_evidence_pointer that resolves to a structured manifest line. Errors stay ConfigError so doctor/scan exit codes are unchanged. - High-risk emitters (policy/side_effects/manifest_scope/auth/HITL evidence) pass the real manifest pointers and gain a second SARIF physicalLocation. Pointers use actual schema fields (require_approval_for_tools, validation/required_evidence/...). Setup safety (CHANGE_ME placeholders): - cli/scan.py wires collect_placeholders into source_warnings so unresolved CHANGE_ME entries trip the existing source_warning_count > 0 → review_required branch in release_decision.evidence_coverage. Catalog-driven escalation override: - CheckMetadata.requires_human_review_regardless_of_patch (new): set True on 12 approval / confirmation / idempotency / broad-scope / prohibited-action / runtime-trace / HITL-evidence check IDs. annotate_remediation forces autofix_safe=False before derive_agent_action runs, so a high-confidence non-manual patch on these check IDs lands at propose_patch_for_review (never auto_apply) and Finding.autofix_safe / Finding.requires_human_review / agent_action stay in agreement. Reviewer surface threading: - SARIF emits dual physicalLocations per result. - Packet markdown §1 (Blockers / Review items) and §2 (Capability intent divergences) append `(path:line)` citations. - CapabilityFact.source_ref enriched with `#L{line}` when known. - tool_inventory rows gain source_path/source_start_line/source_pointer so post-scan renderers can cite path:line without re-parsing. - GitHub Step Summary highlights append `(path:line)` from the enriched tool_inventory lookup. - Scenario YAML rows carry a `source: {tool, policy_evidence}` block per row when structured pointers are available. Stability invariants preserved: - _run_id excludes policy_evidence_source entirely (same rationale as the v0.11 structured-source exclusion). Run IDs unchanged. - finding_fingerprint excludes source/policy_evidence_source from the identity hash; existing baselines stay matched. - agent_finding's legacy source.ref/location strings stay None on the new structured path so the legacy hash inputs don't drift. Docs: AGENTS.md, README.md, agent-contract-current.md, autofix-policy.md, agent-autofix-boundary.md, examples.md, faq.md, INDEX.md, llms.txt, llms-full.txt, .well-known/agents-shipgate.json, skills/SKILL.md all mention v0.19 as current. The v0.18 schema is preserved as a frozen reference. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

… revert CapabilityFact mutation Fixes four reviewer findings on PR #103: 1. [P1] Packet schema bump (v0.5 → v0.6). The previous commit added `ReleaseDecisionItem.source` and `policy_evidence_source` to packet.json but left `packet_schema_version` at "0.5" and overwrote docs/packet-schema.v0.5.json in place. Consumers validating existing v0.5 packets with `additionalProperties: false` would reject the new artifacts even though the version said nothing changed. - Restore docs/packet-schema.v0.5.json from the pre-PR git state (frozen reference). - Bump `packet_schema_version` Literal to "0.6" in schemas/packet.py. - Regenerate docs/packet-schema.v0.6.json. - Add v0.5 → v0.6 upgrade path in packet/json_packet.py. - Update INDEX, agent-contract-current, AGENTS, README, skills/SKILL, llms.txt, .well-known, faq, STABILITY, and tests to reference v0.6 as current with v0.5 as frozen. 2. [P1] Stop churning `agent_finding` source.ref. The previous commit set `source.ref = f"{manifest}#{pointer}"` when a policy_evidence_pointer was supplied. `_run_id` excludes structured fields and policy_evidence_source but still hashes legacy `source.ref` for backwards compatibility, so agent-level high-risk findings (e.g. SHIP-AUTH-MANIFEST-BROAD-SCOPE) got new run_ids and broke reviewer-link/baseline identity continuity. - Keep `source.ref` as the bare manifest name. The pointer lives ONLY in the structured `pointer` field plus `policy_evidence_source`. Verified: scan of a manifest with broad scope now emits `source.ref="shipgate.yaml"` with `source.pointer="/permissions/scopes"`, run_id unchanged from the v0.18 era. 3. [P2] Stop enriching `CapabilityFact.source_ref` with `#L{line}`. STABILITY.md treats `CapabilityFact.source_ref` as a stable contract. Existing OpenAPI refs already contain JSON-pointer fragments (e.g. `api.yaml#/paths/...`), so appending `#L42` produced ambiguous strings like `api.yaml#/paths/...#L42`. - Revert `_enriched_source_ref`. The reviewer-grade line citation still lives on the enriched tool_inventory rows (`source_path` / `source_start_line`) and on each finding's structured `source.path` / `source.start_line` — both unambiguous. CapabilityFact.source_ref stays byte-stable. 4. [P2] Update remaining v0.18/v0.17/v0.16 doc references that slipped through. README.md:419, docs/overview.md:45, docs/ai-search-summary.md:90, docs/baseline.md:40, docs/report-reading-for-agents.md tables all now reference v0.19 as current. Test goldens regenerated; full suite (1663 tests) passes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…-finding citations, fix stale v0.5 prose Fixes three reviewer findings on PR #103. P1 — Action/Tool Surface Diff rows now carry tool source: - New `enrich_action_surface_diff_with_source(diff, tool_source_index)` in report/action_surface_diff.py appends `(source: path:line)` to every change-row `reason` when the underlying tool's structured source is known. - New `enrich_tool_surface_diff_with_source(diff, tool_source_index)` in report/tool_surface_diff.py does the same for tool-surface control rows. - New `_tool_source_index(tools)` helper in cli/scan.py builds the tool-name → (path, line) map; scan.py wires it into both the internal and public diff computation paths. - packet/builder.py `_tool_surface_diff_highlights` and `_action_surface_diff_highlights` now accept the same index and append `(path:line)` suffixes to §3A / §3B highlight bullets. - `build_packet_from_report` rebuilds tool source fields from the enriched `tool_inventory` rows (`source_path`, `source_start_line`, `source_pointer`) so the rebuilt-from-report packet path keeps the citation surface working. - Unit tests cover both the enrichment helper (with and without a source index) so regression on the contract is caught at the module boundary. P2 — Agent-level finding citations no longer duplicate: - `agent_finding()` no longer sets `policy_evidence_source`. For agent-level findings the primary `Finding.source` IS the manifest pointer (path, start_line, pointer are identical to what the secondary would carry), so emitting both forced every downstream renderer to dedupe. Setting it to None at the source keeps the contract clean: secondary lives only on `tool_finding` cases where tool source ≠ manifest pointer. - Defensive renderer-level dedupe lands in three places: - packet/markdown.py `_dual_citation(primary, secondary)` suppresses the secondary when its `path:line` suffix equals the primary's. - report/sarif.py compares `physicalLocation` dicts and skips duplicates so SARIF results never carry two identical jump targets. - cli/scenario.py `_source_block` returns `tool` only when the `policy_evidence` pointer block is byte-equal to it, so scenario YAML rows stay terse. - The sample packet now reads `Manifest declares broad permission scopes — shipgate.yaml:60` (was: `... — shipgate.yaml:60 — shipgate.yaml:60`). P3 — Stale v0.5 packet prose in agent-facing surfaces: - AGENTS.md heading `### Release Evidence Packet (v0.5)` → `(v0.6)`. - docs/agent-contract-current.md paragraph starting "Packet schema 0.5 preserves the v0.4 HITL fields ..." rewritten to lead with v0.6 (citing the new dual-source pointer fields) and to describe v0.5 as the predecessor whose fields are preserved. Test plan: - `python -m pytest` — 1666 passed, 4 skipped (full suite). - Smoke against `samples/support_refund_agent`: - Agent-level `SHIP-AUTH-MANIFEST-BROAD-SCOPE` finding carries `policy_evidence_source=None` and exactly one SARIF location. - Diff with `--diff-from` enables tool/action surface diff; helpers run cleanly. - Sample goldens (report.json, report.md, packet.{md,json,html}) and llms-full.txt regenerated. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…, fingerprint-safe enrichment, packet v0.6 prose Merges origin/main (PR #104 evidence_matrix) into reviewer-grade-provenance and addresses the round-4 review. Merge resolution: - packet_schema_version stays "0.6". The bump now covers BOTH additive extensions on v0.5: PR #104's top-level evidence_matrix section AND PR #103's ReleaseDecisionItem.{source, policy_evidence_source} pointers. Schema comment in schemas/packet.py and the docs (STABILITY, agent-contract-current, INDEX, faq, AGENTS, README, SKILL) describe both. - packet/json_packet.py upgrade chain merges both v0.5→v0.6 upgrades (HEAD's bare bump for PR #103 + main's _upgrade_evidence_matrix_v06). - docs/packet-schema.v0.6.json regenerated from the combined model. - Sample goldens (report.{md,json}, packet.{md,json,html}) regenerated. - llms-full.txt regenerated. P1 — Stop leaking line numbers into action-finding identity: - cli/scan.py no longer enriches the INTERNAL action_surface_diff. evaluate_action_surface_policies serializes ActionSurfaceChange.model_dump() into finding evidence, and finding_fingerprint hashes evidence. Mutating the row before policy evaluation would leak (source: path:line) into baseline identity and a tool moving lines would churn fingerprints. The PUBLIC diff (rendered into report.json / packet) is still enriched separately from public_tools. P2 — Structured source on every tool-surface diff change row: - schemas/surfaces.py: ToolSurfaceToolChange, ToolSurfaceHighRiskEffectChange, ToolSurfaceControlChange, ToolSurfaceMetadataChange, and ActionSurfaceChange each gain optional source_path / source_start_line fields (default None, additive). - enrich_tool_surface_diff_with_source now covers all four tool-surface row families (tools, high_risk_effects, controls, metadata_changes), not just controls. - enrich_action_surface_diff_with_source now populates the structured fields instead of suffixing reason. ActionSurfaceChange.reason stays byte-stable so policy-finding fingerprints don't churn. - Renderers that previously read source via tool_source_index keep working; the structured fields are an additional canonical surface for post-scan consumers reading report.json directly. Tests: - test_enrich_action_surface_diff_populates_structured_source_fields replaces the earlier reason-suffix test. - test_enrich_action_surface_diff_does_not_mutate_reason is a regression for the fingerprint-stability rule. - Full suite: 1676 passed, 4 skipped. - run_id + fingerprint identical across two scans of the same fixture with the structured source fields populated. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ding evidence `ActionSurfaceChange` gained optional `source_path` / `source_start_line` fields in this PR, but `evaluate_action_surface_policies` dumps the change into `Finding.evidence` via `change.model_dump(mode="json")`, which unconditionally includes those keys as `null`. `finding_fingerprint` hashes canonicalised `evidence`, so the mere presence of the new keys shifts every existing action-surface finding fingerprint relative to pre-v0.19 baselines. Fix: new private `_change_evidence(change)` helper in `report/action_surface_diff.py` that dumps with `exclude={"source_path", "source_start_line"}`. All four `evidence={"change": ...}` call sites in `evaluate_action_surface_policies` route through the helper. `ActionSurfaceChange` keeps the structured fields on the diff row itself (renderers and post-scan consumers still see them); only the finding-evidence projection drops them. Verified: the legacy change payload now hashes identically before and after enrichment — `fp_fe9dd3a3a7e07d00` matches across pre-fix-legacy, post-fix-bare, and post-fix-enriched dumps. Test `test_action_policy_finding_evidence_excludes_v019_source_fields` pins the contract. Test plan: - 1677 passed, 4 skipped (full suite). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

pengfei-threemoonslab and others added 5 commits May 20, 2026 16:31

pengfei-threemoonslab merged commit ae924b8 into main May 21, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add reviewer-grade dual-source provenance and CHANGE_ME setup safety#103

Add reviewer-grade dual-source provenance and CHANGE_ME setup safety#103
pengfei-threemoonslab merged 5 commits into
mainfrom
reviewer-grade-provenance

pengfei-threemoonslab commented May 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

pengfei-threemoonslab commented May 20, 2026

Summary

Type

Verification

Release-readiness notes

What changed

Schema (`v0.18` → `v0.19`, additive)

Provenance infrastructure

High-risk emitters (real manifest fields)

Reviewer surfaces

Setup safety

Catalog escalation override

Stability invariants preserved

Docs

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant