[codex] Add evidence matrix to release packet#104
Merged
Conversation
pengfei-threemoonslab
added a commit
that referenced
this pull request
May 21, 2026
…, fingerprint-safe enrichment, packet v0.6 prose Merges origin/main (PR #104 evidence_matrix) into reviewer-grade-provenance and addresses the round-4 review. Merge resolution: - packet_schema_version stays "0.6". The bump now covers BOTH additive extensions on v0.5: PR #104's top-level evidence_matrix section AND PR #103's ReleaseDecisionItem.{source, policy_evidence_source} pointers. Schema comment in schemas/packet.py and the docs (STABILITY, agent-contract-current, INDEX, faq, AGENTS, README, SKILL) describe both. - packet/json_packet.py upgrade chain merges both v0.5→v0.6 upgrades (HEAD's bare bump for PR #103 + main's _upgrade_evidence_matrix_v06). - docs/packet-schema.v0.6.json regenerated from the combined model. - Sample goldens (report.{md,json}, packet.{md,json,html}) regenerated. - llms-full.txt regenerated. P1 — Stop leaking line numbers into action-finding identity: - cli/scan.py no longer enriches the INTERNAL action_surface_diff. evaluate_action_surface_policies serializes ActionSurfaceChange.model_dump() into finding evidence, and finding_fingerprint hashes evidence. Mutating the row before policy evaluation would leak (source: path:line) into baseline identity and a tool moving lines would churn fingerprints. The PUBLIC diff (rendered into report.json / packet) is still enriched separately from public_tools. P2 — Structured source on every tool-surface diff change row: - schemas/surfaces.py: ToolSurfaceToolChange, ToolSurfaceHighRiskEffectChange, ToolSurfaceControlChange, ToolSurfaceMetadataChange, and ActionSurfaceChange each gain optional source_path / source_start_line fields (default None, additive). - enrich_tool_surface_diff_with_source now covers all four tool-surface row families (tools, high_risk_effects, controls, metadata_changes), not just controls. - enrich_action_surface_diff_with_source now populates the structured fields instead of suffixing reason. ActionSurfaceChange.reason stays byte-stable so policy-finding fingerprints don't churn. - Renderers that previously read source via tool_source_index keep working; the structured fields are an additional canonical surface for post-scan consumers reading report.json directly. Tests: - test_enrich_action_surface_diff_populates_structured_source_fields replaces the earlier reason-suffix test. - test_enrich_action_surface_diff_does_not_mutate_reason is a regression for the fingerprint-stability rule. - Full suite: 1676 passed, 4 skipped. - run_id + fingerprint identical across two scans of the same fixture with the structured source fields populated. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
mediumconfidence because source presence is declared coverage, not runtime proof.findings[].blocks_releaseis surfaced as an Action-surface policy source only for findings already classified bySHIP-ACTION-*orcategory == "action_surface"; non-action blockers do not cross-classify into that row.Validation
python scripts/generate_schemas.py --checkgit diff --checkpython -m ruff check .python -m pytest tests/test_evidence_packet.py -qpython -m pytestPYTHONPATH=src python -m agents_shipgate self-check --jsonPYTHONPATH=src python -m agents_shipgate contract --json