chore(physics): evidence matrix + validator check 9 (Task 7)#432
chore(physics): evidence matrix + validator check 9 (Task 7)#432neuron7xLab merged 1 commit intomainfrom
Conversation
Closes Task 7 of the Physics-Invariant Rocketization Protocol.
What's new:
1. tools/physics_evidence_matrix.py — deterministic markdown table
generator. Reads .claude/physics/INVARIANTS.yaml via load_invariants
and emits a per-invariant evidence row (tier, priority, source path,
unit-test path, integration-test path, runtime status) plus on-disk
resolution checks (✓ / ✗ / —). Re-running produces byte-identical
output. CLI: --out PATH (default stdout). Exit 0/1/4.
2. .claude/physics/INVARIANTS.yaml — added integration_test +
runtime_evaluable: yes to the three ANCHORED runtime invariants
that the substrate-gate-chain integration suite already exercises:
- INV-BEKENSTEIN-COGNITIVE
- INV-ARROW-OF-TIME
- INV-ANCHORED-SUBSTRATE-GATE
Also added the missing `provenance: ANCHORED` field to
bekenstein_cognitive — it was treated as ANCHORED everywhere
(CLAUDE.md §0, related lists, gate composition) but never explicitly
tier-labeled in the registry. Discovered by check 9.
3. .claude/physics/validate_tests.py — new self-check #9. For each
invariant where provenance == ANCHORED and runtime_evaluable in
{yes, true}, the validator confirms an integration_test field is
declared AND resolves on disk. Same shape as check 7 (path
integrity) and check 8 (related cross-refs). Catches the failure
mode where an invariant claims runtime evaluability but has no
integration evidence.
Reproduced evidence:
Clean baseline:
$ python .claude/physics/validate_tests.py --self-check
9. Evidence matrix OK: every ANCHORED runtime invariant has
a resolving integration_test path
✅ Self-check PASSED (exit 0)
Negative injection (corrupt one integration_test path):
9. FAIL: 1 ANCHORED runtime invariants miss integration evidence:
["INV-BEKENSTEIN-COGNITIVE.integration_test = ... DOES_NOT_EXIST.py
does not resolve"]
❌ Self-check FAILED (exit 1)
Determinism: two consecutive runs of physics_evidence_matrix.py
produced byte-identical 95-line output (diff clean).
Quality gates:
- ruff format: clean
- ruff check: clean
- black --check: clean
- mypy --strict tools/physics_evidence_matrix.py: 0 issues
- mypy --strict .claude/physics/validate_tests.py: 0 issues
Closure: CLOSED. Validator now fails on intentional broken evidence;
matrix is regenerable, deterministic, and not editable by hand.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7fbc36bb09
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| full = repo_root / integration.split("::", 1)[0] | ||
| if not full.exists(): |
There was a problem hiding this comment.
Require integration_test to point to a file
The new check 9 treats any existing filesystem path as valid integration evidence, so a directory like tests/integration would pass even though no concrete test file is referenced. Because --self-check is enforced in the physics gate, this can silently bypass the intended guarantee that each ANCHORED runtime invariant names a specific integration test artifact. Validate with is_file() (after stripping any ::symbol suffix) to avoid false positives.
Useful? React with 👍 / 👎.
Summary
Closes Task 7 of the Physics-Invariant Rocketization Protocol — final task in the 7-task chain (T1 PR #426 → T2 #427 → T3 #428 → T4 #429 → T5 #430 → T6 #431 → T7).
tools/physics_evidence_matrix.py— deterministic markdown table generator over.claude/physics/INVARIANTS.yaml. Per-invariant row: tier · priority · source · unit test · integration test · runtime status · on-disk ✓/✗/—. Byte-identical re-runs..claude/physics/INVARIANTS.yaml— addedintegration_test:+runtime_evaluable: yesto the three ANCHORED runtime invariants exercised by the substrate-gate-chain suite (BEKENSTEIN-COGNITIVE, ARROW-OF-TIME, ANCHORED-SUBSTRATE-GATE). Also added the missingprovenance: ANCHOREDtobekenstein_cognitive— it was treated as ANCHORED everywhere but never explicitly tier-labeled (caught by check 9)..claude/physics/validate_tests.py— new check 9: for every invariant withprovenance==ANCHOREDandruntime_evaluable∈{yes,true}, the validator confirmsintegration_test:is declared AND resolves on disk. Same shape as checks 7/8.docs/physics/evidence_matrix.md— generated artefact (95 lines), 87 invariants tabulated.Reproduced evidence
Determinism: two consecutive
python tools/physics_evidence_matrix.pyruns →diffclean.Test plan
python .claude/physics/validate_tests.py --self-check→ 9/9 PASS, exit 0ruff format --checkcleanruff checkcleanblack --checkcleanmypy --strict tools/physics_evidence_matrix.py→ 0 issuesmypy --strict .claude/physics/validate_tests.py→ 0 issuesClosure
CLOSED. Validator now fails on intentional broken evidence; matrix regenerable and not editable by hand.
🤖 Generated with Claude Code