feat: close the CMR table — impressed + toolmark validation, all three reductions source-disjoint by erichare · Pull Request #75 · erichare/verity

erichare · 2026-06-09T15:36:15Z

Context

The CMR unification — one algorithm reducing to the field's standard per-modality methods — is Verity's core scientific novelty, but the whitepaper demonstrated it with numbers for only the 1-D bullet modality. The impressed (cartridge) result already shipped in cartridge_fadul.npz but was never surfaced, and the toolmark reduction had no committed reference at all.

This PR closes the CMR table: all three reductions are now demonstrated, source-disjoint, under one scorer config — each recomputed from a committed reference via the same compute_validation_summary path, all sharing scorer_config_hash ea4ddd51… (cryptographic proof of one config, not three tuned pipelines).

Reduction	Reference	Source-disjoint Cllr	AUC	Recovers
Bullet lands (1-D)	pooled, 4 studies	0.186 ± 0.126	0.989	bulletxtrctr / CMS
Cartridge breech (2-D)	Fadul, 10 slides	0.385 ± 0.187	0.991	Song CMC (`cmcR`)
Screwdriver toolmarks (1-D)	tmaRks, 56 edges	0.328 ± 0.050	0.957	Chumbley (`toolmaRk`)

What's here

Engine

_reference_io.py: load_reference + barrels_from_clusters; write_reference now savez_compressed (toolmark npz 10 MB → 159 KB, no change to the data).
report_validation.py: report_from_reference + verity-validation-report-ref — validation PDF/JSON from any committed reference, no catalog/network.
build_toolmark_reference.py: deployed CMR-1D scorer on the full tmaRks set (580 marks / 56 tool-edges), wired into verity-build-references; new references/toolmark_tmaRks.npz (+ provenance).
scorer.py: documents diag_contrast as inter-land congruence (the bullet-level instance of the same CMR principle).

Whitepaper data (reproducible from the committed references)

docs/whitepaper/data/{cartridge_fadul,toolmark_tmaRks,cmr_table}.json
fig_cllr_cartridge.pdf (areal CCF 0.91 → CMR-2D 0.997 → cmcR 1.00; recorded baselines hatched)
builders verity-{cartridge,toolmark}-validation-data, verity-cmr-table-data

Web

services/web/lib/cmr-validation.json + a "Three reductions, validated source-disjoint" section on /method.

Test plan

Engine suite: 111 passed, 3 data-gated skips; ruff clean
Impressed + toolmark validation PDFs render (6 pages, correct numbers)
/method renders the new section live (Next 16.2.7), correct values + shared hash, no console errors
Toolmark reference rebuilt deterministically; all three references share one scorer-config hash
Baselines (cartridge areal/cmcR, toolmark Chumbley) recorded with provenance; re-lock via R in a follow-up

Notes

Baselines are recorded with provenance flags, not recomputed here (need R/network). Scope-guard advertising a live striated-toolmark API domain is intentionally deferred (needs an applicability_thresholds sidecar).

…e reductions source-disjoint The CMR unification claim (one algorithm reducing to the field's per-modality methods) was demonstrated with numbers only for bullets. This surfaces the impressed result that already shipped and adds the missing toolmark reduction, so all three are validated source-disjoint under one scorer config. The three reductions — each computed from a committed reference via the same compute_validation_summary path; all share scorer_config_hash ea4ddd51…: - bullet lands (1-D, pooled 4 studies): source-disjoint Cllr 0.186±0.126, AUC 0.989 - cartridge breech (2-D / CMC): Cllr 0.385±0.187, AUC 0.991 (in-sample AUC 0.997) - screwdriver toolmarks (1-D / Chumbley): Cllr 0.328±0.050, AUC 0.957 — NEW Engine: - _reference_io.py: load_reference + barrels_from_clusters (recover per-side source IDs from the A|B pair-source-set clusters); write_reference now uses savez_compressed (toolmark npz 10MB -> 159KB). - report_validation.py: report_from_reference + verity-validation-report-ref — renders a validation PDF/JSON from any committed reference, no catalog/network. - build_toolmark_reference.py: the deployed CMR-1D scorer on the full tmaRks set (580 marks, 56 tool-edges), keyed by tool-edge; wired into verity-build-references. references/toolmark_tmaRks.npz (+ provenance). - scorer.py: document diag_contrast as inter-land congruence — the bullet-level instance of the same CMR principle (composes with within-land CMR-1D). Whitepaper data (source of truth, reproducible from the committed references): - docs/whitepaper/data/{cartridge_fadul,toolmark_tmaRks,cmr_table}.json - fig_cllr_cartridge.pdf: global areal CCF 0.91 -> CMR-2D 0.997 -> cmcR 1.00 (recorded baselines hatched; CMR-2D reproducible from the npz) - new builders: verity-{cartridge,toolmark}-validation-data, verity-cmr-table-data Web: - services/web/lib/cmr-validation.json + a "Three reductions, validated source-disjoint" section on /method (the CMR table, live). Tests: engine 111 passed (3 data-gated skips); ruff clean; /method verified rendering with correct numbers and no console errors.

vercel · 2026-06-09T15:36:21Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
verity	Ready	Preview, Comment	Jun 9, 2026 3:36pm

…rk validation (#77) The whitepaper claimed the three-modality CMR unification but showed numbers for bullets only. Now that the impressed and toolmark reductions are measured (#75, #76), this adds the cross-modality validation so the paper backs every row of its own CMR table. Not submitted anywhere — a polished draft. - Abstract + contribution 1: the unification is now demonstrated source-disjoint under one scorer config (CMR recovers Song's CMC on breech faces and the Chumbley statistic on toolmarks); the 3-D fractured case is explicitly the unvalidated general form. - New "Closing the CMR table" subsection in Validation: a three-reductions table (bullets pooled 0.186±0.126 / Hamby-252 0.113; cartridge 0.385±0.187; toolmark 0.328±0.050 — all source-disjoint, one shared config hash), the cartridge figure (areal 0.911 -> CMR-2D 0.997 -> cmcR 1.000), and the toolmark result (CMR-1D beats the same-pipeline global baseline on tmaRks; Chumbley weak on the small ameslab set). - Datasets: add the tmaRks toolmark set. - Limitations: impressed/toolmark now demonstrated (few-sources caveat kept); learned representation reframed with the concrete SSL-on-LAPD path. - Platform: NBTRD as the first open harvester; reproducibility lists every per-modality data file + verity-relock-baselines. Builds clean (latexmk, 12 pp, no undefined references).

vercel Bot deployed to Preview June 9, 2026 15:36 View deployment

erichare enabled auto-merge (squash) June 9, 2026 15:36

erichare merged commit 76cc93c into main Jun 9, 2026
7 checks passed

erichare deleted the feat/close-cmr-table branch June 9, 2026 15:39

This was referenced Jun 9, 2026

feat: re-lock measured comparison baselines for the CMR validation figures/tables #76

Merged

docs(whitepaper): demonstrate the full CMR table — impressed + toolmark validation #77

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: close the CMR table — impressed + toolmark validation, all three reductions source-disjoint#75

feat: close the CMR table — impressed + toolmark validation, all three reductions source-disjoint#75
erichare merged 1 commit into
mainfrom
feat/close-cmr-table

erichare commented Jun 9, 2026

Uh oh!

vercel Bot commented Jun 9, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

erichare commented Jun 9, 2026

Context

What's here

Test plan

Notes

Uh oh!

vercel Bot commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vercel Bot commented Jun 9, 2026 •

edited

Loading