Skip to content

feat: close the CMR table — impressed + toolmark validation, all three reductions source-disjoint#75

Merged
erichare merged 1 commit into
mainfrom
feat/close-cmr-table
Jun 9, 2026
Merged

feat: close the CMR table — impressed + toolmark validation, all three reductions source-disjoint#75
erichare merged 1 commit into
mainfrom
feat/close-cmr-table

Conversation

@erichare

@erichare erichare commented Jun 9, 2026

Copy link
Copy Markdown
Owner

Context

The CMR unification — one algorithm reducing to the field's standard per-modality methods — is Verity's core scientific novelty, but the whitepaper demonstrated it with numbers for only the 1-D bullet modality. The impressed (cartridge) result already shipped in cartridge_fadul.npz but was never surfaced, and the toolmark reduction had no committed reference at all.

This PR closes the CMR table: all three reductions are now demonstrated, source-disjoint, under one scorer config — each recomputed from a committed reference via the same compute_validation_summary path, all sharing scorer_config_hash ea4ddd51… (cryptographic proof of one config, not three tuned pipelines).

Reduction Reference Source-disjoint Cllr AUC Recovers
Bullet lands (1-D) pooled, 4 studies 0.186 ± 0.126 0.989 bulletxtrctr / CMS
Cartridge breech (2-D) Fadul, 10 slides 0.385 ± 0.187 0.991 Song CMC (cmcR)
Screwdriver toolmarks (1-D) tmaRks, 56 edges 0.328 ± 0.050 0.957 Chumbley (toolmaRk)

What's here

Engine

  • _reference_io.py: load_reference + barrels_from_clusters; write_reference now savez_compressed (toolmark npz 10 MB → 159 KB, no change to the data).
  • report_validation.py: report_from_reference + verity-validation-report-ref — validation PDF/JSON from any committed reference, no catalog/network.
  • build_toolmark_reference.py: deployed CMR-1D scorer on the full tmaRks set (580 marks / 56 tool-edges), wired into verity-build-references; new references/toolmark_tmaRks.npz (+ provenance).
  • scorer.py: documents diag_contrast as inter-land congruence (the bullet-level instance of the same CMR principle).

Whitepaper data (reproducible from the committed references)

  • docs/whitepaper/data/{cartridge_fadul,toolmark_tmaRks,cmr_table}.json
  • fig_cllr_cartridge.pdf (areal CCF 0.91 → CMR-2D 0.997 → cmcR 1.00; recorded baselines hatched)
  • builders verity-{cartridge,toolmark}-validation-data, verity-cmr-table-data

Web

  • services/web/lib/cmr-validation.json + a "Three reductions, validated source-disjoint" section on /method.

Test plan

  • Engine suite: 111 passed, 3 data-gated skips; ruff clean
  • Impressed + toolmark validation PDFs render (6 pages, correct numbers)
  • /method renders the new section live (Next 16.2.7), correct values + shared hash, no console errors
  • Toolmark reference rebuilt deterministically; all three references share one scorer-config hash
  • Baselines (cartridge areal/cmcR, toolmark Chumbley) recorded with provenance; re-lock via R in a follow-up

Notes

Baselines are recorded with provenance flags, not recomputed here (need R/network). Scope-guard advertising a live striated-toolmark API domain is intentionally deferred (needs an applicability_thresholds sidecar).

…e reductions source-disjoint

The CMR unification claim (one algorithm reducing to the field's per-modality
methods) was demonstrated with numbers only for bullets. This surfaces the
impressed result that already shipped and adds the missing toolmark reduction,
so all three are validated source-disjoint under one scorer config.

The three reductions — each computed from a committed reference via the same
compute_validation_summary path; all share scorer_config_hash ea4ddd51…:
- bullet lands (1-D, pooled 4 studies): source-disjoint Cllr 0.186±0.126, AUC 0.989
- cartridge breech (2-D / CMC): Cllr 0.385±0.187, AUC 0.991 (in-sample AUC 0.997)
- screwdriver toolmarks (1-D / Chumbley): Cllr 0.328±0.050, AUC 0.957 — NEW

Engine:
- _reference_io.py: load_reference + barrels_from_clusters (recover per-side
  source IDs from the A|B pair-source-set clusters); write_reference now uses
  savez_compressed (toolmark npz 10MB -> 159KB).
- report_validation.py: report_from_reference + verity-validation-report-ref —
  renders a validation PDF/JSON from any committed reference, no catalog/network.
- build_toolmark_reference.py: the deployed CMR-1D scorer on the full tmaRks set
  (580 marks, 56 tool-edges), keyed by tool-edge; wired into verity-build-references.
  references/toolmark_tmaRks.npz (+ provenance).
- scorer.py: document diag_contrast as inter-land congruence — the bullet-level
  instance of the same CMR principle (composes with within-land CMR-1D).

Whitepaper data (source of truth, reproducible from the committed references):
- docs/whitepaper/data/{cartridge_fadul,toolmark_tmaRks,cmr_table}.json
- fig_cllr_cartridge.pdf: global areal CCF 0.91 -> CMR-2D 0.997 -> cmcR 1.00
  (recorded baselines hatched; CMR-2D reproducible from the npz)
- new builders: verity-{cartridge,toolmark}-validation-data, verity-cmr-table-data

Web:
- services/web/lib/cmr-validation.json + a "Three reductions, validated
  source-disjoint" section on /method (the CMR table, live).

Tests: engine 111 passed (3 data-gated skips); ruff clean; /method verified
rendering with correct numbers and no console errors.
@vercel

vercel Bot commented Jun 9, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
verity Ready Ready Preview, Comment Jun 9, 2026 3:36pm

Request Review

@erichare erichare enabled auto-merge (squash) June 9, 2026 15:36
@erichare erichare merged commit 76cc93c into main Jun 9, 2026
7 checks passed
@erichare erichare deleted the feat/close-cmr-table branch June 9, 2026 15:39
erichare added a commit that referenced this pull request Jun 9, 2026
…rk validation (#77)

The whitepaper claimed the three-modality CMR unification but showed numbers for
bullets only. Now that the impressed and toolmark reductions are measured (#75, #76),
this adds the cross-modality validation so the paper backs every row of its own CMR
table. Not submitted anywhere — a polished draft.

- Abstract + contribution 1: the unification is now demonstrated source-disjoint under
  one scorer config (CMR recovers Song's CMC on breech faces and the Chumbley statistic
  on toolmarks); the 3-D fractured case is explicitly the unvalidated general form.
- New "Closing the CMR table" subsection in Validation: a three-reductions table
  (bullets pooled 0.186±0.126 / Hamby-252 0.113; cartridge 0.385±0.187; toolmark
  0.328±0.050 — all source-disjoint, one shared config hash), the cartridge figure
  (areal 0.911 -> CMR-2D 0.997 -> cmcR 1.000), and the toolmark result (CMR-1D beats the
  same-pipeline global baseline on tmaRks; Chumbley weak on the small ameslab set).
- Datasets: add the tmaRks toolmark set.
- Limitations: impressed/toolmark now demonstrated (few-sources caveat kept); learned
  representation reframed with the concrete SSL-on-LAPD path.
- Platform: NBTRD as the first open harvester; reproducibility lists every per-modality
  data file + verity-relock-baselines.

Builds clean (latexmk, 12 pp, no undefined references).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant