Skip to content

PR6 — i8 hot-path measure + ε/recall/fidelity safety net (LOC-64)#68

Merged
dev07060 merged 7 commits into
mainfrom
feat/loc-64-i8-measure-parity-net
May 30, 2026
Merged

PR6 — i8 hot-path measure + ε/recall/fidelity safety net (LOC-64)#68
dev07060 merged 7 commits into
mainfrom
feat/loc-64-i8-measure-parity-net

Conversation

@dev07060

Copy link
Copy Markdown
Owner

Applies the PR1 "measure before changing" pattern to the shipped int8 hot path — every prior review/bench scrutinized only the f32 fallback, while the release build (vector_faer,vector_quant_i8) actually scores candidates via the i8 kernel. Non-destructive: vector_math.rs 0-diff; vector_quant.rs changes only inside mod tests; everything else additive.

What

  • Measure (i8 micro + i8 scan benches): the shipped i8 scan is ~15× faster than the f32-faer fallback — exact_scan_i8 29.98 µs vs exact_scan[faer] 452.82 µs @ 2000×768.
  • Numeric ε net: i8 cosine kernel ≈ independent f64 reference, ε=1e-4 (catches kernel/SIMD-rewrite bugs).
  • recall@k floor: top-k(i8) vs top-k(f32, f64 ground truth) recall@10 ≥ 0.98 — measured 0.996875 (i8@768 is ~lossless). Deterministic: f64 GT + integer-exact i8, total-order (score, index) tie-break.
  • cosine fidelity: max|cosine_i8 − cosine_f32_true| ≤ 0.005 — measured 0.00121. Ranking-free, fully deterministic, sensitive backstop against a lossier future quantizer.
  • CI fail-closed: the nets run on the shipped vector_faer,vector_quant_i8 tree, with per-net name guards so cfg-excluding a net fails closed.
  • Thresholds are pre-locked from measured baselines and protected by const _: () = assert!(...) compile guards.

Finding: the shipped i8 path is both ~15× faster than the f32 fallback and ~lossless on recall@10 — the right call to ship, now CI-guarded.

Spec / plan / journal: docs/perf/vector-math-refactor/PR6-spec-…md, PR6-plan-…md, PR6.md.

Note: build emits one pre-existing unrelated warning (source_rag.rs:2280 dead_code, 0 diff on this branch).

dev07060 added 7 commits May 31, 2026 04:11
…et (LOC-64)

Brainstorming output for the next work-stream: apply the PR1 "measure
before changing" pattern to the SHIPPED hot path (i8), which every prior
review/bench scrutinized only on the f32 fallback. Non-destructive —
kernels unchanged. Captures i8 bench baseline + numeric ε parity + a
recall@k quantization-quality gate, all fail-closed in CI on the shipped
faer+quant compile tree. Implementation follows via writing-plans.
…elity, measured baselines) (LOC-64)

Plan and spec for the i8 measure-first + safety-net work, finalized after an
adversarial pre-flight that (a) compiled the test/bench code on the shipped
vector_quant_i8,vector_faer tree and (b) measured real baselines:
recall@10 = 0.996875 (i8@768 is ~lossless), max|cosine_i8 - cosine_f32_true|
= 0.00121. Net 2 redesigned to recall floor (>= 0.98) + deterministic
cosine-fidelity backstop (<= 0.005), f64 ground truth (kills x86/ARM ULP
jitter), const-assert + CI name guards against vacuous gates.
@linear-code

linear-code Bot commented May 30, 2026

Copy link
Copy Markdown

LOC-64

@dev07060 dev07060 merged commit 5d32b91 into main May 30, 2026
6 checks passed
@dev07060 dev07060 self-assigned this May 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant