Skip to content

fix: rewrite zeckbf17 — golden-step traversal, i16 base, L1 distance#21

Merged
AdaWorldAPI merged 1 commit intomainfrom
feat/codec-research
Mar 21, 2026
Merged

fix: rewrite zeckbf17 — golden-step traversal, i16 base, L1 distance#21
AdaWorldAPI merged 1 commit intomainfrom
feat/codec-research

Conversation

@AdaWorldAPI
Copy link
Copy Markdown
Owner

Critical fixes from PR review:

  1. MATH BUG: Fibonacci mod 17 visits only 13 of 17 residues (missing
    {6,7,10,11}). Replaced with golden-ratio STEP (step=11, gcd(11,17)=1)
    which visits ALL 17. The architectural reasoning (prime base, X-Trans
    aperiodic sampling) was correct — only the traversal generator was wrong.

  2. BF16 → i16: Base pattern now stores i16 fixed-point (×256) instead of
    BF16. Same 34 bytes, 256× finer quantization. i16 captures sub-unit
    averages that BF16's 7-bit mantissa destroys. Native SIMD width.

  3. BF16 Hamming → L1: Distance metric now uses L1 (Manhattan) on i16 base
    patterns, matching the production ZeckF64 pipeline (BitVec Hamming →
    quantile → L1 on bytes). Added zeckf64_from_base() that produces a
    full ZeckF64 u64 from two ZeckBF17 edges.

  4. Removed unused deps: hound, half (code rolled its own BF16 which is
    now replaced by i16 fixed-point).

  5. Improved synthetic data: 70/30 signal/noise ratio with per-node
    deterministic base patterns, better testing rank correlation.

643 lines (was 712). All BF16 helpers removed. Zero floating-point in
the distance path — pure integer L1.

Critical fixes from PR review:

1. MATH BUG: Fibonacci mod 17 visits only 13 of 17 residues (missing
   {6,7,10,11}). Replaced with golden-ratio STEP (step=11, gcd(11,17)=1)
   which visits ALL 17. The architectural reasoning (prime base, X-Trans
   aperiodic sampling) was correct — only the traversal generator was wrong.

2. BF16 → i16: Base pattern now stores i16 fixed-point (×256) instead of
   BF16. Same 34 bytes, 256× finer quantization. i16 captures sub-unit
   averages that BF16's 7-bit mantissa destroys. Native SIMD width.

3. BF16 Hamming → L1: Distance metric now uses L1 (Manhattan) on i16 base
   patterns, matching the production ZeckF64 pipeline (BitVec Hamming →
   quantile → L1 on bytes). Added zeckf64_from_base() that produces a
   full ZeckF64 u64 from two ZeckBF17 edges.

4. Removed unused deps: hound, half (code rolled its own BF16 which is
   now replaced by i16 fixed-point).

5. Improved synthetic data: 70/30 signal/noise ratio with per-node
   deterministic base patterns, better testing rank correlation.

643 lines (was 712). All BF16 helpers removed. Zero floating-point in
the distance path — pure integer L1.
@AdaWorldAPI AdaWorldAPI merged commit 75c9280 into main Mar 21, 2026
AdaWorldAPI pushed a commit that referenced this pull request Apr 30, 2026
…ached_dtos + cohort_similarity_z

SmbStack assembled facade closes the last review-open item from PR #21
(96fb069 + bb3df0b on smb-office-rs claude/review-csharp-rust-transcode-9ygcR).
27 smb-realtime tests pass, clippy clean.

REQUEST: three upstream improvements to retire copy-pasted consumer
caches in medcare-rs + smb-office-rs:
1. &'static Ontology factories (LazyLock-once, not allocate-per-call)
2. OntologyDto::cached_dtos -> CachedOntologyDtos { de, en }
3. lance_graph_contract::distance::cohort_similarity_z<F: Distance>

Both consumers (medcare PR #73, smb PR #22) invented identical caches
locally; promoting upstream eliminates the drift class.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant