Skip to content

refactor: heel_f64x8 uses crate::simd::F64x8 polyfill, add SIMD cosine Replace raw std::arch intrinsics with crate::simd::F64x8 polyfill. Automatic dispatch: AVX-512 (native __m512d) → AVX2 (2×__m256d) → scalar. Consumer writes crate::simd::F64x8 — polyfill handles tier selection. Added SIMD cosine kernels using F64x8 FMA: cosine_f64_simd() — single-pass dot + norm_a + norm_b via F64x8 cosine_f32_to_f64_simd() — f32 input, f64 precision cosine dot_f64_simd() — F64x8 FMA dot product on f64 slices sum_sq_f64_simd() — F64x8 sum of squares 12 tests passing (6 HEEL + 6 cosine). https://claude.ai/code/session_01ChLvBfpJS8dQhHxRD4pYNp#75

Merged
AdaWorldAPI merged 3 commits into
masterfrom
claude/setup-embedding-pipeline-Fa65C
Apr 3, 2026

Conversation

@AdaWorldAPI
Copy link
Copy Markdown
Owner

No description provided.

claude added 3 commits April 3, 2026 13:16
Consumers (bgz-tensor) need read access to Welford σ tracking state
for quarter-sigma band computation. Fields remain private; accessors
provide read-only access.

https://claude.ai/code/session_01ChLvBfpJS8dQhHxRD4pYNp
8 HEEL planes × 1 distance each = 8 f64 = one F64x8 register.
LazyLock dispatch: AVX-512 (vmulpd + vreducepd) → AVX2 (2×vmulpd) → scalar.

Functions:
  heel_weighted_distance(&[f64;8], &[f64;8]) → f64  (weighted dot)
  heel_plane_distances(&[u64;8], &[u64;8]) → [f64;8] (Hamming per plane)
  heel_weighted_hamming(a, b, weights) → f64  (full pipeline)

Predefined weights:
  UNIFORM_WEIGHTS = [1.0; 8]
  HEEL_7PLUS1_WEIGHTS = [1,1,1,1,1,1,1, 0.5] (contradiction at half)

6 tests passing.

https://claude.ai/code/session_01ChLvBfpJS8dQhHxRD4pYNp
Replace raw std::arch intrinsics with crate::simd::F64x8 polyfill.
Automatic dispatch: AVX-512 (native __m512d) → AVX2 (2×__m256d) → scalar.
Consumer writes crate::simd::F64x8 — polyfill handles tier selection.

Added SIMD cosine kernels using F64x8 FMA:
  cosine_f64_simd() — single-pass dot + norm_a + norm_b via F64x8
  cosine_f32_to_f64_simd() — f32 input, f64 precision cosine
  dot_f64_simd() — F64x8 FMA dot product on f64 slices
  sum_sq_f64_simd() — F64x8 sum of squares

12 tests passing (6 HEEL + 6 cosine).

https://claude.ai/code/session_01ChLvBfpJS8dQhHxRD4pYNp
@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, you can upgrade your account or add credits to your account and enable them for code reviews in your settings.

@AdaWorldAPI AdaWorldAPI merged commit 3bc3651 into master Apr 3, 2026
5 of 14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants