add bulk off-heap scoring for uint8 quantized vectors using panama vector api by iprithv · Pull Request #16203 · apache/lucene

iprithv · 2026-06-05T11:38:03Z

added 4-wide bulk dot product and square distance for uint8 quantized vectors using panama vector api with memorysegment-based access. reduced reduceLanes calls by 4x which helps under icache pressure during hnsw graph traversal.

benchmarks (amd ryzen 7 7800x3d, avx-512):

float32 bulk: ~1.5x speedup
uint8 bulk (clean icache): ~5-7% slower
uint8 bulk (polluted icache): ~2x faster

related: #15155 #15257 #14980
also I tried this for int4 bulk but saw ~2.3x slower on avx-512 due to nibble unpacking overhead..

…ctor API Adds 4-wide bulk dot product and square distance operations for uint8 quantized vectors using the Panama Vector API with MemorySegment-based data access. This reduces reduceLanes calls by 4x compared to single- vector scoring, which helps under icache pressure during HNSW graph traversal. Key implementation details: - Uint8DotProduct and Uint8SqrDistance inner classes in MemorySegmentBulkVectorOps with platform-specific widening: - 128-bit: bytes -> shorts -> ints (2 convertShape parts) - 256-bit: bytes -> ints directly - 512-bit: bytes -> shorts -> ints (single part) - UINT8_NEEDS_PART1 guard prevents convertShape(..., 1) call on 512-bit where all 16 shorts fit in 16 ints in a single part - Int4 bulk scoring is explicitly not implemented; int4 scorers fall back to the existing single-vector Panama/Native paths via !isUint8() guard in bulkScoreBody Tests cover: basic uint8 (int7), large dimensions (128), odd dimensions (97), SIMD boundaries (15/16/17), tail paths (0-3 nodes), updateable scorer (MemorySegment query), and int4 fallback verification. Benchmarks (AMD Ryzen 7 7800X3D, AVX-512): - Float32 bulk: ~1.5x speedup (reference) - Uint8 bulk (clean icache): ~5-7% regression - Uint8 bulk (polluted icache): ~2x speedup Int4 bulk was evaluated but benchmarked ~2.3x slower than single-vector and has been removed from this PR.

github-actions Bot added the module:core/other label Jun 5, 2026

github-actions Bot added this to the 10.5.0 milestone Jun 5, 2026

iprithv force-pushed the bulk-offheap-scoring branch 4 times, most recently from 2336b8b to 94e8a44 Compare June 5, 2026 18:29

iprithv changed the title ~~Add bulk off-heap scoring for quantized vectors using Panama Vector API.~~ add bulk off-heap scoring for uint8 quantized vectors using panama vector api Jun 5, 2026

iprithv force-pushed the bulk-offheap-scoring branch 3 times, most recently from c47bf55 to 8a127b8 Compare June 5, 2026 18:54

iprithv marked this pull request as ready for review June 5, 2026 19:09

iprithv force-pushed the bulk-offheap-scoring branch from 8a127b8 to fdab794 Compare June 5, 2026 19:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add bulk off-heap scoring for uint8 quantized vectors using panama vector api#16203

add bulk off-heap scoring for uint8 quantized vectors using panama vector api#16203
iprithv wants to merge 1 commit into
apache:mainfrom
iprithv:bulk-offheap-scoring

iprithv commented Jun 5, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

iprithv commented Jun 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

iprithv commented Jun 5, 2026 •

edited

Loading