Replace AlignedBoxWithSlice with Vec in PQScratch and disk fp vector caches#960
Merged
Replace AlignedBoxWithSlice with Vec in PQScratch and disk fp vector caches#960
AlignedBoxWithSlice with Vec in PQScratch and disk fp vector caches#960Conversation
- Remove dead field `aligned_query_float` (written in `set()` but never read). - Replace `rotated_query: AlignedBoxWithSlice<f32>` with `Vec<f32>`: the downstream SIMD consumers (`populate_chunk_distances` via `diskann-vector`, `TransposedTable::process_into` via `diskann-quantization`) use `_mm256_loadu_ps` / `read_unaligned`, so the 32-byte AVX-aligned-load alignment is not load-bearing. - Rename `PQScratch::new`'s `aligned_dim` parameter to `dim`: the production caller passes `graph_header.metadata().dims` (raw dim), no padding applied. - Drop the unused `norm: f32` parameter from `PQScratch::set`. The only production caller passed `1.0_f32`, so the `query_val / norm` branch was never taken. Simplify the body to a single `copy_from_slice` after `T::as_f32` bulk conversion. The remaining three `AlignedBoxWithSlice` buffers (128-byte alignment for the L2 Adjacent Cache Line Prefetcher) are left untouched — that claim needs benchmarking before being removed.
Contributor
There was a problem hiding this comment.
Pull request overview
This PR simplifies PQ query scratch handling in diskann-disk by removing unused scratch fields and alignment assumptions, and by tightening the PQScratch API to match actual production usage.
Changes:
- Remove the unused
aligned_query_floatbuffer and drop the unusednormargument fromPQScratch::set. - Replace
rotated_queryfromAlignedBoxWithSlice<f32>toVec<f32>and update callers/tests accordingly. - Rename
PQScratch::newparameter fromaligned_dimtodimand adjust call sites.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
diskann-disk/src/search/provider/disk_provider.rs |
Updates the PQScratch::set call site to match the new signature (drops norm). |
diskann-disk/src/search/pq/pq_scratch.rs |
Removes dead scratch field, switches rotated_query storage to Vec<f32>, and simplifies set() implementation and tests. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
The refactor in the previous commit used `self.rotated_query[..dim]` as the destination for `copy_from_slice`, assuming `dim` equaled the decompressed f32 length. That holds for `T = f32/f16/i8/u8` (per-element conversion) but breaks for `T = MinMaxElement`, where `T::as_f32` decompresses each element into multiple f32s — the source length differs from `dim`. Size the destination slice by the actual f32 length returned by `as_f32`. Matches the original loop semantics (write indices 0..query.len()). Caught by test_disk_minmax_index_builder::use_sharded_build_2_true.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #960 +/- ##
==========================================
- Coverage 89.31% 89.31% -0.01%
==========================================
Files 447 447
Lines 83260 83250 -10
==========================================
- Hits 74367 74354 -13
- Misses 8893 8896 +3
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
added 3 commits
April 20, 2026 12:18
Return DimensionMismatchError (matching the function's ANNResult signature) when dim > query.len() or when the decompressed f32 vector does not fit in rotated_query. Previously both conditions would panic via slice indexing and copy_from_slice.
DiskVertexProvider::aligned_vector_buf and Cache::vectors used AlignedBoxWithSlice<T> with FP_VECTOR_MEM_ALIGN (32 bytes) as a leftover from AVX aligned loads (_mm256_load_ps). diskann-vector now uses unaligned loads (_mm256_loadu_ps), so the alignment no longer carries weight. Neither buffer participates in concurrent insert+search on the same buffer (aligned_vector_buf is per-scratch-exclusive, Cache::vectors is read-only after warmup), so the 64-byte cache-line rationale used by FastMemoryVectorProviderAsync does not apply either. Replace both with plain Vec<T> and remove the now-unused FP_VECTOR_MEM_ALIGN constant.
memory_aligned_dimension = dims.next_multiple_of(8) was paired with the now-removed FP_VECTOR_MEM_ALIGN to feed an 8-wide AVX aligned-load SIMD kernel (_mm256_load_ps, 8 f32 at a time). SIMD has since moved to _mm256_loadu_ps and the buffer-start alignment has been removed, so the padding no longer serves any hardware purpose — it only makes get_vector return a slice longer than the raw dim, which the distance kernel silently tolerates because the padding is zero. Replace memory_aligned_dimension with raw dim on DiskVertexProvider and on the Cache it feeds via disk_vertex_provider_factory. Rename the field aligned_vector_buf -> vector_buf, drop the unused public memory_aligned_dimension() accessor, and size vector_buf and Cache::vectors to max_batch_size * dim / capacity * dim respectively.
set() in diskann-diskAlignedBoxWithSlice with Vec in PQScratch and disk fp vector caches
JordanMaples
approved these changes
Apr 20, 2026
metajack
approved these changes
Apr 20, 2026
6 tasks
Merged
arkrishn94
added a commit
that referenced
this pull request
Apr 22, 2026
Bumping to 0.50.1 to propagate changes to consumers. Changes since previous bump: ## What's Changed * Add more agentic guard rails by @hildebrandmw in #871 * Cleanup `diskann-benchmark-runner` and friends. by @hildebrandmw in #865 * Use `--all-targets` for the no-default-features CI run. by @hildebrandmw in #874 * Remove unused `normalizing_util.rs` from `diskann-providers` by @Copilot in #902 * Benchmark Support for A/B Tests by @hildebrandmw in #900 * [diskann-garnet] Bump diskann-garnet to 1.0.26 by @tiagonapoli in #925 * Remove the `AdjacencyList` from `diskann-providers` by @hildebrandmw in #915 * [PQ cleanup] Part 1: Move pq_scratch, quantizer_preprocess and pq_dataset to `diskann-disk` by @arkrishn94 in #930 * Forbid Debug in diskann-benchmark by @arrayka in #914 * Remove DebugProvider by @JordanMaples in #923 * [diskann-garnet] Create workflow to publish to nuget by @tiagonapoli in #926 * Move k-means implementation from diskann-providers to diskann-disk by @Copilot in #933 * Inline minmax distance evaluations by @arkrishn94 in #935 * Use `rust-toolchain.toml` in CI by @hildebrandmw in #934 * Add a globally blocking CI gate. by @hildebrandmw in #932 * Remove `utils/math_util.rs` from `diskann-providers` by @Copilot in #921 * Bump rand from 0.9.2 to 0.9.3 by @dependabot[bot] in #945 * Remove OPQ and friends by @arkrishn94 in #947 * Migrate test_flaky_consolidate from diskann_providers to diskann by @JordanMaples in #942 * Remove GraphDataType from diskann-providers by @wuw92 in #950 * Remove unused method extract_best_l_candidates in NeighborPriorityQueue by @doliawu in #951 * Add `Debug` bounds to `VectorRepr`'s distance GATs. by @hildebrandmw in #948 * Add benchmark pipeline with Rust-native A/B validation by @YuanyuanTian-hh in #912 * Remove unnecessary `Default` bound from `Neighbor`'s `VectorIdType` by @doliawu in #956 * Replace `AlignedBoxWithSlice` with plain `Vec` / `Matrix` where alignment is unused by @wuw92 in #955 * [minmax] 8-bit benchmark by @arkrishn94 in #959 * Add `MultiInsertStrategy` implementations for `BfTreeProvider` by @hildebrandmw in #949 * Replace `AlignedBoxWithSlice` with `Vec` in PQScratch and disk fp vector caches by @wuw92 in #960 * Adding unit tests for paged_search by @JordanMaples in #962 * Remove AlignedBoxWithSlice wrapper and add alias to Poly<[T], AlignedAllocator> by @JordanMaples in #965 * Remove synthetic/structured data generation from diskann-providers by @JordanMaples in #963 * added tests and some baselines for range_search by @JordanMaples in #961 ## New Contributors * @JordanMaples made their first contribution in #923 * @wuw92 made their first contribution in #950 * @doliawu made their first contribution in #951 * @YuanyuanTian-hh made their first contribution in #912 **Full Changelog**: v0.50.0...v0.50.1
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Replaces
AlignedBoxWithSlicewith plainVecfor three disk-side buffers whose consumers don't require alignment.1.
PQScratch.rotated_query(diskann-disk/src/search/pq/pq_scratch.rs)AlignedBoxWithSlice<f32>withVec<f32>. Consumers (populate_chunk_distancesviadiskann-vectorSIMD,TransposedTable::process_intoviadiskann-quantizationscalar PQ table lookups) handle unaligned / arbitrary-length data, so the alignment is not load-bearing.PQScratch::setnow toleratesquery.len() >= dim, following the inmemTableL2::populateconvention, instead of requiring strict equality.2.
DiskVertexProvider::aligned_vector_bufandCache::vectors(FP_VECTOR_MEM_ALIGN)AlignedBoxWithSlice<T>(32-byteFP_VECTOR_MEM_ALIGN) with plainVec<T>. The fp distance kernels reached viaget_vectorgo throughdiskann-vectorSIMD, which handles unaligned data, so the alignment is not load-bearing.FastMemoryVectorProviderAsync-style 64-byte cache-line alignment:aligned_vector_bufis per-scratch-exclusive;Cache::vectorsis built once via BFS then shared read-only throughArc<Cache>.memory_aligned_dimension = dims.next_multiple_of(8)padding — it was paired with the 32-byte alignment to feed 8-wide AVX tiles and is no longer justified.DiskVertexProviderandCachenow size slots to rawdim;get_vectorreturns a slice of lengthdiminstead of a padded length.