Skip to content

Replace AlignedBoxWithSlice with Vec in PQScratch and disk fp vector caches#960

Merged
wuw92 merged 7 commits intomainfrom
wewu2/cleanup-pq-scratch-rotated-query
Apr 21, 2026
Merged

Replace AlignedBoxWithSlice with Vec in PQScratch and disk fp vector caches#960
wuw92 merged 7 commits intomainfrom
wewu2/cleanup-pq-scratch-rotated-query

Conversation

@wuw92
Copy link
Copy Markdown
Contributor

@wuw92 wuw92 commented Apr 20, 2026

Summary

Replaces AlignedBoxWithSlice with plain Vec for three disk-side buffers whose consumers don't require alignment.

1. PQScratch.rotated_query (diskann-disk/src/search/pq/pq_scratch.rs)

  • Replace AlignedBoxWithSlice<f32> with Vec<f32>. Consumers (populate_chunk_distances via diskann-vector SIMD, TransposedTable::process_into via diskann-quantization scalar PQ table lookups) handle unaligned / arbitrary-length data, so the alignment is not load-bearing.
  • PQScratch::set now tolerates query.len() >= dim, following the inmem TableL2::populate convention, instead of requiring strict equality.

2. DiskVertexProvider::aligned_vector_buf and Cache::vectors (FP_VECTOR_MEM_ALIGN)

  • Replace AlignedBoxWithSlice<T> (32-byte FP_VECTOR_MEM_ALIGN) with plain Vec<T>. The fp distance kernels reached via get_vector go through diskann-vector SIMD, which handles unaligned data, so the alignment is not load-bearing.
  • Neither buffer has the concurrent insert+search pattern that justifies the FastMemoryVectorProviderAsync-style 64-byte cache-line alignment: aligned_vector_buf is per-scratch-exclusive; Cache::vectors is built once via BFS then shared read-only through Arc<Cache>.
  • Drop the companion memory_aligned_dimension = dims.next_multiple_of(8) padding — it was paired with the 32-byte alignment to feed 8-wide AVX tiles and is no longer justified. DiskVertexProvider and Cache now size slots to raw dim; get_vector returns a slice of length dim instead of a padded length.

- Remove dead field `aligned_query_float` (written in `set()` but never read).
- Replace `rotated_query: AlignedBoxWithSlice<f32>` with `Vec<f32>`:
  the downstream SIMD consumers (`populate_chunk_distances` via
  `diskann-vector`, `TransposedTable::process_into` via `diskann-quantization`)
  use `_mm256_loadu_ps` / `read_unaligned`, so the 32-byte AVX-aligned-load
  alignment is not load-bearing.
- Rename `PQScratch::new`'s `aligned_dim` parameter to `dim`: the production
  caller passes `graph_header.metadata().dims` (raw dim), no padding applied.
- Drop the unused `norm: f32` parameter from `PQScratch::set`. The only
  production caller passed `1.0_f32`, so the `query_val / norm` branch was
  never taken. Simplify the body to a single `copy_from_slice` after
  `T::as_f32` bulk conversion.

The remaining three `AlignedBoxWithSlice` buffers (128-byte alignment for
the L2 Adjacent Cache Line Prefetcher) are left untouched — that claim
needs benchmarking before being removed.
@wuw92 wuw92 requested review from a team and Copilot April 20, 2026 02:17
@wuw92 wuw92 self-assigned this Apr 20, 2026
@wuw92 wuw92 moved this to In Progress in DiskANN backlog Apr 20, 2026
@wuw92 wuw92 added this to the 2026-04 milestone Apr 20, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR simplifies PQ query scratch handling in diskann-disk by removing unused scratch fields and alignment assumptions, and by tightening the PQScratch API to match actual production usage.

Changes:

  • Remove the unused aligned_query_float buffer and drop the unused norm argument from PQScratch::set.
  • Replace rotated_query from AlignedBoxWithSlice<f32> to Vec<f32> and update callers/tests accordingly.
  • Rename PQScratch::new parameter from aligned_dim to dim and adjust call sites.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
diskann-disk/src/search/provider/disk_provider.rs Updates the PQScratch::set call site to match the new signature (drops norm).
diskann-disk/src/search/pq/pq_scratch.rs Removes dead scratch field, switches rotated_query storage to Vec<f32>, and simplifies set() implementation and tests.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread diskann-disk/src/search/pq/pq_scratch.rs Outdated
Comment thread diskann-disk/src/search/pq/pq_scratch.rs Outdated
wuw92 and others added 2 commits April 20, 2026 10:25
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
The refactor in the previous commit used `self.rotated_query[..dim]` as the
destination for `copy_from_slice`, assuming `dim` equaled the decompressed
f32 length. That holds for `T = f32/f16/i8/u8` (per-element conversion) but
breaks for `T = MinMaxElement`, where `T::as_f32` decompresses each element
into multiple f32s — the source length differs from `dim`.

Size the destination slice by the actual f32 length returned by `as_f32`.
Matches the original loop semantics (write indices 0..query.len()).

Caught by test_disk_minmax_index_builder::use_sharded_build_2_true.
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 20, 2026

Codecov Report

❌ Patch coverage is 72.72727% with 9 lines in your changes missing coverage. Please review.
✅ Project coverage is 89.31%. Comparing base (e497ed8) to head (f316071).
⚠️ Report is 8 commits behind head on main.

Files with missing lines Patch % Lines
diskann-disk/src/search/pq/pq_scratch.rs 52.63% 9 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #960      +/-   ##
==========================================
- Coverage   89.31%   89.31%   -0.01%     
==========================================
  Files         447      447              
  Lines       83260    83250      -10     
==========================================
- Hits        74367    74354      -13     
- Misses       8893     8896       +3     
Flag Coverage Δ
miri 89.31% <72.72%> (-0.01%) ⬇️
unittests 89.15% <72.72%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
diskann-disk/src/data_model/cache.rs 100.00% <100.00%> (ø)
diskann-disk/src/search/provider/disk_provider.rs 90.89% <100.00%> (+0.09%) ⬆️
...n-disk/src/search/provider/disk_vertex_provider.rs 85.27% <100.00%> (+1.31%) ⬆️
...rc/search/provider/disk_vertex_provider_factory.rs 95.65% <100.00%> (-0.02%) ⬇️
diskann-disk/src/search/pq/pq_scratch.rs 78.57% <52.63%> (-21.43%) ⬇️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Wei Wu (from Dev Box) added 3 commits April 20, 2026 12:18
Return DimensionMismatchError (matching the function's ANNResult
signature) when dim > query.len() or when the decompressed f32 vector
does not fit in rotated_query. Previously both conditions would panic
via slice indexing and copy_from_slice.
DiskVertexProvider::aligned_vector_buf and Cache::vectors used
AlignedBoxWithSlice<T> with FP_VECTOR_MEM_ALIGN (32 bytes) as a leftover
from AVX aligned loads (_mm256_load_ps). diskann-vector now uses
unaligned loads (_mm256_loadu_ps), so the alignment no longer carries
weight. Neither buffer participates in concurrent insert+search on the
same buffer (aligned_vector_buf is per-scratch-exclusive, Cache::vectors
is read-only after warmup), so the 64-byte cache-line rationale used by
FastMemoryVectorProviderAsync does not apply either. Replace both with
plain Vec<T> and remove the now-unused FP_VECTOR_MEM_ALIGN constant.
memory_aligned_dimension = dims.next_multiple_of(8) was paired with the
now-removed FP_VECTOR_MEM_ALIGN to feed an 8-wide AVX aligned-load SIMD
kernel (_mm256_load_ps, 8 f32 at a time). SIMD has since moved to
_mm256_loadu_ps and the buffer-start alignment has been removed, so the
padding no longer serves any hardware purpose — it only makes get_vector
return a slice longer than the raw dim, which the distance kernel silently
tolerates because the padding is zero.

Replace memory_aligned_dimension with raw dim on DiskVertexProvider and on
the Cache it feeds via disk_vertex_provider_factory. Rename the field
aligned_vector_buf -> vector_buf, drop the unused public
memory_aligned_dimension() accessor, and size vector_buf and Cache::vectors
to max_batch_size * dim / capacity * dim respectively.
@wuw92 wuw92 changed the title Clean up PQScratch.rotated_query and set() in diskann-disk Clean up obsolete AVX-era alignment in diskann-disk search path Apr 20, 2026
@wuw92 wuw92 changed the title Clean up obsolete AVX-era alignment in diskann-disk search path Replace AlignedBoxWithSlice with Vec in PQScratch and disk fp vector caches Apr 20, 2026
@wuw92 wuw92 enabled auto-merge (squash) April 21, 2026 00:21
@wuw92 wuw92 merged commit cfb5927 into main Apr 21, 2026
25 of 26 checks passed
@wuw92 wuw92 deleted the wewu2/cleanup-pq-scratch-rotated-query branch April 21, 2026 00:36
@github-project-automation github-project-automation Bot moved this from In Progress to Done in DiskANN backlog Apr 21, 2026
@wuw92 wuw92 removed this from the 2026-04 milestone Apr 21, 2026
Copilot AI added a commit that referenced this pull request Apr 21, 2026
@wuw92 wuw92 removed this from DiskANN backlog Apr 22, 2026
@arkrishn94 arkrishn94 mentioned this pull request Apr 22, 2026
arkrishn94 added a commit that referenced this pull request Apr 22, 2026
Bumping to 0.50.1 to propagate changes to consumers.

Changes since previous bump: 

## What's Changed
* Add more agentic guard rails by @hildebrandmw in
#871
* Cleanup `diskann-benchmark-runner` and friends. by @hildebrandmw in
#865
* Use `--all-targets` for the no-default-features CI run. by
@hildebrandmw in #874
* Remove unused `normalizing_util.rs` from `diskann-providers` by
@Copilot in #902
* Benchmark Support for A/B Tests by @hildebrandmw in
#900
* [diskann-garnet] Bump diskann-garnet to 1.0.26 by @tiagonapoli in
#925
* Remove the `AdjacencyList` from `diskann-providers` by @hildebrandmw
in #915
* [PQ cleanup] Part 1: Move pq_scratch, quantizer_preprocess and
pq_dataset to `diskann-disk` by @arkrishn94 in
#930
* Forbid Debug in diskann-benchmark by @arrayka in
#914
* Remove DebugProvider by @JordanMaples in
#923
* [diskann-garnet] Create workflow to publish to nuget by @tiagonapoli
in #926
* Move k-means implementation from diskann-providers to diskann-disk by
@Copilot in #933
* Inline minmax distance evaluations by @arkrishn94 in
#935
* Use `rust-toolchain.toml` in CI by @hildebrandmw in
#934
* Add a globally blocking CI gate. by @hildebrandmw in
#932
* Remove `utils/math_util.rs` from `diskann-providers` by @Copilot in
#921
* Bump rand from 0.9.2 to 0.9.3 by @dependabot[bot] in
#945
* Remove OPQ and friends by @arkrishn94 in
#947
* Migrate test_flaky_consolidate from diskann_providers to diskann by
@JordanMaples in #942
* Remove GraphDataType from diskann-providers by @wuw92 in
#950
* Remove unused method extract_best_l_candidates in
NeighborPriorityQueue by @doliawu in
#951
* Add `Debug` bounds to `VectorRepr`'s distance GATs. by @hildebrandmw
in #948
* Add benchmark pipeline with Rust-native A/B validation by
@YuanyuanTian-hh in #912
* Remove unnecessary `Default` bound from `Neighbor`'s `VectorIdType` by
@doliawu in #956
* Replace `AlignedBoxWithSlice` with plain `Vec` / `Matrix` where
alignment is unused by @wuw92 in
#955
* [minmax] 8-bit benchmark by @arkrishn94 in
#959
* Add `MultiInsertStrategy` implementations for `BfTreeProvider` by
@hildebrandmw in #949
* Replace `AlignedBoxWithSlice` with `Vec` in PQScratch and disk fp
vector caches by @wuw92 in #960
* Adding unit tests for paged_search by @JordanMaples in
#962
* Remove AlignedBoxWithSlice wrapper and add alias to Poly<[T],
AlignedAllocator> by @JordanMaples in
#965
* Remove synthetic/structured data generation from diskann-providers by
@JordanMaples in #963
* added tests and some baselines for range_search by @JordanMaples in
#961

## New Contributors
* @JordanMaples made their first contribution in
#923
* @wuw92 made their first contribution in
#950
* @doliawu made their first contribution in
#951
* @YuanyuanTian-hh made their first contribution in
#912

**Full Changelog**:
v0.50.0...v0.50.1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Replace AlignedBoxWithSlice with plain Vec/Matrix where alignment is not needed

5 participants