Skip to content

PQ: tighten dim contract; right-size scratch buffer#1044

Open
wuw92 wants to merge 4 commits intomainfrom
wewu2/tighten-pq-populate-contract
Open

PQ: tighten dim contract; right-size scratch buffer#1044
wuw92 wants to merge 4 commits intomainfrom
wewu2/tighten-pq-populate-contract

Conversation

@wuw92
Copy link
Copy Markdown
Contributor

@wuw92 wuw92 commented May 8, 2026

Background

Historically, query buffers came from AlignedBoxWithSlice, which silently rounded length up to a multiple of 8 for SIMD alignment. Downstream populate functions therefore had to accept query.len() >= dim instead of query.len() == dim — pre-PR comment in TableL2::populate:

Alignment means that the size of query gets increased ...
This makes is VERY hard to do error checking on dimension propagation.

With #960 removing AlignedBoxWithSlice from the PQ path, the subtree can refactor dim handling along the boundary convert / internal trust idiom.

Three-layer dim contract

Layer Where Action Failure
Boundary (inmem) QueryComputer::new, MultiQueryComputer::new, DistanceComputer::evaluate_similarity Validate len == dim Result::Err / assert_eq!
Boundary (disk) PQScratch::set Validate len >= dim, slice [..dim] Result::Err on undersize
Internal TableL2/IP/Cosine::{new, populate} Trusted, no re-validation
Inner kernel preprocess_query, populate_chunk_distances_impl, direct_distance_impl Contract boundary debug_assert_eq! dev/CI panic, release zero-cost; also fail-loud for direct OSS callers

Boundary methods take &[f32]. Quantized inputs are decoded via VectorRepr::as_f32 once at the caller boundary; the PQ subtree is f32-only internally.

Why entries take &[f32]

Into<f32> is per-element. MinMaxElement<8> is a single byte that can't decode without the full slice's trailing metadata — it cannot implement Into<f32>. Production callers supporting MinMax were therefore always pre-decoding via VectorRepr::as_f32 and passing &[f32] upstream; the previous <U: Into<f32>> generic on PQ entries was effectively orphan-only.

Follow-up to @hildebrandmw's review on #960.

- `PQScratch::rotated_query` is sized by `PQData::get_dim()` (PQ
  logical dim) instead of `graph_header.metadata().dims` (slot byte
  count, exceeds logical dim for `MinMaxElement` due to trailing
  min/max metadata).
- PQ entries take `&[f32]`, accept `len >= dim`, slice `[..dim]`.
  Callers decode via `VectorRepr::as_f32` once at the boundary;
  PQ subtree is f32-only internally.
- Kernels (`preprocess_query`, `populate_chunk_distances_impl`,
  `direct_distance_impl`) `debug_assert_eq!` on entry, matching
  `pq_dist_lookup_single`. The two `_impl` helpers become private.
- `DirectCosine::populate` uses `copy_from_slice` (the previous zip
  silently truncated, no longer applicable).
- Drop redundant `Copy` and `U: Into<f32>` bounds on touched fns.
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 8, 2026

Codecov Report

❌ Patch coverage is 99.20635% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 90.60%. Comparing base (c50fb2b) to head (bc20758).
⚠️ Report is 5 commits behind head on main.

Files with missing lines Patch % Lines
diskann-disk/src/search/pq/quantizer_preprocess.rs 80.00% 1 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1044      +/-   ##
==========================================
+ Coverage   89.51%   90.60%   +1.08%     
==========================================
  Files         460      461       +1     
  Lines       85466    85678     +212     
==========================================
+ Hits        76508    77631    +1123     
+ Misses       8958     8047     -911     
Flag Coverage Δ
miri 90.60% <99.20%> (+1.08%) ⬆️
unittests 90.56% <99.20%> (+1.20%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
diskann-disk/src/search/pq/pq_scratch.rs 89.18% <100.00%> (+9.64%) ⬆️
diskann-disk/src/search/provider/disk_provider.rs 90.88% <100.00%> (-0.01%) ⬇️
diskann-disk/src/storage/quant/pq/pq_dataset.rs 97.33% <100.00%> (+0.19%) ⬆️
...aph/provider/async_/experimental/multi_pq_async.rs 96.40% <100.00%> (ø)
...ovider/async_/fast_memory_quant_vector_provider.rs 98.46% <100.00%> (ø)
...ph/provider/async_/memory_quant_vector_provider.rs 98.26% <100.00%> (ø)
diskann-providers/src/model/pq/distance/cosine.rs 98.76% <100.00%> (-1.24%) ⬇️
diskann-providers/src/model/pq/distance/dynamic.rs 95.59% <100.00%> (+4.29%) ⬆️
...nn-providers/src/model/pq/distance/innerproduct.rs 100.00% <100.00%> (ø)
diskann-providers/src/model/pq/distance/l2.rs 100.00% <100.00%> (ø)
... and 4 more

... and 54 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@wuw92 wuw92 changed the title PQ: size scratch by logical dim, formalize dim contract PQ: tighten dim contract; right-size scratch buffer May 8, 2026
PR moved entries to &[f32] so test_X_inner helpers no longer need to
generate per-T data — drop the type parameter, generate Vec<f32>
directly. Removes turbofish at all call sites and the rstest values
parameterization.
@wuw92 wuw92 marked this pull request as ready for review May 8, 2026 08:06
@wuw92 wuw92 requested review from a team and Copilot May 8, 2026 08:06
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR tightens and clarifies the PQ query-dimension contract by moving conversion/validation to boundary APIs, making PQ internals f32-only, and sizing scratch buffers to the PQ table’s logical dimension (rather than alignment-padded lengths). It also hides previously public PQ internals that were effectively low-level/FFI-oriented.

Changes:

  • Switch PQ “entry points” (e.g., QueryComputer / MultiQueryComputer) to accept &[f32], validate len >= dim, and internally slice to [..dim].
  • Add debug-only dimension assertions inside inner kernels and right-size PQ scratch/query buffers to the PQ table’s logical dimension.
  • Remove public exposure of some PQ internals (direct_distance_impl, populate_chunk_distances_impl, inner_product_raw) and adjust callers/tests accordingly.

Reviewed changes

Copilot reviewed 17 out of 17 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
diskann-providers/src/model/pq/mod.rs Stops re-exporting direct_distance_impl from the PQ module surface.
diskann-providers/src/model/pq/fixed_chunk_pq_table.rs Makes several PQ internals private and adds debug-only dim assertions in kernels.
diskann-providers/src/model/pq/distance/test_utils.rs Updates PQ distance test helpers to generate/use f32 queries directly.
diskann-providers/src/model/pq/distance/multi.rs Updates MultiQueryComputer::new to take &[f32] and adjusts tests.
diskann-providers/src/model/pq/distance/l2.rs Changes TableL2 constructor/populate path to accept &[f32]; updates tests.
diskann-providers/src/model/pq/distance/innerproduct.rs Changes TableIP constructor/populate path to accept &[f32]; updates tests.
diskann-providers/src/model/pq/distance/dynamic.rs Enforces boundary dim checks/slicing in QueryComputer::new; slices FP input in DistanceComputer. Adds tests for undersized query handling.
diskann-providers/src/model/pq/distance/cosine.rs Changes cosine query handling to copy from &[f32]; updates tests.
diskann-providers/src/model/mod.rs Removes re-export of direct_distance_impl from the top-level model API.
diskann-providers/src/model/graph/provider/async_/memory_quant_vector_provider.rs Moves query decoding to the provider boundary via VectorRepr::as_f32.
diskann-providers/src/model/graph/provider/async_/fast_memory_quant_vector_provider.rs Aligns query constraint to T: VectorRepr (dropping Copy).
diskann-providers/src/model/graph/provider/async_/experimental/multi_pq_async.rs Decodes queries to f32 at the boundary and passes &[f32] into PQ code.
diskann-providers/src/model/graph/provider/async_/bf_tree/quant_vector_provider.rs Aligns query constraint to T: VectorRepr (dropping Copy).
diskann-disk/src/storage/quant/pq/pq_dataset.rs Adds PQData::get_dim() to expose PQ logical dimension for sizing buffers.
diskann-disk/src/search/provider/disk_provider.rs Uses PQ logical dim and decodes query to f32 before PQScratch::set.
diskann-disk/src/search/pq/quantizer_preprocess.rs Removes manual query slicing and relies on right-sized rotated_query.
diskann-disk/src/search/pq/pq_scratch.rs Updates PQScratch::set to accept &[f32] and copy exactly dim elements into a right-sized buffer.
Comments suppressed due to low confidence (1)

diskann-providers/src/model/pq/distance/multi.rs:377

  • MultiQueryComputer::new now requires &[f32] rather than &[U: Into<f32> + Copy], which is a breaking change for downstreams passing non-f32 query types directly. If this is intended, consider a transition path (deprecated overload or helper) consistent with the new boundary decode pattern.
    /// Construct a new `MultiQueryComputer` with the requested metric and query.
    pub fn new(table: MultiTable<T, I>, metric: Metric, query: &[f32]) -> ANNResult<Self> {
        let s = match table {
            MultiTable::One { table, version } => Self::One {
                computer: { QueryComputer::new(table, metric, query, None)? },
                version,

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread diskann-providers/src/model/mod.rs
Comment thread diskann-providers/src/model/pq/mod.rs
Comment thread diskann-providers/src/model/pq/fixed_chunk_pq_table.rs
Comment thread diskann-providers/src/model/pq/fixed_chunk_pq_table.rs
Comment thread diskann-providers/src/model/pq/fixed_chunk_pq_table.rs
Comment thread diskann-providers/src/model/pq/distance/dynamic.rs
Comment on lines 218 to 235
fn evaluate_similarity(&self, fp: &[f32], q: &[u8]) -> f32 {
// Accept oversized `fp` (only the first `dim` elements are used) for
// backwards compatibility with callers that hold alignment-padded buffers.
let dim = self.table.get_dim();
assert!(
fp.len() >= dim,
"DistanceComputer: full-precision query length {} < dim {}",
fp.len(),
dim,
);
assert_eq!(
q.len(),
self.table.get_num_chunks(),
"{}",
INVALID_PQ_DIMENSION
);
(self.vtable.distance_fn)(&self.table, fp, q)
(self.vtable.distance_fn)(&self.table, &fp[..dim], q)
}
table: T,
metric: Metric,
query: &[U],
query: &[f32],
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd support requiring hard equality query.len() == table.get_dim(). Yes, this rejects arguments that it previously accepted, but accepting such inputs is a bug to begin with.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, I pushed strict == to QueryComputer::new (returns Result::Err), and DistanceComputer::evaluate_similarity (assert_eq!).
I left PQScratch::set (the disk-side entry) on query.len() >= dim + [..dim] slicing for now since the disk path tends to have more external surface.

Per #1044 review: callers passing oversized queries to inmem PQ
entries was a bug worth surfacing. QueryComputer::new (and via
delegation MultiQueryComputer::new) return Result::Err on mismatch;
DistanceComputer::evaluate_similarity asserts equality since the
trait method has no Result return.

PQScratch::set kept on >= dim tolerance for now — disk-side surface.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants