PQ: tighten dim contract; right-size scratch buffer#1044
Conversation
Follow-up to @hildebrandmw's review on #960. - `PQScratch::rotated_query` is sized by `PQData::get_dim()` (PQ logical dim) instead of `graph_header.metadata().dims` (slot byte count, exceeds logical dim for `MinMaxElement` due to trailing min/max metadata). - PQ entries take `&[f32]`, accept `len >= dim`, slice `[..dim]`. Callers decode via `VectorRepr::as_f32` once at the boundary; PQ subtree is f32-only internally. - Kernels (`preprocess_query`, `populate_chunk_distances_impl`, `direct_distance_impl`) `debug_assert_eq!` on entry, matching `pq_dist_lookup_single`. The two `_impl` helpers become private. - `DirectCosine::populate` uses `copy_from_slice` (the previous zip silently truncated, no longer applicable). - Drop redundant `Copy` and `U: Into<f32>` bounds on touched fns.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1044 +/- ##
==========================================
+ Coverage 89.51% 90.60% +1.08%
==========================================
Files 460 461 +1
Lines 85466 85678 +212
==========================================
+ Hits 76508 77631 +1123
+ Misses 8958 8047 -911
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
PR moved entries to &[f32] so test_X_inner helpers no longer need to generate per-T data — drop the type parameter, generate Vec<f32> directly. Removes turbofish at all call sites and the rstest values parameterization.
There was a problem hiding this comment.
Pull request overview
This PR tightens and clarifies the PQ query-dimension contract by moving conversion/validation to boundary APIs, making PQ internals f32-only, and sizing scratch buffers to the PQ table’s logical dimension (rather than alignment-padded lengths). It also hides previously public PQ internals that were effectively low-level/FFI-oriented.
Changes:
- Switch PQ “entry points” (e.g.,
QueryComputer/MultiQueryComputer) to accept&[f32], validatelen >= dim, and internally slice to[..dim]. - Add debug-only dimension assertions inside inner kernels and right-size PQ scratch/query buffers to the PQ table’s logical dimension.
- Remove public exposure of some PQ internals (
direct_distance_impl,populate_chunk_distances_impl,inner_product_raw) and adjust callers/tests accordingly.
Reviewed changes
Copilot reviewed 17 out of 17 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| diskann-providers/src/model/pq/mod.rs | Stops re-exporting direct_distance_impl from the PQ module surface. |
| diskann-providers/src/model/pq/fixed_chunk_pq_table.rs | Makes several PQ internals private and adds debug-only dim assertions in kernels. |
| diskann-providers/src/model/pq/distance/test_utils.rs | Updates PQ distance test helpers to generate/use f32 queries directly. |
| diskann-providers/src/model/pq/distance/multi.rs | Updates MultiQueryComputer::new to take &[f32] and adjusts tests. |
| diskann-providers/src/model/pq/distance/l2.rs | Changes TableL2 constructor/populate path to accept &[f32]; updates tests. |
| diskann-providers/src/model/pq/distance/innerproduct.rs | Changes TableIP constructor/populate path to accept &[f32]; updates tests. |
| diskann-providers/src/model/pq/distance/dynamic.rs | Enforces boundary dim checks/slicing in QueryComputer::new; slices FP input in DistanceComputer. Adds tests for undersized query handling. |
| diskann-providers/src/model/pq/distance/cosine.rs | Changes cosine query handling to copy from &[f32]; updates tests. |
| diskann-providers/src/model/mod.rs | Removes re-export of direct_distance_impl from the top-level model API. |
| diskann-providers/src/model/graph/provider/async_/memory_quant_vector_provider.rs | Moves query decoding to the provider boundary via VectorRepr::as_f32. |
| diskann-providers/src/model/graph/provider/async_/fast_memory_quant_vector_provider.rs | Aligns query constraint to T: VectorRepr (dropping Copy). |
| diskann-providers/src/model/graph/provider/async_/experimental/multi_pq_async.rs | Decodes queries to f32 at the boundary and passes &[f32] into PQ code. |
| diskann-providers/src/model/graph/provider/async_/bf_tree/quant_vector_provider.rs | Aligns query constraint to T: VectorRepr (dropping Copy). |
| diskann-disk/src/storage/quant/pq/pq_dataset.rs | Adds PQData::get_dim() to expose PQ logical dimension for sizing buffers. |
| diskann-disk/src/search/provider/disk_provider.rs | Uses PQ logical dim and decodes query to f32 before PQScratch::set. |
| diskann-disk/src/search/pq/quantizer_preprocess.rs | Removes manual query slicing and relies on right-sized rotated_query. |
| diskann-disk/src/search/pq/pq_scratch.rs | Updates PQScratch::set to accept &[f32] and copy exactly dim elements into a right-sized buffer. |
Comments suppressed due to low confidence (1)
diskann-providers/src/model/pq/distance/multi.rs:377
MultiQueryComputer::newnow requires&[f32]rather than&[U: Into<f32> + Copy], which is a breaking change for downstreams passing non-f32 query types directly. If this is intended, consider a transition path (deprecated overload or helper) consistent with the new boundary decode pattern.
/// Construct a new `MultiQueryComputer` with the requested metric and query.
pub fn new(table: MultiTable<T, I>, metric: Metric, query: &[f32]) -> ANNResult<Self> {
let s = match table {
MultiTable::One { table, version } => Self::One {
computer: { QueryComputer::new(table, metric, query, None)? },
version,
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| fn evaluate_similarity(&self, fp: &[f32], q: &[u8]) -> f32 { | ||
| // Accept oversized `fp` (only the first `dim` elements are used) for | ||
| // backwards compatibility with callers that hold alignment-padded buffers. | ||
| let dim = self.table.get_dim(); | ||
| assert!( | ||
| fp.len() >= dim, | ||
| "DistanceComputer: full-precision query length {} < dim {}", | ||
| fp.len(), | ||
| dim, | ||
| ); | ||
| assert_eq!( | ||
| q.len(), | ||
| self.table.get_num_chunks(), | ||
| "{}", | ||
| INVALID_PQ_DIMENSION | ||
| ); | ||
| (self.vtable.distance_fn)(&self.table, fp, q) | ||
| (self.vtable.distance_fn)(&self.table, &fp[..dim], q) | ||
| } |
| table: T, | ||
| metric: Metric, | ||
| query: &[U], | ||
| query: &[f32], |
There was a problem hiding this comment.
I'd support requiring hard equality query.len() == table.get_dim(). Yes, this rejects arguments that it previously accepted, but accepting such inputs is a bug to begin with.
There was a problem hiding this comment.
Done, I pushed strict == to QueryComputer::new (returns Result::Err), and DistanceComputer::evaluate_similarity (assert_eq!).
I left PQScratch::set (the disk-side entry) on query.len() >= dim + [..dim] slicing for now since the disk path tends to have more external surface.
Per #1044 review: callers passing oversized queries to inmem PQ entries was a bug worth surfacing. QueryComputer::new (and via delegation MultiQueryComputer::new) return Result::Err on mismatch; DistanceComputer::evaluate_similarity asserts equality since the trait method has no Result return. PQScratch::set kept on >= dim tolerance for now — disk-side surface.
Background
Historically, query buffers came from
AlignedBoxWithSlice, which silently rounded length up to a multiple of 8 for SIMD alignment. Downstream populate functions therefore had to acceptquery.len() >= diminstead ofquery.len() == dim— pre-PR comment inTableL2::populate:With #960 removing
AlignedBoxWithSlicefrom the PQ path, the subtree can refactor dim handling along the boundary convert / internal trust idiom.Three-layer dim contract
QueryComputer::new,MultiQueryComputer::new,DistanceComputer::evaluate_similaritylen == dimResult::Err/assert_eq!PQScratch::setlen >= dim, slice[..dim]Result::Erron undersizeTableL2/IP/Cosine::{new, populate}preprocess_query,populate_chunk_distances_impl,direct_distance_impldebug_assert_eq!Boundary methods take
&[f32]. Quantized inputs are decoded viaVectorRepr::as_f32once at the caller boundary; the PQ subtree is f32-only internally.Why entries take
&[f32]Into<f32>is per-element.MinMaxElement<8>is a single byte that can't decode without the full slice's trailing metadata — it cannot implementInto<f32>. Production callers supporting MinMax were therefore always pre-decoding viaVectorRepr::as_f32and passing&[f32]upstream; the previous<U: Into<f32>>generic on PQ entries was effectively orphan-only.