DiskANN v0.52.0
DiskANN v0.52.0 Release Notes
Breaking Changes
An AI generated, human reviewed list of changes is summarized below.
get_degree_stats signature changed (#998)
DiskANNIndex::get_degree_stats now takes an explicit iterator of IDs instead of requiring the data provider to implement IntoIterator.
// Before — provider had to impl IntoIterator
index.get_degree_stats(&mut accessor)?;
// After — caller supplies the ID iterator
index.get_degree_stats(&mut accessor, id_iter)?;PQ dimension contract tightened; entries now &[f32] only (#1044)
With AlignedBoxWithSlice removed from the PQ path, the dimension handling has been refactored into a three-layer contract:
| Layer | Where | Contract |
|---|---|---|
| Boundary (inmem) | QueryComputer::new, MultiQueryComputer::new, DistanceComputer::evaluate_similarity |
len == dim (returns Err on mismatch) |
| Boundary (disk) | PQScratch::set |
len >= dim, slices to [..dim] |
| Internal | TableL2/IP/Cosine::{new, populate} |
Trusted — no re-validation |
Other changes:
- PQ table populate/distance methods now accept
&[f32]instead of<U: Into<f32>>. Callers must pre-decode quantized vectors viaVectorRepr::as_f32. - Generic trampoline impls (
&Vec<u8>,&&[u8]) onQueryComputer/DistanceComputerhave been removed.
calculate_chunk_offsets relocated to ChunkOffsets constructors (#976)
The free functions calculate_chunk_offsets and calculate_chunk_offsets_auto have been moved into constructors on ChunkOffsets / ChunkOffsetsView in diskann-quantization::views.
// Before
let offsets = calculate_chunk_offsets(dim, num_chunks);
// After (allocating)
let offsets = ChunkOffsets::partition(dim, num_chunks)?;
// After (zero-alloc, borrows caller-owned scratch)
let view = ChunkOffsetsView::partition_into(dim, &mut scratch)?;Additionally, get_chunk_from_training_data has been moved from public API.
CachingProvider removed (#1052)
The entire diskann_providers::model::graph::provider::async_::caching module has been deleted.
Why: The CachingProvider was an experiment in transparent caching over DataProvider. In practice it required double monomorphization of the indexing code, didn't save integration work for bulk methods like on_elements_unordered/distances_unordered, and was complex to maintain. An internal user who …migrated off it removed ~1,000 lines of code, improved compile times by ~20%, and substantially reduced complexity.
Upgrade: Manage caching directly in your DataProvider implementation.
New Features
AVX-512 4-bit distance kernels (#1045)
Native V4 (AVX-512) specializations for 4-bit packed vector distance computations:
SquaredL2— 16 ×u32lanes per iteration via_mm512_madd_epi16.InnerProduct— AVX-512 VNNI (_mm512_dpbusd_epi32) overu8x64/i8x64operands.
Previously, V4 hardware fell back to two AVX2 (V3) kernel invocations per 512-bit chunk. The native kernels double per-instruction throughput. No API changes — existing code benefits automatically on AVX-512 capable hardware.
Merged PRs
- Deprecate 32-bit targets by @suhasjs in #1022
- Add a fast path to
Map::prepare. by @hildebrandmw in #1023 - Add boundary checks in gen_associated_data_from_range() by @Copilot in #847
- [deps] Don't pull
rayonas a dependency ofdiskann. by @hildebrandmw in #1024 - Bump openssl from 0.10.78 to 0.10.79 by @dependabot[bot] in #1026
- Cleaning up test work and changing the get_degree_stats signature. by @JordanMaples in #998
- Reduce scalar-quantization benchmark monomorphization by @suri-kumkaran in #1041
- [diskann-vector] Support truly unaligned distances. by @hildebrandmw in #981
- rename spherical.json to graph index with spherical quantization by @harsha-simhadri in #1042
- [PQ Cleanup] Part 2: Consolidate
calculate_chunk_offsets*by @arkrishn94 in #976 - PQ: tighten dim contract; right-size scratch buffer by @wuw92 in #1044
- Add v4 distance kernels (4-bit SquaredL2 / InnerProduct) by @m3hm3t in #1045
- Remove the Caching Provider by @hildebrandmw in #1052
New Contributors
Full Changelog: v0.51.0...v0.52.0