[benchmark/filtered-search prep] Make benchmarks stateful#995
[benchmark/filtered-search prep] Make benchmarks stateful#995hildebrandmw wants to merge 3 commits intomainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR updates the benchmark framework to support stateful benchmarks by switching Benchmark/Regression APIs from type-level (static) methods to instance methods (&self) and updating the registry to register benchmark values (enabling future dynamic “search plugin” registration per benchmark instance).
Changes:
- Convert
BenchmarkandRegressiontrait methods (try_match,description,run,check) to take&self. - Update
Benchmarks::register/register_regressionto accept a benchmark instance by value and store it behind a type-erased wrapper. - Refactor benchmark implementations across
diskann-benchmarkanddiskann-benchmark-simdto remove the prior'static/dispatcher indirection patterns.
Reviewed changes
Copilot reviewed 16 out of 16 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| diskann-benchmark/src/utils/mod.rs | Updates stub benchmark registration and trait method signatures to the new &self API. |
| diskann-benchmark/src/backend/index/spherical.rs | Refactors spherical index benchmarks to unit-like/stateful benchmarks and inlines prior dispatch indirection. |
| diskann-benchmark/src/backend/index/scalar.rs | Refactors scalar quantized index benchmarks to constructable/stateful benchmark instances. |
| diskann-benchmark/src/backend/index/product.rs | Refactors PQ index benchmark to a constructable/stateful benchmark instance. |
| diskann-benchmark/src/backend/index/benchmarks.rs | Removes BuildAndSearch/BuildAndDynamicRun indirection and moves logic directly into Benchmark::run(&self, ...). |
| diskann-benchmark/src/backend/filters/benchmark.rs | Converts metadata index benchmark into a stateful/unit-like benchmark and extracts run logic into a free function. |
| diskann-benchmark/src/backend/exhaustive/spherical.rs | Refactors exhaustive spherical benchmarks to unit-like/stateful benchmarks. |
| diskann-benchmark/src/backend/exhaustive/product.rs | Refactors exhaustive product benchmarks to unit-like/stateful benchmarks. |
| diskann-benchmark/src/backend/exhaustive/minmax.rs | Refactors exhaustive minmax benchmarks to unit-like/stateful benchmarks. |
| diskann-benchmark/src/backend/disk_index/benchmarks.rs | Refactors disk index benchmark/regression to stateful benchmarks and updates registration accordingly. |
| diskann-benchmark-simd/src/lib.rs | Updates SIMD regression benchmarks to be instance-based and adjusts kernel execution to pass arch at call time. |
| diskann-benchmark-runner/src/test/typed.rs | Updates typed test benchmarks/regressions to instance-based implementations (adds constructors). |
| diskann-benchmark-runner/src/test/mod.rs | Updates benchmark registration in test harness to pass benchmark instances. |
| diskann-benchmark-runner/src/test/dim.rs | Updates dim test benchmarks/regression to the new &self trait signatures. |
| diskann-benchmark-runner/src/registry.rs | Changes registry APIs to accept benchmark instances and stores them in the wrapper. |
| diskann-benchmark-runner/src/benchmark.rs | Changes core traits to &self methods and updates internal type-erased wrapper and regression plumbing. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #995 +/- ##
==========================================
+ Coverage 89.48% 90.64% +1.15%
==========================================
Files 448 448
Lines 84081 84153 +72
==========================================
+ Hits 75239 76279 +1040
+ Misses 8842 7874 -968
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 16 out of 16 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| impl<T> Regression for T | ||
| where | ||
| T: super::Regression, | ||
| { |
There was a problem hiding this comment.
The blanket implementation impl<T> Regression for T where T: super::Regression is self-referential and will overlap with any concrete impl Regression for SomeType, causing coherence conflicts. Additionally, calling self.check(...) inside this impl risks unbounded recursion because it resolves to the same Regression::check being implemented. Remove this blanket impl; instead, keep regression behavior on the concrete benchmark types and have the internal JSON/check adapter call the benchmark’s check via the wrapped benchmark instance (e.g., wrapper.benchmark.check(...)) or via a dedicated internal adapter type/extension trait that is not an impl of Regression itself.
| input: &IndexOperation, | ||
| checkpoint: Checkpoint<'_>, | ||
| output: &mut dyn Output, | ||
| mut output: &mut dyn Output, |
There was a problem hiding this comment.
mut output: &mut dyn Output makes the binding mutable, but the binding doesn’t appear to be reassigned in this function. This commonly triggers unused_mut warnings (including on parameters) and provides no benefit because interior mutability is already available through &mut. Drop the mut on the parameter unless you actually rebind output.
| mut output: &mut dyn Output, | |
| output: &mut dyn Output, |
| #[cfg(target_arch = "x86_64")] | ||
| { | ||
| dispatcher.register_regression( | ||
| "simd-op-f32xf32-x86_64_V4", | ||
| Kernel::<diskann_wide::arch::x86_64::V4, f32, f32>::new(), | ||
| ); |
There was a problem hiding this comment.
The change removes the small register! macro and introduces repeated #[cfg(target_arch = ...)] blocks with mostly identical registration calls. This increases the chance of drift (e.g., missing an arch variant or name mismatch) when updating the registration list. Consider reinstating a small helper macro/function for arch-gated registration to keep the list declarative while still passing benchmark instances (e.g., macro that expands to dispatcher.register_regression($name, Kernel::<...>::new()) under the appropriate cfg).
A recurring problem with our current benchmark infrastructure is the
SearchPhaseenum (selecting what kind of search is conducted) does its job a little too well: every time a new variant is added, we need to either update all users ofSearchPhase(bloating compile times) or explicitly opt-out of a particular search phase, which is brittle especially with respect toBenchmark::try_matchconsistency.For example see:
This PR takes the first step towards systematically solving this problem by allowing benchmarks registered with
diskann_benchmark_runner::Benchmarksto have state ratherthan being purely type-level constructs. Stateful benchmarks can have "search plugins" dynamically registered at construction time. These plugins participate in
Benchmark::try_match,Benchmark::description, andBenchmark::run, allowing individual benchmarks to opt into new search-phase variants without requiring changes across allbenchmarks. See #996 as a follow-up implementing this idea
Suggested Reviewing Order
In
diskann-benchmark-runner:benchmark.rs: This is the main change. It simply changes theBenchmarkandRegressiontraits to receive by&self.registry.rs: Change the signatures ofBenchmarks::registerandBenchmarks::register_regressionto receive the benchmark type by-value.In
diskann-benchmark: The main changes involve cleaning up the'statichack and removing theBuildAndSearch/BuildAndDynamicRunindirection traits that are no longer necessary.In
diskann-benchmark-simd: Feel free to skip.