RFC: Query Planner for Per-Filter-Bitmap Disk Search by YuanyuanTian-hh · Pull Request #1102 · microsoft/DiskANN

YuanyuanTian-hh · 2026-05-25T09:12:39Z

Does this PR have a descriptive title that could go in our release notes?
Does this PR add any new dependencies?
Does this PR modify any existing APIs?
Is the change to the API backwards compatible?
Should this result in any changes to our documentation, either updating existing docs or adding new ones?

Reference Issues/PRs

What does this implement/fix? Briefly explain your changes.

Any other comments?

Copilot

Pull request overview

Introduces a new RFC proposing a “Query Planner” that chooses between flat scan and beta-filtered disk graph search based on filter bitmap match rate and index size, aiming to improve recall/latency tradeoffs for per-filter searches.

Changes:

Adds an RFC describing the planner’s motivation, decision thresholds, and proposed APIs (QueryPlannerConfig, QueryPlanner, QueryStrategy).
Documents benchmark-driven threshold choices and outlines future extensions (adaptive beta, more strategies).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@@ -0,0 +1,346 @@
+# Query Planner for Per-Filter-Bitmap Disk Search


+| | |
+|---|---|
+| **Authors** | tianyuanyuan |
+| **Created** | 2026-05-25 |
+| | |


+1. Provide a `QueryPlanner` that automatically selects the optimal search strategy (flat scan vs. beta-filtered graph search) based on index size and filter match rate.
+2. Derive thresholds from benchmark experiments on datasets covering the 150K-10M index sizes.
+3. Compose cleanly with the `SearchPlan` / `GraphMode` API from the [disk beta filter RFC](https://github.com/dyhyfu/DiskANN/blob/c3ae608683531765920f0844d70750efa731946a/rfcs/01101-disk-beta-filter.md) — the planner produces `SearchPlan` values, nothing else.
+4. Allow callers to override thresholds via `QueryPlannerConfig` for tuning.


+    pub fn plan(&self, matching_count: u64) -> QueryStrategy {
+        if self.total_points <= self.config.total_points_threshold {
+            return QueryStrategy::FlatSearch;
+        }
+
+        let match_rate = matching_count as f64 / self.total_points as f64;
+        if match_rate <= self.config.flat_search_threshold {
+            QueryStrategy::FlatSearch
+        } else {
+            QueryStrategy::BetaFilter
+        }
+    }


+    /// Plan and produce a `SearchPlan` with the appropriate predicate wiring.
+    ///
+    /// The caller provides the bitmap (as an `Arc<RoaringBitmap>`) and the
+    /// matching count. The planner selects the strategy and constructs the
+    /// `SearchPlan` with the closure already wired to the bitmap.
+    pub fn plan_search(
+        &self,
+        bitmap: Arc<RoaringBitmap>,
+        matching_count: u64,
+    ) -> SearchPlan {


codecov-commenter · 2026-05-25T09:48:44Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 89.46%. Comparing base (e3139b4) to head (a35c14e).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1102      +/-   ##
==========================================
- Coverage   89.49%   89.46%   -0.03%     
==========================================
  Files         479      482       +3     
  Lines       90337    91082     +745     
==========================================
+ Hits        80844    81491     +647     
- Misses       9493     9591      +98

Flag	Coverage Δ
miri	`89.46% <ø> (-0.03%)`	⬇️
unittests	`89.11% <ø> (-0.02%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.
see 8 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Replace total_points threshold with matching_count threshold based on benchmark analysis across 150K-10M vectors. The hybrid approach (matching_count <= 200K OR match_rate <= 25%) achieves 9/109 mistakes vs 11/109 for the original total_points-based approach. Key insight: flat scan latency depends on matching_count, not total_points (for <=1M indexes). The beta recall dip zone is a match-rate phenomenon (2-10%) that requires a rate-based guard.

suri-kumkaran

Overall the planner is a solid addition — the benchmark methodology is thorough, the hybrid threshold is well-justified against the data, and the routing layer cleanly composes with the beta-filter RFC's SearchPlan shape. Leaving inline comments on a few design points worth resolving before this lands.

One general note: several of the code examples carry small impl-detail issues (unnecessary Arc::clone() where ownership transfer would do, direct struct construction, hard-pinned bitmap type, etc.). These are distracting at the RFC stage — readers spend cycles on whether the code is right, rather than whether the design is right. Happy to handle these in the impl PR; the RFC just needs to nail the design decisions that need consensus.

suri-kumkaran · 2026-05-26T09:46:34Z

+
+### Background
+
+When a query is scoped to a specific filter category, the caller extracts a list of items matching that category, maps them to DiskANN's internal vector IDs, and constructs a bitmap. The caller wraps the bitmap in a `Predicate` closure and passes it to the disk searcher.


Line 17 makes the caller responsible for mapping to internal IDs:

maps them to DiskANN's internal vector IDs, and constructs a bitmap

Once the beta-filter RFC flips Predicate to be keyed on external IDs (with translation inside DiskANN), the planner should follow the same contract — the caller just builds a bitmap of vertex IDs as seen from search outputs, no internal-ID mapping at the call site.

Suggested edit to line 17: drop the "internal vector IDs" phrasing. Bitmap is over external IDs; DiskANN translates internally at the predicate boundary.

Fixed — dropped the 'internal vector IDs' phrasing.

suri-kumkaran · 2026-05-26T09:49:52Z

+    /// `SearchPlan` with the closure already wired to the bitmap.
+    pub fn plan_search(
+        &self,
+        bitmap: Arc<RoaringBitmap>,


plan_search is pinned to RoaringBitmap, which contradicts the beta-filter RFC it builds on.

Line 219:

pub fn plan_search(&self, bitmap: Arc<RoaringBitmap>, matching_count: u64) -> SearchPlan

But line 48 in this RFC's own §Prerequisites says:

The Predicate closure can wrap any bitmap type (RoaringBitmap, BitSet, HashSet<u32>)

The whole point of Predicate = Box<dyn Fn(u32) -> bool> is bitmap-agnosticism — callers using BitSet, HashSet<u32>, or any custom backing structure can't use the planner as written.

Cleaner shape — take the predicate abstraction the beta-filter RFC already chose, with matching_count as a separate caller-provided hint (predicates aren't size-aware, so the count must come from outside regardless):

pub fn plan_search<F>(&self, predicate: F, matching_count: u64) -> SearchPlan where F: Fn(u32) -> bool + Send + Sync + 'static + Clone,

The planner clones the predicate into whichever SearchPlan arm it picks. Bitmap implementation stays the caller's choice; the planner only cares about "is there a predicate and how many points does it match."

Fixed — The planner is bitmap-agnostic — callers wrap any bitmap type into a closure before calling. Bounds match the beta-filter RFC's constructor signatures exactly.

suri-kumkaran · 2026-05-26T09:55:53Z

+                SearchPlan::FlatScan {
+                    filter: Some(Box::new(move |id| bm.contains(id))),
+                }


plan_search builds variants via direct struct construction (lines 228-230, 234-237):

SearchPlan::FlatScan { filter: Some(Box::new(move |id| bm.contains(id))) } SearchPlan::Graph(GraphMode::BetaFilter { predicate: Box::new(...), beta })

This bypasses the constructors (SearchPlan::flat_filtered, GraphMode::beta_filter) that the beta-filter RFC introduced precisely to centralize predicate boxing and β ∈ (0, 1] validation. As written, QueryPlannerConfig::beta is unvalidated — a caller passing beta = 2.0 produces a plan that increases distances on matching vectors (semantic inversion), silently.

Use the constructors and propagate the Result:

QueryStrategy::FlatSearch => Ok(SearchPlan::flat_filtered(move |id| bitmap.contains(id))), QueryStrategy::BetaFilter => GraphMode::beta_filter(move |id| bitmap.contains(id), beta) .map(SearchPlan::graph_with),

plan_search returns Result<SearchPlan, BetaError>. Validation discipline established by the beta-filter RFC stays intact through the planner.

Fixed — now uses SearchPlan::flat_filtered() and GraphMode::beta_filter() constructors.

suri-kumkaran · 2026-05-26T09:59:28Z

+            to caller
+```
+
+### 3. Strategy Selection


plan_search only produces FlatScan or BetaFilter — GraphMode::PostFilter and Graph::Unfiltered (and any future GraphMode variant) are unreachable through the planner.

Two questions worth answering explicitly:

Why is PostFilter never selected? The beta-filter RFC keeps it as a first-class mode. If it's strictly dominated by BetaFilter at every (matching_count, match_rate) cell on the benchmark surface, say so — and consider deprecating it from the underlying API. If it's not dominated, the planner is missing a case.

What's the extension story? When future graph algorithms land as new GraphMode variants (the beta-filter RFC explicitly leaves room — Multihop, etc.), how does the planner pick them up? Today every new variant needs a new threshold derivation and a new QueryStrategy arm. Worth at least sketching: does each GraphMode self-describe its viable (matching_count, match_rate) region, with the planner aggregating? Or does each new variant land with a planner update?

Even a brief §Future Work note ("planner currently covers two of three modes; PostFilter dominance to be validated") would close the gap. Right now a reader can't tell whether the omission is intentional or oversight.

Good catch — the omission is intentional, not an oversight.

Why PostFilter is not selected: PostFilter applies the same hard post-filter as BetaFilter but without the beta bias on PQ distances. It is strictly dominated — BetaFilter does everything PostFilter does plus biases beam traversal toward matching vectors, improving recall at high match rates. At low match rates where beta bias hurts (the 2–10% dip zone), the planner routes to FlatScan instead of PostFilter, since flat scan has ~100% recall. There is no (matching_count, match_rate) cell in the benchmark data where PostFilter outperforms both FlatScan and BetaFilter.

Extension story: Each new GraphMode variant lands with its own threshold derivation and a new QueryStrategy arm — the planner update is part of the variant's integration. Self-describing viable regions is an interesting future direction but over-engineers the current two-strategy planner. Will add a Future Work note.

suri-kumkaran · 2026-05-26T10:03:09Z

+            to caller
+```
+
+### 3. Strategy Selection


The current decision logic has a lower bound on flat scan (matching_count ≤ 200K → flat) but no upper bound. Consider: 10M index at 20% match rate. matching_count = 2M, match_rate < 25%, so the planner picks flat scan.

From Experiment 5: on a 10M index, flat scan at 1M matching vectors is ~970ms; at 5M it's ~3.4s. 2M matching extrapolates to ~1.5–2s. Latency tracks matching_count almost linearly — at 2M it's already 5× the ~315ms beta plateau.

Real workloads will have a latency budget. If a caller's budget is, say, 500ms, the planner shouldn't route to flat scan when matching_count is large enough that flat scan blows past the budget — even if the recall would be better. Accept the recall hit; meet the SLA.

A symmetric max_brute_force_matching_count (or, more precisely, a latency-budget-aware threshold) closes this gap:

if matching_count <= matching_count_threshold → FlatScan else if matching_count >= max_brute_force_count → BetaFilter // (new) else if match_rate <= flat_search_threshold → FlatScan else → BetaFilter

Today's use cases may not hit it (smaller indexes), but the decision logic should cover the full surface so larger indexes don't need a re-derivation. Even a conservative default (e.g. max_brute_force_count = 1M) caps worst-case flat-scan latency at ~1s on a 10M index.

Added max_brute_force_count (default: 1M) as a third threshold:

\
if matching_count <= 200K → FlatScan
else if matching_count >= 1M → BetaFilter // NEW
else if match_rate <= 25% → FlatScan
else → BetaFilter
\\

Validated across 7 index sizes (1M–10M, dim=384). Where flat scan crosses 500ms:

- Drop 'internal vector IDs' phrasing (comment 1) - Generic predicate instead of Arc<RoaringBitmap> in plan_search (comment 2) - Validate beta at QueryPlanner::new(), plan_search returns Result (comment 3) - Use SearchPlan/GraphMode constructors instead of direct struct construction

Add third threshold (max_brute_force_count = 1M) that caps flat scan latency. Validated across 7 index sizes (1M-10M, dim=384): - 1M index: flat crosses 500ms at ~1M matching - 3M index: flat crosses 500ms at ~800K matching - 5M index: flat crosses 500ms at ~600K matching - 10M index: flat crosses 500ms at ~100K matching Default 1M is conservative for indexes up to ~3M; callers with larger indexes can lower it via QueryPlannerConfig.

YuanyuanTian-hh requested review from a team and Copilot May 25, 2026 09:12

Copilot started reviewing on behalf of YuanyuanTian-hh May 25, 2026 09:12 View session

Copilot AI reviewed May 25, 2026

View reviewed changes

RFC: Query Planner for Per-Filter-Bitmap Disk Search

be14cf4

YuanyuanTian-hh force-pushed the rfc_query_planner_v2 branch from ab21a40 to be14cf4 Compare May 25, 2026 09:29

RFC: rename file to match PR #1102

dcfbd61

YuanyuanTian-hh added the RFC Request For Comments label May 25, 2026

Yuanyuan Tian (from Dev Box) added 2 commits May 26, 2026 14:31

RFC: remove internal dataset names (MERB, Enron, FBV8, FBV4)

a9c50d1

suri-kumkaran reviewed May 26, 2026

View reviewed changes

Yuanyuan Tian (from Dev Box) added 3 commits May 26, 2026 19:23

RFC: address comment 4 - explain PostFilter omission in Future Work

b852980

		@@ -0,0 +1,346 @@
		# Query Planner for Per-Filter-Bitmap Disk Search


		### Background

		When a query is scoped to a specific filter category, the caller extracts a list of items matching that category, maps them to DiskANN's internal vector IDs, and constructs a bitmap. The caller wraps the bitmap in a `Predicate` closure and passes it to the disk searcher.

Conversation

YuanyuanTian-hh commented May 25, 2026

Reference Issues/PRs

What does this implement/fix? Briefly explain your changes.

Any other comments?

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

codecov-commenter commented May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

suri-kumkaran left a comment

Choose a reason for hiding this comment

Uh oh!

suri-kumkaran May 26, 2026

Choose a reason for hiding this comment

Uh oh!

YuanyuanTian-hh May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

suri-kumkaran May 26, 2026

Choose a reason for hiding this comment

Uh oh!

YuanyuanTian-hh May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

suri-kumkaran May 26, 2026

Choose a reason for hiding this comment

Uh oh!

YuanyuanTian-hh May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

suri-kumkaran May 26, 2026

Choose a reason for hiding this comment

Uh oh!

YuanyuanTian-hh May 26, 2026

Choose a reason for hiding this comment

Uh oh!

suri-kumkaran May 26, 2026

Choose a reason for hiding this comment

Uh oh!

YuanyuanTian-hh May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

codecov-commenter commented May 25, 2026 •

edited

Loading

YuanyuanTian-hh May 26, 2026 •

edited

Loading

YuanyuanTian-hh May 26, 2026 •

edited

Loading

YuanyuanTian-hh May 26, 2026 •

edited

Loading

YuanyuanTian-hh May 27, 2026 •

edited

Loading