RFC: Query Planner for Per-Filter-Bitmap Disk Search#1102
RFC: Query Planner for Per-Filter-Bitmap Disk Search#1102YuanyuanTian-hh wants to merge 7 commits into
Conversation
There was a problem hiding this comment.
Pull request overview
Introduces a new RFC proposing a “Query Planner” that chooses between flat scan and beta-filtered disk graph search based on filter bitmap match rate and index size, aiming to improve recall/latency tradeoffs for per-filter searches.
Changes:
- Adds an RFC describing the planner’s motivation, decision thresholds, and proposed APIs (
QueryPlannerConfig,QueryPlanner,QueryStrategy). - Documents benchmark-driven threshold choices and outlines future extensions (adaptive beta, more strategies).
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| @@ -0,0 +1,346 @@ | |||
| # Query Planner for Per-Filter-Bitmap Disk Search | |||
| | | | | ||
| |---|---| | ||
| | **Authors** | tianyuanyuan | | ||
| | **Created** | 2026-05-25 | | ||
| | | | |
| 1. Provide a `QueryPlanner` that automatically selects the optimal search strategy (flat scan vs. beta-filtered graph search) based on index size and filter match rate. | ||
| 2. Derive thresholds from benchmark experiments on datasets covering the 150K-10M index sizes. | ||
| 3. Compose cleanly with the `SearchPlan` / `GraphMode` API from the [disk beta filter RFC](https://github.com/dyhyfu/DiskANN/blob/c3ae608683531765920f0844d70750efa731946a/rfcs/01101-disk-beta-filter.md) — the planner produces `SearchPlan` values, nothing else. | ||
| 4. Allow callers to override thresholds via `QueryPlannerConfig` for tuning. |
| pub fn plan(&self, matching_count: u64) -> QueryStrategy { | ||
| if self.total_points <= self.config.total_points_threshold { | ||
| return QueryStrategy::FlatSearch; | ||
| } | ||
|
|
||
| let match_rate = matching_count as f64 / self.total_points as f64; | ||
| if match_rate <= self.config.flat_search_threshold { | ||
| QueryStrategy::FlatSearch | ||
| } else { | ||
| QueryStrategy::BetaFilter | ||
| } | ||
| } |
| /// Plan and produce a `SearchPlan` with the appropriate predicate wiring. | ||
| /// | ||
| /// The caller provides the bitmap (as an `Arc<RoaringBitmap>`) and the | ||
| /// matching count. The planner selects the strategy and constructs the | ||
| /// `SearchPlan` with the closure already wired to the bitmap. | ||
| pub fn plan_search( | ||
| &self, | ||
| bitmap: Arc<RoaringBitmap>, | ||
| matching_count: u64, | ||
| ) -> SearchPlan { |
ab21a40 to
be14cf4
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1102 +/- ##
==========================================
- Coverage 89.49% 89.46% -0.03%
==========================================
Files 479 482 +3
Lines 90337 91082 +745
==========================================
+ Hits 80844 81491 +647
- Misses 9493 9591 +98
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
Replace total_points threshold with matching_count threshold based on benchmark analysis across 150K-10M vectors. The hybrid approach (matching_count <= 200K OR match_rate <= 25%) achieves 9/109 mistakes vs 11/109 for the original total_points-based approach. Key insight: flat scan latency depends on matching_count, not total_points (for <=1M indexes). The beta recall dip zone is a match-rate phenomenon (2-10%) that requires a rate-based guard.
suri-kumkaran
left a comment
There was a problem hiding this comment.
Overall the planner is a solid addition — the benchmark methodology is thorough, the hybrid threshold is well-justified against the data, and the routing layer cleanly composes with the beta-filter RFC's SearchPlan shape. Leaving inline comments on a few design points worth resolving before this lands.
One general note: several of the code examples carry small impl-detail issues (unnecessary Arc::clone() where ownership transfer would do, direct struct construction, hard-pinned bitmap type, etc.). These are distracting at the RFC stage — readers spend cycles on whether the code is right, rather than whether the design is right. Happy to handle these in the impl PR; the RFC just needs to nail the design decisions that need consensus.
|
|
||
| ### Background | ||
|
|
||
| When a query is scoped to a specific filter category, the caller extracts a list of items matching that category, maps them to DiskANN's internal vector IDs, and constructs a bitmap. The caller wraps the bitmap in a `Predicate` closure and passes it to the disk searcher. |
There was a problem hiding this comment.
Line 17 makes the caller responsible for mapping to internal IDs:
maps them to DiskANN's internal vector IDs, and constructs a bitmap
Once the beta-filter RFC flips Predicate to be keyed on external IDs (with translation inside DiskANN), the planner should follow the same contract — the caller just builds a bitmap of vertex IDs as seen from search outputs, no internal-ID mapping at the call site.
Suggested edit to line 17: drop the "internal vector IDs" phrasing. Bitmap is over external IDs; DiskANN translates internally at the predicate boundary.
There was a problem hiding this comment.
Fixed — dropped the 'internal vector IDs' phrasing.
| /// `SearchPlan` with the closure already wired to the bitmap. | ||
| pub fn plan_search( | ||
| &self, | ||
| bitmap: Arc<RoaringBitmap>, |
There was a problem hiding this comment.
plan_search is pinned to RoaringBitmap, which contradicts the beta-filter RFC it builds on.
Line 219:
pub fn plan_search(&self, bitmap: Arc<RoaringBitmap>, matching_count: u64) -> SearchPlanBut line 48 in this RFC's own §Prerequisites says:
The
Predicateclosure can wrap any bitmap type (RoaringBitmap,BitSet,HashSet<u32>)
The whole point of Predicate = Box<dyn Fn(u32) -> bool> is bitmap-agnosticism — callers using BitSet, HashSet<u32>, or any custom backing structure can't use the planner as written.
Cleaner shape — take the predicate abstraction the beta-filter RFC already chose, with matching_count as a separate caller-provided hint (predicates aren't size-aware, so the count must come from outside regardless):
pub fn plan_search<F>(&self, predicate: F, matching_count: u64) -> SearchPlan
where F: Fn(u32) -> bool + Send + Sync + 'static + Clone,The planner clones the predicate into whichever SearchPlan arm it picks. Bitmap implementation stays the caller's choice; the planner only cares about "is there a predicate and how many points does it match."
There was a problem hiding this comment.
Fixed — The planner is bitmap-agnostic — callers wrap any bitmap type into a closure before calling. Bounds match the beta-filter RFC's constructor signatures exactly.
| SearchPlan::FlatScan { | ||
| filter: Some(Box::new(move |id| bm.contains(id))), | ||
| } |
There was a problem hiding this comment.
plan_search builds variants via direct struct construction (lines 228-230, 234-237):
SearchPlan::FlatScan { filter: Some(Box::new(move |id| bm.contains(id))) }
SearchPlan::Graph(GraphMode::BetaFilter { predicate: Box::new(...), beta })This bypasses the constructors (SearchPlan::flat_filtered, GraphMode::beta_filter) that the beta-filter RFC introduced precisely to centralize predicate boxing and β ∈ (0, 1] validation. As written, QueryPlannerConfig::beta is unvalidated — a caller passing beta = 2.0 produces a plan that increases distances on matching vectors (semantic inversion), silently.
Use the constructors and propagate the Result:
QueryStrategy::FlatSearch =>
Ok(SearchPlan::flat_filtered(move |id| bitmap.contains(id))),
QueryStrategy::BetaFilter =>
GraphMode::beta_filter(move |id| bitmap.contains(id), beta)
.map(SearchPlan::graph_with),plan_search returns Result<SearchPlan, BetaError>. Validation discipline established by the beta-filter RFC stays intact through the planner.
There was a problem hiding this comment.
Fixed — now uses SearchPlan::flat_filtered() and GraphMode::beta_filter() constructors.
| to caller | ||
| ``` | ||
|
|
||
| ### 3. Strategy Selection |
There was a problem hiding this comment.
plan_search only produces FlatScan or BetaFilter — GraphMode::PostFilter and Graph::Unfiltered (and any future GraphMode variant) are unreachable through the planner.
Two questions worth answering explicitly:
-
Why is
PostFilternever selected? The beta-filter RFC keeps it as a first-class mode. If it's strictly dominated byBetaFilterat every (matching_count, match_rate) cell on the benchmark surface, say so — and consider deprecating it from the underlying API. If it's not dominated, the planner is missing a case. -
What's the extension story? When future graph algorithms land as new
GraphModevariants (the beta-filter RFC explicitly leaves room —Multihop, etc.), how does the planner pick them up? Today every new variant needs a new threshold derivation and a newQueryStrategyarm. Worth at least sketching: does eachGraphModeself-describe its viable (matching_count, match_rate) region, with the planner aggregating? Or does each new variant land with a planner update?
Even a brief §Future Work note ("planner currently covers two of three modes; PostFilter dominance to be validated") would close the gap. Right now a reader can't tell whether the omission is intentional or oversight.
There was a problem hiding this comment.
Good catch — the omission is intentional, not an oversight.
Why PostFilter is not selected: PostFilter applies the same hard post-filter as BetaFilter but without the beta bias on PQ distances. It is strictly dominated — BetaFilter does everything PostFilter does plus biases beam traversal toward matching vectors, improving recall at high match rates. At low match rates where beta bias hurts (the 2–10% dip zone), the planner routes to FlatScan instead of PostFilter, since flat scan has ~100% recall. There is no (matching_count, match_rate) cell in the benchmark data where PostFilter outperforms both FlatScan and BetaFilter.
Extension story: Each new GraphMode variant lands with its own threshold derivation and a new QueryStrategy arm — the planner update is part of the variant's integration. Self-describing viable regions is an interesting future direction but over-engineers the current two-strategy planner. Will add a Future Work note.
| to caller | ||
| ``` | ||
|
|
||
| ### 3. Strategy Selection |
There was a problem hiding this comment.
The current decision logic has a lower bound on flat scan (matching_count ≤ 200K → flat) but no upper bound. Consider: 10M index at 20% match rate. matching_count = 2M, match_rate < 25%, so the planner picks flat scan.
From Experiment 5: on a 10M index, flat scan at 1M matching vectors is ~970ms; at 5M it's ~3.4s. 2M matching extrapolates to ~1.5–2s. Latency tracks matching_count almost linearly — at 2M it's already 5× the ~315ms beta plateau.
Real workloads will have a latency budget. If a caller's budget is, say, 500ms, the planner shouldn't route to flat scan when matching_count is large enough that flat scan blows past the budget — even if the recall would be better. Accept the recall hit; meet the SLA.
A symmetric max_brute_force_matching_count (or, more precisely, a latency-budget-aware threshold) closes this gap:
if matching_count <= matching_count_threshold → FlatScan
else if matching_count >= max_brute_force_count → BetaFilter // (new)
else if match_rate <= flat_search_threshold → FlatScan
else → BetaFilterToday's use cases may not hit it (smaller indexes), but the decision logic should cover the full surface so larger indexes don't need a re-derivation. Even a conservative default (e.g. max_brute_force_count = 1M) caps worst-case flat-scan latency at ~1s on a 10M index.
There was a problem hiding this comment.
- Drop 'internal vector IDs' phrasing (comment 1) - Generic predicate instead of Arc<RoaringBitmap> in plan_search (comment 2) - Validate beta at QueryPlanner::new(), plan_search returns Result (comment 3) - Use SearchPlan/GraphMode constructors instead of direct struct construction
Add third threshold (max_brute_force_count = 1M) that caps flat scan latency. Validated across 7 index sizes (1M-10M, dim=384): - 1M index: flat crosses 500ms at ~1M matching - 3M index: flat crosses 500ms at ~800K matching - 5M index: flat crosses 500ms at ~600K matching - 10M index: flat crosses 500ms at ~100K matching Default 1M is conservative for indexes up to ~3M; callers with larger indexes can lower it via QueryPlannerConfig.

Reference Issues/PRs
What does this implement/fix? Briefly explain your changes.
Any other comments?