Skip to content

[RFC/diskann] Overhaul paged search#1078

Merged
hildebrandmw merged 12 commits into
mainfrom
mhildebr/paged
May 21, 2026
Merged

[RFC/diskann] Overhaul paged search#1078
hildebrandmw merged 12 commits into
mainfrom
mhildebr/paged

Conversation

@hildebrandmw
Copy link
Copy Markdown
Contributor

@hildebrandmw hildebrandmw commented May 15, 2026

Paged search has been causing all kinds of issues for our code base and is actively getting in the way of simplifications in #1067 due to interactions with the PagedSearchState. The TLDR of the issue is that PagedSearchState requires types to be 'static and introduces the need to "pause" and "resume" search state in a way that is complex to describe in trait bounds.

Since our code is already async, we can lean into that and use the usual Rust machinery to embed non-'static paged searcher inside an otherwise 'static future. The recommended way to now interact with paged search is via channels.

Rendered RFC

API Migration Guide

Old pattern New pattern
index.start_paged_search(s, ctx, q, l).await index.paged_search(s, ctx, q, l).await
index.next_search_results(ctx, &mut state, k, &mut buf).await search.next_page(k).await
SearchState<Id, (S, C)> PagedSearch<'a, DP, S, T>
PagedSearchState<DP, S, C> PagedSearch<'a, DP, S, T>
Check return count for exhaustion Check page.is_empty()

If existing code embedded the SearchState in some 'static container, that is no longer viable because of the borrow. Instead, channels can be used for this communication:

// Types are illustrative — adapt names to your crate.

type PageResult = ANNResult<Vec<Neighbor<ExternalId>>>;

/// Spawn a paged search session. The index is held by Arc so the task is 'static.
///
/// Returns a request channel and a result channel. The caller sends the desired
/// page size (`k`) and awaits the corresponding result on the other end.
fn spawn_paged_session(
    index: Arc<DiskANNIndex<DP>>,
    context: Arc<DP::Context>,
    query: T,
    l: usize,
) -> (mpsc::Sender<usize>, mpsc::Receiver<PageResult>) {
    let (req_tx, mut req_rx) = mpsc::channel::<usize>(1);
    let (res_tx, res_rx) = mpsc::channel::<PageResult>(1);

    tokio::spawn(async move {
        // Borrow from the Arc — these references are scoped to the task.
        let mut search = index.paged_search(strategy, &*context, query, l).await.unwrap();

        while let Some(k) = req_rx.recv().await {
            let page = search.next_page(k).await;
            if res_tx.send(page).await.is_err() {
                break; // caller dropped the result receiver
            }
        }
        // Request channel closed -> caller dropped sender -> clean shutdown.
    });

    (req_tx, res_rx)
}

If code was already explicitly using a .await loop with SearchState, then minimal changes should be needed.

For Users of Paged Search via wrapped_async

Users of paged search via wrapped_async::DiskANNIndex that know their inner futures will never suspend can use the new wrapped_async::DiskANNIndex::paged_search_no_await. This will use the new API transparently via wrapped_async::noawait::PagedSearch and efficiently run paged searches with minimal synchronization overhead.

This should only be used if the implementation of Accessor, BuildQueryComputer, SearchExt, DataProvider, and ExpandBeam are known to never yield and always complete with Poll::Ready.

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 15, 2026

Codecov Report

❌ Patch coverage is 81.69935% with 56 lines in your changes missing coverage. Please review.
✅ Project coverage is 90.57%. Comparing base (5443ca0) to head (8598804).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
diskann-providers/src/index/wrapped_async.rs 75.42% 43 Missing ⚠️
diskann/src/graph/search/paged.rs 86.36% 12 Missing ⚠️
diskann-providers/src/index/diskann_async.rs 80.00% 1 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1078      +/-   ##
==========================================
+ Coverage   89.46%   90.57%   +1.10%     
==========================================
  Files         473      474       +1     
  Lines       89653    89740      +87     
==========================================
+ Hits        80212    81278    +1066     
+ Misses       9441     8462     -979     
Flag Coverage Δ
miri 90.57% <81.69%> (+1.10%) ⬆️
unittests 90.53% <81.69%> (+1.42%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
diskann/src/graph/index.rs 96.15% <100.00%> (+0.84%) ⬆️
diskann/src/graph/search/scratch.rs 98.21% <ø> (ø)
diskann/src/graph/test/cases/paged_search.rs 95.09% <100.00%> (-0.85%) ⬇️
diskann/src/provider.rs 95.14% <ø> (ø)
diskann-providers/src/index/diskann_async.rs 95.98% <80.00%> (-0.02%) ⬇️
diskann/src/graph/search/paged.rs 86.36% <86.36%> (ø)
diskann-providers/src/index/wrapped_async.rs 59.87% <75.42%> (+13.99%) ⬆️

... and 45 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@hildebrandmw hildebrandmw changed the title [diskann] Overhaul paged search [RFC/diskann] Overhaul paged search May 18, 2026
@hildebrandmw hildebrandmw marked this pull request as ready for review May 18, 2026 19:35
@hildebrandmw hildebrandmw requested review from a team and Copilot May 18, 2026 19:35
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR overhauls DiskANN’s paged (iterative) search API to remove the SearchState<..., ExtraState: 'static> pattern and instead return a lifetime-bound PagedSearch<'a, ...> handle, enabling non-'static query computers/strategies and reducing trait-bound complexity. It also updates downstream wrappers/tests and adds an RFC documenting a channel-based pattern for crossing tokio::spawn/FFI boundaries.

Changes:

  • Remove the 'static bound from BuildQueryComputer::QueryComputer.
  • Replace the start_paged_search/next_search_results API with DiskANNIndex::paged_search{_with_init_ids} returning a PagedSearch handle with next_page.
  • Update diskann-providers sync wrapper + test cases, and add an RFC describing the new model and migration guidance.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
rfcs/01078-paged-search.md RFC describing the motivation, new API shape, and a channel-based spawned-task usage pattern.
diskann/src/provider.rs Drops 'static from BuildQueryComputer::QueryComputer to allow borrowed query computers.
diskann/src/graph/test/cases/paged_search.rs Updates paged-search tests to use PagedSearch::next_page.
diskann/src/graph/search/scratch.rs Gates SearchScratch::search_l() behind #[cfg(test)].
diskann/src/graph/search/paged.rs Introduces the new PagedSearch handle implementation and paging logic.
diskann/src/graph/search/mod.rs Wires the new paged module and re-exports PagedSearch.
diskann/src/graph/index.rs Removes old SearchState/paged-search API and adds paged_search{_with_init_ids} constructors returning PagedSearch.
diskann-providers/src/index/wrapped_async.rs Updates synchronous wrapper to return a blocking PagedSearch wrapper around the async handle.
diskann-providers/src/index/diskann_async.rs Updates async provider tests/helpers to use PagedSearch::next_page.
Comments suppressed due to low confidence (2)

diskann-providers/src/index/wrapped_async.rs:356

  • These synchronous wrapper methods still require S: SearchStrategy<DP, T> + 'static, but the underlying DiskANNIndex::paged_search no longer needs 'static. Keeping this bound unnecessarily restricts callers from using non-'static strategies (the main goal of this RFC). Consider dropping the + 'static bound here as well.
    pub fn paged_search<'a, S, T>(
        &'a self,
        strategy: S,
        context: &'a DP::Context,
        query: T,
        l_value: usize,
    ) -> ANNResult<PagedSearch<'a, DP, S, T>>
    where
        S: SearchStrategy<DP, T> + 'static,
        T: Copy + Send + 'a,
    {

diskann/src/graph/index.rs:2211

  • computed_result is initialized with vec![Neighbor::default(); l_value] and next_result_index is set to l_value to represent an empty cache. Since PagedSearch::next_page now returns an owned Vec, you can avoid the O(l_value) initialization cost by using Vec::with_capacity(l_value) (or Vec::new()) and starting next_result_index at 0.
            ANNResult::Ok(PagedSearch {
                index: self,
                context,
                scratch,
                computed_result: vec![Neighbor::default(); l_value],
                next_result_index: l_value,
                search_param_l: l_value,
                strategy,
                computer,
                _query: std::marker::PhantomData,
            })

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread diskann-providers/src/index/wrapped_async.rs Outdated
Comment thread diskann/src/graph/test/cases/paged_search.rs Outdated
Comment thread diskann/src/graph/search/paged.rs Outdated
Comment thread diskann/src/graph/index.rs
Comment thread diskann/src/graph/search/paged.rs Outdated
Comment thread diskann/src/graph/search/paged.rs
@hildebrandmw hildebrandmw enabled auto-merge (squash) May 21, 2026 18:05
@hildebrandmw hildebrandmw merged commit c667a3c into main May 21, 2026
23 of 24 checks passed
@hildebrandmw hildebrandmw deleted the mhildebr/paged branch May 21, 2026 18:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants