Skip to content

Port range search and multihop filtered search tests to diskann crate#854

Draft
Copilot wants to merge 5 commits intomainfrom
copilot/port-unit-tests-to-diskann
Draft

Port range search and multihop filtered search tests to diskann crate#854
Copilot wants to merge 5 commits intomainfrom
copilot/port-unit-tests-to-diskann

Conversation

Copy link
Contributor

Copilot AI commented Mar 19, 2026

Search tests (range, multihop/filtered) live in diskann-providers in a monolithic diskann_async.rs using inmem providers, with limited metrics visibility. These should use the baseline testing infrastructure in diskann directly.

New test modules

  • grid_range_search.rs — Range search over 1D/3D/4D grids: outer-radius-only, annular (inner+outer radius), varying query positions, and a low-starting_l case that triggers two-round range search. Baselines track comparisons, hops, get_vector/get_neighbors counts, validates all returned distances are within bounds, and checks for duplicate IDs in results.

  • grid_multihop_search.rs — Multihop label-filtered search with six filter implementations:

    • EvenFilter — baselined; validates two-hop expansion returns only matching IDs
    • RejectAll — asserts only the start point (seeded pre-filter) survives
    • TerminateAfterN — uses Mutex-based visit counter with visit_count <= max_visits + 10 bound assertion, plus comparison-bounded check against full search
    • DistanceAdjuster — asserts on_visit distance scaling affects ranking (broad even/odd test)
    • CallbackFilter with CallbackMetrics — mirrors test_multihop_callback_enforces_filtering from diskann-providers; tracks total_visits, rejected_count, adjusted_count, visited_ids; validates blocked ID exclusion and distance adjustment
    • AcceptAll — baseline for comparison in termination and adjustment tests

Infrastructure

  • Added bool to VerboseEq impl_via_partial_eq! (was missing, needed by range_search_second_round field).

Tests not portable

The following tests remain in diskann-providers because they depend on provider-specific infrastructure:

  • Beta filtering (BetaFilter strategy)
  • SIFT-based build+search tests (file I/O, quantization training)
  • Spherical data search, diversity search, flaky build/consolidation, infinity values

Test count

22 existing → 35 total (+3 range search, +12 multihop).

Original prompt

This section details on the original issue you should resolve

<issue_title>Indexing unit tests should go in diskann</issue_title>
<issue_description>For historical reasons, unit tests exercising search and its various flavors (range, multihop filtered etc) live outside of the core diskann crate and instead live in diskann-providers in the large diskann_async.rs and are implemented using the inmem providers.

Further, these tests perform basic recall checks, but do not provide much visibility into more useful metrics like exact neighbors returned, number of get-vector calls etc.

These tests should be ported to the baseline testing infrastructure in diskann and all future such tests should be written there instead of diskann-providers. Ideally, I'd like the diskann crate on its own to have high test coverage and not be dependent on diskann-providers for its testing needs.

Any infrastructure developed within the diskann crate to achieve this can likely be reused to make writing future tests even easier.</issue_description>

Comments on the Issue (you are @copilot in this section)


💬 Send tasks to Copilot coding agent from Slack and Teams to turn conversations into code. Copilot posts an update in your thread when it's finished.

Copilot AI and others added 3 commits March 19, 2026 22:26
…o diskann

Co-authored-by: harsha-simhadri <5590673+harsha-simhadri@users.noreply.github.com>
Port search test patterns from diskann-providers to diskann's baseline
testing infrastructure:

- grid_range_search.rs: Range search tests across 1D/3D/4D grids with
  outer radius, inner/outer (annular) radius, and varying query positions.
  Validates result distances are within bounds and tracks metrics.

- grid_multihop_search.rs: Multihop (label-filtered) search tests with
  EvenFilter (baseline), RejectAll, TerminateAfterN, and DistanceAdjuster
  filters. Validates filter correctness, early termination, and distance
  adjustment ranking effects.

- Add bool to VerboseEq impl_via_partial_eq list for baseline support.

All tests use the baseline caching infrastructure and record detailed
metrics (comparisons, hops, get_vector calls) for regression detection.

Co-authored-by: harsha-simhadri <5590673+harsha-simhadri@users.noreply.github.com>
Co-authored-by: harsha-simhadri <5590673+harsha-simhadri@users.noreply.github.com>
Copilot AI changed the title [WIP] Port indexing unit tests to diskann crate Port range search and multihop filtered search tests to diskann crate Mar 19, 2026
Copilot AI requested a review from harsha-simhadri March 19, 2026 22:37
@codecov-commenter
Copy link

codecov-commenter commented Mar 19, 2026

Codecov Report

❌ Patch coverage is 94.86486% with 19 lines in your changes missing coverage. Please review.
✅ Project coverage is 90.44%. Comparing base (65e20ec) to head (43eab3b).

Files with missing lines Patch % Lines
...skann/src/graph/test/cases/grid_multihop_search.rs 94.90% 13 Missing ⚠️
diskann/src/graph/test/cases/grid_range_search.rs 94.78% 6 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #854      +/-   ##
==========================================
+ Coverage   89.29%   90.44%   +1.14%     
==========================================
  Files         443      445       +2     
  Lines       83441    83811     +370     
==========================================
+ Hits        74512    75802    +1290     
+ Misses       8929     8009     -920     
Flag Coverage Δ
miri 90.44% <94.86%> (+1.14%) ⬆️
unittests 90.40% <94.86%> (+1.26%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
diskann/src/test/cmp.rs 98.30% <ø> (+0.56%) ⬆️
diskann/src/graph/test/cases/grid_range_search.rs 94.78% <94.78%> (ø)
...skann/src/graph/test/cases/grid_multihop_search.rs 94.90% <94.90%> (ø)

... and 39 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@harsha-simhadri
Copy link
Contributor

@copilot review this PR and check specifically whether the new tests cover all aspects of the older version of the tests it is migrating

Range search:
- Add low starting_l test case to trigger two-round range search
- Add deduplication check (no duplicate IDs in results)
- Fix metric assertions for two-round range search

Multihop search:
- Add CallbackFilter with full metrics tracking (total_visits,
  rejected_count, adjusted_count, visited_ids) matching the
  diskann-providers test_multihop_callback_enforces_filtering test
- Improve TerminateAfterN to use Mutex-based visit counter matching
  old test pattern, with visit_count <= max_visits + 10 assertion
- Add separate visit-count-bounded test and comparison-bounded test

Test count: 22 existing → 35 total (+3 range, +12 multihop).

Co-authored-by: harsha-simhadri <5590673+harsha-simhadri@users.noreply.github.com>
Agent-Logs-Url: https://github.com/microsoft/DiskANN/sessions/38bcf170-21bb-4b8b-90cb-2af4f5382c7e
Copy link
Contributor Author

Copilot AI commented Mar 20, 2026

@copilot review this PR and check specifically whether the new tests cover all aspects of the older version of the tests it is migrating

I performed a detailed comparison of the old tests in diskann-providers/src/index/diskann_async.rs against the new tests. Here's the gap analysis and what I've fixed in cf8febe:

Gaps found and addressed:

  1. Range search: Added deduplication check (no duplicate IDs in results) and a low-starting_l test case that triggers the two-round range search path — both were covered in the old test_sift_build_and_range_search but missing here.

  2. Callback metrics tracking: Added CallbackFilter with full CallbackMetrics (total_visits, rejected_count, adjusted_count, visited_ids) mirroring the old test_multihop_callback_enforces_filtering. Validates blocked ID exclusion and distance adjustment.

  3. Termination visit count bounds: Improved TerminateAfterN to use Mutex-based counter (matching old pattern) with visit_count <= max_visits + 10 assertion from the old test_multihop_terminate_stops_traversal.

Tests that cannot be ported (depend on diskann-providers infrastructure):

  • test_even_filtering_beta — uses BetaFilter strategy from diskann-providers
  • test_sift_build_and_range_search — uses SIFT data files + inmem providers
  • SIFT build+search tests (scalar/spherical/PQ quantization) — require quantization providers
  • test_sphere_search — uses spherical data generation from providers
  • Diversity search, flaky build/consolidation, infinity values — require specialized providers

Test count is now 22 existing → 35 total (+3 range search, +12 multihop).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Indexing unit tests should go in diskann

3 participants