Skip to content

feat(rust): add search_with_filter to CAGRA Index#2019

Open
jamie8johnson wants to merge 1 commit intorapidsai:mainfrom
jamie8johnson:rust-search-filter
Open

feat(rust): add search_with_filter to CAGRA Index#2019
jamie8johnson wants to merge 1 commit intorapidsai:mainfrom
jamie8johnson:rust-search-filter

Conversation

@jamie8johnson
Copy link
Copy Markdown

Summary

Add Index::search_with_filter() to the CAGRA Rust bindings that accepts a bitset filter via DLPack ManagedTensor. The C API cuvsCagraSearch() already supports cuvsFilter with BITSET type, but the Rust bindings hardcoded NO_FILTER. This exposes the existing capability to Rust consumers.

Details

The bitset is a 1-D uint32 device tensor with ceil(n_rows / 32) elements. Bit = 1 includes the row, bit = 0 excludes it. Filtering happens during CAGRA graph traversal, not post-retrieval, giving better recall than over-fetch-and-filter approaches.

The existing search() method is unchanged (backward compatible). search_with_filter() is additive.

Motivation

We use CAGRA via the Rust crate for a code search tool. Currently we post-filter search results by metadata (chunk type, language), which requires 3x over-fetching to compensate for filtered-out candidates. Native bitset filtering would eliminate the over-fetch, reduce GPU work, and improve recall for filtered queries.

Test

test_cagra_search_with_filter: builds a 256-point index, creates a bitset that includes only even-indexed rows, searches with the filter, and verifies all returned neighbors are even-indexed.

Changes

  • rust/cuvs/src/cagra/index.rs: Added search_with_filter() method + test

Add `Index::search_with_filter()` that accepts a bitset filter via
DLPack ManagedTensor. The C API `cuvsCagraSearch()` already supports
`cuvsFilter` with BITSET type, but the Rust bindings hardcoded
`NO_FILTER`. This exposes the existing capability.

The bitset is a 1-D uint32 device tensor with ceil(n_rows / 32)
elements. Bit = 1 includes the row, bit = 0 excludes it. Filtering
happens during graph traversal, not post-retrieval.

Includes test: builds a 256-point index, filters to even-indexed
rows, verifies all returned neighbors pass the filter.
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot bot commented Apr 13, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@jamie8johnson
Copy link
Copy Markdown
Author

Related to #1464 — this exposes the existing C-level CAGRA bitset filter to the Rust bindings. The C API already supports it via cuvsFilter (added in #452); this PR just stops hardcoding NO_FILTER in the Rust wrapper.

@jamie8johnson
Copy link
Copy Markdown
Author

jamie8johnson commented Apr 13, 2026

Labels needed: improvement + non-breaking (external contributor, cannot self-label).

jamie8johnson added a commit to jamie8johnson/cqs that referenced this pull request Apr 13, 2026
Override VectorIndex::search_with_filter for CagraIndex: builds a
bitset from the predicate on host, uploads to GPU, and passes it to
CAGRA's traversal-time filter via search_with_filter (patched cuvs).

Eliminates the 3x over-fetch workaround — k=100 goes directly to GPU
as k=100 with a bitset, not k=300 unfiltered + post-filter. Better
recall for type-filtered and language-filtered queries.

Also:
- Cargo.toml [patch.crates-io] pointing to ../cuvs-patched (local
  patched cuvs 26.4 with search_with_filter, upstream PR
  rapidsai/cuvs#2019)
- Demoted CAGRA itopk_size clamp warning to debug level

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
jamie8johnson added a commit to jamie8johnson/cqs that referenced this pull request Apr 14, 2026
* feat: enrichment ablation + optimal routing + batch base index support

Two-arm eval at 78% summary coverage with per-category SPLADE:
- Base (no summaries): 42.3% R@1
- Enriched (with summaries): 41.9% R@1
- Oracle (best per category): 43.8% R@1 (+1.9pp)

Router updated based on per-category results:
- type_filtered → DenseBase (+8.4pp: 41.7% vs 33.3%)
- multi_step → DenseBase (+2.9pp: 23.5% vs 20.6%)
- structural/conceptual/cross_language stay enriched

Batch handler now supports base/enriched HNSW routing:
- Added base_hnsw field + base_vector_index() to BatchContext
- dispatch_search classifies queries and routes to base when appropriate
- CQS_FORCE_BASE_INDEX=1 env var for eval A/B testing
- Fixes daemon always using enriched regardless of classification

Other:
- Demote CAGRA itopk_size clamp warning to debug level
- Fix stale Cargo.toml comment about cuVS CUDA compatibility
- Cargo.lock updated for merged dep bumps (#935-#938)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: GPU-native CAGRA filtered search via bitset

Override VectorIndex::search_with_filter for CagraIndex: builds a
bitset from the predicate on host, uploads to GPU, and passes it to
CAGRA's traversal-time filter via search_with_filter (patched cuvs).

Eliminates the 3x over-fetch workaround — k=100 goes directly to GPU
as k=100 with a bitset, not k=300 unfiltered + post-filter. Better
recall for type-filtered and language-filtered queries.

Also:
- Cargo.toml [patch.crates-io] pointing to ../cuvs-patched (local
  patched cuvs 26.4 with search_with_filter, upstream PR
  rapidsai/cuvs#2019)
- Demoted CAGRA itopk_size clamp warning to debug level

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: clippy div_ceil + CI-safe cuvs patch via git dep

- Fix clippy::manual_div_ceil in CAGRA bitset construction
- Switch [patch.crates-io] from local path to git repo
  (CI can't access ../cuvs-patched)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor: simplify CAGRA to non-consuming search (cuVS 26.4)

cuVS 26.4 changed Index::search(self) to search(&self). This
eliminates the entire take-rebuild cycle that was causing SIGABRT
in the daemon under sustained use (repeated GPU index rebuilds
corrupted CUDA state).

Removed:
- IndexRebuilder RAII guard
- rebuild_index_with_resources / ensure_index_rebuilt
- dataset field (was cached for rebuilds)
- Mutex<Option<Index>> → single Mutex<GpuState> (resources + index)

The index is now built once and reused for all searches.
search_impl() is shared between filtered and unfiltered paths.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: eval results, tears, roadmap updates

- 8 eval run results (enrichment ablation, CAGRA filtering, routing)
- Updated roadmap: marked completed items, added HyDE/CAGRA items
- Updated tears with session summary and next priorities
- Added cuvs-fork-push to .gitignore

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* release: v1.24.0 — GPU-native CAGRA filtering + daemon stability

- Bump version 1.23.0 → 1.24.0
- CHANGELOG: add v1.24.0 entry, fold stale [Unreleased] into v1.23.0
- README: note cuVS 26.04 conda requirement + patched crate

Highlights:
- CAGRA native bitset filtering (GPU-side, replaces 3x over-fetch)
- Batch/daemon base index routing fix
- Router: type_filtered + multi_step → DenseBase
- cuVS 26.4: fixes daemon SIGABRT under sustained CAGRA load
- cagra.rs simplified (−357 lines via non-consuming search)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: jamie8johnson <jamie8johnson@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

1 participant