Skip to content

Update cuvs-bench docs to show both CLI and Python API usage#2084

Merged
rapids-bot[bot] merged 5 commits into
rapidsai:mainfrom
jnke2016:update_docs
May 14, 2026
Merged

Update cuvs-bench docs to show both CLI and Python API usage#2084
rapids-bot[bot] merged 5 commits into
rapidsai:mainfrom
jnke2016:update_docs

Conversation

@jnke2016
Copy link
Copy Markdown
Contributor

No description provided.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 14, 2026

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Summary by CodeRabbit

  • Documentation

    • Added Python API examples showing how to programmatically perform the "build and search index" step for both 10M and 100M vector benchmark workflows, including algorithm selection, build/search parameter sets, tuning mode, and batch sizing.
  • Refactor

    • Centralized benchmark config loading to unify dataset handling, discover backend algorithm groups, pre-expand parameter combinations, support tune-mode flows, and aggregate index builds per backend executable.

Walkthrough

This PR adds Python API documentation examples to the benchmark guide and refactors ConfigLoader.load() into a template-method flow that discovers backend algo groups, expands build/search parameter grids, and delegates BenchmarkConfig construction to backend-specific hooks (with CppGBench updated accordingly).

Changes

Programmatic Benchmark Alternatives

Layer / File(s) Summary
Python API examples for workflow steps
docs/source/cuvs_bench/index.rst
Adds Python code examples for both small-scale (10M vectors) and large-scale (100M vectors) workflows demonstrating BenchmarkOrchestrator instantiation and run_benchmark calls with dataset, algorithm, count/batch, and build/search parameters.

Orchestrator ConfigLoader refactor

Layer / File(s) Summary
ConfigLoader docstring and public hook signatures
python/cuvs_bench/cuvs_bench/orchestrator/config_loaders.py
Updates ConfigLoader docstring and backend hook signatures to document the new template-method flow and expanded_groups contract.
ConfigLoader.load() shared workflow and expansion
python/cuvs_bench/cuvs_bench/orchestrator/config_loaders.py
Implements load() to call _discover_algo_groups(), expand build/search parameter grids per group, and pass expanded_groups to _build_benchmark_configs().
CppGBench _discover_algo_groups() and filtering
python/cuvs_bench/cuvs_bench/orchestrator/config_loaders.py
Adds backend-specific discovery that extracts kwargs, loads algorithms, applies filters/validation (including GPU gating), and returns per-(algo,group) metadata with param grids.
Build BenchmarkConfig grouping by executable
python/cuvs_bench/cuvs_bench/orchestrator/config_loaders.py
Reworks _build_benchmark_configs() to iterate expanded_groups, select tune vs Cartesian params, call prepare_indexes(), and group produced indexes by executable into single BenchmarkConfig entries; small backend_config assembly tweak.
prepare_indexes() signature and tune-mode handling
python/cuvs_bench/cuvs_bench/orchestrator/config_loaders.py
Changes prepare_indexes() to accept pre-expanded all_build_params/all_search_params, updates docs, and treats tune-mode params as exact (skipping constraint/search validation).

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Description check ❓ Inconclusive No description was provided by the author, making it impossible to assess relevance to the changeset. Add a pull request description explaining the changes, such as what CLI and Python API examples were added and why.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main change: adding Python API usage examples alongside existing CLI documentation for cuvs-bench.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Comment @coderabbitai help to get the list of available commands and usage tips.

@jnke2016 jnke2016 requested a review from a team as a code owner May 14, 2026 19:18
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@python/cuvs_bench/cuvs_bench/orchestrator/config_loaders.py`:
- Around line 533-557: The code currently uses tune_mode as a blanket flag when
calling prepare_indexes even if tune_build_params or tune_search_params are
missing, causing validation to be skipped for fallback grids; change the call
site so that prepare_indexes only receives tune_mode for the specific lists that
came from Optuna: compute booleans (e.g., build_from_tune = tune_mode and
tune_build_params is not None; search_from_tune = tune_mode and
tune_search_params is not None), set actual_build/actual_search as you already
do, and modify the prepare_indexes call to accept and use these finer-grained
flags (or pass separate parameters like tune_build_flag and tune_search_flag) so
validation runs for the fallback build_combos/search_combos while being skipped
only for the real tuned param lists.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 7cbdab07-950d-4a0d-a260-08cb56e697b5

📥 Commits

Reviewing files that changed from the base of the PR and between 7453082 and 1708b56.

📒 Files selected for processing (1)
  • python/cuvs_bench/cuvs_bench/orchestrator/config_loaders.py

Comment thread python/cuvs_bench/cuvs_bench/orchestrator/config_loaders.py Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
python/cuvs_bench/cuvs_bench/orchestrator/config_loaders.py (1)

525-549: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Tune flow still uses a blanket validation bypass.

build_from_tune is passed as tune_mode into prepare_indexes(), which skips both build and search validation. This still bypasses search validation when only build params are tuned. Split build/search skip flags and only skip the side that actually came from Optuna.

💡 Minimal fix
-            build_from_tune = tune_mode and tune_build_params is not None
-            if build_from_tune:
+            skip_build_validation = (
+                tune_mode and tune_build_params is not None
+            )
+            skip_search_validation = (
+                tune_mode and tune_search_params is not None
+            )
+
+            if skip_build_validation:
                 actual_build = [tune_build_params.copy()]
-                actual_search = (
-                    [tune_search_params.copy()]
-                    if tune_search_params
-                    else [{}]
-                )
             else:
                 actual_build = build_combos
-                actual_search = search_combos
+
+            if skip_search_validation:
+                actual_search = [tune_search_params.copy()]
+            else:
+                actual_search = search_combos

             indexes = self.prepare_indexes(
                 actual_build,
                 actual_search,
@@
-                tune_mode=build_from_tune,
+                skip_build_validation=skip_build_validation,
+                skip_search_validation=skip_search_validation,
             )
-        tune_mode: bool = False,
+        skip_build_validation: bool = False,
+        skip_search_validation: bool = False,
@@
-            if not tune_mode:
+            if not skip_build_validation:
                 if not self.validate_constraints(
@@
-            if tune_mode:
+            if skip_search_validation:
                 index["search_params"] = list(all_search_params)
             else:
                 index["search_params"] = self.validate_search_params(

As per coding guidelines, "Ensure missing validation does not cause crashes on invalid input through proper size/type checks".

Also applies to: 786-856

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@python/cuvs_bench/cuvs_bench/orchestrator/config_loaders.py` around lines 525
- 549, The tune flow currently passes a single tune_mode (build_from_tune) into
prepare_indexes which causes both build and search validations to be skipped;
change the logic to compute two separate flags (e.g., build_tune_mode =
bool(tune_build_params) and search_tune_mode = bool(tune_search_params)), use
actual_build/actual_search as before, and update the prepare_indexes signature
and all its call sites (including the other occurrence noted) to accept and use
these two flags so only the side coming from Optuna skips validation; inside
prepare_indexes, ensure it only bypasses build validation when build_tune_mode
is true and only bypasses search validation when search_tune_mode is true.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@python/cuvs_bench/cuvs_bench/orchestrator/config_loaders.py`:
- Around line 525-549: The tune flow currently passes a single tune_mode
(build_from_tune) into prepare_indexes which causes both build and search
validations to be skipped; change the logic to compute two separate flags (e.g.,
build_tune_mode = bool(tune_build_params) and search_tune_mode =
bool(tune_search_params)), use actual_build/actual_search as before, and update
the prepare_indexes signature and all its call sites (including the other
occurrence noted) to accept and use these two flags so only the side coming from
Optuna skips validation; inside prepare_indexes, ensure it only bypasses build
validation when build_tune_mode is true and only bypasses search validation when
search_tune_mode is true.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 408ce488-f8a5-4da4-94f9-16f1329a36b6

📥 Commits

Reviewing files that changed from the base of the PR and between a55eba8 and b0d83b7.

📒 Files selected for processing (1)
  • python/cuvs_bench/cuvs_bench/orchestrator/config_loaders.py

@cjnolet cjnolet added doc Improvements or additions to documentation non-breaking Introduces a non-breaking change labels May 14, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
python/cuvs_bench/cuvs_bench/orchestrator/config_loaders.py (1)

530-537: 💤 Low value

Rename unused loop variable group_conf to _group_conf.

The variable is unpacked but never referenced in the loop body. Prefix with underscore to signal intentional disuse.

         for (
             algo,
             group,
-            group_conf,
+            _group_conf,
             build_combos,
             search_combos,
             group_meta,
         ) in expanded_groups:
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@python/cuvs_bench/cuvs_bench/orchestrator/config_loaders.py` around lines 530
- 537, The unpacking in the for-loop over expanded_groups uses the variable name
group_conf which is never referenced; rename it to _group_conf in the tuple
unpacking (the for (algo, group, _group_conf, build_combos, search_combos,
group_meta) in expanded_groups:) to indicate intentional unused variable and
silence linters while leaving surrounding logic (algo, group, build_combos,
search_combos, group_meta) unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@python/cuvs_bench/cuvs_bench/orchestrator/config_loaders.py`:
- Around line 530-537: The unpacking in the for-loop over expanded_groups uses
the variable name group_conf which is never referenced; rename it to _group_conf
in the tuple unpacking (the for (algo, group, _group_conf, build_combos,
search_combos, group_meta) in expanded_groups:) to indicate intentional unused
variable and silence linters while leaving surrounding logic (algo, group,
build_combos, search_combos, group_meta) unchanged.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 698351c6-0833-4b84-8d51-a48a66a17aa2

📥 Commits

Reviewing files that changed from the base of the PR and between b0d83b7 and d92822d.

📒 Files selected for processing (1)
  • python/cuvs_bench/cuvs_bench/orchestrator/config_loaders.py

@cjnolet
Copy link
Copy Markdown
Member

cjnolet commented May 14, 2026

/merge

@rapids-bot rapids-bot Bot merged commit 49ce810 into rapidsai:main May 14, 2026
71 of 72 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

doc Improvements or additions to documentation non-breaking Introduces a non-breaking change

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants