Update cuvs-bench docs to show both CLI and Python API usage by jnke2016 · Pull Request #2084 · rapidsai/cuvs

jnke2016 · 2026-05-14T17:29:52Z

No description provided.

coderabbitai · 2026-05-14T17:31:32Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Summary by CodeRabbit

Documentation
- Added Python API examples showing how to programmatically perform the "build and search index" step for both 10M and 100M vector benchmark workflows, including algorithm selection, build/search parameter sets, tuning mode, and batch sizing.
Refactor
- Centralized benchmark config loading to unify dataset handling, discover backend algorithm groups, pre-expand parameter combinations, support tune-mode flows, and aggregate index builds per backend executable.

Walkthrough

This PR adds Python API documentation examples to the benchmark guide and refactors ConfigLoader.load() into a template-method flow that discovers backend algo groups, expands build/search parameter grids, and delegates BenchmarkConfig construction to backend-specific hooks (with CppGBench updated accordingly).

Changes

Programmatic Benchmark Alternatives

Layer / File(s)	Summary
Python API examples for workflow steps `docs/source/cuvs_bench/index.rst`	Adds Python code examples for both small-scale (10M vectors) and large-scale (100M vectors) workflows demonstrating `BenchmarkOrchestrator` instantiation and `run_benchmark` calls with dataset, algorithm, count/batch, and build/search parameters.

Orchestrator ConfigLoader refactor

Layer / File(s)	Summary
ConfigLoader docstring and public hook signatures `python/cuvs_bench/cuvs_bench/orchestrator/config_loaders.py`	Updates ConfigLoader docstring and backend hook signatures to document the new template-method flow and `expanded_groups` contract.
ConfigLoader.load() shared workflow and expansion `python/cuvs_bench/cuvs_bench/orchestrator/config_loaders.py`	Implements `load()` to call `_discover_algo_groups()`, expand build/search parameter grids per group, and pass `expanded_groups` to `_build_benchmark_configs()`.
CppGBench _discover_algo_groups() and filtering `python/cuvs_bench/cuvs_bench/orchestrator/config_loaders.py`	Adds backend-specific discovery that extracts kwargs, loads algorithms, applies filters/validation (including GPU gating), and returns per-(algo,group) metadata with param grids.
Build BenchmarkConfig grouping by executable `python/cuvs_bench/cuvs_bench/orchestrator/config_loaders.py`	Reworks `_build_benchmark_configs()` to iterate `expanded_groups`, select tune vs Cartesian params, call `prepare_indexes()`, and group produced indexes by executable into single `BenchmarkConfig` entries; small `backend_config` assembly tweak.
prepare_indexes() signature and tune-mode handling `python/cuvs_bench/cuvs_bench/orchestrator/config_loaders.py`	Changes `prepare_indexes()` to accept pre-expanded `all_build_params`/`all_search_params`, updates docs, and treats tune-mode params as exact (skipping constraint/search validation).

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name	Status	Explanation	Resolution
Description check	❓ Inconclusive	No description was provided by the author, making it impossible to assess relevance to the changeset.	Add a pull request description explaining the changes, such as what CLI and Python API examples were added and why.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately describes the main change: adding Python API usage examples alongside existing CLI documentation for cuvs-bench.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

Generate code and open pull requests
Plan features and break down work
Investigate incidents and troubleshoot customer tickets together
Automate recurring tasks and respond to alerts with triggers
Summarize progress and report instantly

Built for teams:

Shared memory across your entire org—no repeating context
Per-thread sandboxes to safely plan and execute work
Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

…oader calls expand_param_grid directly

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@python/cuvs_bench/cuvs_bench/orchestrator/config_loaders.py`:
- Around line 533-557: The code currently uses tune_mode as a blanket flag when
calling prepare_indexes even if tune_build_params or tune_search_params are
missing, causing validation to be skipped for fallback grids; change the call
site so that prepare_indexes only receives tune_mode for the specific lists that
came from Optuna: compute booleans (e.g., build_from_tune = tune_mode and
tune_build_params is not None; search_from_tune = tune_mode and
tune_search_params is not None), set actual_build/actual_search as you already
do, and modify the prepare_indexes call to accept and use these finer-grained
flags (or pass separate parameters like tune_build_flag and tune_search_flag) so
validation runs for the fallback build_combos/search_combos while being skipped
only for the real tuned param lists.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 7cbdab07-950d-4a0d-a260-08cb56e697b5

📥 Commits

Reviewing files that changed from the base of the PR and between 7453082 and 1708b56.

📒 Files selected for processing (1)

python/cuvs_bench/cuvs_bench/orchestrator/config_loaders.py

…, not on fallback grids.

coderabbitai

♻️ Duplicate comments (1)

python/cuvs_bench/cuvs_bench/orchestrator/config_loaders.py (1)

525-549: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Tune flow still uses a blanket validation bypass.

build_from_tune is passed as tune_mode into prepare_indexes(), which skips both build and search validation. This still bypasses search validation when only build params are tuned. Split build/search skip flags and only skip the side that actually came from Optuna.

💡 Minimal fix

-            build_from_tune = tune_mode and tune_build_params is not None
-            if build_from_tune:
+            skip_build_validation = (
+                tune_mode and tune_build_params is not None
+            )
+            skip_search_validation = (
+                tune_mode and tune_search_params is not None
+            )
+
+            if skip_build_validation:
                 actual_build = [tune_build_params.copy()]
-                actual_search = (
-                    [tune_search_params.copy()]
-                    if tune_search_params
-                    else [{}]
-                )
             else:
                 actual_build = build_combos
-                actual_search = search_combos
+
+            if skip_search_validation:
+                actual_search = [tune_search_params.copy()]
+            else:
+                actual_search = search_combos

             indexes = self.prepare_indexes(
                 actual_build,
                 actual_search,
@@
-                tune_mode=build_from_tune,
+                skip_build_validation=skip_build_validation,
+                skip_search_validation=skip_search_validation,
             )

-        tune_mode: bool = False,
+        skip_build_validation: bool = False,
+        skip_search_validation: bool = False,
@@
-            if not tune_mode:
+            if not skip_build_validation:
                 if not self.validate_constraints(
@@
-            if tune_mode:
+            if skip_search_validation:
                 index["search_params"] = list(all_search_params)
             else:
                 index["search_params"] = self.validate_search_params(

As per coding guidelines, "Ensure missing validation does not cause crashes on invalid input through proper size/type checks".

Also applies to: 786-856

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@python/cuvs_bench/cuvs_bench/orchestrator/config_loaders.py` around lines 525
- 549, The tune flow currently passes a single tune_mode (build_from_tune) into
prepare_indexes which causes both build and search validations to be skipped;
change the logic to compute two separate flags (e.g., build_tune_mode =
bool(tune_build_params) and search_tune_mode = bool(tune_search_params)), use
actual_build/actual_search as before, and update the prepare_indexes signature
and all its call sites (including the other occurrence noted) to accept and use
these two flags so only the side coming from Optuna skips validation; inside
prepare_indexes, ensure it only bypasses build validation when build_tune_mode
is true and only bypasses search validation when search_tune_mode is true.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@python/cuvs_bench/cuvs_bench/orchestrator/config_loaders.py`:
- Around line 525-549: The tune flow currently passes a single tune_mode
(build_from_tune) into prepare_indexes which causes both build and search
validations to be skipped; change the logic to compute two separate flags (e.g.,
build_tune_mode = bool(tune_build_params) and search_tune_mode =
bool(tune_search_params)), use actual_build/actual_search as before, and update
the prepare_indexes signature and all its call sites (including the other
occurrence noted) to accept and use these two flags so only the side coming from
Optuna skips validation; inside prepare_indexes, ensure it only bypasses build
validation when build_tune_mode is true and only bypasses search validation when
search_tune_mode is true.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 408ce488-f8a5-4da4-94f9-16f1329a36b6

📥 Commits

Reviewing files that changed from the base of the PR and between a55eba8 and b0d83b7.

📒 Files selected for processing (1)

python/cuvs_bench/cuvs_bench/orchestrator/config_loaders.py

coderabbitai

🧹 Nitpick comments (1)

python/cuvs_bench/cuvs_bench/orchestrator/config_loaders.py (1)

530-537: 💤 Low value

Rename unused loop variable group_conf to _group_conf.

The variable is unpacked but never referenced in the loop body. Prefix with underscore to signal intentional disuse.

         for (
             algo,
             group,
-            group_conf,
+            _group_conf,
             build_combos,
             search_combos,
             group_meta,
         ) in expanded_groups:

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@python/cuvs_bench/cuvs_bench/orchestrator/config_loaders.py` around lines 530
- 537, The unpacking in the for-loop over expanded_groups uses the variable name
group_conf which is never referenced; rename it to _group_conf in the tuple
unpacking (the for (algo, group, _group_conf, build_combos, search_combos,
group_meta) in expanded_groups:) to indicate intentional unused variable and
silence linters while leaving surrounding logic (algo, group, build_combos,
search_combos, group_meta) unchanged.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@python/cuvs_bench/cuvs_bench/orchestrator/config_loaders.py`:
- Around line 530-537: The unpacking in the for-loop over expanded_groups uses
the variable name group_conf which is never referenced; rename it to _group_conf
in the tuple unpacking (the for (algo, group, _group_conf, build_combos,
search_combos, group_meta) in expanded_groups:) to indicate intentional unused
variable and silence linters while leaving surrounding logic (algo, group,
build_combos, search_combos, group_meta) unchanged.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 698351c6-0833-4b84-8d51-a48a66a17aa2

📥 Commits

Reviewing files that changed from the base of the PR and between b0d83b7 and d92822d.

📒 Files selected for processing (1)

python/cuvs_bench/cuvs_bench/orchestrator/config_loaders.py

cjnolet · 2026-05-14T22:55:23Z

/merge

Add BenchmarkOrchestrator Python API example back alongside CLI in docs.

7453082

jnke2016 requested a review from a team as a code owner May 14, 2026 17:29

github-project-automation Bot added this to Unstructured Data Processing May 14, 2026

Move parameter expansion into base ConfigLoader.load() so no config l…

1708b56

…oader calls expand_param_grid directly

jnke2016 requested a review from a team as a code owner May 14, 2026 19:18

coderabbitai Bot reviewed May 14, 2026

View reviewed changes

Comment thread python/cuvs_bench/cuvs_bench/orchestrator/config_loaders.py Outdated

jnke2016 added 2 commits May 14, 2026 19:22

update docstrings

a55eba8

Only skip constraint validation when params actually come from Optuna…

b0d83b7

…, not on fallback grids.

coderabbitai Bot reviewed May 14, 2026

View reviewed changes

cjnolet approved these changes May 14, 2026

View reviewed changes

cjnolet added doc Improvements or additions to documentation non-breaking Introduces a non-breaking change labels May 14, 2026

fix style

d92822d

coderabbitai Bot reviewed May 14, 2026

View reviewed changes

aamijar assigned jnke2016 May 14, 2026

rapids-bot Bot merged commit 49ce810 into rapidsai:main May 14, 2026
71 of 72 checks passed

github-project-automation Bot moved this to Done in Unstructured Data Processing May 14, 2026

jrbourbeau mentioned this pull request May 15, 2026

[REVIEW] Add OpenSearch backend to cuvs-bench #2012

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update cuvs-bench docs to show both CLI and Python API usage#2084

Update cuvs-bench docs to show both CLI and Python API usage#2084
rapids-bot[bot] merged 5 commits into
rapidsai:mainfrom
jnke2016:update_docs

jnke2016 commented May 14, 2026

Uh oh!

coderabbitai Bot commented May 14, 2026 •

edited

Loading

Reviews paused

Summary by CodeRabbit

Walkthrough

Changes

❌ Failed checks (1 inconclusive)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot left a comment

Uh oh!

cjnolet commented May 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jnke2016 commented May 14, 2026

Uh oh!

coderabbitai Bot commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Summary by CodeRabbit

Walkthrough

Changes

❌ Failed checks (1 inconclusive)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cjnolet commented May 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

coderabbitai Bot commented May 14, 2026 •

edited

Loading