[None][test] local wheel installation support and add gb300 cases demo by fredricz-20070104 · Pull Request #11742 · NVIDIA/TensorRT-LLM

fredricz-20070104 · 2026-02-26T08:38:10Z

Summary by CodeRabbit

New Features
- Added --install-mode option to support source (default) or wheel installation methods during setup
- Added performance sanity test configurations for Deepseek models on GB300 GPUs with FP4 precision
Chores
- Updated performance testing framework with new benchmark configurations

Perf sanity local wheel installation support
GB300 test cases demo.

Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>

Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>

coderabbitai · 2026-02-26T08:43:31Z

📝 Walkthrough

Walkthrough

These changes introduce an install mode selection mechanism for performance test scripts that supports both source and wheel-based installation, alongside new performance benchmarking configuration files for Deepseek models on GB300 hardware.

Changes

Cohort / File(s)	Summary
Install Mode Feature `jenkins/scripts/perf/local/README.md`, `jenkins/scripts/perf/local/slurm_install.sh`, `jenkins/scripts/perf/local/submit.py`	Adds `--install-mode` CLI option with choices `source` (default) or `wheel`, implemented in submit.py that propagates via INSTALL_MODE environment variable. slurm_install.sh script includes conditional logic to attempt wheel installation from llmSrcNode/build directory with fallback to source mode, plus explicit logging of chosen mode.
Performance Test Configurations `tests/integration/defs/perf/disagg/test_configs/disagg/perf-sanity/gb300_deepseek-r1-fp4_1k1k_con1024_ctx1_dep4_gen1_dep32_eplb0_mtp3_ccb-UCX.yaml`, `tests/scripts/perf-sanity/gb300_deepseek_r1_fp4_v2_2_nodes_grace_blackwell.yaml`	Two new YAML configuration files defining performance benchmarking setups: one comprehensive single-config integration test with FP4 precision and UCX backends, and one multi-config sanity test with three server configurations (varying in workload size, tensor/expert parallelism, and client concurrency parameters) for 2-node Grace Blackwell deployments.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The PR description is incomplete and does not follow the repository's template, missing critical sections like detailed description, test coverage, and PR checklist.	Provide a detailed description explaining the issue, solution, and test coverage. Complete the PR checklist items to ensure all guidelines and best practices are followed.
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly summarizes the main changes: adding local wheel installation support and GB300 test cases for performance sanity testing.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

tests/integration/defs/perf/disagg/test_configs/disagg/perf-sanity/gb300_deepseek-r1-fp4_1k1k_con1024_ctx1_dep4_gen1_dep32_eplb0_mtp3_ccb-UCX.yaml (1)
7-10: Use a single source of truth for script_file.

Line 7 and Line 10 duplicate the same launch script value. Keeping this in two places increases drift risk if one is edited later.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@tests/integration/defs/perf/disagg/test_configs/disagg/perf-sanity/gb300_deepseek-r1-fp4_1k1k_con1024_ctx1_dep4_gen1_dep32_eplb0_mtp3_ccb-UCX.yaml`
around lines 7 - 10, The file defines script_file in two places (top-level key
"script_file" and "slurm.script_file"); remove the duplicate and keep a single
source of truth (preferably under "slurm.script_file") so updates won’t
drift—delete the top-level "script_file" entry (or conversely remove
"slurm.script_file" if your codebase expects the top-level) and ensure any code
that reads this config (look for code that references "script_file" or
"slurm.script_file") uses the retained location.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@jenkins/scripts/perf/local/slurm_install.sh`:
- Around line 53-57: The WHEEL_FILE selection is nondeterministic because it
uses find ... | head -1; change the assignment of WHEEL_FILE (the variable set
where find "$llmSrcNode/build" -name "tensorrt_llm-*.whl" ... | head -1) to
deterministically pick the intended wheel (for example sort by version or
modification time) before piping to head: e.g., use find to list all matching
wheels and then select the newest (ls -t or sort -V) or explicitly filter for
the desired version, and keep the subsequent retry_command pip install
"$WHEEL_FILE" call unchanged so the install uses the deterministic selection.

In
`@tests/scripts/perf-sanity/gb300_deepseek_r1_fp4_v2_2_nodes_grace_blackwell.yaml`:
- Around line 96-99: The kv_cache_config block is missing the explicit
enable_block_reuse key; add kv_cache_config.enable_block_reuse with the same
boolean value used in the other two configs (so it matches their behavior) to
avoid relying on defaults and ensure reproducible perf comparisons; update the
kv_cache_config section containing dtype and free_gpu_memory_fraction to include
enable_block_reuse.

---

Nitpick comments:
In
`@tests/integration/defs/perf/disagg/test_configs/disagg/perf-sanity/gb300_deepseek-r1-fp4_1k1k_con1024_ctx1_dep4_gen1_dep32_eplb0_mtp3_ccb-UCX.yaml`:
- Around line 7-10: The file defines script_file in two places (top-level key
"script_file" and "slurm.script_file"); remove the duplicate and keep a single
source of truth (preferably under "slurm.script_file") so updates won’t
drift—delete the top-level "script_file" entry (or conversely remove
"slurm.script_file" if your codebase expects the top-level) and ensure any code
that reads this config (look for code that references "script_file" or
"slurm.script_file") uses the retained location.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a93c56e and 2005523.

📒 Files selected for processing (5)

jenkins/scripts/perf/local/README.md
jenkins/scripts/perf/local/slurm_install.sh
jenkins/scripts/perf/local/submit.py
tests/integration/defs/perf/disagg/test_configs/disagg/perf-sanity/gb300_deepseek-r1-fp4_1k1k_con1024_ctx1_dep4_gen1_dep32_eplb0_mtp3_ccb-UCX.yaml
tests/scripts/perf-sanity/gb300_deepseek_r1_fp4_v2_2_nodes_grace_blackwell.yaml

jenkins/scripts/perf/local/slurm_install.sh

tests/scripts/perf-sanity/gb300_deepseek_r1_fp4_v2_2_nodes_grace_blackwell.yaml

fredricz-20070104 · 2026-02-26T08:44:07Z

/bot skip --comment "skip test as just modify local mode installation, no affect sanity check"

tensorrt-cicd · 2026-02-26T08:49:58Z

PR_Github #36896 [ skip ] triggered by Bot. Commit: 2005523 Link to invocation

Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>

tensorrt-cicd · 2026-02-26T09:33:14Z

PR_Github #36896 [ skip ] completed with state SUCCESS. Commit: 2005523
Release Check Pipeline #3257 failed

Link to invocation

fredricz-20070104 · 2026-02-26T09:39:39Z

/bot skip --comment "skip test as just modify local mode installation, no affect sanity check"

tensorrt-cicd · 2026-02-26T09:45:35Z

PR_Github #36906 [ skip ] triggered by Bot. Commit: 74ef45b Link to invocation

tensorrt-cicd · 2026-02-26T10:29:03Z

PR_Github #36906 [ skip ] completed with state SUCCESS. Commit: 74ef45b
Skipping testing for commit 74ef45b

Link to invocation

fredricz-20070104 · 2026-02-27T01:39:30Z

/bot reuse-pipeline

tensorrt-cicd · 2026-02-27T01:47:51Z

PR_Github #36994 [ reuse-pipeline ] triggered by Bot. Commit: ab95c0c Link to invocation

tensorrt-cicd · 2026-02-27T02:20:30Z

PR_Github #36994 [ reuse-pipeline ] completed with state SUCCESS. Commit: ab95c0c
Can't reuse PR_Github #0 with status: UNKNOWN

Link to invocation

fredricz-20070104 · 2026-02-27T02:31:16Z

/bot run --skip-test

tensorrt-cicd · 2026-02-27T02:37:11Z

PR_Github #37003 [ run ] triggered by Bot. Commit: ab95c0c Link to invocation

tensorrt-cicd · 2026-02-27T06:20:11Z

PR_Github #37003 [ run ] completed with state SUCCESS. Commit: ab95c0c
/LLM/main/L0_MergeRequest_PR pipeline #28651 (Partly Tested) completed with status: 'SUCCESS'

Link to invocation

fredricz-20070104 · 2026-02-27T09:54:36Z

/bot reuse-pipeline

tensorrt-cicd · 2026-02-27T10:00:25Z

PR_Github #37060 [ reuse-pipeline ] triggered by Bot. Commit: ab95c0c Link to invocation

tensorrt-cicd · 2026-02-27T10:57:41Z

PR_Github #37060 [ reuse-pipeline ] completed with state SUCCESS. Commit: ab95c0c
Reusing PR_Github #37003 (Partly Tested) for commit ab95c0c

Link to invocation

NVIDIA#11742) Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com> Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com> Co-authored-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>

xinhe-nv and others added 3 commits February 26, 2026 08:34

add wheel installation support

b421e55

Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>

add gb300 case for deepseek r1

41afe43

Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>

add gb300 multi agg cases

2005523

Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>

fredricz-20070104 requested review from a team as code owners February 26, 2026 08:38

fredricz-20070104 requested review from chenfeiz0326, niukuo and ruodil February 26, 2026 08:38

coderabbitai bot reviewed Feb 26, 2026

View reviewed changes

jenkins/scripts/perf/local/slurm_install.sh Show resolved Hide resolved

tests/scripts/perf-sanity/gb300_deepseek_r1_fp4_v2_2_nodes_grace_blackwell.yaml Show resolved Hide resolved

ruodil approved these changes Feb 26, 2026

View reviewed changes

fredricz-20070104 enabled auto-merge (squash) February 26, 2026 08:51

fx pre-commit error

74ef45b

Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>

Merge branch 'main' into feature/local_wheel_install

2f63a05

Merge branch 'main' into feature/local_wheel_install

ab95c0c

fredricz-20070104 merged commit cb1a872 into NVIDIA:main Feb 27, 2026
5 checks passed

Conversation

fredricz-20070104 commented Feb 26, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 26, 2026

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (2 warnings)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

fredricz-20070104 commented Feb 26, 2026

Uh oh!

tensorrt-cicd commented Feb 26, 2026

Uh oh!

tensorrt-cicd commented Feb 26, 2026

Uh oh!

fredricz-20070104 commented Feb 26, 2026

Uh oh!

tensorrt-cicd commented Feb 26, 2026

Uh oh!

tensorrt-cicd commented Feb 26, 2026

Uh oh!

fredricz-20070104 commented Feb 27, 2026

Uh oh!

tensorrt-cicd commented Feb 27, 2026

Uh oh!

tensorrt-cicd commented Feb 27, 2026

Uh oh!

fredricz-20070104 commented Feb 27, 2026

Uh oh!

tensorrt-cicd commented Feb 27, 2026

Uh oh!

tensorrt-cicd commented Feb 27, 2026

Uh oh!

fredricz-20070104 commented Feb 27, 2026

Uh oh!

tensorrt-cicd commented Feb 27, 2026

Uh oh!

tensorrt-cicd commented Feb 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

fredricz-20070104 commented Feb 26, 2026 •

edited by coderabbitai bot

Loading