Skip to content

[None][test] local wheel installation support and add gb300 cases demo#11742

Merged
fredricz-20070104 merged 6 commits intoNVIDIA:mainfrom
fredricz-20070104:feature/local_wheel_install
Feb 27, 2026
Merged

[None][test] local wheel installation support and add gb300 cases demo#11742
fredricz-20070104 merged 6 commits intoNVIDIA:mainfrom
fredricz-20070104:feature/local_wheel_install

Conversation

@fredricz-20070104
Copy link
Collaborator

@fredricz-20070104 fredricz-20070104 commented Feb 26, 2026

Summary by CodeRabbit

  • New Features

    • Added --install-mode option to support source (default) or wheel installation methods during setup
    • Added performance sanity test configurations for Deepseek models on GB300 GPUs with FP4 precision
  • Chores

    • Updated performance testing framework with new benchmark configurations
  1. Perf sanity local wheel installation support
  2. GB300 test cases demo.

xinhe-nv and others added 3 commits February 26, 2026 08:34
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>
Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 26, 2026

📝 Walkthrough

Walkthrough

These changes introduce an install mode selection mechanism for performance test scripts that supports both source and wheel-based installation, alongside new performance benchmarking configuration files for Deepseek models on GB300 hardware.

Changes

Cohort / File(s) Summary
Install Mode Feature
jenkins/scripts/perf/local/README.md, jenkins/scripts/perf/local/slurm_install.sh, jenkins/scripts/perf/local/submit.py
Adds --install-mode CLI option with choices source (default) or wheel, implemented in submit.py that propagates via INSTALL_MODE environment variable. slurm_install.sh script includes conditional logic to attempt wheel installation from llmSrcNode/build directory with fallback to source mode, plus explicit logging of chosen mode.
Performance Test Configurations
tests/integration/defs/perf/disagg/test_configs/disagg/perf-sanity/gb300_deepseek-r1-fp4_1k1k_con1024_ctx1_dep4_gen1_dep32_eplb0_mtp3_ccb-UCX.yaml, tests/scripts/perf-sanity/gb300_deepseek_r1_fp4_v2_2_nodes_grace_blackwell.yaml
Two new YAML configuration files defining performance benchmarking setups: one comprehensive single-config integration test with FP4 precision and UCX backends, and one multi-config sanity test with three server configurations (varying in workload size, tensor/expert parallelism, and client concurrency parameters) for 2-node Grace Blackwell deployments.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Description check ⚠️ Warning The PR description is incomplete and does not follow the repository's template, missing critical sections like detailed description, test coverage, and PR checklist. Provide a detailed description explaining the issue, solution, and test coverage. Complete the PR checklist items to ensure all guidelines and best practices are followed.
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main changes: adding local wheel installation support and GB300 test cases for performance sanity testing.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
tests/integration/defs/perf/disagg/test_configs/disagg/perf-sanity/gb300_deepseek-r1-fp4_1k1k_con1024_ctx1_dep4_gen1_dep32_eplb0_mtp3_ccb-UCX.yaml (1)

7-10: Use a single source of truth for script_file.

Line 7 and Line 10 duplicate the same launch script value. Keeping this in two places increases drift risk if one is edited later.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@tests/integration/defs/perf/disagg/test_configs/disagg/perf-sanity/gb300_deepseek-r1-fp4_1k1k_con1024_ctx1_dep4_gen1_dep32_eplb0_mtp3_ccb-UCX.yaml`
around lines 7 - 10, The file defines script_file in two places (top-level key
"script_file" and "slurm.script_file"); remove the duplicate and keep a single
source of truth (preferably under "slurm.script_file") so updates won’t
drift—delete the top-level "script_file" entry (or conversely remove
"slurm.script_file" if your codebase expects the top-level) and ensure any code
that reads this config (look for code that references "script_file" or
"slurm.script_file") uses the retained location.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@jenkins/scripts/perf/local/slurm_install.sh`:
- Around line 53-57: The WHEEL_FILE selection is nondeterministic because it
uses find ... | head -1; change the assignment of WHEEL_FILE (the variable set
where find "$llmSrcNode/build" -name "tensorrt_llm-*.whl" ... | head -1) to
deterministically pick the intended wheel (for example sort by version or
modification time) before piping to head: e.g., use find to list all matching
wheels and then select the newest (ls -t or sort -V) or explicitly filter for
the desired version, and keep the subsequent retry_command pip install
"$WHEEL_FILE" call unchanged so the install uses the deterministic selection.

In
`@tests/scripts/perf-sanity/gb300_deepseek_r1_fp4_v2_2_nodes_grace_blackwell.yaml`:
- Around line 96-99: The kv_cache_config block is missing the explicit
enable_block_reuse key; add kv_cache_config.enable_block_reuse with the same
boolean value used in the other two configs (so it matches their behavior) to
avoid relying on defaults and ensure reproducible perf comparisons; update the
kv_cache_config section containing dtype and free_gpu_memory_fraction to include
enable_block_reuse.

---

Nitpick comments:
In
`@tests/integration/defs/perf/disagg/test_configs/disagg/perf-sanity/gb300_deepseek-r1-fp4_1k1k_con1024_ctx1_dep4_gen1_dep32_eplb0_mtp3_ccb-UCX.yaml`:
- Around line 7-10: The file defines script_file in two places (top-level key
"script_file" and "slurm.script_file"); remove the duplicate and keep a single
source of truth (preferably under "slurm.script_file") so updates won’t
drift—delete the top-level "script_file" entry (or conversely remove
"slurm.script_file" if your codebase expects the top-level) and ensure any code
that reads this config (look for code that references "script_file" or
"slurm.script_file") uses the retained location.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a93c56e and 2005523.

📒 Files selected for processing (5)
  • jenkins/scripts/perf/local/README.md
  • jenkins/scripts/perf/local/slurm_install.sh
  • jenkins/scripts/perf/local/submit.py
  • tests/integration/defs/perf/disagg/test_configs/disagg/perf-sanity/gb300_deepseek-r1-fp4_1k1k_con1024_ctx1_dep4_gen1_dep32_eplb0_mtp3_ccb-UCX.yaml
  • tests/scripts/perf-sanity/gb300_deepseek_r1_fp4_v2_2_nodes_grace_blackwell.yaml

@fredricz-20070104
Copy link
Collaborator Author

/bot skip --comment "skip test as just modify local mode installation, no affect sanity check"

@tensorrt-cicd
Copy link
Collaborator

PR_Github #36896 [ skip ] triggered by Bot. Commit: 2005523 Link to invocation

@fredricz-20070104 fredricz-20070104 enabled auto-merge (squash) February 26, 2026 08:51
Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>
@tensorrt-cicd
Copy link
Collaborator

PR_Github #36896 [ skip ] completed with state SUCCESS. Commit: 2005523
Release Check Pipeline #3257 failed

Link to invocation

@fredricz-20070104
Copy link
Collaborator Author

/bot skip --comment "skip test as just modify local mode installation, no affect sanity check"

@tensorrt-cicd
Copy link
Collaborator

PR_Github #36906 [ skip ] triggered by Bot. Commit: 74ef45b Link to invocation

@tensorrt-cicd
Copy link
Collaborator

PR_Github #36906 [ skip ] completed with state SUCCESS. Commit: 74ef45b
Skipping testing for commit 74ef45b

Link to invocation

@fredricz-20070104
Copy link
Collaborator Author

/bot reuse-pipeline

@tensorrt-cicd
Copy link
Collaborator

PR_Github #36994 [ reuse-pipeline ] triggered by Bot. Commit: ab95c0c Link to invocation

@tensorrt-cicd
Copy link
Collaborator

PR_Github #36994 [ reuse-pipeline ] completed with state SUCCESS. Commit: ab95c0c
Can't reuse PR_Github #0 with status: UNKNOWN

Link to invocation

@fredricz-20070104
Copy link
Collaborator Author

/bot run --skip-test

@tensorrt-cicd
Copy link
Collaborator

PR_Github #37003 [ run ] triggered by Bot. Commit: ab95c0c Link to invocation

@tensorrt-cicd
Copy link
Collaborator

PR_Github #37003 [ run ] completed with state SUCCESS. Commit: ab95c0c
/LLM/main/L0_MergeRequest_PR pipeline #28651 (Partly Tested) completed with status: 'SUCCESS'

Link to invocation

@fredricz-20070104
Copy link
Collaborator Author

/bot reuse-pipeline

@tensorrt-cicd
Copy link
Collaborator

PR_Github #37060 [ reuse-pipeline ] triggered by Bot. Commit: ab95c0c Link to invocation

@tensorrt-cicd
Copy link
Collaborator

PR_Github #37060 [ reuse-pipeline ] completed with state SUCCESS. Commit: ab95c0c
Reusing PR_Github #37003 (Partly Tested) for commit ab95c0c

Link to invocation

@fredricz-20070104 fredricz-20070104 merged commit cb1a872 into NVIDIA:main Feb 27, 2026
5 checks passed
dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Mar 9, 2026
NVIDIA#11742)

Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>
Co-authored-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants