[None][test] Decrease P1 models number and merge sanity test list into core by yufeiwu-nv · Pull Request #14952 · NVIDIA/TensorRT-LLM

yufeiwu-nv · 2026-06-04T09:27:19Z

Added new performance tests for various models including qwen3 and llama_v3.1.
Removed the llm_perf_sanity.yml file as it is no longer needed.

Signed-off-by: yufeiwu-nv 230315618+yufeiwu-nv@users.noreply.github.com

Summary by CodeRabbit

Tests
- Updated LLM performance testing configurations across multiple GPU compute conditions
- Refined test coverage for model variants with different optimization formats
- Optimized test matrix to prioritize critical performance benchmarks

Description

Test Coverage

PR Checklist

Please review the following before submitting your PR:

PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.
PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.
Test cases are provided for new code paths (see test instructions)
If PR introduces API changes, an appropriate PR label is added - either api-compatible or api-breaking. For api-breaking, include BREAKING in the PR title.
Any new dependencies have been scanned for license and vulnerabilities
CODEOWNERS updated if ownership changes
Documentation updated as needed
Update tava architecture diagram if there is a significant design change in PR.
The reviewers assigned automatically/manually are appropriate for the PR.
Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

To see a list of available CI bot commands, please comment /bot help.

coderabbitai · 2026-06-04T09:29:24Z

📝 Walkthrough

Walkthrough

This PR updates the LLM performance test configurations by revising llm_perf_core.yml test entries across multiple GPU condition sections and removing llm_perf_sanity.yml entirely. Changes consolidate llama model coverage, add new model variants (gpt_oss_20b_fp4, nemotron_nano_12b_v2), and reduce timeout-prone deepseek test configurations.

Changes

Performance Test Configuration Updates

Layer / File(s)	Summary
Common GPU baseline tests update `tests/integration/test_lists/qa/llm_perf_core.yml`	Replaces the initial "All GPUs common tests" entries with revised FP8/BF16 configurations, removing prior qwen/llama variants and adding qwen3_0.6b, qwen3_4b_eagle3 streaming, and llama_v3.1_nemotron_nano_8b_fp8 entries.
Llama model consolidation and streamlining `tests/integration/test_lists/qa/llm_perf_core.yml`	Removes llama_v3.1_8b test block from mid-range conditions, reduces llama_v3.3_70b_instruct FP8 streaming variants by removing many gpus:4 configurations and streamlining to smaller BF16/FP8 sets, and updates RTX-6000D condition entries with reduced BF16/FP4 variant set.
New model variants and additions `tests/integration/test_lists/qa/llm_perf_core.yml`	Introduces new test entries for gpt_oss_20b_fp4 (float4 format) and adds BF16 nemotron_nano_12b_v2 variant with updated I/O length configurations.
Advanced GPU condition optimizations `tests/integration/test_lists/qa/llm_perf_core.yml`	Updates GB200/B200/B300 condition group with streamlined FP8 streaming and FP4 entries, reduces kimi_k2_nvfp4 FP4 test coverage to fewer maxbs/config points, and removes multiple timeout-prone deepseek_v3.2_fp4 and deepseek_v3.2_fp8 input-output variants.
YAML formatting adjustment `tests/integration/test_lists/qa/llm_perf_core.yml`	Introduces spacing adjustment within the test list.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

NVIDIA/TensorRT-LLM#14749: Both PRs adjust llm_perf_core.yml to add and configure Nemotron model test coverage (e.g., nemotron_nano_12b_v2) alongside model-config handling updates for those variants.

Suggested reviewers

niukuo
StanleySun639

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name	Status	Explanation	Resolution
Description check	❓ Inconclusive	The description provides basic context but is incomplete. It lacks detail on the specific changes, rationale, and test coverage information required by the template.	Expand the description to include: what specific P1 models were reduced, why the sanity test list was merged into core, and detailed test coverage information for validation.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly describes the main changes: decreasing P1 models and merging sanity tests into core list.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

…aced repo in quantization scripts Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>

Signed-off-by: TensorRT LLM <90828364+tensorrt-cicd@users.noreply.github.com> Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>

…IA#14925) Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com> Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>

Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com> Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>

…est_run_with_different_env (NVIDIA#14939) Signed-off-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com> Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>

…NVIDIA#14900) Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com> Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>

…y.yml file - Added new performance tests for various models including qwen3 and llama_v3.1. - Removed the llm_perf_sanity.yml file as it is no longer needed. Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>

….5 8-GPU perf (NVIDIA#14613) Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com> Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com> Co-authored-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com> Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>

Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>

Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com> Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>

…ter MLIR elementwise fusion (NVIDIA#14795) Signed-off-by: tensorrt-cicd <90828364+tensorrt-cicd@users.noreply.github.com> Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>

Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com> Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>

yufeiwu-nv · 2026-06-04T09:34:51Z

/bot skip --comment "only test list modify"

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/integration/test_lists/qa/llm_perf_core.yml`:
- Line 28: The test matrix entry sets maxnt:2048 which is smaller than the
declared input_output_len (8000,1000) and will cause the workload to exceed the
token budget; update the perf/test_perf.py::test_perf[...] case so maxnt
(max_num_tokens) is increased to cover the larger sequence (e.g., >=8000 or
remove the explicit maxnt to rely on the default 8192) ensuring the token budget
matches input_output_len, and keep the test name/identifier intact.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 0f13f275-b6d1-4113-aeb0-8579272e1f17

📥 Commits

Reviewing files that changed from the base of the PR and between 8c39de8 and c206c0c.

📒 Files selected for processing (2)

tests/integration/test_lists/qa/llm_perf_core.yml
tests/integration/test_lists/qa/llm_perf_sanity.yml

💤 Files with no reviewable changes (1)

tests/integration/test_lists/qa/llm_perf_sanity.yml

tensorrt-cicd · 2026-06-04T09:41:48Z

PR_Github #52037 [ skip ] triggered by Bot. Commit: a6d5db1 Link to invocation

tensorrt-cicd · 2026-06-04T09:51:12Z

PR_Github #52037 [ skip ] completed with state SUCCESS. Commit: a6d5db1
Skipping testing for commit a6d5db1

Link to invocation

yufeiwu-nv · 2026-06-04T10:05:19Z

/bot skip --comment "only test list modify"

tensorrt-cicd · 2026-06-04T10:11:50Z

PR_Github #52044 [ skip ] triggered by Bot. Commit: e95537d Link to invocation

tensorrt-cicd · 2026-06-04T10:21:39Z

PR_Github #52044 [ skip ] completed with state SUCCESS. Commit: e95537d
Skipping testing for commit e95537d

Link to invocation

yufeiwu-nv requested review from a team as code owners June 4, 2026 09:27

yufeiwu-nv requested review from HuiGao-NV, JunyiXu-nv and dongxuy04 June 4, 2026 09:27

github-actions Bot assigned yufeiwu-nv Jun 4, 2026

yufeiwu-nv removed request for a team, HuiGao-NV, JunyiXu-nv and dongxuy04 June 4, 2026 09:28

yufeiwu-nv force-pushed the bug branch from c206c0c to bdd3b0e Compare June 4, 2026 09:31

yufeiwu-nv requested review from a team as code owners June 4, 2026 09:31

yufeiwu-nv requested review from mzweilz, niukuo and suyoggupta June 4, 2026 09:31

yufeiwu-nv and others added 3 commits June 4, 2026 09:34

[None][fix] Update dataset identifier for cnn_dailymail to use namesp…

e6c1cd9

…aced repo in quantization scripts Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>

[None][infra] Check in most recent lock file from nightly pipeline

269ee92

Signed-off-by: TensorRT LLM <90828364+tensorrt-cicd@users.noreply.github.com> Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>

[None][infra] Waive 11 failed cases for main in post-merge 2757 (NVID…

2586ccb

…IA#14925) Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com> Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>

yufeiwu-nv removed request for a team, mzweilz, niukuo and suyoggupta June 4, 2026 09:34

xinhe-nv and others added 10 commits June 4, 2026 09:34

[None][test] update rtx6k test list (NVIDIA#14929)

1b8cd42

Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com> Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>

[https://nvbugs/5979673][fix] Unwaive test_agent_multi_backends.py::t…

d7e6029

…est_run_with_different_env (NVIDIA#14939) Signed-off-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com> Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>

[None][fix] Add nemotron-v3 as the proper nemotron-h reasoning parser (…

1b67180

…NVIDIA#14900) Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com> Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>

[TRTLLM-8236][infra] fix platform tag for public wheel (NVIDIA#14616)

fcfced4

Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>

[None][test] update bug ids in waives (NVIDIA#14946)

2d5c1eb

Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com> Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>

[https://nvbugs/6244474][fix] AutoDeploy: skip explicit shape-prop af…

feb8441

…ter MLIR elementwise fusion (NVIDIA#14795) Signed-off-by: tensorrt-cicd <90828364+tensorrt-cicd@users.noreply.github.com> Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>

[None][infra] fix cbts json decode (NVIDIA#14928)

bdd3b0e

Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com> Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>

Merge branch 'main' into bug

a6d5db1

coderabbitai Bot reviewed Jun 4, 2026

View reviewed changes

Comment thread tests/integration/test_lists/qa/llm_perf_core.yml

ruodil approved these changes Jun 4, 2026

View reviewed changes

Merge branch 'main' into bug

e95537d

yufeiwu-nv enabled auto-merge (squash) June 4, 2026 10:05

yufeiwu-nv merged commit 941c778 into NVIDIA:main Jun 4, 2026
8 checks passed

This was referenced Jun 5, 2026

[None][test] Fix the ci disagg perf local submit test scope too large issue to avoid HF Model not found #14989

Merged

[None][test] remove outdated model in perf test #14992

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[None][test] Decrease P1 models number and merge sanity test list into core#14952

[None][test] Decrease P1 models number and merge sanity test list into core#14952
yufeiwu-nv merged 14 commits into
NVIDIA:mainfrom
yufeiwu-nv:bug

yufeiwu-nv commented Jun 4, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 4, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

❌ Failed checks (1 inconclusive)

Uh oh!

yufeiwu-nv commented Jun 4, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

tensorrt-cicd commented Jun 4, 2026

Uh oh!

tensorrt-cicd commented Jun 4, 2026

Uh oh!

yufeiwu-nv commented Jun 4, 2026

Uh oh!

tensorrt-cicd commented Jun 4, 2026

Uh oh!

tensorrt-cicd commented Jun 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

Conversation

yufeiwu-nv commented Jun 4, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Description

Test Coverage

PR Checklist

GitHub Bot Help

Uh oh!

coderabbitai Bot commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

❌ Failed checks (1 inconclusive)

Uh oh!

yufeiwu-nv commented Jun 4, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

tensorrt-cicd commented Jun 4, 2026

Uh oh!

tensorrt-cicd commented Jun 4, 2026

Uh oh!

yufeiwu-nv commented Jun 4, 2026

Uh oh!

tensorrt-cicd commented Jun 4, 2026

Uh oh!

tensorrt-cicd commented Jun 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

yufeiwu-nv commented Jun 4, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 4, 2026 •

edited

Loading