[#10607][fix] moved AD perf regression tests to AD jobs pre and post merge by MrGeva · Pull Request #12461 · NVIDIA/TensorRT-LLM

MrGeva · 2026-03-23T16:44:44Z

The AD perf regression test was in a PT BE job so did not get trigger for AD PRs. this PR moves the test to AD's pre and post merge jobs.

Summary by CodeRabbit

Tests
- Expanded performance testing coverage for B200 GPUs with new test entries for single-GPU and multi-GPU configurations.
- Added performance sanity tests for AutoDeploy backend across different system configurations and GPU counts.
- Reorganized test execution stages for improved testing pipeline efficiency.

Description

Test Coverage

PR Checklist

Please review the following before submitting your PR:

PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.
PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.
Test cases are provided for new code paths (see test instructions)
Any new dependencies have been scanned for license and vulnerabilities
CODEOWNERS updated if ownership changes
Documentation updated as needed
Update tava architecture diagram if there is a significant design change in PR.
The reviewers assigned automatically/manually are appropriate for the PR.
Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

To see a list of available CI bot commands, please comment /bot help.

Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>

coderabbitai · 2026-03-23T16:47:41Z

📝 Walkthrough

Walkthrough

These changes expand perf sanity test coverage for Blackwell GPUs by introducing new test entries across multiple test stages and configurations. A new single-GPU server configuration is added to support autodeploy backend testing, while test lists are updated to include corresponding post-merge test cases and GPU-specific conditions.

Changes

Cohort / File(s)	Summary
Test List Configuration `tests/integration/test_lists/test-db/l0_b200.yml`, `tests/integration/test_lists/test-db/l0_dgx_b200.yml`, `tests/integration/test_lists/test-db/l0_gb200_multi_gpus_perf_sanity.yml`	Added AutoDeploy Perf Sanity test entries to pre-merge and post-merge stages with B200/GB200 GPU filters; removed equivalent entry from multi-GPU test list to consolidate coverage.
Performance Configuration `tests/scripts/perf-sanity/aggregated/super_ad_blackwell.yaml`	Extended GPU support from GB200 to include B200; introduced new single-GPU server configuration (`super_ad_ws1_1k1k`) for `super_nvfp4` model with autodeploy backend and specified client workload parameters.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name	Status	Explanation	Resolution
Description check	❓ Inconclusive	PR description is minimal and lacks required sections. Only mentions what was moved but omits structured details about the change rationale, affected test cases, and verification steps.	Fill in the Description section explaining why tests were moved and the Impact section detailing affected test cases. Also complete the Test Coverage section with specific test identifiers being moved.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Title check	✅ Passed	The PR title clearly indicates the main objective: moving AD (AutoDeploy) perf regression tests to AD jobs with pre and post merge stages, which directly matches the changeset modifications across test configuration files.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

MrGeva · 2026-03-24T08:00:04Z

/bot run --extra-stage "DGX_B200-4_GPUs-AutoDeploy-1, DGX_H100-4_GPUs-AutoDeploy-1" --disable-fail-fast

tensorrt-cicd · 2026-03-24T08:05:50Z

PR_Github #40090 [ run ] triggered by Bot. Commit: 5dca209 Link to invocation

tensorrt-cicd · 2026-03-24T14:19:28Z

PR_Github #40090 [ run ] completed with state SUCCESS. Commit: 5dca209
/LLM/main/L0_MergeRequest_PR pipeline #31242 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

MrGeva · 2026-03-24T15:30:16Z

/bot run --extra-stage "DGX_B200-4_GPUs-AutoDeploy-1, DGX_H100-4_GPUs-AutoDeploy-1" --disable-fail-fast

Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>

MrGeva · 2026-03-25T09:32:07Z

/bot run --extra-stage "DGX_B200-4_GPUs-AutoDeploy-1, DGX_H100-4_GPUs-AutoDeploy-1" --disable-fail-fast

tensorrt-cicd · 2026-03-25T09:38:45Z

PR_Github #40312 [ run ] triggered by Bot. Commit: 60e3d6a Link to invocation

tests/integration/test_lists/test-db/l0_b200.yml

Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>

tensorrt-cicd · 2026-03-25T20:42:18Z

PR_Github #40312 [ run ] completed with state SUCCESS. Commit: 60e3d6a
/LLM/main/L0_MergeRequest_PR pipeline #31423 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

MrGeva · 2026-03-30T08:34:00Z

/bot run --extra-stage "DGX_B200-4_GPUs-AutoDeploy-1, DGX_H100-4_GPUs-AutoDeploy-1" --disable-fail-fast

tensorrt-cicd · 2026-03-30T08:39:48Z

PR_Github #40700 [ run ] triggered by Bot. Commit: d528737 Link to invocation

tensorrt-cicd · 2026-03-30T16:45:22Z

PR_Github #40700 [ run ] completed with state SUCCESS. Commit: d528737
/LLM/main/L0_MergeRequest_PR pipeline #31729 completed with status: 'SUCCESS'

CI Report

Link to invocation

moved AD perf tests to AD jobs

5dca209

Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>

github-actions bot assigned MrGeva Mar 23, 2026

MrGeva changed the title ~~[None][fix] moved AD perf regression tests to AD jobs~~ [None][fix] moved AD perf regression tests to AD jobs pre and post merge Mar 23, 2026

MrGeva changed the title ~~[None][fix] moved AD perf regression tests to AD jobs pre and post merge~~ [#10607][fix] moved AD perf regression tests to AD jobs pre and post merge Mar 23, 2026

Merge branch 'main' into eg/perfci

60e3d6a

Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>

MrGeva requested a review from galagam March 25, 2026 09:32

MrGeva enabled auto-merge (squash) March 25, 2026 09:35

galagam reviewed Mar 25, 2026

View reviewed changes

tests/integration/test_lists/test-db/l0_b200.yml Outdated Show resolved Hide resolved

galagam approved these changes Mar 25, 2026

View reviewed changes

tests/integration/test_lists/test-db/l0_b200.yml Outdated Show resolved Hide resolved

fixed cr

d528737

Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>

MrGeva merged commit d1a4af8 into NVIDIA:main Mar 30, 2026
5 checks passed

Conversation

MrGeva commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Description

Test Coverage

PR Checklist

GitHub Bot Help

Uh oh!

coderabbitai bot commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 inconclusive)

Uh oh!

MrGeva commented Mar 24, 2026

Uh oh!

tensorrt-cicd commented Mar 24, 2026

Uh oh!

tensorrt-cicd commented Mar 24, 2026

Uh oh!

MrGeva commented Mar 24, 2026

Uh oh!

MrGeva commented Mar 25, 2026

Uh oh!

tensorrt-cicd commented Mar 25, 2026

Uh oh!

Uh oh!

Uh oh!

tensorrt-cicd commented Mar 25, 2026

Uh oh!

MrGeva commented Mar 30, 2026

Uh oh!

tensorrt-cicd commented Mar 30, 2026

Uh oh!

tensorrt-cicd commented Mar 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

MrGeva commented Mar 23, 2026 •

edited

Loading

coderabbitai bot commented Mar 23, 2026 •

edited

Loading