[None][chore] Unwaive AutoDeploy accuracy tests by bmarimuthu-nv · Pull Request #14971 · NVIDIA/TensorRT-LLM

bmarimuthu-nv · 2026-06-04T18:58:11Z

Summary by CodeRabbit

Tests
- Refactored and simplified test configurations for accuracy validation; reduced parametrization and standardized YAML-driven settings.
- Removed several waived unit tests from the waiver lists.
Chores
- Enabled and updated model registry entries and adjusted example deployment configs for specific models.
- Updated an accuracy reference to specify bfloat16 for a listed model.

Description

Test Coverage

PR Checklist

Please review the following before submitting your PR:

PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.
PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.
Test cases are provided for new code paths (see test instructions)
If PR introduces API changes, an appropriate PR label is added - either api-compatible or api-breaking. For api-breaking, include BREAKING in the PR title.
Any new dependencies have been scanned for license and vulnerabilities
CODEOWNERS updated if ownership changes
Documentation updated as needed
Update tava architecture diagram if there is a significant design change in PR.
The reviewers assigned automatically/manually are appropriate for the PR.
Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

To see a list of available CI bot commands, please comment /bot help.

bmarimuthu-nv · 2026-06-04T19:00:54Z

/bot run --disable-fail-fast

bmarimuthu-nv · 2026-06-04T19:01:13Z

@CodeRabbit summary

coderabbitai · 2026-06-04T19:01:18Z

✅ Action performed

Summary regeneration triggered.

coderabbitai · 2026-06-04T19:02:28Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 7a8d6d14-027a-4dc5-8e05-01b08f7745c4

📥 Commits

Reviewing files that changed from the base of the PR and between 910826b and 69f28bb.

📒 Files selected for processing (5)

examples/auto_deploy/model_registry/configs/gemma4_e2b.yaml
examples/auto_deploy/model_registry/models.yaml
tests/integration/defs/accuracy/references/gsm8k.yaml
tests/integration/defs/accuracy/test_llm_api_autodeploy.py
tests/integration/test_lists/waives.txt

💤 Files with no reviewable changes (1)

tests/integration/test_lists/waives.txt

📝 Walkthrough

Walkthrough

Loads AutoDeploy config for GLM4Flash from a fixed YAML; simplifies test signatures by removing chunked-prefill/attn_backend parametrization; enables/uncomments model registry entries and disables piecewise CUDA-graph warmup for Gemma4 E2B; adds GSM8K dtype and trims waived-test entries.

Changes

Autodeploy tests, model registry, references, and waivers

Layer / File(s)	Summary
TestGLM4Flash: YAML-driven AutoDeploy config `tests/integration/defs/accuracy/test_llm_api_autodeploy.py`	Replaces inline AutoDeploy dict with CONFIG_YAML; `get_default_kwargs` loads YAML and overrides only `max_seq_len` and `max_num_tokens`. `test_auto_dtype` and `test_nvfp4` no longer parametrize over chunked-prefill/attn_backend.
Model registry and CUDA-graph config updates `examples/auto_deploy/model_registry/models.yaml`, `examples/auto_deploy/model_registry/configs/gemma4_e2b.yaml`	Uncomments/enables `google/gemma-4-E2B-it`, `google/gemma-4-26B-A4B-it`, and `MiniMaxAI/MiniMax-M2`; sets `piecewise_enabled: false` for Gemma4 E2B CUDA-graph warmup.
GSM8K reference dtype update `tests/integration/defs/accuracy/references/gsm8k.yaml`	Adds `dtype: bfloat16` to the `Qwen/Qwen3.5-35B-A3B` accuracy reference entry.
Waived tests list cleanup `tests/integration/test_lists/waives.txt`	Removes multiple waived skip entries for `accuracy/test_llm_api_autodeploy.py` and deletes three previously waived unit-test entries under nvbugs `6189450`.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

NVIDIA/TensorRT-LLM#14789: Also modifies tests/integration/test_lists/waives.txt and waived test cases.
NVIDIA/TensorRT-LLM#14857: Related waiver edits for accuracy/test_llm_api_autodeploy.py and neighboring tests.
NVIDIA/TensorRT-LLM#14791: Overlapping changes to waiver entries for autodeploy/accuracy tests.

Suggested reviewers

symphonylyh
xinhe-nv
jieli-matrix

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The PR description is entirely empty, containing only template placeholders with no actual content explaining the changes, rationale, or test coverage.	Fill in the 'Description' and 'Test Coverage' sections with details about why tests were unwaived, which tests were affected, and how the changes ensure correctness.
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title '[None][chore] Unwaive AutoDeploy accuracy tests' clearly and concisely summarizes the main change: re-enabling previously waived AutoDeploy accuracy tests.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

bmarimuthu-nv · 2026-06-04T19:03:25Z

/bot kill

bmarimuthu-nv · 2026-06-04T19:05:37Z

/bot run --disable-fail-fast

tensorrt-cicd · 2026-06-04T19:06:07Z

PR_Github #52140 [ run ] triggered by Bot. Commit: 41e5de9 Link to invocation

tensorrt-cicd · 2026-06-04T19:08:11Z

PR_Github #52142 [ run ] triggered by Bot. Commit: 41e5de9 Link to invocation

tensorrt-cicd · 2026-06-04T19:09:47Z

PR_Github #52143 [ kill ] triggered by Bot. Commit: 41e5de9 Link to invocation

tensorrt-cicd · 2026-06-04T19:11:39Z

PR_Github #52144 [ run ] triggered by Bot. Commit: 41e5de9 Link to invocation

tensorrt-cicd · 2026-06-04T19:11:44Z

PR_Github #52143 [ kill ] completed with state ABORTED. Commit: 41e5de9

Link to invocation

tensorrt-cicd · 2026-06-04T19:12:55Z

PR_Github #52140 [ run ] completed with state ABORTED. Commit: 41e5de9

Link to invocation

tensorrt-cicd · 2026-06-04T19:15:29Z

PR_Github #52142 [ run ] completed with state ABORTED. Commit: 41e5de9

Link to invocation

bmarimuthu-nv · 2026-06-04T19:58:06Z

/bot run --stage-list "A30-AutoDeploy-1, H100_PCIe-AutoDeploy-1, DGX_B200-AutoDeploy-1, DGX_H100-4_GPUs-AutoDeploy-1, DGX_B200-4_GPUs-AutoDeploy-1"

tensorrt-cicd · 2026-06-04T20:03:43Z

PR_Github #52154 [ run ] triggered by Bot. Commit: 3fac256 Link to invocation

tensorrt-cicd · 2026-06-04T20:07:20Z

PR_Github #52144 [ run ] completed with state ABORTED. Commit: 41e5de9

Link to invocation

bmarimuthu-nv · 2026-06-04T20:27:30Z

/bot run --stage-list "A30-AutoDeploy-1, H100_PCIe-AutoDeploy-1, DGX_B200-AutoDeploy-1, DGX_H100-4_GPUs-AutoDeploy-1, DGX_B200-4_GPUs-AutoDeploy-1"

tensorrt-cicd · 2026-06-04T20:34:04Z

PR_Github #52159 [ run ] triggered by Bot. Commit: 69f28bb Link to invocation

tensorrt-cicd · 2026-06-04T20:37:35Z

PR_Github #52154 [ run ] completed with state ABORTED. Commit: 3fac256

Link to invocation

tensorrt-cicd · 2026-06-04T23:26:02Z

PR_Github #52159 [ run ] completed with state SUCCESS. Commit: 69f28bb
/LLM/main/L0_MergeRequest_PR pipeline #41480 (Partly Tested) completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

bmarimuthu-nv · 2026-06-04T23:34:19Z

/bot run --stage-list "A30-AutoDeploy-1, H100_PCIe-AutoDeploy-1, DGX_B200-AutoDeploy-1, DGX_H100-4_GPUs-AutoDeploy-1, DGX_B200-4_GPUs-AutoDeploy-1"

tensorrt-cicd · 2026-06-04T23:39:56Z

PR_Github #52182 [ run ] triggered by Bot. Commit: 69f28bb Link to invocation

tensorrt-cicd · 2026-06-05T00:21:17Z

PR_Github #52182 [ run ] completed with state FAILURE. Commit: 69f28bb
/LLM/main/L0_MergeRequest_PR pipeline #41503 (Partly Tested) completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

bmarimuthu-nv · 2026-06-05T00:53:50Z

/bot run

tensorrt-cicd · 2026-06-05T01:00:24Z

PR_Github #52193 [ run ] triggered by Bot. Commit: 69f28bb Link to invocation

tensorrt-cicd · 2026-06-05T05:40:36Z

PR_Github #52193 [ run ] completed with state SUCCESS. Commit: 69f28bb
/LLM/main/L0_MergeRequest_PR pipeline #41513 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

bmarimuthu-nv · 2026-06-05T07:41:07Z

/bot run

tensorrt-cicd · 2026-06-05T07:46:45Z

PR_Github #52299 [ run ] triggered by Bot. Commit: 05a7be8 Link to invocation

tensorrt-cicd · 2026-06-05T10:54:54Z

PR_Github #52299 [ run ] completed with state SUCCESS. Commit: 05a7be8
/LLM/main/L0_MergeRequest_PR pipeline #41607 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

Signed-off-by: Balamurugan Marimuthu <246387390+bmarimuthu-nv@users.noreply.github.com>

bmarimuthu-nv · 2026-06-05T17:06:39Z

/bot run

tensorrt-cicd · 2026-06-05T17:13:22Z

PR_Github #52393 [ run ] triggered by Bot. Commit: 0f40dbd Link to invocation

tensorrt-cicd · 2026-06-05T20:19:15Z

PR_Github #52393 [ run ] completed with state SUCCESS. Commit: 0f40dbd
/LLM/main/L0_MergeRequest_PR pipeline #41687 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

Signed-off-by: Balamurugan Marimuthu <246387390+bmarimuthu-nv@users.noreply.github.com> [None][test] add bug id for perf sanity waive Signed-off-by: Balamurugan Marimuthu <246387390+bmarimuthu-nv@users.noreply.github.com>

bmarimuthu-nv · 2026-06-05T22:04:07Z

/bot run

tensorrt-cicd · 2026-06-05T22:09:46Z

PR_Github #52442 [ run ] triggered by Bot. Commit: 6730533 Link to invocation

tensorrt-cicd · 2026-06-06T00:54:29Z

PR_Github #52442 [ run ] completed with state FAILURE. Commit: 6730533
/LLM/main/L0_MergeRequest_PR pipeline #41735 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

Signed-off-by: Suyog Gupta <41447211+suyoggupta@users.noreply.github.com>

suyoggupta · 2026-06-06T03:28:41Z

/bot run

tensorrt-cicd · 2026-06-06T03:35:54Z

PR_Github #52469 [ run ] triggered by Bot. Commit: 4440796 Link to invocation

tensorrt-cicd · 2026-06-06T08:34:23Z

PR_Github #52469 [ run ] completed with state FAILURE. Commit: 4440796
/LLM/main/L0_MergeRequest_PR pipeline #41761 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

github-actions Bot assigned bmarimuthu-nv Jun 4, 2026

bmarimuthu-nv force-pushed the bala/ad-accuracy-tests-unwaive branch from 5fb1e65 to 41e5de9 Compare June 4, 2026 19:03

bmarimuthu-nv mentioned this pull request Jun 4, 2026

[None][fix] Fix AutoDeploy accuracy tests #13925

Merged

1 task

bmarimuthu-nv marked this pull request as ready for review June 4, 2026 20:28

bmarimuthu-nv requested review from a team as code owners June 4, 2026 20:28

bmarimuthu-nv requested a review from greg-kwasniewski1 June 4, 2026 20:28

nvchenghaoz approved these changes Jun 4, 2026

View reviewed changes

Comment thread examples/auto_deploy/model_registry/configs/gemma4_e2b.yaml

bmarimuthu-nv mentioned this pull request Jun 4, 2026

Piecewise Cudagraph support for Gemma4 VSWA cache pools #14975

Open

suyoggupta approved these changes Jun 5, 2026

View reviewed changes

bmarimuthu-nv force-pushed the bala/ad-accuracy-tests-unwaive branch from 69f28bb to 05a7be8 Compare June 5, 2026 07:37

bmarimuthu-nv force-pushed the bala/ad-accuracy-tests-unwaive branch from 05a7be8 to 947260f Compare June 5, 2026 17:01

bmarimuthu-nv added 4 commits June 5, 2026 10:04

[None][chore] unwaive AutoDeploy accuracy tests

f67dcd6

Signed-off-by: Balamurugan Marimuthu <246387390+bmarimuthu-nv@users.noreply.github.com>

disable piecewise for gemma2b

1e857ad

Signed-off-by: Balamurugan Marimuthu <246387390+bmarimuthu-nv@users.noreply.github.com>

uncomment gemma, minimax in models.yaml

6fd413a

Signed-off-by: Balamurugan Marimuthu <246387390+bmarimuthu-nv@users.noreply.github.com>

fixes

0f40dbd

Signed-off-by: Balamurugan Marimuthu <246387390+bmarimuthu-nv@users.noreply.github.com>

bmarimuthu-nv force-pushed the bala/ad-accuracy-tests-unwaive branch from 947260f to 0f40dbd Compare June 5, 2026 17:06

[None][test] waive llama fp8 AutoDeploy perf sanity

6730533

Signed-off-by: Balamurugan Marimuthu <246387390+bmarimuthu-nv@users.noreply.github.com> [None][test] add bug id for perf sanity waive Signed-off-by: Balamurugan Marimuthu <246387390+bmarimuthu-nv@users.noreply.github.com>

bmarimuthu-nv force-pushed the bala/ad-accuracy-tests-unwaive branch from 58d6ff7 to 6730533 Compare June 5, 2026 22:03

Merge branch 'main' into bala/ad-accuracy-tests-unwaive

4440796

Signed-off-by: Suyog Gupta <41447211+suyoggupta@users.noreply.github.com>

Conversation

bmarimuthu-nv commented Jun 4, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Description

Test Coverage

PR Checklist

GitHub Bot Help

Uh oh!

bmarimuthu-nv commented Jun 4, 2026

Uh oh!

bmarimuthu-nv commented Jun 4, 2026

Uh oh!

coderabbitai Bot commented Jun 4, 2026

Uh oh!

coderabbitai Bot commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

❌ Failed checks (2 warnings)

Uh oh!

bmarimuthu-nv commented Jun 4, 2026

Uh oh!

bmarimuthu-nv commented Jun 4, 2026

Uh oh!

tensorrt-cicd commented Jun 4, 2026

Uh oh!

tensorrt-cicd commented Jun 4, 2026

Uh oh!

tensorrt-cicd commented Jun 4, 2026

Uh oh!

tensorrt-cicd commented Jun 4, 2026

Uh oh!

tensorrt-cicd commented Jun 4, 2026

Uh oh!

tensorrt-cicd commented Jun 4, 2026

Uh oh!

tensorrt-cicd commented Jun 4, 2026

Uh oh!

bmarimuthu-nv commented Jun 4, 2026

Uh oh!

tensorrt-cicd commented Jun 4, 2026

Uh oh!

tensorrt-cicd commented Jun 4, 2026

Uh oh!

bmarimuthu-nv commented Jun 4, 2026

Uh oh!

tensorrt-cicd commented Jun 4, 2026

Uh oh!

Uh oh!

tensorrt-cicd commented Jun 4, 2026

Uh oh!

tensorrt-cicd commented Jun 4, 2026

Uh oh!

bmarimuthu-nv commented Jun 4, 2026

Uh oh!

tensorrt-cicd commented Jun 4, 2026

Uh oh!

tensorrt-cicd commented Jun 5, 2026

Uh oh!

bmarimuthu-nv commented Jun 5, 2026

Uh oh!

tensorrt-cicd commented Jun 5, 2026

Uh oh!

tensorrt-cicd commented Jun 5, 2026

Uh oh!

bmarimuthu-nv commented Jun 5, 2026

Uh oh!

tensorrt-cicd commented Jun 5, 2026

Uh oh!

tensorrt-cicd commented Jun 5, 2026

Uh oh!

bmarimuthu-nv commented Jun 5, 2026

Uh oh!

tensorrt-cicd commented Jun 5, 2026

Uh oh!

tensorrt-cicd commented Jun 5, 2026

bmarimuthu-nv commented Jun 4, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 4, 2026 •

edited

Loading