[None][test] Remove RTX-6000 OOM test cases by yufeiwu-nv · Pull Request #12800 · NVIDIA/TensorRT-LLM

yufeiwu-nv · 2026-04-07T08:16:17Z

Summary by CodeRabbit

Release Notes

Tests
- Extended timeout for non-error performance test hangs from 10 to 30 minutes to improve test reliability
- Rebalanced performance test configurations across different GPU hardware setups to enhance coverage for B200, B300, GB300, and RTX6000 server environments

Description

Test Coverage

PR Checklist

Please review the following before submitting your PR:

PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.
PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.
Test cases are provided for new code paths (see test instructions)
Any new dependencies have been scanned for license and vulnerabilities
CODEOWNERS updated if ownership changes
Documentation updated as needed
Update tava architecture diagram if there is a significant design change in PR.
The reviewers assigned automatically/manually are appropriate for the PR.
Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

To see a list of available CI bot commands, please comment /bot help.

Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>

yufeiwu-nv · 2026-04-07T08:18:23Z

/bot skip --comment "only test list modify"

github-actions · 2026-04-07T08:23:10Z

👎 Promotion blocked, new vulnerability found

Vulnerability report

Component	Vulnerability	Description	Severity
encode/uvicorn	CVE-2020-7694	This affects all versions of package uvicorn. The request logger provided by the package is vulnerable to ASNI escape sequence injection. Whenever any HTTP request is received, the default behaviour of uvicorn is to log its details to either the console or a log file. When attackers request crafted URLs with percent-encoded escape sequences, the logging component will log the URL after it's been processed with urllib.parse.unquote, therefore converting any percent-encoded characters into their single-character equivalent, which can have special meaning in terminal emulators. By requesting URLs with crafted paths, attackers can: * Pollute uvicorn's access logs, therefore jeopardising the integrity of such files. * Use ANSI sequence codes to attempt to interact with the terminal emulator that's displaying the logs (either in real time or from a file).	HIGH
encode/uvicorn	CVE-2020-7695	Uvicorn before 0.11.7 is vulnerable to HTTP response splitting. CRLF sequences are not escaped in the value of HTTP headers. Attackers can exploit this to add arbitrary headers to HTTP responses, or even return an arbitrary response body, whenever crafted input is used to construct HTTP headers.	MEDIUM

coderabbitai · 2026-04-07T08:23:23Z

📝 Walkthrough

Walkthrough

The stall timeout for command execution in performance utilities was increased from 600 to 1800 seconds, with documentation comments added. Performance test cases for deepseek_r1_0528_fp4 were reorganized across different GPU system configurations in the test list.

Changes

Cohort / File(s)	Summary
Utility Timeout Configuration `tests/integration/defs/perf/utils.py`	Increased stall timeout from 600 to 1800 seconds with added comments documenting 30-minute kill threshold for non-error hangs and 3-minute threshold for error states.
Performance Test Case Reorganization `tests/integration/test_lists/qa/llm_perf_core.yml`	Removed and repositioned multiple deepseek_r1_0528_fp4 fp4 test cases between GPU configuration sections; moved tests from RTX6000-Server to B200/GB200/B300/GB300 chunked prefill subsection; added new fp4 max throughput test for B200/B300 condition.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The PR description is entirely empty, containing only the repository template with placeholders and no actual description of changes, rationale, or test coverage.	Provide a concrete description explaining the OOM test case changes, the rationale for moving/removing RTX-6000 tests, and what test coverage validates these changes.
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title is partially related to the changeset, referring to RTX-6000 OOM test removal, but the changes also include increased stall timeout and test reorganization across GPU sections.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/integration/defs/perf/utils.py`:
- Around line 135-138: The comments above the constants _STALL_TIMEOUT and
_ERROR_STALL_TIMEOUT violate PEP 8 E265 by missing a space after the '#'
characters; update the two comment lines that precede those constants so each
'#' is followed by a single space (e.g., "# if hang time > 30 mins, it will be
killed") to fix the lint error while leaving the constant names and values
unchanged.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: f8627007-c275-47aa-a5fc-f02cc3b1a2da

📥 Commits

Reviewing files that changed from the base of the PR and between 4e69c14 and e242fd5.

📒 Files selected for processing (2)

tests/integration/defs/perf/utils.py
tests/integration/test_lists/qa/llm_perf_core.yml

yufeiwu-nv · 2026-04-07T09:24:15Z

/bot skip --comment "only test list modify"

tensorrt-cicd · 2026-04-07T09:30:43Z

PR_Github #42119 [ skip ] triggered by Bot. Commit: 545982e Link to invocation

tensorrt-cicd · 2026-04-07T09:42:02Z

PR_Github #42119 [ skip ] completed with state SUCCESS. Commit: 545982e
Skipping testing for commit 545982e

Link to invocation

Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>

yufeiwu-nv added 2 commits April 7, 2026 08:10

increase stall handle time

10f66a3

Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>

improve to 30 mins

b52b93f

Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>

yufeiwu-nv requested review from a team as code owners April 7, 2026 08:16

github-actions bot assigned yufeiwu-nv Apr 7, 2026

adjust OOM test csases

e242fd5

Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>

coderabbitai bot reviewed Apr 7, 2026

View reviewed changes

Comment thread tests/integration/defs/perf/utils.py

Merge branch 'main' into fix_RTX

545982e

ruodil approved these changes Apr 8, 2026

View reviewed changes

ruodil merged commit 6175e2c into NVIDIA:main Apr 8, 2026
5 checks passed

suyoggupta pushed a commit to nv-auto-deploy/TensorRT-LLM that referenced this pull request Apr 8, 2026

[None][test] Remove RTX-6000 OOM test cases (NVIDIA#12800)

0d91c77

Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[None][test] Remove RTX-6000 OOM test cases#12800

[None][test] Remove RTX-6000 OOM test cases#12800
ruodil merged 4 commits intoNVIDIA:mainfrom
yufeiwu-nv:fix_RTX

yufeiwu-nv commented Apr 7, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

yufeiwu-nv commented Apr 7, 2026

Uh oh!

github-actions bot commented Apr 7, 2026

Uh oh!

coderabbitai bot commented Apr 7, 2026

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (2 warnings)

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

yufeiwu-nv commented Apr 7, 2026

Uh oh!

tensorrt-cicd commented Apr 7, 2026

Uh oh!

tensorrt-cicd commented Apr 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

yufeiwu-nv commented Apr 7, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Description

Test Coverage

PR Checklist

GitHub Bot Help

Uh oh!

yufeiwu-nv commented Apr 7, 2026

Uh oh!

github-actions bot commented Apr 7, 2026

👎 Promotion blocked, new vulnerability found

Vulnerability report

Uh oh!

coderabbitai bot commented Apr 7, 2026

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (2 warnings)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

yufeiwu-nv commented Apr 7, 2026

Uh oh!

tensorrt-cicd commented Apr 7, 2026

Uh oh!

tensorrt-cicd commented Apr 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

yufeiwu-nv commented Apr 7, 2026 •

edited by coderabbitai bot

Loading