[https://nvbugs/5911788][fix] Isolate single_gpu ray orchestrator tests to avoid CI timeouts by shuyixiong · Pull Request #12616 · NVIDIA/TensorRT-LLM

shuyixiong · 2026-03-31T07:48:51Z

Summary by CodeRabbit

Tests
- Updated test selection configuration for GPU testing infrastructure, refining which test cases are executed in the test suite.

Description

Test Coverage

PR Checklist

Please review the following before submitting your PR:

PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.
PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.
Test cases are provided for new code paths (see test instructions)
Any new dependencies have been scanned for license and vulnerabilities
CODEOWNERS updated if ownership changes
Documentation updated as needed
Update tava architecture diagram if there is a significant design change in PR.
The reviewers assigned automatically/manually are appropriate for the PR.
Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

To see a list of available CI bot commands, please comment /bot help.

shuyixiong · 2026-03-31T08:43:42Z

/bot run

tensorrt-cicd · 2026-03-31T09:52:08Z

PR_Github #40916 [ run ] triggered by Bot. Commit: e362e6e Link to invocation

tensorrt-cicd · 2026-03-31T10:12:50Z

PR_Github #40919 [ run ] triggered by Bot. Commit: e362e6e Link to invocation

tensorrt-cicd · 2026-03-31T10:12:52Z

PR_Github #40916 [ run ] completed with state ABORTED. Commit: e362e6e

Link to invocation

shuyixiong · 2026-03-31T10:43:53Z

/bot run

tensorrt-cicd · 2026-03-31T11:15:54Z

PR_Github #40930 [ run ] triggered by Bot. Commit: e362e6e Link to invocation

tensorrt-cicd · 2026-03-31T11:15:57Z

PR_Github #40919 [ run ] completed with state ABORTED. Commit: e362e6e

Link to invocation

shuyixiong · 2026-03-31T12:07:01Z

/bot run

tensorrt-cicd · 2026-03-31T12:12:41Z

PR_Github #40939 [ run ] triggered by Bot. Commit: 8c61bfd Link to invocation

tensorrt-cicd · 2026-03-31T12:12:43Z

PR_Github #40930 [ run ] completed with state ABORTED. Commit: e362e6e

Link to invocation

tensorrt-cicd · 2026-03-31T19:06:42Z

PR_Github #40939 [ run ] completed with state SUCCESS. Commit: 8c61bfd
/LLM/main/L0_MergeRequest_PR pipeline #31931 completed with status: 'SUCCESS'

CI Report

Link to invocation

coderabbitai · 2026-04-01T02:47:36Z

📝 Walkthrough

Walkthrough

This PR updates test selection and waive configurations for single-GPU ray orchestrator tests in the integration test suite. The change replaces a directory-level test selection with explicit individual test cases and removes waive (SKIP) entries for several parameterized test variants.

Changes

Cohort / File(s)	Summary
Test Selection and Waive Updates `tests/integration/test_lists/test-db/l0_h100.yml`, `tests/integration/test_lists/waives.txt`	Updated test selection to run specific test cases instead of the entire `unittest/_torch/ray_orchestrator/single_gpu` directory. Removed SKIP waive entries for `test_llm_partial_update_weights` and `test_llm_update_weights_with_quant_config` cases across multiple model variants (TinyLlama, Qwen2.5, Qwen3, including FP8 variants).

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The PR description lacks substantive content in required sections: Description and Test Coverage are empty, with only the template checklist provided.	Add details to the Description section explaining why test isolation is needed to avoid CI timeouts, and document specific test cases in Test Coverage that validate the changes.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The pull request title clearly indicates the main change: isolating single_gpu ray orchestrator tests to resolve CI timeout issues.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

tests/integration/test_lists/test-db/l0_h100.yml

shuyixiong · 2026-04-01T04:57:45Z

/bot run

tensorrt-cicd · 2026-04-01T05:04:06Z

PR_Github #41108 [ run ] triggered by Bot. Commit: da139f4 Link to invocation

Superjomn

LGTM

tensorrt-cicd · 2026-04-01T07:03:27Z

PR_Github #41108 [ run ] completed with state FAILURE. Commit: da139f4
/LLM/main/L0_MergeRequest_PR pipeline #32082 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

Signed-off-by: shuyixiong <219646547+shuyixiong@users.noreply.github.com>

shuyixiong · 2026-04-01T07:58:26Z

/bot run

tensorrt-cicd · 2026-04-01T08:04:05Z

PR_Github #41156 [ run ] triggered by Bot. Commit: 1178d95 Link to invocation

tensorrt-cicd · 2026-04-01T11:16:25Z

PR_Github #41156 [ run ] completed with state SUCCESS. Commit: 1178d95
/LLM/main/L0_MergeRequest_PR pipeline #32125 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

shuyixiong · 2026-04-01T11:23:31Z

/bot run

tensorrt-cicd · 2026-04-01T11:29:05Z

PR_Github #41190 [ run ] triggered by Bot. Commit: 1178d95 Link to invocation

tensorrt-cicd · 2026-04-01T14:14:59Z

PR_Github #41190 [ run ] completed with state SUCCESS. Commit: 1178d95
/LLM/main/L0_MergeRequest_PR pipeline #32152 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

shuyixiong · 2026-04-01T14:30:13Z

/bot run

tensorrt-cicd · 2026-04-01T14:35:46Z

PR_Github #41205 [ run ] triggered by Bot. Commit: 1178d95 Link to invocation

tensorrt-cicd · 2026-04-01T21:07:33Z

PR_Github #41205 [ run ] completed with state SUCCESS. Commit: 1178d95
/LLM/main/L0_MergeRequest_PR pipeline #32166 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

shuyixiong · 2026-04-02T05:28:46Z

/bot run

tensorrt-cicd · 2026-04-02T05:34:15Z

PR_Github #41347 [ run ] triggered by Bot. Commit: 1178d95 Link to invocation

tensorrt-cicd · 2026-04-02T09:49:34Z

PR_Github #41347 [ run ] completed with state SUCCESS. Commit: 1178d95
/LLM/main/L0_MergeRequest_PR pipeline #32293 completed with status: 'SUCCESS'

CI Report

Link to invocation

…ts to avoid CI timeouts (NVIDIA#12616) Signed-off-by: shuyixiong <219646547+shuyixiong@users.noreply.github.com>

github-actions bot assigned shuyixiong Mar 31, 2026

shuyixiong force-pushed the user/shuyix/isolate_ray_tests branch from 8a90efd to e362e6e Compare March 31, 2026 08:02

shuyixiong marked this pull request as ready for review April 1, 2026 02:44

shuyixiong requested a review from Superjomn April 1, 2026 02:49

Superjomn reviewed Apr 1, 2026

View reviewed changes

tests/integration/test_lists/test-db/l0_h100.yml Outdated Show resolved Hide resolved

shuyixiong requested a review from Superjomn April 1, 2026 04:57

Superjomn approved these changes Apr 1, 2026

View reviewed changes

shuyixiong added 3 commits April 1, 2026 15:58

Isolate single_gpu ray orchestrator tests to avoid CI timeouts

7fbdfec

Signed-off-by: shuyixiong <219646547+shuyixiong@users.noreply.github.com>

Unwaive test

37fd0f6

Signed-off-by: shuyixiong <219646547+shuyixiong@users.noreply.github.com>

Use pytest mark instead expanding all tests

1178d95

Signed-off-by: shuyixiong <219646547+shuyixiong@users.noreply.github.com>

shuyixiong force-pushed the user/shuyix/isolate_ray_tests branch from da139f4 to 1178d95 Compare April 1, 2026 07:58

shuyixiong enabled auto-merge (squash) April 1, 2026 14:30

shuyixiong merged commit fd09239 into NVIDIA:main Apr 2, 2026
5 checks passed

karen-sy pushed a commit to karen-sy/TensorRT-LLM that referenced this pull request Apr 7, 2026

[https://nvbugs/5911788][fix] Isolate single_gpu ray orchestrator tes…

c4f6218

…ts to avoid CI timeouts (NVIDIA#12616) Signed-off-by: shuyixiong <219646547+shuyixiong@users.noreply.github.com>

Conversation

shuyixiong commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Description

Test Coverage

PR Checklist

GitHub Bot Help

Uh oh!

shuyixiong commented Mar 31, 2026

Uh oh!

tensorrt-cicd commented Mar 31, 2026

Uh oh!

tensorrt-cicd commented Mar 31, 2026

Uh oh!

tensorrt-cicd commented Mar 31, 2026

Uh oh!

shuyixiong commented Mar 31, 2026

Uh oh!

tensorrt-cicd commented Mar 31, 2026

Uh oh!

tensorrt-cicd commented Mar 31, 2026

Uh oh!

shuyixiong commented Mar 31, 2026

Uh oh!

tensorrt-cicd commented Mar 31, 2026

Uh oh!

tensorrt-cicd commented Mar 31, 2026

Uh oh!

tensorrt-cicd commented Mar 31, 2026

Uh oh!

coderabbitai bot commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

Uh oh!

shuyixiong commented Apr 1, 2026

Uh oh!

tensorrt-cicd commented Apr 1, 2026

Uh oh!

Superjomn left a comment

Choose a reason for hiding this comment

Uh oh!

tensorrt-cicd commented Apr 1, 2026

Uh oh!

shuyixiong commented Apr 1, 2026

Uh oh!

tensorrt-cicd commented Apr 1, 2026

Uh oh!

tensorrt-cicd commented Apr 1, 2026

Uh oh!

shuyixiong commented Apr 1, 2026

Uh oh!

tensorrt-cicd commented Apr 1, 2026

Uh oh!

tensorrt-cicd commented Apr 1, 2026

Uh oh!

shuyixiong commented Apr 1, 2026

Uh oh!

tensorrt-cicd commented Apr 1, 2026

Uh oh!

tensorrt-cicd commented Apr 1, 2026

Uh oh!

shuyixiong commented Apr 2, 2026

Uh oh!

tensorrt-cicd commented Apr 2, 2026

Uh oh!

tensorrt-cicd commented Apr 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

shuyixiong commented Mar 31, 2026 •

edited

Loading

coderabbitai bot commented Apr 1, 2026 •

edited

Loading