Skip to content

Conversation

@Shixiaowei02
Copy link
Collaborator

@Shixiaowei02 Shixiaowei02 commented Dec 2, 2025

Summary by CodeRabbit

  • Chores
    • Enhanced CI/CD pipeline configuration to expand multi-GPU testing coverage.

✏️ Tip: You can customize this high-level summary in your review settings.

@Shixiaowei02 Shixiaowei02 requested review from a team as code owners December 2, 2025 06:59
@Shixiaowei02 Shixiaowei02 requested review from kxdc and ruodil December 2, 2025 06:59
@Shixiaowei02
Copy link
Collaborator Author

/bot run

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 2, 2025

📝 Walkthrough

Walkthrough

A test file path is added to the multi-GPU related changed file list in the Jenkins Groovy script, expanding the files that trigger multi-GPU logic during CI/CD pipeline execution.

Changes

Cohort / File(s) Change Summary
Jenkins Pipeline Configuration
jenkins/L0_MergeRequest.groovy
Added tests/integration/defs/accuracy/test_disaggregated_serving.py to the multi-GPU file detection list in getMultiGpuFileChanged

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Description check ⚠️ Warning The pull request has no description content beyond '@coderabbitai summary', missing required sections like Description, Test Coverage, and PR Checklist. Add a proper description explaining what changes were made and why, document test coverage, and complete the PR Checklist as specified in the template.
Title check ⚠️ Warning The PR title mentions shortening time limits for dis-agg accuracy testing, but the actual change adds a test file to the multi-GPU file list—unrelated to time limits. Update the title to accurately reflect the change: something like '[TRTLLM-6537][chore] Add disaggregated serving test to multi-GPU file list' or similar.
✅ Passed checks (1 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@tensorrt-cicd
Copy link
Collaborator

PR_Github #26544 [ run ] triggered by Bot. Commit: b9e7c4f

@tensorrt-cicd
Copy link
Collaborator

PR_Github #26544 [ run ] completed with state SUCCESS. Commit: b9e7c4f
/LLM/main/L0_MergeRequest_PR pipeline #20186 completed with status: 'SUCCESS'
Pipeline passed with automatic retried tests. Check the rerun report for details.

@Shixiaowei02 Shixiaowei02 force-pushed the user/xiaoweis/check_list branch from b9e7c4f to 961d8a1 Compare December 3, 2025 03:15
@Shixiaowei02
Copy link
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #26702 [ run ] triggered by Bot. Commit: 961d8a1

@Shixiaowei02 Shixiaowei02 force-pushed the user/xiaoweis/check_list branch from 961d8a1 to f6299b0 Compare December 3, 2025 14:51
@Shixiaowei02
Copy link
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #26800 [ run ] triggered by Bot. Commit: f6299b0

@tensorrt-cicd
Copy link
Collaborator

PR_Github #26702 [ run ] completed with state ABORTED. Commit: 961d8a1
/LLM/main/L0_MergeRequest_PR pipeline #20318 completed with status: 'FAILURE'

@Shixiaowei02 Shixiaowei02 force-pushed the user/xiaoweis/check_list branch from f6299b0 to 21f2ba7 Compare December 3, 2025 15:05
@Shixiaowei02 Shixiaowei02 reopened this Dec 3, 2025
@Shixiaowei02 Shixiaowei02 force-pushed the user/xiaoweis/check_list branch 2 times, most recently from f6d9ac7 to 1f6ceee Compare December 3, 2025 15:11
@Shixiaowei02
Copy link
Collaborator Author

/bot run --add-multi-gpu-test --disable-fail-fast

@tensorrt-cicd
Copy link
Collaborator

PR_Github #26803 [ run ] triggered by Bot. Commit: 1f6ceee

@tensorrt-cicd
Copy link
Collaborator

PR_Github #26800 [ run ] completed with state ABORTED. Commit: f6299b0
LLM/main/L0_MergeRequest_PR #20406 (Blue Ocean) completed with status: ABORTED

@tensorrt-cicd
Copy link
Collaborator

PR_Github #26803 [ run ] completed with state SUCCESS. Commit: 1f6ceee
/LLM/main/L0_MergeRequest_PR pipeline #20409 completed with status: 'FAILURE'

@Shixiaowei02 Shixiaowei02 force-pushed the user/xiaoweis/check_list branch from 1f6ceee to 7cca46a Compare December 4, 2025 06:12
@Shixiaowei02
Copy link
Collaborator Author

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Collaborator

PR_Github #26893 [ run ] triggered by Bot. Commit: 7cca46a

@Shixiaowei02 Shixiaowei02 force-pushed the user/xiaoweis/check_list branch from 7cca46a to a07c39e Compare December 4, 2025 11:26
@Shixiaowei02
Copy link
Collaborator Author

/bot help

@github-actions
Copy link

github-actions bot commented Dec 4, 2025

GitHub Bot Help

/bot [-h] ['run', 'kill', 'skip', 'reuse-pipeline'] ...

Provide a user friendly way for developers to interact with a Jenkins server.

Run /bot [-h|--help] to print this help message.

See details below for each supported subcommand.

Details

run [--reuse-test (optional)pipeline-id --disable-fail-fast --skip-test --stage-list "A10-PyTorch-1, xxx" --gpu-type "A30, H100_PCIe" --test-backend "pytorch, cpp" --add-multi-gpu-test --only-multi-gpu-test --disable-multi-gpu-test --post-merge --extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx" --detailed-log --debug(experimental)]

Launch build/test pipelines. All previously running jobs will be killed.

--reuse-test (optional)pipeline-id (OPTIONAL) : Allow the new pipeline to reuse build artifacts and skip successful test stages from a specified pipeline or the last pipeline if no pipeline-id is indicated. If the Git commit ID has changed, this option will be always ignored. The DEFAULT behavior of the bot is to reuse build artifacts and successful test results from the last pipeline.

--disable-reuse-test (OPTIONAL) : Explicitly prevent the pipeline from reusing build artifacts and skipping successful test stages from a previous pipeline. Ensure that all builds and tests are run regardless of previous successes.

--disable-fail-fast (OPTIONAL) : Disable fail fast on build/tests/infra failures.

--skip-test (OPTIONAL) : Skip all test stages, but still run build stages, package stages and sanity check stages. Note: Does NOT update GitHub check status.

--stage-list "A10-PyTorch-1, xxx" (OPTIONAL) : Only run the specified test stages. Examples: "A10-PyTorch-1, xxx". Note: Does NOT update GitHub check status.

--gpu-type "A30, H100_PCIe" (OPTIONAL) : Only run the test stages on the specified GPU types. Examples: "A30, H100_PCIe". Note: Does NOT update GitHub check status.

--test-backend "pytorch, cpp" (OPTIONAL) : Skip test stages which don't match the specified backends. Only support [pytorch, cpp, tensorrt, triton]. Examples: "pytorch, cpp" (does not run test stages with tensorrt or triton backend). Note: Does NOT update GitHub pipeline status.

--only-multi-gpu-test (OPTIONAL) : Only run the multi-GPU tests. Note: Does NOT update GitHub check status.

--disable-multi-gpu-test (OPTIONAL) : Disable the multi-GPU tests. Note: Does NOT update GitHub check status.

--add-multi-gpu-test (OPTIONAL) : Force run the multi-GPU tests in addition to running L0 pre-merge pipeline.

--post-merge (OPTIONAL) : Run the L0 post-merge pipeline instead of the ordinary L0 pre-merge pipeline.

--extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx" (OPTIONAL) : Run the ordinary L0 pre-merge pipeline and specified test stages. Examples: --extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx".

--detailed-log (OPTIONAL) : Enable flushing out all logs to the Jenkins console. This will significantly increase the log volume and may slow down the job.

--debug (OPTIONAL) : Experimental feature. Enable access to the CI container for debugging purpose. Note: Specify exactly one stage in the stage-list parameter to access the appropriate container environment. Note: Does NOT update GitHub check status.

kill

kill

Kill all running builds associated with pull request.

skip

skip --comment COMMENT

Skip testing for latest commit on pull request. --comment "Reason for skipping build/test" is required. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.

reuse-pipeline

reuse-pipeline

Reuse a previous pipeline to validate current commit. This action will also kill all currently running builds associated with the pull request. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.

@tensorrt-cicd
Copy link
Collaborator

PR_Github #27066 [ run ] completed with state SUCCESS. Commit: 40328b0
/LLM/main/L0_MergeRequest_PR pipeline #20642 (Partly Tested) completed with status: 'FAILURE'

@Shixiaowei02
Copy link
Collaborator Author

/bot run --only-multi-gpu-test

@tensorrt-cicd
Copy link
Collaborator

PR_Github #27115 [ run ] triggered by Bot. Commit: 6057487

@tensorrt-cicd
Copy link
Collaborator

PR_Github #27115 [ run ] completed with state SUCCESS. Commit: 6057487
/LLM/main/L0_MergeRequest_PR pipeline #20686 (Partly Tested) completed with status: 'FAILURE'

@Shixiaowei02
Copy link
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #27234 [ run ] triggered by Bot. Commit: 22947ac

@tensorrt-cicd
Copy link
Collaborator

PR_Github #27234 [ run ] completed with state SUCCESS. Commit: 22947ac
/LLM/main/L0_MergeRequest_PR pipeline #20793 completed with status: 'FAILURE'

Signed-off-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
Signed-off-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
@Shixiaowei02 Shixiaowei02 force-pushed the user/xiaoweis/check_list branch from 22947ac to 39138e7 Compare December 8, 2025 12:47
@Shixiaowei02
Copy link
Collaborator Author

/bot run --only-multi-gpu-test

@tensorrt-cicd
Copy link
Collaborator

PR_Github #27309 [ run ] triggered by Bot. Commit: 39138e7

@tensorrt-cicd
Copy link
Collaborator

PR_Github #27309 [ run ] completed with state SUCCESS. Commit: 39138e7
/LLM/main/L0_MergeRequest_PR pipeline #20860 (Partly Tested) completed with status: 'SUCCESS'
Pipeline passed with automatic retried tests. Check the rerun report for details.

@Shixiaowei02 Shixiaowei02 requested a review from chzblych December 9, 2025 02:07
@Shixiaowei02
Copy link
Collaborator Author

/bot skip --comment "CI was tested separately using 20186 (single GPU) and 20860 (multiple GPUs)"

@tensorrt-cicd
Copy link
Collaborator

PR_Github #27388 Bot args parsing error: usage: /bot [-h]
{run,kill,skip,submit,reviewers,reuse-pipeline,reuse-review} ...
/bot: error: unrecognized arguments: CI was tested separately using 20186 (single GPU) and 20860 (multiple GPUs)

@Shixiaowei02 Shixiaowei02 enabled auto-merge (squash) December 9, 2025 02:08
@Shixiaowei02
Copy link
Collaborator Author

/bot skip --comment "CI was tested separately using 20186 single GPU and 20860 multiple GPUs"

@Shixiaowei02 Shixiaowei02 changed the title [TRTLLM-6537][infra] extend multi-gpu tests related file list [TRTLLM-6537][infra] Shorten the time limit for dis-agg accuracy testing Dec 9, 2025
@Shixiaowei02 Shixiaowei02 changed the title [TRTLLM-6537][infra] Shorten the time limit for dis-agg accuracy testing [TRTLLM-6537][chore] Shorten the time limit for dis-agg accuracy testing Dec 9, 2025
@tensorrt-cicd
Copy link
Collaborator

PR_Github #27390 [ skip ] triggered by Bot. Commit: 39138e7

@tensorrt-cicd
Copy link
Collaborator

PR_Github #27391 [ skip ] triggered by Bot. Commit: 39138e7

@tensorrt-cicd
Copy link
Collaborator

PR_Github #27390 [ skip ] completed with state ABORTED. Commit: 39138e7

@tensorrt-cicd
Copy link
Collaborator

PR_Github #27391 [ skip ] completed with state SUCCESS. Commit: 39138e7
Skipping testing for commit 39138e7

@Shixiaowei02 Shixiaowei02 merged commit b050804 into NVIDIA:main Dec 9, 2025
9 checks passed
@Shixiaowei02 Shixiaowei02 deleted the user/xiaoweis/check_list branch December 9, 2025 05:38
usberkeley pushed a commit to usberkeley/TensorRT-LLM that referenced this pull request Dec 11, 2025
…#9614)

Signed-off-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
codego7250 pushed a commit to codego7250/TensorRT-LLM that referenced this pull request Dec 11, 2025
…#9614)

Signed-off-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
codego7250 pushed a commit to codego7250/TensorRT-LLM that referenced this pull request Dec 13, 2025
…#9614)

Signed-off-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants