Skip to content

[TRTLLM-7335][infra] use JobBuilder to trigger downstream job#7079

Merged
niukuo merged 2 commits intoNVIDIA:mainfrom
niukuo:downstream
Apr 2, 2026
Merged

[TRTLLM-7335][infra] use JobBuilder to trigger downstream job#7079
niukuo merged 2 commits intoNVIDIA:mainfrom
niukuo:downstream

Conversation

@niukuo
Copy link
Copy Markdown
Collaborator

@niukuo niukuo commented Aug 20, 2025

Summary by CodeRabbit

  • Chores
    • Enhanced build and test infrastructure with improved job execution consistency and streamlined testing workflows for better reliability and maintainability across the CI/CD pipeline.

Description

Test Coverage

PR Checklist

Please review the following before submitting your PR:

  • PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.

  • PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.

  • Test cases are provided for new code paths (see test instructions)

  • Any new dependencies have been scanned for license and vulnerabilities

  • CODEOWNERS updated if ownership changes

  • Documentation updated as needed

  • The reviewers assigned automatically/manually are appropriate for the PR.

  • Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

/bot [-h] ['run', 'kill', 'skip', 'reuse-pipeline'] ...

Provide a user friendly way for developers to interact with a Jenkins server.

Run /bot [-h|--help] to print this help message.

See details below for each supported subcommand.

Details

run [--reuse-test (optional)pipeline-id --disable-fail-fast --skip-test --stage-list "A10-PyTorch-1, xxx" --gpu-type "A30, H100_PCIe" --test-backend "pytorch, cpp" --add-multi-gpu-test --only-multi-gpu-test --disable-multi-gpu-test --post-merge --extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx" --detailed-log --debug(experimental)]

Launch build/test pipelines. All previously running jobs will be killed.

--reuse-test (optional)pipeline-id (OPTIONAL) : Allow the new pipeline to reuse build artifacts and skip successful test stages from a specified pipeline or the last pipeline if no pipeline-id is indicated. If the Git commit ID has changed, this option will be always ignored. The DEFAULT behavior of the bot is to reuse build artifacts and successful test results from the last pipeline.

--disable-reuse-test (OPTIONAL) : Explicitly prevent the pipeline from reusing build artifacts and skipping successful test stages from a previous pipeline. Ensure that all builds and tests are run regardless of previous successes.

--disable-fail-fast (OPTIONAL) : Disable fail fast on build/tests/infra failures.

--skip-test (OPTIONAL) : Skip all test stages, but still run build stages, package stages and sanity check stages. Note: Does NOT update GitHub check status.

--stage-list "A10-PyTorch-1, xxx" (OPTIONAL) : Only run the specified test stages. Examples: "A10-PyTorch-1, xxx". Note: Does NOT update GitHub check status.

--gpu-type "A30, H100_PCIe" (OPTIONAL) : Only run the test stages on the specified GPU types. Examples: "A30, H100_PCIe". Note: Does NOT update GitHub check status.

--test-backend "pytorch, cpp" (OPTIONAL) : Skip test stages which don't match the specified backends. Only support [pytorch, cpp, tensorrt, triton]. Examples: "pytorch, cpp" (does not run test stages with tensorrt or triton backend). Note: Does NOT update GitHub pipeline status.

--only-multi-gpu-test (OPTIONAL) : Only run the multi-GPU tests. Note: Does NOT update GitHub check status.

--disable-multi-gpu-test (OPTIONAL) : Disable the multi-GPU tests. Note: Does NOT update GitHub check status.

--add-multi-gpu-test (OPTIONAL) : Force run the multi-GPU tests in addition to running L0 pre-merge pipeline.

--post-merge (OPTIONAL) : Run the L0 post-merge pipeline instead of the ordinary L0 pre-merge pipeline.

--extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx" (OPTIONAL) : Run the ordinary L0 pre-merge pipeline and specified test stages. Examples: --extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx".

--detailed-log (OPTIONAL) : Enable flushing out all logs to the Jenkins console. This will significantly increase the log volume and may slow down the job.

--debug (OPTIONAL) : Experimental feature. Enable access to the CI container for debugging purpose. Note: Specify exactly one stage in the stage-list parameter to access the appropriate container environment. Note: Does NOT update GitHub check status.

For guidance on mapping tests to stage names, see docs/source/reference/ci-overview.md
and the scripts/test_to_stage_mapping.py helper.

kill

kill

Kill all running builds associated with pull request.

skip

skip --comment COMMENT

Skip testing for latest commit on pull request. --comment "Reason for skipping build/test" is required. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.

reuse-pipeline

reuse-pipeline

Reuse a previous pipeline to validate current commit. This action will also kill all currently running builds associated with the pull request. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.

@niukuo niukuo requested review from a team as code owners August 20, 2025 07:16
@niukuo niukuo requested review from tburt-nv and xinhe-nv August 20, 2025 07:16
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Aug 20, 2025

📝 Walkthrough

Walkthrough

Replaced the old triggerJob flow in jenkins/L0_MergeRequest.groovy with a refactored launchJob that accepts a pipeline argument, normalizes downstream job names, and invokes downstream runs via com.nvidia.bloom.JobBuilder.build(...) using a Logger, returning buildStatus and erroring when status != "SUCCESS".

Changes

Cohort / File(s) Summary
Primary pipeline file
jenkins/L0_MergeRequest.groovy
Removed triggerJob(...). Updated launchJob(...) signature to launchJob(pipeline, jobName, ...). Added Logger import and usage. Job name normalization (trim leading / or prepend job dir). Downstream invocation replaced with JobBuilder.build(pipeline, logger, jobName, parameters, 1, false) and returns buildStatus; errors when buildStatus != "SUCCESS". Replaced prior direct build/trigger calls across Build and Test stages to use new launchJob(pipeline, ...). Test stage labels simplified to “Remote Run” unconditionally.
Dependencies / imports
jenkins/L0_MergeRequest.groovy (top-level changes)
Added imports: com.nvidia.bloom.Logger and com.nvidia.bloom.JobBuilder.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor DevPipeline
  participant L0 as L0_MergeRequest.launchJob
  participant Logger as Logger
  participant JB as JobBuilder
  participant Down as DownstreamJob

  DevPipeline->>L0: launchJob(pipeline, jobName, params...)
  L0->>L0: normalize jobName (trim '/' or prepend dir)
  L0->>Logger: new Logger(pipeline)
  L0->>JB: JobBuilder.build(pipeline, logger, jobName, parameters, 1, false)
  JB->>Down: start downstream build
  Down-->>JB: build result (buildStatus)
  JB-->>L0: return buildStatus
  alt buildStatus == "SUCCESS"
    L0-->>DevPipeline: return "SUCCESS"
  else
    L0-->>DevPipeline: error / fail pipeline
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e72ade3 and 46b85d7.

📒 Files selected for processing (1)
  • jenkins/L0_MergeRequest.groovy (13 hunks)

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
jenkins/L0_MergeRequest.groovy (2)

940-969: Support Jenkins URL/credentials via parameters and guard against missing credentials before remote trigger

The current implementation only reads jenkinsUrl/credentials from env, which doesn’t align with the stated goal of “allowing Jenkins URL in parameters,” and it may attempt a remote trigger with a URL but no credentials, causing failures.

  • Honor optional keys in parameters (e.g., jenkinsUrl/downstreamJenkinsUrl and jobCredentials/downstreamJobCredentials/credentials).
  • Avoid null-vs-empty pitfalls by using truthiness checks.
  • Only use triggerRemoteJob when both jenkinsUrl and credentials are present; otherwise fall back to local build.
  • Do not forward the URL/credentials keys downstream.

Apply this diff:

-def triggerJob(jobName, parameters)
+def triggerJob(jobName, parameters)
 {
-    def jenkinsUrl = env.downstreamJenkinsUrl
-    def credentials = env.localJobCredentials
-    if (jenkinsUrl == "" && credentials) {
-        jenkinsUrl = env.JENKINS_URL
-    }
-    def status = ""
-    if (jenkinsUrl != "") {
-        def jobPath = trtllm_utils.resolveFullJobName(jobName).replace('/', '/job/').substring(1)
-        def handle = triggerRemoteJob(
-            job: "${jenkinsUrl}${jobPath}/",
-            auth: CredentialsAuth(credentials: credentials),
-            parameters: trtllm_utils.toRemoteBuildParameters(parameters),
-            pollInterval: 60,
-            abortTriggeredJob: true,
-        )
-        status = handle.getBuildResult().toString()
-    } else {
-        def handle = build(
-            job: jobName,
-            parameters: trtllm_utils.toBuildParameters(parameters),
-            propagate: false,
-        )
-        echo "Triggered job: ${handle.absoluteUrl}"
-        status = handle.result
-    }
-    return status
+    // Make a defensive copy and allow overrides from parameters
+    def paramMap = parameters ? new LinkedHashMap(parameters) : [:]
+    def paramJenkinsUrl = (paramMap.remove('jenkinsUrl') ?: paramMap.remove('downstreamJenkinsUrl')) as String
+    def paramCredentials = (paramMap.remove('jobCredentials') ?: paramMap.remove('downstreamJobCredentials') ?: paramMap.remove('credentials')) as String
+
+    def jenkinsUrl = paramJenkinsUrl ?: env.downstreamJenkinsUrl
+    def credentials = paramCredentials ?: env.localJobCredentials
+    if (!jenkinsUrl?.trim() && credentials?.trim()) {
+        // Allow remote trigger to same controller only if credentials are present
+        jenkinsUrl = env.JENKINS_URL
+    }
+
+    def status = ""
+    boolean useRemote = jenkinsUrl?.trim() && credentials?.trim()
+    if (useRemote) {
+        def jobPath = trtllm_utils.resolveFullJobName(jobName).replace('/', '/job/').substring(1)
+        echo "Triggering remote job: ${jenkinsUrl}${jobPath}/"
+        def handle = triggerRemoteJob(
+            job: "${jenkinsUrl}${jobPath}/",
+            auth: CredentialsAuth(credentials: credentials),
+            parameters: trtllm_utils.toRemoteBuildParameters(paramMap),
+            pollInterval: 60,
+            abortTriggeredJob: true,
+        )
+        status = handle.getBuildResult().toString()
+    } else {
+        def handle = build(
+            job: jobName,
+            parameters: trtllm_utils.toBuildParameters(paramMap),
+            propagate: false,
+        )
+        echo "Triggered job: ${handle.absoluteUrl}"
+        status = handle.result
+    }
+    return status
 }

948-957: Prevent remote trigger without credentials

If env.downstreamJenkinsUrl is set but credentials are missing, this path will call triggerRemoteJob with empty credentials, likely failing.

Gate the remote path on both jenkinsUrl and credentials being non-blank. Example:

-    if (jenkinsUrl != "") {
+    if (jenkinsUrl?.trim() && credentials?.trim()) {
         // remote trigger
     } else {
         // local build
     }
🧹 Nitpick comments (1)
jenkins/L0_MergeRequest.groovy (1)

942-946: Fix null/blank handling for env.downstreamJenkinsUrl

Comparing with an empty string misses the null case. If env.downstreamJenkinsUrl is null, the fallback to env.JENKINS_URL won’t execute, changing behavior unintentionally.

Use truthiness checks:

-    if (jenkinsUrl == "" && credentials) {
+    if (!jenkinsUrl?.trim() && credentials?.trim()) {
         jenkinsUrl = env.JENKINS_URL
     }
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 020fed9 and 687f1d1.

📒 Files selected for processing (1)
  • jenkins/L0_MergeRequest.groovy (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Pre-commit Check
🔇 Additional comments (1)
jenkins/L0_MergeRequest.groovy (1)

970-1007: No outdated triggerJob call sites found

A repository-wide search confirms that triggerJob is only defined with two parameters and invoked with two arguments (e.g., in jenkins/L0_MergeRequest.groovy:1002). No three-argument usages remain.

@niukuo
Copy link
Copy Markdown
Collaborator Author

niukuo commented Aug 21, 2025

/bot run --skip-test

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #15970 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #15970 [ run ] completed with state FAILURE
/LLM/main/L0_MergeRequest_PR pipeline #12003 (Partly Tested) completed with status: 'FAILURE'

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #16107 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #16109 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #16107 [ run ] completed with state ABORTED

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
jenkins/L0_MergeRequest.groovy (1)

1027-1027: Stage labels tied to credentials are consistent with the updated trigger logic.

Using env.downstreamJobCredentials ? "Remote Run" : "Run" keeps the UI aligned with the proposed “remote only when credentials exist” behavior in triggerJob. Looks good.

Two tiny follow-ups:

  • SBSA block still defines def jenkinsUrl = "" and def credentials = "" above Line 1121 but no longer uses them; consider removing to avoid noise.
  • If you want to de-dup the label logic, add a small helper and use it in all four places. Example (outside this block):
// near other helpers
def remoteLabel() { (env.downstreamJobCredentials ? "Remote Run" : "Run") }

Then:

def testStageName = "[Test-x86_64-Single-GPU] ${remoteLabel()}"
// ...and similarly for the other three

Also applies to: 1084-1084, 1121-1121, 1188-1188

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 9787b1b and 3b858dc.

📒 Files selected for processing (1)
  • jenkins/L0_MergeRequest.groovy (5 hunks)
🔇 Additional comments (1)
jenkins/L0_MergeRequest.groovy (1)

940-968: No stray downstreamJenkinsUrl references detected; changes approved.

Scan results confirm that the only place env.downstreamJenkinsUrl is read is within the triggerJob method (lines 942–945), immediately alongside env.downstreamJobCredentials. No other code paths set or use downstreamJenkinsUrl without pairing it with downstreamJobCredentials. The hardening diff therefore covers the single trigger point and safely falls back to a local build when credentials are missing.

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #16109 [ run ] completed with state FAILURE
/LLM/main/L0_MergeRequest_PR pipeline #12117 (Partly Tested) completed with status: 'FAILURE'

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #16116 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #16116 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #12123 (Partly Tested) completed with status: 'FAILURE'

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #16806 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #16806 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #12616 (Partly Tested) completed with status: 'SUCCESS'

@niukuo
Copy link
Copy Markdown
Collaborator Author

niukuo commented Aug 28, 2025

/bot run

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #16845 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #16845 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #12646 completed with status: 'SUCCESS'
Pipeline passed with automatic retried tests. Check the rerun report for details.

@chzblych chzblych changed the title [TRTLLM-7335][infra] allows jenkins url in parameters [TRTLLM-7335][infra] allows Jenkins URL in parameters Sep 2, 2025
@niukuo
Copy link
Copy Markdown
Collaborator Author

niukuo commented Sep 18, 2025

/bot run --skip-test

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #21128 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #21128 [ run ] completed with state FAILURE
/LLM/main/L0_MergeRequest_PR pipeline #15961 (Partly Tested) completed with status: 'FAILURE'

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #21169 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #21169 [ run ] completed with state FAILURE
/LLM/main/L0_MergeRequest_PR pipeline #15982 (Partly Tested) completed with status: 'FAILURE'

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #21414 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #21414 [ run ] completed with state FAILURE
/LLM/main/L0_MergeRequest_PR pipeline #16172 (Partly Tested) completed with status: 'FAILURE'

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #21417 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #21417 [ run ] completed with state FAILURE
/LLM/main/L0_MergeRequest_PR pipeline #16175 (Partly Tested) completed with status: 'FAILURE'

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #21420 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #21420 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #16176 (Partly Tested) completed with status: 'FAILURE'

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #21430 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #21430 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #16184 (Partly Tested) completed with status: 'SUCCESS'

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #21667 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #21667 [ run ] completed with state FAILURE
/LLM/main/L0_MergeRequest_PR pipeline #16326 (Partly Tested) completed with status: 'FAILURE'

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #21668 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #21668 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #16327 (Partly Tested) completed with status: 'SUCCESS'

@niukuo
Copy link
Copy Markdown
Collaborator Author

niukuo commented Oct 17, 2025

@coderabbitai summary

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Oct 17, 2025

✅ Actions performed

Summary regeneration triggered.

Signed-off-by: Yiteng Niu <6831097+niukuo@users.noreply.github.com>
@niukuo
Copy link
Copy Markdown
Collaborator Author

niukuo commented Apr 1, 2026

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #41209 [ run ] triggered by Bot. Commit: dd94657 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #41209 [ run ] completed with state SUCCESS. Commit: dd94657
/LLM/main/L0_MergeRequest_PR pipeline #32171 completed with status: 'SUCCESS'
Pipeline passed with automatic retried tests. Check the rerun report for details.

CI Report

Link to invocation

@niukuo niukuo merged commit a3b686d into NVIDIA:main Apr 2, 2026
5 checks passed
karen-sy pushed a commit to karen-sy/TensorRT-LLM that referenced this pull request Apr 7, 2026
…#7079)

Signed-off-by: Yiteng Niu <6831097+niukuo@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants