[None][ci] Fix misleading still running log when Slurm job is PENDING by QiJune · Pull Request #13586 · NVIDIA/TensorRT-LLM

QiJune · 2026-04-29T01:30:45Z

Summary by CodeRabbit

Bug Fixes
- Enhanced Slurm job monitoring log messages to display the actual job state instead of a generic status indicator, with graceful fallback to "UNKNOWN" when state information is unavailable.

Description

Test Coverage

PR Checklist

Please review the following before submitting your PR:

PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.
PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.
Test cases are provided for new code paths (see test instructions)
Any new dependencies have been scanned for license and vulnerabilities
CODEOWNERS updated if ownership changes
Documentation updated as needed
Update tava architecture diagram if there is a significant design change in PR.
The reviewers assigned automatically/manually are appropriate for the PR.
Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

To see a list of available CI bot commands, please comment /bot help.

Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>

coderabbitai · 2026-04-29T01:32:11Z

📝 Walkthrough

Walkthrough

Updates a Slurm monitoring log message in Jenkins test configuration to display the resolved sacct status value during polling, falling back to "UNKNOWN" when the status is unset, replacing a static message.

Changes

Cohort / File(s)	Summary
Slurm Monitoring Log Update `jenkins/L0_Test.groovy`	Modified the status polling log message to output the actual resolved `STATUS` value from `sacct` command output with an "UNKNOWN" fallback when status is empty, instead of a static "is still running" message.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The PR description is incomplete; it contains only the template with placeholders for Description and Test Coverage sections, which were not filled out by the author.	Fill in the Description section with an explanation of the issue and solution, and the Test Coverage section documenting relevant tests that validate the changes.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly summarizes the main change: fixing a misleading 'still running' log message for Slurm jobs in PENDING state, which directly relates to the code change in jenkins/L0_Test.groovy.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Review rate limit: 9/10 reviews remaining, refill in 6 minutes.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (1)

jenkins/L0_Test.groovy (1)
1374-1383: ⚠️ Optional: make STATUS handling robust if sacct --allocations outputs multiple lines.

Right now STATUS=$(sacct ... --format=State -Pn --allocations) can potentially contain multiple newline-separated states (depending on Slurm accounting granularity). The subsequent check does direct string equality comparisons, which can behave unexpectedly if STATUS is multi-line (e.g., the else branch may trigger even though the job is still transitioning).

If in practice sacct --allocations sometimes returns multiple states, consider normalizing STATUS to a single token before both the condition and the log, e.g. selecting the last line:
Suggested tweak (normalize `STATUS` to one state)
-                        STATUS=\$(sacct -j \$jobId --format=State -Pn --allocations)
+                        STATUS=\$(sacct -j \$jobId --format=State -Pn --allocations | tail -n 1)
This is optional and only needed if you confirm multi-line STATUS occurs in your Slurm environment.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@jenkins/L0_Test.groovy` around lines 1374 - 1383, The STATUS variable from
the sacct call can be multi-line; update the assignment inside the loop (where
STATUS=\$(sacct ... --format=State -Pn --allocations) is set) to normalize to a
single token (e.g., extract the last non-empty line or last field) before the
if-check and echo, then use that normalized STATUS for the equality checks and
log messages so multi-line outputs won't break the RUNNING/PENDING/CONFIGURING
comparisons.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@jenkins/L0_Test.groovy`:
- Around line 1374-1383: The STATUS variable from the sacct call can be
multi-line; update the assignment inside the loop (where STATUS=\$(sacct ...
--format=State -Pn --allocations) is set) to normalize to a single token (e.g.,
extract the last non-empty line or last field) before the if-check and echo,
then use that normalized STATUS for the equality checks and log messages so
multi-line outputs won't break the RUNNING/PENDING/CONFIGURING comparisons.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 5bf25ec6-cd47-4674-9649-f00c3d848b03

📥 Commits

Reviewing files that changed from the base of the PR and between b5c41f2 and b20c863.

📒 Files selected for processing (1)

jenkins/L0_Test.groovy

QiJune · 2026-04-29T01:56:31Z

/bot skip --comment "trivial changes"

tensorrt-cicd · 2026-04-29T02:02:51Z

PR_Github #46022 [ skip ] triggered by Bot. Commit: b20c863 Link to invocation

tensorrt-cicd · 2026-04-29T02:11:35Z

PR_Github #46022 [ skip ] completed with state SUCCESS. Commit: b20c863
Skipping testing for commit b20c863

Link to invocation

Fix misleading still running log when Slurm job is PENDING

b20c863

Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>

QiJune requested review from a team as code owners April 29, 2026 01:30

QiJune requested review from niukuo and zeroepoch April 29, 2026 01:30

github-actions Bot assigned QiJune Apr 29, 2026

QiJune requested a review from ZhanruiSunCh April 29, 2026 01:30

coderabbitai Bot reviewed Apr 29, 2026

View reviewed changes

ZhanruiSunCh approved these changes Apr 29, 2026

View reviewed changes

QiJune enabled auto-merge (squash) April 29, 2026 01:56

QiJune merged commit c9e3d9a into NVIDIA:main Apr 29, 2026
11 of 12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[None][ci] Fix misleading still running log when Slurm job is PENDING#13586

[None][ci] Fix misleading still running log when Slurm job is PENDING#13586
QiJune merged 1 commit intoNVIDIA:mainfrom
QiJune:refine_log

QiJune commented Apr 29, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 29, 2026

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

QiJune commented Apr 29, 2026

Uh oh!

tensorrt-cicd commented Apr 29, 2026

Uh oh!

tensorrt-cicd commented Apr 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

QiJune commented Apr 29, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Description

Test Coverage

PR Checklist

GitHub Bot Help

Uh oh!

coderabbitai Bot commented Apr 29, 2026

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

QiJune commented Apr 29, 2026

Uh oh!

tensorrt-cicd commented Apr 29, 2026

Uh oh!

tensorrt-cicd commented Apr 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

QiJune commented Apr 29, 2026 •

edited by coderabbitai Bot

Loading