Skip to content

[None][ci] Fix misleading still running log when Slurm job is PENDING#13586

Merged
QiJune merged 1 commit intoNVIDIA:mainfrom
QiJune:refine_log
Apr 29, 2026
Merged

[None][ci] Fix misleading still running log when Slurm job is PENDING#13586
QiJune merged 1 commit intoNVIDIA:mainfrom
QiJune:refine_log

Conversation

@QiJune
Copy link
Copy Markdown
Collaborator

@QiJune QiJune commented Apr 29, 2026

Summary by CodeRabbit

  • Bug Fixes
    • Enhanced Slurm job monitoring log messages to display the actual job state instead of a generic status indicator, with graceful fallback to "UNKNOWN" when state information is unavailable.

Description

Test Coverage

PR Checklist

Please review the following before submitting your PR:

  • PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.

  • PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.

  • Test cases are provided for new code paths (see test instructions)

  • Any new dependencies have been scanned for license and vulnerabilities

  • CODEOWNERS updated if ownership changes

  • Documentation updated as needed

  • Update tava architecture diagram if there is a significant design change in PR.

  • The reviewers assigned automatically/manually are appropriate for the PR.

  • Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

To see a list of available CI bot commands, please comment /bot help.

Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
@QiJune QiJune requested review from a team as code owners April 29, 2026 01:30
@QiJune QiJune requested review from niukuo and zeroepoch April 29, 2026 01:30
@QiJune QiJune requested a review from ZhanruiSunCh April 29, 2026 01:30
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 29, 2026

📝 Walkthrough

Walkthrough

Updates a Slurm monitoring log message in Jenkins test configuration to display the resolved sacct status value during polling, falling back to "UNKNOWN" when the status is unset, replacing a static message.

Changes

Cohort / File(s) Summary
Slurm Monitoring Log Update
jenkins/L0_Test.groovy
Modified the status polling log message to output the actual resolved STATUS value from sacct command output with an "UNKNOWN" fallback when status is empty, instead of a static "is still running" message.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Description check ⚠️ Warning The PR description is incomplete; it contains only the template with placeholders for Description and Test Coverage sections, which were not filled out by the author. Fill in the Description section with an explanation of the issue and solution, and the Test Coverage section documenting relevant tests that validate the changes.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: fixing a misleading 'still running' log message for Slurm jobs in PENDING state, which directly relates to the code change in jenkins/L0_Test.groovy.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Review rate limit: 9/10 reviews remaining, refill in 6 minutes.

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
jenkins/L0_Test.groovy (1)

1374-1383: ⚠️ Optional: make STATUS handling robust if sacct --allocations outputs multiple lines.

Right now STATUS=$(sacct ... --format=State -Pn --allocations) can potentially contain multiple newline-separated states (depending on Slurm accounting granularity). The subsequent check does direct string equality comparisons, which can behave unexpectedly if STATUS is multi-line (e.g., the else branch may trigger even though the job is still transitioning).

If in practice sacct --allocations sometimes returns multiple states, consider normalizing STATUS to a single token before both the condition and the log, e.g. selecting the last line:

Suggested tweak (normalize `STATUS` to one state)
-                        STATUS=\$(sacct -j \$jobId --format=State -Pn --allocations)
+                        STATUS=\$(sacct -j \$jobId --format=State -Pn --allocations | tail -n 1)

This is optional and only needed if you confirm multi-line STATUS occurs in your Slurm environment.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@jenkins/L0_Test.groovy` around lines 1374 - 1383, The STATUS variable from
the sacct call can be multi-line; update the assignment inside the loop (where
STATUS=\$(sacct ... --format=State -Pn --allocations) is set) to normalize to a
single token (e.g., extract the last non-empty line or last field) before the
if-check and echo, then use that normalized STATUS for the equality checks and
log messages so multi-line outputs won't break the RUNNING/PENDING/CONFIGURING
comparisons.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@jenkins/L0_Test.groovy`:
- Around line 1374-1383: The STATUS variable from the sacct call can be
multi-line; update the assignment inside the loop (where STATUS=\$(sacct ...
--format=State -Pn --allocations) is set) to normalize to a single token (e.g.,
extract the last non-empty line or last field) before the if-check and echo,
then use that normalized STATUS for the equality checks and log messages so
multi-line outputs won't break the RUNNING/PENDING/CONFIGURING comparisons.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 5bf25ec6-cd47-4674-9649-f00c3d848b03

📥 Commits

Reviewing files that changed from the base of the PR and between b5c41f2 and b20c863.

📒 Files selected for processing (1)
  • jenkins/L0_Test.groovy

@QiJune
Copy link
Copy Markdown
Collaborator Author

QiJune commented Apr 29, 2026

/bot skip --comment "trivial changes"

@QiJune QiJune enabled auto-merge (squash) April 29, 2026 01:56
@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #46022 [ skip ] triggered by Bot. Commit: b20c863 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #46022 [ skip ] completed with state SUCCESS. Commit: b20c863
Skipping testing for commit b20c863

Link to invocation

@QiJune QiJune merged commit c9e3d9a into NVIDIA:main Apr 29, 2026
11 of 12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants