Skip to content

[TRTLLM-12092][infra] Add PR Base Freshness Check Action#13430

Merged
crazydemo merged 1 commit intoNVIDIA:mainfrom
crazydemo:TRTLLM-12092
Apr 28, 2026
Merged

[TRTLLM-12092][infra] Add PR Base Freshness Check Action#13430
crazydemo merged 1 commit intoNVIDIA:mainfrom
crazydemo:TRTLLM-12092

Conversation

@crazydemo
Copy link
Copy Markdown
Collaborator

@crazydemo crazydemo commented Apr 24, 2026

Summary by CodeRabbit

Release Notes

  • Chores
    • Added automated validation to ensure pull request base branches stay current with target branches, detecting staleness based on commit history and age metrics.
    • New GitHub Actions workflow monitors PR base freshness with configurable threshold limits and optional enforcement modes for blocking stale PRs.

Description

Test Coverage

PR Checklist

Please review the following before submitting your PR:

  • PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.

  • PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.

  • Test cases are provided for new code paths (see test instructions)

  • Any new dependencies have been scanned for license and vulnerabilities

  • CODEOWNERS updated if ownership changes

  • Documentation updated as needed

  • Update tava architecture diagram if there is a significant design change in PR.

  • The reviewers assigned automatically/manually are appropriate for the PR.

  • Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

To see a list of available CI bot commands, please comment /bot help.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 24, 2026

📝 Walkthrough

Walkthrough

Introduces a new GitHub Actions-based PR base freshness validation system consisting of a Python script that evaluates whether a PR's base branch is sufficiently fresh relative to the PR head by comparing commits-behind count and merge-base age against configured thresholds, and a workflow that orchestrates the check on PR lifecycle events with optional enforcement.

Changes

Cohort / File(s) Summary
PR Base Freshness Validation
.github/scripts/pr_base_freshness_check.py, .github/workflows/pr-base-freshness.yml
New Python script that computes PR merge-base via git, calculates commits-behind interval count and base age in days, generates console and markdown reports, and exits with configurable enforcement (warning-only or blocking) based on staleness metrics. Paired with new workflow that fetches PR refs, manages git history depth, installs Python 3.12, and invokes the script with threshold environment variables and enforcement flag.

Sequence Diagram

sequenceDiagram
    actor GHA as GitHub Actions
    participant WF as pr-base-freshness.yml
    participant GIT as Git
    participant PY as pr_base_freshness_check.py
    participant ENV as Environment/Report

    GHA->>WF: Trigger on PR events
    WF->>GIT: Fetch PR head ref
    WF->>GIT: Deepen target branch history<br/>(iterative fetch)
    WF->>GIT: Compute merge-base
    GIT-->>WF: merge-base SHA
    WF->>PY: Run script with PR_HEAD_SHA,<br/>TARGET_REF, thresholds
    PY->>GIT: Get merge-base commits
    GIT-->>PY: Commits between merge-base<br/>and target ref
    PY->>PY: Calculate staleness<br/>(commits behind + base age)
    PY->>ENV: Write report to<br/>GITHUB_STEP_SUMMARY
    PY->>GHA: Exit with status<br/>(0=pass/warn, 1=fail)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Description check ⚠️ Warning The PR description contains only the template with no actual content in the required sections (Description and Test Coverage are empty, only the checklist is marked complete). Fill in the Description section explaining the issue/motivation and solution, and provide clear Test Coverage details documenting how the changes are validated.
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main change: adding a PR base freshness check action, which is fully supported by the changeset.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
.github/scripts/pr_base_freshness_check.py (1)

28-29: Missing error handling for git command failures.

If any git command fails (e.g., invalid ref, corrupted repo state), subprocess.run(..., check=True) raises CalledProcessError, which propagates as an unhandled exception with a less informative traceback rather than a clean ::error:: message.

Regarding static analysis hints (S603/S607): These are acceptable here since the inputs are controlled by the workflow environment and git is guaranteed to be in PATH on GitHub-hosted runners.

♻️ Suggested improvement for cleaner error reporting
 def _git(*args: str) -> str:
-    return subprocess.run(["git", *args], capture_output=True, text=True, check=True).stdout.strip()
+    result = subprocess.run(["git", *args], capture_output=True, text=True)
+    if result.returncode != 0:
+        print(f"::error::git {' '.join(args)} failed: {result.stderr.strip()}")
+        raise SystemExit(1)
+    return result.stdout.strip()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/scripts/pr_base_freshness_check.py around lines 28 - 29, Change the
helper _git to catch subprocess.CalledProcessError and emit a GitHub Actions
friendly error message instead of letting an unhandled traceback percolate: wrap
the subprocess.run call in try/except for subprocess.CalledProcessError, and in
the except block print a formatted ::error:: line that includes the failed git
command (args), the return code, and the captured stderr/stdout for diagnostics,
then either re-raise or call sys.exit(1) to fail the workflow cleanly; reference
the _git(...) function to locate where to add this handling.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In @.github/scripts/pr_base_freshness_check.py:
- Around line 28-29: Change the helper _git to catch
subprocess.CalledProcessError and emit a GitHub Actions friendly error message
instead of letting an unhandled traceback percolate: wrap the subprocess.run
call in try/except for subprocess.CalledProcessError, and in the except block
print a formatted ::error:: line that includes the failed git command (args),
the return code, and the captured stderr/stdout for diagnostics, then either
re-raise or call sys.exit(1) to fail the workflow cleanly; reference the
_git(...) function to locate where to add this handling.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 9af26d39-a7b9-4981-81e8-c3260ca70d89

📥 Commits

Reviewing files that changed from the base of the PR and between c4b8e8e and ffd95ce.

📒 Files selected for processing (2)
  • .github/scripts/pr_base_freshness_check.py
  • .github/workflows/pr-base-freshness.yml

@crazydemo
Copy link
Copy Markdown
Collaborator Author

Post-merge actions required by repo admins

This PR ships the freshness check in warn-only mode. Merging this PR alone is not enough to block stale-base PRs — the workflow
will log ::warning:: but stale PRs can still merge. To finish the rollout, a repo admin (someone with Admin access on
NVIDIA/TensorRT-LLM) needs to take the following steps. All configuration lives in the repo Settings UI; no code changes or
follow-up PRs are needed.

Step 1 (recommended): Observe in warn-only mode for ~2 weeks

Let the workflow run as-is. The computed commits_behind and age_days for every PR appear in the workflow run's step summary and in
the Actions log. Use that distribution to decide whether the default thresholds need tuning for real TRT-LLM PR volume.

Step 2 (optional): Tune thresholds

Thresholds are read from repo-level Actions variables so they can be changed without another PR. Under:

Settings → Secrets and variables → Actions → Variables tab → Repository variables

Variable Fallback Purpose
PR_BASE_FRESHNESS_COMMITS_LIMIT 150 Max commits behind target before the PR is considered stale
PR_BASE_FRESHNESS_AGE_LIMIT_DAYS 10 Max age (in days) of the merge base before the PR is considered stale

Leave them unset to use the fallbacks.

Step 3: Turn on enforcement (two actions required — both needed)

(a) Set the enforcement variable in the same Repository variables page:

PR_BASE_FRESHNESS_ENFORCE = true                          

This flips the workflow exit code from 0 (warn) to 1 (error) on stale PRs.

(b) Add the check to branch protection for main:

Settings → Rules → Rulesets → the ruleset targeting main → under Require status checks to pass before merging, add:

PR Base Freshness                                         

(the check must have run at least once before it shows up in the picker — any PR opened after this one merges will suffice.)

Both (a) and (b) are required. Without (a), the check stays green even on stale PRs. Without (b), a red check is informational only
and does not block merge. Only the combination produces a real merge block.

Rollback

If enforcement causes unforeseen problems, set PR_BASE_FRESHNESS_ENFORCE=false (or delete the variable) on the same Variables page.
The change takes effect on the next PR event — no PR, deploy, or revert needed.

@crazydemo
Copy link
Copy Markdown
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #45387 [ run ] triggered by Bot. Commit: d41b039 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #45387 [ run ] completed with state FAILURE. Commit: d41b039
/LLM/main/L0_MergeRequest_PR pipeline #35628 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

@crazydemo
Copy link
Copy Markdown
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #45404 [ run ] triggered by Bot. Commit: e5e2575 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #45404 [ run ] completed with state SUCCESS. Commit: e5e2575
/LLM/main/L0_MergeRequest_PR pipeline #35642 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

@crazydemo
Copy link
Copy Markdown
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #45470 [ run ] triggered by Bot. Commit: 68034f5 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #45470 [ run ] completed with state SUCCESS. Commit: 68034f5
/LLM/main/L0_MergeRequest_PR pipeline #35702 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

Copy link
Copy Markdown
Collaborator

@QiJune QiJune left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@crazydemo
Copy link
Copy Markdown
Collaborator Author

/bot help

@github-actions
Copy link
Copy Markdown

GitHub Bot Help

/bot [-h] ['run', 'kill', 'skip', 'reuse-pipeline'] ...

Provide a user friendly way for developers to interact with a Jenkins server.

Run /bot [-h|--help] to print this help message.

See details below for each supported subcommand.

Details

run [--reuse-test (optional)pipeline-id --disable-fail-fast --skip-test --stage-list "A10-PyTorch-1, xxx" --gpu-type "A30, H100_PCIe" --test-backend "pytorch, cpp" --add-multi-gpu-test --only-multi-gpu-test --disable-multi-gpu-test --post-merge --extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx" --detailed-log --debug(experimental) --high-priority]

Launch build/test pipelines. All previously running jobs will be killed.

--reuse-test (optional)pipeline-id (OPTIONAL) : Allow the new pipeline to reuse build artifacts and skip successful test stages from a specified pipeline or the last pipeline if no pipeline-id is indicated. If the Git commit ID has changed, this option will be always ignored. The DEFAULT behavior of the bot is to reuse build artifacts and successful test results from the last pipeline.

--disable-reuse-test (OPTIONAL) : Explicitly prevent the pipeline from reusing build artifacts and skipping successful test stages from a previous pipeline. Ensure that all builds and tests are run regardless of previous successes.

--disable-fail-fast (OPTIONAL) : Disable fail fast on build/tests/infra failures.

--skip-test (OPTIONAL) : Skip all test stages, but still run build stages, package stages and sanity check stages. Note: Does NOT update GitHub check status.

--stage-list "A10-PyTorch-1, xxx" (OPTIONAL) : Only run the specified test stages. Supports wildcard * for pattern matching (e.g., "*PerfSanity*" matches all stages containing PerfSanity). Examples: "A10-PyTorch-1, xxx", "PerfSanity". Note: Does NOT update GitHub check status.

--gpu-type "A30, H100_PCIe" (OPTIONAL) : Only run the test stages on the specified GPU types. Examples: "A30, H100_PCIe". Note: Does NOT update GitHub check status.

--test-backend "pytorch, cpp" (OPTIONAL) : Skip test stages which don't match the specified backends. Only support [pytorch, cpp, tensorrt, triton]. Examples: "pytorch, cpp" (does not run test stages with tensorrt or triton backend). Note: Does NOT update GitHub pipeline status.

--only-multi-gpu-test (OPTIONAL) : Only run the multi-GPU tests. Note: Does NOT update GitHub check status.

--disable-multi-gpu-test (OPTIONAL) : Disable the multi-GPU tests. Note: Does NOT update GitHub check status.

--add-multi-gpu-test (OPTIONAL) : Force run the multi-GPU tests in addition to running L0 pre-merge pipeline.

--post-merge (OPTIONAL) : Run the L0 post-merge pipeline instead of the ordinary L0 pre-merge pipeline.

--extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx" (OPTIONAL) : Run the ordinary L0 pre-merge pipeline and specified test stages. Supports wildcard * for pattern matching. Examples: --extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx", --extra-stage "Post-Merge".

--detailed-log (OPTIONAL) : Enable flushing out all logs to the Jenkins console. This will significantly increase the log volume and may slow down the job.

--debug (OPTIONAL) : Experimental feature. Enable access to the CI container for debugging purpose. Note: Specify exactly one stage in the stage-list parameter to access the appropriate container environment. Note: Does NOT update GitHub check status.

--high-priority (OPTIONAL) : Run the pipeline with high priority. This option is restricted to authorized users only and will route the job to a high-priority queue.

kill

kill

Kill all running builds associated with pull request.

skip

skip --comment COMMENT

Skip testing for latest commit on pull request. --comment "Reason for skipping build/test" is required. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.

reuse-pipeline

reuse-pipeline

Reuse a previous pipeline to validate current commit. This action will also kill all currently running builds associated with the pull request. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.

Add a GitHub Actions workflow that detects PRs whose merge base is
too far behind the target branch and fails the check so the author
rebases before merging. This catches the "invisible conflict" class
of breakage: a PR can merge textually clean yet break main after
merge because semantic changes (renamed helper, tightened invariant,
replaced registration, etc.) landed on main while the PR was open.

The check is lightweight (pure git): compute merge-base, count
commits_behind, compute merge-base age, compare to thresholds.

Configuration is via repo-level Actions variables so thresholds and
enforcement can be tuned without another PR:

  PR_BASE_FRESHNESS_COMMITS_LIMIT   (fallback: 150)
  PR_BASE_FRESHNESS_AGE_LIMIT_DAYS  (fallback: 10)
  PR_BASE_FRESHNESS_ENFORCE         (fallback: false, warn-only)

Implementation notes:
- Shallow (depth 500) + sparse checkout of the target branch
  materializes only the check script. Progressive deepening covers
  very old PRs without paying the cost up front.
- refs/pull/N/head is used to reach the PR head commit; no PR-side
  code is executed, so fork PRs are safe.
- auto_merge_enabled is an explicit trigger so a PR that sits in
  review long enough to go stale cannot auto-merge on a cached
  green result.
- Bootstrap-safe: if the check script is not yet present on the
  target branch (true for the PR that introduces this workflow),
  the step exits 0 with a notice instead of failing.

Rollout:
- Ships in warn-only mode; a stale PR logs ::warning:: and is not
  blocked.
- To enforce: set PR_BASE_FRESHNESS_ENFORCE='true' in repo variables
  AND add "PR Base Freshness" to main's required status checks under
  Settings -> Rules. Both steps are needed: exit 1 alone does not
  block merge without the required-check wiring.

Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
@crazydemo
Copy link
Copy Markdown
Collaborator Author

/bot run --skip-test

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #45635 [ run ] triggered by Bot. Commit: ac243ec Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #45635 [ run ] completed with state SUCCESS. Commit: ac243ec
/LLM/main/L0_MergeRequest_PR pipeline #35848 (Partly Tested) completed with status: 'SUCCESS'

CI Report

Link to invocation

@crazydemo
Copy link
Copy Markdown
Collaborator Author

/bot reuse-pipeline

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #45662 [ reuse-pipeline ] triggered by Bot. Commit: ac243ec Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #45662 [ reuse-pipeline ] completed with state SUCCESS. Commit: ac243ec
Reusing PR_Github #45635 (Partly Tested) for commit ac243ec

Link to invocation

Copy link
Copy Markdown
Collaborator

@niukuo niukuo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Will add PR_BASE_FRESHNESS_COMMITS_LIMIT to variables on demand

@crazydemo crazydemo merged commit 1e8640c into NVIDIA:main Apr 28, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants