Skip to content

ci(babysitter): switch from check_run to workflow_run trigger#15627

Merged
pzelasko merged 1 commit into
mainfrom
fix-pr-babysitter-1
Apr 21, 2026
Merged

ci(babysitter): switch from check_run to workflow_run trigger#15627
pzelasko merged 1 commit into
mainfrom
fix-pr-babysitter-1

Conversation

@pzelasko
Copy link
Copy Markdown
Collaborator

@pzelasko pzelasko commented Apr 21, 2026

Important

The Update branch button must only be pressed in very rare occassions.
An outdated branch is never blocking the merge of a PR.
Please reach out to the automation team before pressing that button.

What does this PR do ?

The PR Babysitter never investigated real CI failures. Check-runs produced by GHA jobs authenticated with the default GITHUB_TOKEN do not fire check_run events on downstream workflows (GitHub's recursion guard). As a result, check-label-for-ci skipped on every CI failure it was meant to handle — verified on PR #15626 where all 31 recent check_run-triggered runs skipped every job despite multiple failing checks (Nemo_CICD_Test, Nemo_Linting_Test, etc.).

Replace the check_run: [completed] trigger with an explicit workflow_run: [completed] list covering the CI workflows that can fail on a PR (CICD NeMo, PyLint/flake8, wheel build, init check, copyright, CI-Install-Check, CodeQL, secrets detector). Intentionally omitted: "Isort and Black Formatting" (auto-pushes fixes), the babysitter itself, and labeler/relabel bots.

Update check-label-for-ci, the investigate prompt, and ping-author-on-failure to read github.event.workflow_run.* instead of github.event.check_run.* (conclusion, pull_requests, head_sha, name). Drop the reformat_with_isort_and_black name filter — filtering is now done at the trigger level by not listing that workflow.

Semantics change: one investigation per failing workflow (not per failing check). The existing prompt already instructs Claude to look at all failing checks on the PR, so coverage is unchanged and the noise is lower.

Collection: CI

Changelog

  • Add specific line by line info of high level changes in this PR.

Usage

  • You can potentially add a usage example below
# Add a code snippet demonstrating how to use this 

GitHub Actions CI

The Jenkins CI system has been replaced by GitHub Actions self-hosted runners.

The GitHub Actions CI will run automatically when the "Run CICD" label is added to the PR.
To re-run CI remove and add the label again.
To run CI on an untrusted fork, a NeMo user with write access must first click "Approve and run".

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you add or update any necessary documentation?
  • Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
    • Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

  • New Feature
  • Bugfix
  • Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

  • Related to # (issue)

The PR Babysitter never investigated real CI failures. Check-runs
produced by GHA jobs authenticated with the default GITHUB_TOKEN do
not fire `check_run` events on downstream workflows (GitHub's
recursion guard). As a result, `check-label-for-ci` skipped on every
CI failure it was meant to handle — verified on PR #15626 where all
31 recent check_run-triggered runs skipped every job despite multiple
failing checks (Nemo_CICD_Test, Nemo_Linting_Test, etc.).

Replace the `check_run: [completed]` trigger with an explicit
`workflow_run: [completed]` list covering the CI workflows that can
fail on a PR (CICD NeMo, PyLint/flake8, wheel build, __init__ check,
copyright, CI-Install-Check, CodeQL, secrets detector). Intentionally
omitted: "Isort and Black Formatting" (auto-pushes fixes), the
babysitter itself, and labeler/relabel bots.

Update `check-label-for-ci`, the investigate prompt, and
`ping-author-on-failure` to read `github.event.workflow_run.*`
instead of `github.event.check_run.*` (conclusion, pull_requests,
head_sha, name). Drop the `reformat_with_isort_and_black` name filter
— filtering is now done at the trigger level by not listing that
workflow.

Semantics change: one investigation per failing workflow (not per
failing check). The existing prompt already instructs Claude to look
at all failing checks on the PR, so coverage is unchanged and the
noise is lower.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@pzelasko pzelasko requested a review from chtruong814 April 21, 2026 13:27
@github-actions github-actions Bot added the CI label Apr 21, 2026
@pzelasko pzelasko merged commit fa59a8e into main Apr 21, 2026
44 of 45 checks passed
@pzelasko pzelasko deleted the fix-pr-babysitter-1 branch April 21, 2026 13:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants