Skip to content

Fix wrong changes produced by style/repo. check bot#46371

Merged
ydshieh merged 3 commits into
mainfrom
fix_pr_comment_ci
Jun 3, 2026
Merged

Fix wrong changes produced by style/repo. check bot#46371
ydshieh merged 3 commits into
mainfrom
fix_pr_comment_ci

Conversation

@ydshieh
Copy link
Copy Markdown
Collaborator

@ydshieh ydshieh commented Jun 3, 2026

What does this PR do?

When running repo/style checks, some utility scripts (e.g. utils/check_copies.py, setup.py)
were copied from main into the PR directory before running checks. If those scripts had changed
on main after the PR branched off, they would show up in git diff as modified and get
incorrectly included in the bot's fix commit — a false positive.

This PR fixes this by detecting whether the PR itself modifies any of those scripts. If it does,
checks are skipped entirely and a comment is posted instead:

`setup.py` or some script files under the `utils/` directory are modified in this PR. Please run style/repo. checks/fixes locally.

If the PR doesn't touch those scripts, they are used as-is from the PR checkout (no longer copied
from `main`), so the git diff stays clean.

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

If the PR modifies any of the trusted util scripts that get copied from
main (setup.py, utils/check_*.py, etc.), skip the repo/style checks and
post a comment asking the author to run checks locally instead. This avoids
false diffs caused by those files differing between the PR branch and main.

The check uses github.paginate(pulls.listFiles) following the same pattern
as pr_slow_ci_suggestion.yml to avoid template injection and E2BIG issues.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the pr-repo-consistency-bot GitHub Actions workflow to avoid unsafe or incorrect “fix” commits by detecting whether trusted utility scripts (used for repo/style checks) were modified in the PR, and if so, skipping the automated checks and posting a message asking the contributor to run them locally.

Changes:

  • Add a PR-file-list API check to detect whether specific trusted scripts (e.g. utils/check_copies.py, setup.py) are modified in the PR.
  • Gate the script-copy/install/check/run + artifact upload steps on that detection, and expose the result as a job output.
  • Update the final PR comment logic to report when checks were skipped due to modified trusted scripts.
Comments suppressed due to low confidence (1)

.github/workflows/pr-repo-consistency-bot.yml:286

  • This workflow still risks including the copied trusted scripts (from main) in modified-files.txt/the fix commit when a PR branch is behind main. The new util_scripts_modified check only detects scripts modified in the PR, but it doesn’t prevent files like setup.py / utils/check_copies.py from showing up as modified solely because they were overwritten from main before git diff --name-only is computed. To match the PR description (“avoid… file not changed in PR but changed on main”), exclude these copied paths from the diff/artifact list (or otherwise ensure copied scripts don’t affect the collected modified-files set).
        if: steps.check_util_scripts.outputs.util_scripts_modified != 'true' && (steps.run_repo_checks.outputs.changes_detected == 'true' || steps.run_style_checks.outputs.changes_detected == 'true')
        run: |
          cd pr-repo
          mkdir -p ../artifact-staging
          git diff --name-only > ../artifact-staging/modified-files.txt

Comment on lines +384 to +386
if [ "$UTIL_SCRIPTS_MODIFIED" = 'true' ]; then
echo "final_comment=${MESSAGE_PREFIX}: some script files under the \`utils\` directory are modified in this PR. Please run style/repo. checks/fixes locally." >> $GITHUB_OUTPUT
elif [ "$CHANGES_DETECTED" = 'true' ]; then
Since we already skip checks when the PR modifies any of the util scripts,
the PR's versions of those files are safe to use directly — no need to
overwrite them from main. This also eliminates the false-diff issue where
copied files (differing from main) would appear in the git diff.

Also fix the PR comment to mention setup.py alongside utils/ scripts.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@ydshieh ydshieh requested a review from zucchini-nlp June 3, 2026 08:50
Copy link
Copy Markdown
Member

@zucchini-nlp zucchini-nlp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I will trust you in this! Left a single q, not sure if it was true before or we were running checkers from PR branch

Comment on lines +365 to +367
if [ "$UTIL_SCRIPTS_MODIFIED" = 'true' ]; then
echo "final_comment=${MESSAGE_PREFIX}: \`setup.py\` or some script files under the \`utils/\` directory are modified in this PR. Please run style/repo. checks/fixes locally." >> $GITHUB_OUTPUT
elif [ "$CHANGES_DETECTED" = 'true' ]; then
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so iiuc if the PR has modified repo-consistency checkers, we don't run the CI with those "new" files due to safety reasons?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we don't run it, at least not for now.

Before this PR, we copied the scripts from main branch and run the checks anyway. But you already see the issue of misleading reports. But more importantly, such "copy from main branch and run" isn't really reliable (things could go wrong anyway).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep. As long as we know they aren't being run in CI, makes sense. Then we can nudge users or test that the checks are passing locally ourselves

@ydshieh ydshieh merged commit 74464e8 into main Jun 3, 2026
32 of 33 checks passed
@ydshieh ydshieh deleted the fix_pr_comment_ci branch June 3, 2026 12:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants