ci(pre-commit): auth tflint via GITHUB_TOKEN + retry transient flakes (closes #564)#697
Conversation
…closes #564) The `Run pre-commit` step in `.github/workflows/pre-commit.yml` intermittently fails because `terraform_tflint`'s `tflint --init` downloads ruleset plugins from the GitHub Releases API anonymously, hitting the 60-requests-per-hour per-IP limit that the runner shares with every other workflow running on the same NAT egress. Most recently observed on PR #696 (https://github.com/LeanerCloud/CUDly/actions/runs/26411029351). Two-part fix: 1. Expose secrets.GITHUB_TOKEN to the step. tflint reads GITHUB_TOKEN natively; with it set the limit jumps to 5000/hr per-token, which matters because the token is unique per workflow run, not per IP. This alone fixes the rate-limit class. 2. Wrap the step with nick-fields/retry@v3.0.2 (SHA-pinned per project policy) at 3 attempts with a 90s wait. The token fix eliminates the AWS-plugin rate-limit case; the retry handles the residual flakes (GitHub Releases availability blips, plugin download timeouts, etc.) so a transient failure no longer requires a manual rerun. Acceptance criteria from #564: - GITHUB_TOKEN exposed to the `Run pre-commit` step - Three consecutive `pre-commit` runs in the same hour all complete without a tflint rate-limit failure (the retry-wrapper also catches the residual flakes the token cannot)
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughThe workflow job-level timeout was raised to 35 minutes. The ChangesPre-commit Retry and Token Configuration
Sequence DiagramsequenceDiagram
participant Runner
participant RetryAction as nick-fields/retry
participant PreCommit as pre-commit
participant GitHubAPI as GitHub_API
Runner->>RetryAction: start step (env: GITHUB_TOKEN, SKIP: terraform_validate)
RetryAction->>PreCommit: run "pre-commit run --all-files" (command)
PreCommit->>GitHubAPI: fetch plugin releases (uses GITHUB_TOKEN for auth)
RetryAction->>RetryAction: on failure, wait 90s and retry (max 3 attempts, timeout 10m per attempt)
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Suggested labels
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/workflows/pre-commit.yml:
- Around line 202-204: The retry configuration is inconsistent: you set retry
plugin params timeout_minutes: 15, max_attempts: 3, retry_wait_seconds: 90 while
the overall job has timeout-minutes: 15, so retries may never run; either reduce
timeout_minutes in the nick-fields/retry block to a value that allows up to 3
attempts within the job cap (e.g., ~4 minutes per attempt accounting for
retry_wait_seconds) or increase the job-level timeout-minutes to cover
max_attempts * timeout_minutes + retry waits; update the keys timeout_minutes,
max_attempts, and/or the job's timeout-minutes accordingly to ensure the retry
policy can actually execute all attempts.
🪄 Autofix (Beta)
❌ Autofix failed (check again to retry)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 490f2372-a613-4999-9f33-df2e4c9085f9
📒 Files selected for processing (1)
.github/workflows/pre-commit.yml
|
Note Autofix is a beta feature. Expect some limitations and changes as we gather feedback and continue to improve it. The agent generated fixes only for
Lines 199–205 SKIP: terraform_validate
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
- timeout_minutes: 15
+ timeout_minutes: 4
max_attempts: 3
retry_wait_seconds: 90
command: pre-commit run --all-files |
CR pointed out the retry policy was ineffective: `nick-fields/retry` uses `timeout_minutes` PER ATTEMPT (not total), so with `max_attempts: 3` and `timeout_minutes: 15` the worst case is ~48 min, but the job-level `timeout-minutes: 15` killed any retry before attempt 2 could start. Fix: tune both timeouts so the retry budget fits inside the job budget with safety margin. - per-attempt timeout: 15 -> 10 minutes (normal pre-commit runs take ~3-5 min, 10 is comfortable margin without inflating hang-detection latency) - job timeout: 15 -> 35 minutes worst case = 3 attempts * 10 min + 2 * 90 s retry waits = 33 min, fits in 35 min with margin Avoids CR's suggested 50-minute job cap because that would also delay killswitch on a genuinely hung step. 35 min is the tightest value that still lets all 3 attempts complete.
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
Summary
Fixes the recurring
Run pre-commitCI failure (issue #564) caused byterraform_tflint'stflint --inithitting the GitHub Releases API anonymously. Two-part fix:Expose
secrets.GITHUB_TOKENto theRun pre-commitstep.tflint --initreadsGITHUB_TOKENnatively; with it set the rate limit jumps from 60/hr per shared runner IP to 5000/hr per workflow-run token. Eliminates the rate-limit failure class entirely.Wrap with
nick-fields/retry@v3.0.2(SHA-pinned atce71cc2ab81d554ebbe88c79ab5975992d79ba08per project policyfeedback_sha_pin_current_major.md): 3 attempts with 90s wait, 15-minute timeout. Defense-in-depth for the residual transient flakes (GitHub Releases availability blips, plugin download timeouts) that the token cannot fix.Most recent observation of this failure: PR #696's pre-commit job. The actual code in that PR was clean; only this CI-infra hook tripped.
Diff
.github/workflows/pre-commit.yml(+21/-1):GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}to the step'senv:block.run:touses: nick-fields/retry@<SHA> # v3.0.2withcommand:carrying the previous shell line.Test plan
python3 -c "import yaml; yaml.safe_load(...)")Run pre-commitjob for this PR is the first end-to-end test (this commit is intentionally tiny and self-validating)Cross-references
feedback_sha_pin_current_major.md(SHA-pinned action),feedback_ci_tool_version_pin.md(tflint version is already pinned elsewhere in the workflow)Summary by CodeRabbit