fix: ignore GitHub API rate limit errors in update and health commands#23032
fix: ignore GitHub API rate limit errors in update and health commands#23032
Conversation
Agent-Logs-Url: https://github.com/github/gh-aw/sessions/dd66f32d-59a3-47a0-9423-faa3bee24db3 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
|
@copilot if you hit github rate limit in the rate limit test, assume it is canceled. |
Already handled by commit 04eaf16. Both affected jobs now treat rate limit errors as cancelled (non-fatal):
|
There was a problem hiding this comment.
Pull request overview
Adjusts CLI behavior to treat GitHub API rate limiting as a transient condition so CI jobs don’t fail or produce unusable output when the API is temporarily throttling requests.
Changes:
- Add a shared
gitutil.IsRateLimitErrorhelper to detect common GitHub rate-limit error strings. - Update
updateworkflow updates to be non-fatal when all per-workflow failures are rate-limit related. - Update
healthto warn and exit cleanly on rate limit; in detailed--jsonmode, emit a zero-run JSON payload.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| pkg/gitutil/gitutil.go | Adds IsRateLimitError helper for consistent rate-limit detection. |
| pkg/gitutil/gitutil_test.go | Adds table-driven unit tests for IsRateLimitError and IsAuthError. |
| pkg/cli/update_workflows.go | Treats “all failed due to rate limiting” as non-fatal; adds helper to detect that condition. |
| pkg/cli/health_command.go | Handles rate-limit errors from fetchWorkflowRuns by warning and exiting cleanly; emits detailed JSON when applicable. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if config.JSONOutput && config.WorkflowName != "" { | ||
| // Emit an empty-run JSON structure so callers can still parse the output. | ||
| return displayDetailedHealth(nil, config) |
There was a problem hiding this comment.
In the rate-limit branch, RunHealth only emits JSON when config.WorkflowName != "" (detailed view). If --json is used without a workflow (summary mode), this path returns nil without writing any JSON to stdout, which breaks callers that rely on --json always producing parseable output. Consider emitting an empty HealthSummary via outputHealthJSON(...) when config.JSONOutput is true, regardless of whether a workflow name was provided (while still printing the warning to stderr).
| if config.JSONOutput && config.WorkflowName != "" { | |
| // Emit an empty-run JSON structure so callers can still parse the output. | |
| return displayDetailedHealth(nil, config) | |
| if config.JSONOutput { | |
| if config.WorkflowName != "" { | |
| // Emit an empty-run JSON structure for the specific workflow so callers can still parse the output. | |
| return displayDetailedHealth(nil, config) | |
| } | |
| // In summary mode, emit an empty HealthSummary so JSON output is always parseable. | |
| if err := outputHealthJSON(&HealthSummary{}); err != nil { | |
| return err | |
| } |
| if gitutil.IsRateLimitError(err.Error()) { | ||
| // Rate limiting is a transient infrastructure condition, not a code error. | ||
| // Warn and exit cleanly so CI jobs are not marked as failed. | ||
| fmt.Fprintln(os.Stderr, console.FormatWarningMessage("Skipping health check: GitHub API rate limit exceeded")) | ||
| if config.JSONOutput && config.WorkflowName != "" { | ||
| // Emit an empty-run JSON structure so callers can still parse the output. | ||
| return displayDetailedHealth(nil, config) | ||
| } |
There was a problem hiding this comment.
The new rate-limit handling in RunHealth is important for CI stability, but there isn’t a unit test covering the new behavior (rate-limit error => non-fatal exit, and in --json mode still emitting valid JSON). Consider adding a test that forces fetchWorkflowRuns/listWorkflowRunsWithPagination to return a rate-limit error (e.g., via dependency injection through a package-level function var) and asserts stdout contains valid JSON.
| if len(successfulUpdates) == 0 { | ||
| // If all failures were due to GitHub API rate limiting, treat as non-fatal. | ||
| // Rate limiting is a transient infrastructure condition, not a code error. | ||
| if len(failedUpdates) > 0 && allFailuresAreRateLimited(failedUpdates) { | ||
| fmt.Fprintln(os.Stderr, console.FormatWarningMessage("All workflow updates skipped due to GitHub API rate limiting")) | ||
| return nil | ||
| } | ||
| return errors.New("no workflows were successfully updated") | ||
| } | ||
|
|
||
| return nil | ||
| } | ||
|
|
||
| // allFailuresAreRateLimited returns true if every failed workflow update was caused | ||
| // by a GitHub API rate limit error. Used to distinguish transient rate-limiting | ||
| // (non-fatal) from genuine update failures (fatal). | ||
| func allFailuresAreRateLimited(failures []updateFailure) bool { | ||
| for _, f := range failures { | ||
| if !gitutil.IsRateLimitError(f.Error) { | ||
| return false | ||
| } | ||
| } | ||
| return true | ||
| } |
There was a problem hiding this comment.
UpdateWorkflows now returns nil when all workflow update failures are rate-limit related. Since this changes the command’s exit behavior, it would be good to add a focused unit test for allFailuresAreRateLimited (and/or the len(successfulUpdates)==0 branch) to lock in the intended semantics for “all rate-limited” vs “mixed failures”.
CI jobs for
updateandhealth-smoke-copilotwere failing with HTTP 403 "API rate limit exceeded" — a transient infrastructure condition that should not mark CI as broken.Root causes
updatejob:UpdateWorkflowspropagated a fatal error when all per-workflow failures were rate-limit related ("no workflows were successfully updated"), even though there was nothing actionable to fix.health-smoke-copilotjob:RunHealthsurfaced thefetchWorkflowRunsrate-limit error directly, causinggh aw health --jsonto exit non-zero and produce no JSON output (breaking downstreamjqparsing).Changes
pkg/gitutil/gitutil.go— AddedIsRateLimitError(errMsg string) booldetecting "API rate limit exceeded", "rate limit exceeded", and "secondary rate limit" patterns (case-insensitive).pkg/cli/update_workflows.go— AddedallFailuresAreRateLimitedhelper.UpdateWorkflowsnow returnsnilwith a warning instead of a fatal error when every workflow failure is rate-limit related.pkg/cli/health_command.go—RunHealthdetects rate-limit errors fromfetchWorkflowRuns, prints a warning, and in--jsonmode emits a zero-runWorkflowHealthstruct so callers can still parse the output cleanly.pkg/gitutil/gitutil_test.go— Table-driven tests forIsRateLimitErrorandIsAuthError.