Skip to content

Optimize GitHub API call volume to prevent secondary rate limits #992

@Trecek

Description

@Trecek

Problem

During pipeline runs with heavy GitHub interaction (posting inline review-pr comments, responding via resolve-review, and PR operations), we've hit GitHub's secondary rate limits (403/429 responses). Issue #892 addressed token fallback and retry resilience, but the root cause — excessive API call volume — was never addressed.

Context: GitHub Secondary Rate Limits

Per GitHub docs:

  • 80 content-generating requests per minute (broader than just POST/PATCH/PUT/DELETE — includes any content-generating action)
  • 900 points/minute for REST, 2000 points/minute for GraphQL
  • Mutating requests cost 5 points each (REST POST/PATCH/PUT/DELETE and GraphQL mutations)
  • GET/HEAD/OPTIONS cost 1 point each; GraphQL queries (no mutations) cost 1 point
  • Recommended: wait at least 1 second between POST/PATCH/PUT/DELETE operations (not a hard rule, but violating increases secondary rate limit risk)
  • Requests should be serialized, not concurrent
  • Use conditional requests (ETags/If-Modified-Since) for repeated GETs — 304 responses don't count against primary limits (note: ETags only work on REST GET requests, NOT on GraphQL POST endpoints)
  • No more than 100 concurrent requests allowed
  • GraphQL aliases count as 1 request — a single GraphQL request with N mutation aliases costs 5 points total, not N×5. This makes alias-based batching extremely high-leverage.

Audit Results: Skills by API Call Volume

A full audit of all skills' GitHub API usage was performed and independently validated against the codebase. Here are the findings:

Zero API Call Skills (no concern)

These skills read only local files (sessions.jsonl, Channel B JSONL, plan files, source code):

  • audit-bugs, audit-friction, audit-arch, audit-cohesion
  • audit-gaps-gap-plan, audit-gaps-gap-review, audit-gaps-synthesize

Low API Call Skills (minimal concern)

These skills make a small number of GitHub API calls (issue list/view) but are not rate-limit-sensitive:

  • dry-walkthroughgh issue list for context gathering
  • implement-worktree, implement-worktree-no-merge — occasional gh for issue references
  • validate-auditgh issue list --state all --search
  • diagnose-cigh api .../actions/runs/.../jobs + gh api .../actions/jobs/.../logs

High-Volume Skills (primary concern)

review-pr — ~10-50 API calls per PR

Call Count Points
gh api user -q .login 1 1
gh pr list --head 1 1
GraphQL reviewThreads(first:100) 1 1
gh pr diff 1 1
gh repo view 1 1
gh pr view (headRefName/baseRefName) 1 1
gh api compare/{base}...{head} 1 1
gh pr view --json files 1 1
POST /pulls/{N}/reviews (batch inline comments) 1 5
POST /pulls/{N}/comments (Tier 1 fallback, per-finding) 0-N 5 each
gh pr review --approve|--request-changes 1 5

Current state: The primary path (Step 6) already uses the batch endpoint POST /pulls/{N}/reviews with a comments[] array. This is correct.

Problem: The Tier 1 fallback (triggered when the batch POST fails) posts each finding individually via POST /pulls/{N}/comments — each costing 5 points. With 20 findings, that's 100 points in rapid succession with zero delay between calls.

Fallback trigger analysis (validated against 57 session logs)

The Tier 1 fallback fires in ~9% of runs (5/57 sessions). Two distinct causes were observed:

Cause A — Own-PR REQUEST_CHANGES 422 (3/5 sessions): The bot always reviews its own PRs (standard deployment model), so GitHub rejects event: REQUEST_CHANGES with HTTP 422 "Review Cannot request changes on your own pull request". The SKILL.md instructs retry with event: COMMENT, but in some cases the model falls through to Tier 1 anyway. Since own-PR review is the expected default, this trigger is not an edge case — the REQUEST_CHANGESCOMMENT retry path must be hardened so it reliably succeeds without falling through to Tier 1. (Future deployments may review other users' PRs, where REQUEST_CHANGES would be valid, so the event logic should remain flexible.)

Cause B — Response-parsing false alarm (2/5 sessions, BUG): The batch POST succeeds (HTTP 200, review created), but the model checks len(response.get("comments", [])) and gets 0. GitHub's POST /pulls/{N}/reviews response does NOT echo back the comments array — it returns the review object without inline comments. The model misreads "0 comments in response" as "0 comments posted" and unnecessarily fires Tier 1, creating duplicate comments on GitHub.

Conclusion: The fallback cannot be removed — Cause A is the standard operating mode (own-PR reviews). But once both fixes land (Cause B bug fix + hardened COMMENT retry for Cause A), the Tier 1 fallback should almost never fire. The VALID_LINE_RANGES filter in Step 4 is effective — no 422 from invalid line numbers was observed in any of the 57 sessions. The remaining latent risk is race conditions (PR head changes between diff fetch and review POST), which is not currently guarded.

Fixes:

  1. Fix the response-parsing bug — remove len(d.get("comments", [])) check. The batch endpoint always returns 0 for this field; presence of inline comments must be verified separately (e.g., via a follow-up GET or by trusting the 200 response).
  2. If batch fails for legitimate reasons, retry with filtered payload (remove offending comment) before falling back to individual POSTs
  3. If individual POSTs are still needed, add 1-second delays between them

resolve-review — ~15-80 API calls per PR (CONFIRMED)

Call Count Points
gh pr list --head 1 1
gh repo view 1 1
GET /pulls/{N}/comments --paginate 1-3 1 each
GET /pulls/{N}/reviews --paginate 1-3 1 each
GraphQL reviewThreads(first:100) 1 1
POST /pulls/{N}/comments/{id}/replies N per finding 5 each
GraphQL resolveReviewThread mutation N per thread 5 each

Confirmed: Replies are posted one at a time (Step 6.5). Thread resolutions are individual GraphQL mutations (Step 6). No delays between any mutating calls. With 15 threads and 15 replies, that's ~150 points in rapid succession.

Fixes:

  1. Batch thread resolution via GraphQL aliases: mutation { t1: resolveReviewThread(...) t2: resolveReviewThread(...) } — one request = 5 points total (confirmed: aliases don't multiply point cost)
  2. Add 1-second delays between reply POSTs (no batch API exists for comment replies)

analyze-prs — ~5 calls per PR × N PRs (CONFIRMED)

Call Count Points
gh pr list --base 1 1
gh pr diff {N} N per PR 1 each
gh pr view {N} --json files N per PR 1 each
gh pr view {N} --json body N per PR 1 each
gh pr checks {N} N per PR 1 each
gh pr view {N} --json reviews N per PR 1 each

Confirmed: Step 1 parallelizes 3 reads/PR in batches of 8. Step 1.5 adds 2 more reads per PR in a sequential shell loop. pr_gates.py:partition_prs already accepts pre-fetched checks_by_number and reviews_by_number dicts — the Python layer is ready for batched input.

Fix: Use GraphQL to batch multiple PR queries: query { pr1: pullRequest(number:1){...} pr2: pullRequest(number:2){...} } — one call for all PRs. This works directly with pr_gates.py's existing interface.

open-integration-pr — ~5-10 calls per collapsed PR

Call Count Points
gh pr view {N} --json body (Step 3) N per PR 1 each
gh pr close {N} --comment (Step 10) N per PR 5 each

Fix: Batch PR body fetches via GraphQL aliases. Add 1-second delay between gh pr close calls.

Medium-Volume Skills

triage-issues — N+2 mutating calls

  • gh issue list (1 bulk read)
  • gh label create --force (2 POSTs, idempotent)
  • gh issue edit {N} --add-label1 PATCH per issue, no delay (5 pts each)

enrich-issues — 2 mutating calls per issue

  • gh issue view N --json body + gh issue edit N --body-file per enriched issue

collapse-issues — 3 mutating calls per original

  • gh issue create + gh issue comment {orig} + gh issue close {orig} per collapsed original

issue-splitter — 3-5 mutating calls per split

  • gh issue create per sub-issue + gh issue edit --add-label "split" + gh issue comment

prepare-issue — 2-3 mutating calls

  • gh issue create or gh issue edit + 2-3 gh issue edit --add-label calls

Server Tools (Python — mutating call sites)

  • tools_pr_ops.py:bulk_close_issues — async _close_issues_sequentially fires sequential gh issue close with no delay between awaits
  • execution/github.py:DefaultGitHubFetcher — all mutating methods (create_issue, add_comment, add_labels, remove_label, ensure_label) are async and fire immediately with no rate-limit awareness
  • tools_issue_lifecycle.py:claim_issue — 3 async API calls per claim (fetch + ensure_label + add_labels). The ensure_label call could be session-cached since label existence is stable within a run (reduces to 2 calls after first invocation)

Cross-Cutting Finding: Zero Rate-Limit Awareness

No code in the entire codebase reads or acts on X-RateLimit-Remaining, X-RateLimit-Reset, or Retry-After headers. The only backoff mechanism is _jittered_sleep in ci.py, which is a CI-polling delay — not a rate-limit response. No ETags or conditional requests are used anywhere.

Implementation Safety Assessment

All proposed changes have been validated for safety:

  • Timeout risk: NONE. Hard session timeout is 7200s (2 hours). Stale/idle thresholds (1200s/600s) only fire on output silence. Adding 20-60 seconds of API delays is negligible. No recipe defines per-skill budgets.
  • Async compatibility: CONFIRMED. All server tool functions (bulk_close_issues, DefaultGitHubFetcher methods, claim_issue) are already async. Adding await asyncio.sleep(1) requires no signature changes.
  • Test impact: MINIMAL. Existing tests mock _run_subprocess with AsyncMock and don't assert timing. Tests will need asyncio.sleep patched to avoid slowdown, but won't break.
  • ETag limitation: GraphQL is POST-only. ETags only work on REST GET requests. merge_queue.py's GraphQL polling cannot benefit from conditional requests. Only ci.py REST GET polling is a candidate for ETags.

Optimization Priorities

P0 — Fix response-parsing bug + harden own-PR retry (review-pr)

Three changes:

  1. Fix the response-parsing bug: Remove the len(d.get("comments", [])) check that causes false-alarm Tier 1 triggers and duplicate comments. Trust the HTTP 200 response from the batch endpoint.
  2. Harden the REQUEST_CHANGESCOMMENT retry: Since own-PR is the standard deployment, this retry path fires on every changes_requested verdict. Ensure it reliably completes without falling through to Tier 1.
  3. Throttle the fallback: If Tier 1 individual POSTs are still needed (future edge cases), add 1-second delays between them.
  • Impact: Eliminates ~40% of current fallback triggers (false alarms), prevents ~60% (hardened retry), throttles the remainder
  • Files: src/autoskillit/skills_extended/review-pr/SKILL.md

P1 — Batch thread resolution (resolve-review)

Use GraphQL mutation aliases to resolve multiple threads in one request.

  • Current: N individual resolveReviewThread mutations (5 pts each, no delay)
  • Target: 1 GraphQL request with N aliases (5 pts total)
  • Impact: Reduces N×5 pts to 5 pts per resolve cycle
  • Files: src/autoskillit/skills_extended/resolve-review/SKILL.md

P2 — Add delays between mutating calls (all skills + server tools)

Add 1-second delays between POST/PATCH/PUT/DELETE calls. All affected functions are already async — await asyncio.sleep(1) requires no interface changes.

  • Files:
    • src/autoskillit/server/tools_pr_ops.pybulk_close_issues / _close_issues_sequentially
    • src/autoskillit/execution/github.py — callers of mutating methods in DefaultGitHubFetcher
    • src/autoskillit/skills_extended/resolve-review/SKILL.md — reply POSTs (Step 6.5)
    • src/autoskillit/skills_extended/triage-issues/SKILL.md — label application
    • src/autoskillit/skills_extended/collapse-issues/SKILL.md — close+comment loops
    • src/autoskillit/skills_extended/issue-splitter/SKILL.md — create loops
    • src/autoskillit/skills_extended/enrich-issues/SKILL.md — edit loops
    • src/autoskillit/skills_extended/open-integration-pr/SKILL.md — close loops
  • Test note: Tests using AsyncMock for _run_subprocess should also mock asyncio.sleep to avoid slowdown

P3 — GraphQL multi-entity batching (analyze-prs, open-integration-pr)

Replace per-PR gh pr view loops with single GraphQL query using aliases.

  • Current: N sequential gh pr view calls per PR
  • Target: 1 GraphQL query with N aliases per batch
  • Impact: Reduces N calls to 1 per batch (up to ~50 entities per query)
  • Files:
    • src/autoskillit/skills_extended/analyze-prs/SKILL.md — Step 1.5 sequential loop
    • src/autoskillit/skills_extended/open-integration-pr/SKILL.md — Step 3 body fetches
  • Note: pipeline/pr_gates.py:partition_prs already accepts pre-fetched dicts — Python layer is ready

P4 — Pre-fetch entity lists + cache labels (triage/process/enrich + claim_issue)

Move gh issue list / gh pr list bulk fetches to pre-scan steps. Pass results via manifest files.

  • Already applied in audit-gaps recipe (pre-scan fetches all issues once)
  • Apply pattern to: triage-issues, process-issues, enrich-issues
  • Cache ensure_label results in DefaultGitHubFetcher (session-scoped set of (owner, repo, label) tuples) — reduces claim_issue from 3 API calls to 2 on all invocations after the first

P5 — Conditional requests for REST GET polling (CI only)

Use ETags / If-Modified-Since for repeated REST GET calls when polling CI status.

  • 304 responses don't count against primary rate limits
  • Only viable for REST GET endpoints — GitHub does not support ETags on GraphQL POST requests
  • Files:
    • src/autoskillit/execution/ci.py — CI status polling (unconditional GETs today; _jittered_sleep pattern is reusable for the delay utility)
  • src/autoskillit/execution/merge_queue.py — NOT a candidate (GraphQL POST, ETags not supported)

Complete Affected Files List

File Priority Change Type
src/autoskillit/skills_extended/review-pr/SKILL.md P0 Fix response-parsing bug, throttle fallback
src/autoskillit/skills_extended/resolve-review/SKILL.md P1 Batch mutations via GraphQL aliases + reply delays
src/autoskillit/server/tools_pr_ops.py P2 Add asyncio.sleep(1) in _close_issues_sequentially
src/autoskillit/execution/github.py P2 Add delay utility; callers add delay between mutating calls
src/autoskillit/skills_extended/triage-issues/SKILL.md P2/P4 Label delays + pre-fetch pattern
src/autoskillit/skills_extended/collapse-issues/SKILL.md P2 Close delays
src/autoskillit/skills_extended/issue-splitter/SKILL.md P2 Create delays
src/autoskillit/skills_extended/enrich-issues/SKILL.md P2/P4 Edit delays + pre-fetch
src/autoskillit/skills_extended/open-integration-pr/SKILL.md P3 GraphQL batch body fetches + close delays
src/autoskillit/skills_extended/analyze-prs/SKILL.md P3 GraphQL batch Step 1.5
src/autoskillit/pipeline/pr_gates.py P3 Already ready (accepts pre-fetched dicts)
src/autoskillit/server/tools_issue_lifecycle.py P4 Cache ensure_label per session
src/autoskillit/execution/ci.py P5 Add ETag/conditional GET for REST polling

Post-Implementation: CLAUDE.md Rules

Once implemented, add the following to CLAUDE.md as §3.5 "GitHub API Call Discipline":

  • Batch inline review comments: Always use POST /pulls/{N}/reviews with a comments[] array. Never post individual comments via POST /pulls/{N}/comments — each is 5 rate-limit points. One batch POST = 5 points regardless of comment count.
  • Batch GraphQL mutations: Use GraphQL aliases to resolve multiple threads or fetch multiple entities in a single request. Never loop individual mutations.
  • Delay between mutating calls: Space POST/PATCH/PUT/DELETE requests at least 1 second apart to avoid secondary rate limits.
  • Pre-fetch entity lists: Use gh issue list/gh pr list with broad filters to get bulk data upfront. Pass results via manifest files — never have each subagent re-fetch the same list.
  • Use --json field selection: Always specify only the fields needed.
  • Prefer GraphQL for multi-entity reads: Replace per-entity gh pr view loops with a single gh api graphql query using aliases.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requesthuman-reviewNeeds human review before implementationstagedImplementation staged and waiting for promotion to main

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions