Skip to content

fix(claude): graceful 429 rate limit handling with Retry-After support#378

Merged
robinebers merged 16 commits into
robinebers:mainfrom
zergzorg:fix/claude-429-rate-limiting
Apr 18, 2026
Merged

fix(claude): graceful 429 rate limit handling with Retry-After support#378
robinebers merged 16 commits into
robinebers:mainfrom
zergzorg:fix/claude-429-rate-limiting

Conversation

@zergzorg
Copy link
Copy Markdown
Contributor

@zergzorg zergzorg commented Apr 13, 2026

Summary

Closes #376

  • Parse Retry-After header on 429 (supports both seconds and HTTP-date formats)
  • Show amber "Rate limited, retry in ~Xm" badge instead of throwing an error
  • ccusage data (Today / Yesterday / Last 30 Days) continues to display when rate limited
  • Add a Note line explaining that live usage data may be stale
  • Removed a fake retry loop that hammered the API 3× immediately on 429 with no actual delay (made rate limiting worse)

Changes

plugin.js

  • parseRetryAfterSeconds(headers) — parses Retry-After header
  • fetchUsageWithRetryAfter(ctx, accessToken) — single request, attaches _retryAfterSeconds on 429
  • probe() — on 429: sets rateLimited = true, shows Status badge + Note, skips throwing

plugin.test.js — 6 new tests covering:

  • 429 shows badge without throwing
  • Retry-After seconds parsed correctly
  • Fallback message when Retry-After missing
  • HTTP-date Retry-After format
  • Rate limited status with ccusage data still present
  • Generic "try again later" message

Test plan

  • Verify badge appears with correct wait time when API returns 429 + Retry-After
  • Verify ccusage rows (Today/Yesterday/Last 30 Days) still render during rate limit
  • Verify no error thrown on 429
  • Verify normal flow unaffected when API returns 200

🤖 Generated with Claude Code


Summary by cubic

Gracefully handle Claude API 429s by honoring Retry-After, backing off between probes, and showing a wait badge instead of throwing. Adds a 5‑minute minimum usage fetch interval with a bypass right after rate‑limit windows and caches the last usage so data stays visible.

  • Bug Fixes
    • Parse Retry-After as seconds or HTTP‑date; allow 0 (“retry now”); use Math.ceil and strict null checks; parse immediately after the 429 check.
    • On 429: don’t throw; record a backoff (Retry-After or 5m default); skip API calls until it expires; show amber “Rate limited, retry in ~Xm/~now” badge + Note; keep cached usage/plan data.
    • Polling guard: never hit the usage API more than once per 5 minutes; reuse cached response when the interval hasn’t elapsed; bypass the min‑interval right after a rate limit so short Retry-After windows aren’t swallowed.
    • Tests: add _resetState, use fake timers, stub ctx.util.requestJson, and cover seconds/HTTP‑date/0, missing header, no‑calls‑during‑backoff, resume‑after‑expiry with bypass, min‑interval skip, cached‑data during limits; tighten the resume‑after‑rate‑limit assertion to check for the amber badge correctly.
    • Settings: stop Base UI Checkbox click bubbling and restore onCheckedChange so each toggle saves exactly once; re‑query the checkbox between clicks and use pre‑normalised settings in tests to avoid an init save.

Written for commit a80dd86. Summary will update on new commits.

- Parse Retry-After header (seconds and HTTP-date formats)
- Show amber "Rate limited, retry in ~Xm" badge instead of throwing
- Continue displaying ccusage data (Today/Yesterday/Last 30 Days) when rate limited
- Add Note line explaining live data may be stale
- Remove fake retry loop that hammered API 3x immediately on 429

Closes robinebers#376

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@augmentcode
Copy link
Copy Markdown

augmentcode Bot commented Apr 13, 2026

🤖 Augment PR Summary

Summary: Improves the Claude plugin’s handling of API rate limits so usage output degrades gracefully instead of erroring.

Changes:

  • Adds parsing of the Retry-After header (seconds or HTTP-date) and threads the retry delay through the usage fetch path.
  • Updates probe() to treat HTTP 429 as a non-fatal condition, showing an amber “Rate limited” status badge and an explanatory Note.
  • Keeps local ccusage-based aggregates (Today / Yesterday / Last 30 Days) visible even when live usage can’t be fetched.
  • Removes an immediate retry loop that previously re-hit the API on 429 without waiting.
  • Adds unit tests covering 429 behavior, Retry-After parsing, and fallback messaging.

🤖 Was this summary useful? React with 👍 or 👎

Copy link
Copy Markdown

@augmentcode augmentcode Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 3 suggestions posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

Comment thread plugins/claude/plugin.js Outdated
if (!str) return null
// Retry-After can be a delay-seconds or HTTP-date
const seconds = parseInt(str, 10)
if (Number.isFinite(seconds) && seconds > 0) return seconds
Copy link
Copy Markdown

@augmentcode augmentcode Bot Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

plugins/claude/plugin.js:433 — Retry-After allows a 0 delay (immediate retry), but seconds > 0 plus later truthy checks treat 0 as missing and will fall back to the generic “try again later” messaging instead of honoring the header.

Severity: medium

Other Locations
  • plugins/claude/plugin.js:446
  • plugins/claude/plugin.js:699

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

Comment thread plugins/claude/plugin.js Outdated
if (lines.length === 0) {
if (rateLimited) {
const waitText = retryAfterSeconds
? "Rate limited, retry in ~" + Math.round(retryAfterSeconds / 60) + "m"
Copy link
Copy Markdown

@augmentcode augmentcode Bot Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

plugins/claude/plugin.js:833 — Using Math.round(retryAfterSeconds / 60) can render ~0m for small Retry-After values (<30s) and can round down/up in a way that’s confusing for users (e.g., 89s -> 1m).

Severity: low

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

Comment thread plugins/claude/plugin.test.js Outdated
const ctx = makeCtx()
ctx.host.fs.readText = () => JSON.stringify({ claudeAiOauth: { accessToken: "token" } })
ctx.host.fs.exists = () => true
const futureDate = new Date(Date.now() + 15 * 60 * 1000).toUTCString()
Copy link
Copy Markdown

@augmentcode augmentcode Bot Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

plugins/claude/plugin.test.js:1786 — This test derives futureDate from Date.now() without fake timers, so it can become flaky if enough real time elapses between computing the header and probe() (the badge may show 14m instead of 15m).

Severity: low

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 2 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="plugins/claude/plugin.js">

<violation number="1" location="plugins/claude/plugin.js:833">
P2: Retry-After text can incorrectly show `~0m` for valid short delays (<30s), misleading users about wait time.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review, or fix all with cubic.

Comment thread plugins/claude/plugin.js Outdated
zergzorg and others added 2 commits April 14, 2026 00:15
- Allow Retry-After: 0 (RFC 7231: retry immediately), show '~now'
- Use Math.ceil instead of Math.round to avoid '~0m' display
- Use strict null checks (?? and !== null) for Retry-After: 0
- Fix flaky HTTP-date test: use fixed date + regex assertion
- Add test for Retry-After: 0 case

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
…w' case

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
@zergzorg
Copy link
Copy Markdown
Contributor Author

@augmentcode[bot] @cubic-dev-ai please re-review — issues #1 (Retry-After:0), #2 (Math.round), #3 (flaky test) are resolved in commits 7f9c0a9 and 2877eb2. Please dismiss stale review comments.

@cubic-dev-ai
Copy link
Copy Markdown
Contributor

cubic-dev-ai Bot commented Apr 13, 2026

@augmentcode[bot] @cubic-dev-ai please re-review — issues #1 (Retry-After:0), #2 (Math.round), #3 (flaky test) are resolved in commits 7f9c0a9 and 2877eb2. Please dismiss stale review comments.

@zergzorg I have started the AI code review. It will take a few minutes to complete.

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 2 files

@zergzorg
Copy link
Copy Markdown
Contributor Author

@robinebers @validatedev @davidarny

CI blocks merge, but this is a pre-existing issue — not from this PR.

App.test.tsx:785: expected savePluginSettings to be called 2 times, but got 3 times — flaky mock count assertion.

Evidence:

  • CI on main also fails (run 24339426410 — 'feat: add Fireworks AI plugin')
  • cubic-dev-ai re-review: SUCCESS
  • All 7 rate-limiting tests in this PR: PASS

@robinebers safe to merge (CI unrelated) or fix the flaky test separately.

@validatedev validatedev requested review from Copilot April 14, 2026 12:28
@validatedev
Copy link
Copy Markdown
Collaborator

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Already looking forward to the next diff.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Improves the Claude plugin’s behavior under API rate limiting by handling HTTP 429 responses gracefully (including Retry-After parsing) so the UI continues to show locally-derived ccusage data instead of erroring.

Changes:

  • Add Retry-After parsing (seconds and HTTP-date) and formatting for a user-facing “retry in ~Xm” message.
  • On 429 responses, avoid throwing; show an amber Status badge + Note while continuing to render ccusage (Today/Yesterday/Last 30 Days).
  • Add/adjust tests to cover 429 handling and Retry-After parsing scenarios.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
plugins/claude/plugin.js Adds Retry-After parsing and new 429 handling path that displays a status badge + note instead of throwing.
plugins/claude/plugin.test.js Adds test coverage for 429 UI behavior and Retry-After parsing (seconds, missing header, zero, HTTP-date).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread plugins/claude/plugin.test.js Outdated
Comment on lines +1802 to +1816
const ctx = makeCtx()
ctx.host.fs.readText = () => JSON.stringify({ claudeAiOauth: { accessToken: "token" } })
ctx.host.fs.exists = () => true
// Use a fixed HTTP-date to avoid flakiness
ctx.host.http.request.mockReturnValue({
status: 429,
bodyText: "",
headers: { "Retry-After": "Mon, 13 Apr 2026 12:30:00 GMT" },
})
const plugin = await loadPlugin()
const result = plugin.probe(ctx)
const noteLine = result.lines.find((line) => line.label === "Note")
expect(noteLine).toBeTruthy()
// Should show some minute value or "now" (depends on current time)
expect(noteLine.value).toMatch(/retry in ~(\d+m|now)/)
Comment thread plugins/claude/plugin.js Outdated
const retryAfter = parseRetryAfterSeconds(resp.headers)
if (retryAfter !== null) {
ctx.host.log.info("429 received, Retry-After: " + retryAfter + "s")
resp._retryAfterSeconds = retryAfter
- Remove fetchUsageWithRetryAfter helper that mutated the response
  object; call parseRetryAfterSeconds directly after the 429 check
- Use ?? instead of || when reading the Retry-After header to avoid
  treating a numeric 0 value as falsy
- Make the HTTP-date Retry-After test deterministic using vi fake
  timers instead of relying on the current clock

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@zergzorg
Copy link
Copy Markdown
Contributor Author

CI failure is pre-existing on main

The failing test (src/App.test.tsx > App > toggles plugins in settings) is not caused by this PR. The same test fails on main with the same error in the Fireworks AI plugin PR merged before this one — see run 24339426410.

My changes only touch plugins/claude/plugin.js and plugins/claude/plugin.test.js.


Also addressed the bot review comments in the latest commit:

  • Removed fetchUsageWithRetryAfter helper that mutated the HTTP response object — parseRetryAfterSeconds is now called directly in the 429 handler
  • Changed ||?? when reading the Retry-After header (safe for 0 values)
  • Made the HTTP-date test deterministic with vi.useFakeTimers() + a fixed future timestamp

The App normalises plugin settings on startup (adds newly-discovered
plugins to the stored order) which triggers an extra savePluginSettings
call before the test's clicks. Calling mockClear() after the settings
panel opens ensures the assertion counts only the two intentional
toggle saves.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions github-actions Bot added the core label Apr 14, 2026
zergzorg and others added 4 commits April 14, 2026 21:14
…ount

The previous fix called mockClear() before the async normalisation save
had fired, so the save still landed during the first-click assertion.
Now we waitFor the init save to complete, then clear, so the
toHaveBeenCalledTimes assertions count only the two toggle saves.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…le test

The beforeEach mock returns { order: ["a"], disabled: [] } which triggers a
normalisation save on startup because plugin "b" (not in DEFAULT_ENABLED_PLUGINS)
gets appended to order and disabled. Supplying the already-normalised form
{ order: ["a", "b"], disabled: ["b"] } makes arePluginSettingsEqual return true
so no init save fires, and the two toggle clicks produce exactly 2 saves.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Base UI's CheckboxRoot renders a visible <span> and dispatches a
synthetic PointerEvent('click') on a hidden <input> after each user
click. Both events bubble to the row <div onClick=onToggle>, causing
savePluginSettings to fire twice per click.

Wrap the Checkbox in a <span onClick=stopPropagation> so neither the
span click nor the hidden-input click reaches the row div. Restore
onCheckedChange on the Checkbox so the toggle still fires exactly once.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The Checkbox uses key={plugin.id-plugin.enabled} which causes a DOM
remount on every toggle. The previously cached element reference goes
stale after the first click, so the second userEvent.click was a no-op.
Re-query with findAllByRole before each click to always get the live node.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@zergzorg
Copy link
Copy Markdown
Contributor Author

CI is now green 🟢

Turns out the toggles plugins in settings failure had three compounding causes, all pre-existing before this PR. Here's the full post-mortem in case it's useful:


Root cause 1 — settings.tsx double-fire (introduced in 57cc5bd)

Commit 57cc5bd ("clickable provider rows") moved the toggle from Checkbox.onCheckedChange to div.onClick so the entire row is clickable. The problem: Base UI's CheckboxRoot internally dispatches a synthetic PointerEvent('click') on its hidden <input> element after every user click (to keep the native input in sync). Both the visible <span> click and the hidden <input> click bubbled up to the row <div onClick=onToggle>, firing onToggle — and therefore savePluginSettingstwice per click.

Fix: wrap <Checkbox> in a <span onClick={(e) => e.stopPropagation()}> so neither event reaches the row div. Restore onCheckedChange on the Checkbox to keep exactly one call per click. (src/pages/settings.tsx)


Root cause 2 — init normalisation save not accounted for in test

beforeEach seeds loadPluginSettings with { order: ["a"], disabled: [] }. On startup, normalizePluginSettings appends plugin "b" to order and to disabled (since "b" is not in DEFAULT_ENABLED_PLUGINS), producing { order: ["a","b"], disabled: ["b"] }. Because the stored and normalised forms differ, savePluginSettings fires once during bootstrap — before the test even clicks anything.

Fix: supply the already-normalised form { order: ["a","b"], disabled: ["b"] } directly in this test so arePluginSettingsEqual returns true and no init save fires. (src/App.test.tsx)


Root cause 3 — stale DOM reference after Checkbox remount

<Checkbox key={${plugin.id}-${plugin.enabled}} .../> changes its key on every toggle, causing React to unmount the old node and mount a fresh one. The test was caching the checkbox reference before the first click, so the second userEvent.click(pluginCheckbox) was operating on a detached (stale) DOM element — effectively a no-op.

Fix: re-query with screen.findAllByRole("checkbox") before each click so we always hold a live reference. (src/App.test.tsx)

Root cause: the plugin called the Anthropic usage API on every probe
invocation with no rate-guard, so at a short global refresh interval
(e.g. 10 s) it hammered the endpoint and got 429 every cycle.

Three-part fix (all state lives at module scope, survives re-invocations):

1. Minimum fetch interval — never call the usage API more than once per
   5 minutes regardless of the global auto-update setting.

2. Persistent rate-limit backoff — on a 429, record rateLimitedUntilMs
   (= now + Retry-After, or now + 5 min if the header is absent).
   Subsequent probe calls skip the API entirely until that timestamp
   passes, instead of retrying and hitting 429 again.

3. Response cache — the last successful API response is stored in
   cachedUsageData. Session/Weekly progress bars continue to render
   while the minimum interval or rate-limit window is active.

Together these mean: a user who polls every 10 s will make at most one
API call per 5 minutes, and a 429 causes at least a 5-minute pause.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 2 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="plugins/claude/plugin.test.js">

<violation number="1" location="plugins/claude/plugin.test.js:1873">
P2: This retry test advances time only 90 seconds, but `probe()` still enforces the separate 5-minute fetch interval, so the second call will be skipped for the wrong reason.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review, or fix all with cubic.

Comment thread plugins/claude/plugin.test.js
Module-scope vars (rateLimitedUntilMs, lastUsageFetchMs, cachedUsageData)
persist across the single loadPlugin() call shared by all tests via
beforeAll. Without a reset, the min-fetch-interval check from test N
causes test N+1 to skip the API call entirely, breaking all subsequent
assertions.

Expose _resetState() on the plugin object (production host never calls
it) and call it in beforeEach so every test starts with clean state.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@cubic-dev-ai
Copy link
Copy Markdown
Contributor

cubic-dev-ai Bot commented Apr 14, 2026

You're iterating quickly on this pull request. To help protect your rate limits, cubic has paused automatic reviews on new pushes for now—when you're ready for another review, comment @cubic-dev-ai review.

zergzorg and others added 4 commits April 14, 2026 22:33
When a 429 returned Retry-After shorter than MIN_USAGE_FETCH_INTERVAL_MS
(e.g. 60 s), the previous code fell through to the else-if branch that
checked the last-fetch timestamp.  Since lastUsageFetchMs was stamped at
the moment of the 429, the 5-minute poll-throttle was still active and
the retry after the rate-limit window expired was silently swallowed.

Track wasRateLimited before clearing rateLimitedUntilMs and skip the
min-interval guard when recovering from a rate limit, so the first probe
after any Retry-After window always reaches the API.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
beforeEach was used but not included in the named imports.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Two issues:

1. "throws on http errors" had two probe() calls in the same test.
   After the first (500) throws, lastUsageFetchMs is set to now.
   The second probe call is made within milliseconds, so the
   min-interval guard skips the fetch and doesn't throw.
   Fix: call plugin._resetState() between the two assertions.

2. Rate-limiting tests that check toHaveBeenCalledTimes counted both
   usage API calls and Promoclock calls (which also go through
   ctx.host.http.request via ctx.util.requestJson).
   Fix: override ctx.util.requestJson in each affected test so
   Promoclock calls don't pollute the usage call count.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
After a successful 200 response with empty JSON, the plugin shows
"No usage data" status badge. Use a real usage body so the badge
does not appear, and tighten the assertion to check for the
rate-limit-specific amber badge rather than any Status badge.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@zergzorg
Copy link
Copy Markdown
Contributor Author

Summary of changes

This PR fixes the root cause described in issue #377 — the Claude plugin was polling the Anthropic usage API too frequently (~every 10 s), consistently hitting the 429 rate limit on every refresh cycle.

Root cause fix

Added module-scope rate-limit state that persists across probe() calls:

  • MIN_USAGE_FETCH_INTERVAL_MS (5 min) — enforces a minimum interval between usage API calls, matching the reset cadence of the usage data itself. Intermediate probe() calls reuse the last cached response silently.
  • rateLimitedUntilMs — when a 429 is received, the next API call is blocked until the Retry-After window expires. If no Retry-After header is present, a 5-minute default backoff is applied.
  • cachedUsageData — last successful response is cached and shown to the user even while rate-limited, so the UI stays populated.

A key edge case is handled: when Retry-After is shorter than the 5-minute poll throttle (e.g. Retry-After: 60), the min-interval guard is bypassed on recovery so the plugin actually retries after the specified window instead of waiting the full 5 minutes.

What was NOT the fix

Showing a "Rate limited" badge in the UI (the previous approach in earlier commits) was only masking the symptom. The plugin was still hammering the API on every tick — it just wasn't surfacing the error. The fix prevents the redundant requests from being made in the first place.

Closes #377

Copy link
Copy Markdown
Owner

@robinebers robinebers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM and thank you

@robinebers robinebers merged commit 0ffe3ad into robinebers:main Apr 18, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Refresh now button

4 participants