Skip to content

fix: graceful rate limit handling for LLM enhancement#53

Merged
harlan-zw merged 1 commit intomainfrom
fix/rate-limit-retry
Mar 24, 2026
Merged

fix: graceful rate limit handling for LLM enhancement#53
harlan-zw merged 1 commit intomainfrom
fix/rate-limit-retry

Conversation

@harlan-zw
Copy link
Copy Markdown
Collaborator

@harlan-zw harlan-zw commented Mar 24, 2026

❓ Type of change

  • 📖 Documentation
  • 🐞 Bug fix
  • 👌 Enhancement
  • ✨ New feature
  • 🧹 Chore
  • ⚠️ Breaking change

📚 Description

When the LLM provider returns a 429 rate limit error, skilld would retry once after 3s and then show a raw API error dump. Now it detects rate limit errors, parses the "reset after Xs" hint from the response, waits the appropriate duration before retrying, and shows a user friendly message (Rate limited by LLM provider. Try again shortly or use a different model via skilld config) instead of the raw error.

Summary by CodeRabbit

  • Improvements
    • Added rate-limit awareness with intelligent retry delays and improved error messaging
    • Enhanced LLM operation feedback to provide guidance when rate-limited, including instructions for configuration changes

Detect 429/rate-limit errors and wait the appropriate duration before
retrying instead of using the fixed 3s stagger. Shows a user-friendly
message when rate limited instead of dumping raw API errors.
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 24, 2026

📝 Walkthrough

Walkthrough

The retry logic for optimizeSection failures was enhanced to detect rate-limit errors and apply dynamic backoff delays derived from error messages. Error logging in LLM enhancement failures was updated to identify rate-limit conditions and provide specific guidance. New helper functions support error extraction, rate-limit detection, and delay parsing.

Changes

Cohort / File(s) Summary
Rate-Limit-Aware Retry Logic
src/agent/clis/index.ts
Enhanced retry loop with helper functions (getRetryError, isRateLimitError, parseRateLimitDelay) to detect rate-limit errors via pattern matching (429, "rate limit", "exhausted capacity", "quota reset") and apply dynamic backoff delays extracted from error messages, falling back to fixed delay if unavailable.
LLM Error Logging Enhancement
src/commands/sync-shared.ts
Updated error logging in enhanceSkillWithLLM() to detect LLM rate-limit/quota/capacity messages and emit a dedicated "Rate limited by LLM provider…" message with retry guidance; other errors retain the original format.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 A rabbit hops through retry streams,
Rate limits thwarted, patience gleams,
Backoff delays now parse with care,
Errors detected, burdens shared—
Swift recovery through patient pairs! 🚀

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main change: implementing graceful handling of rate limits for LLM enhancement operations.
Docstring Coverage ✅ Passed Docstring coverage is 83.33% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/rate-limit-retry

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (3)
src/agent/clis/index.ts (2)

806-814: Hoist regex patterns to module scope and export for reuse.

Static analysis correctly flags that these regexes are recreated on every call. Additionally, the same pattern is duplicated inline in sync-shared.ts at line 1455. Consider:

  1. Moving patterns to module scope for performance
  2. Exporting isRateLimitError so sync-shared.ts can import it instead of duplicating
♻️ Proposed refactor
+// ── Rate limit detection ─────────────────────────────────────────────
+
+const RATE_LIMIT_429 = /\b429\b/
+const RATE_LIMIT_TEXT = /rate.?limit/i
+const RATE_LIMIT_CAPACITY = /exhausted.*capacity/i
+const RATE_LIMIT_QUOTA = /quota.*reset/i
+
 /** Check if an error string indicates a rate limit (429) */
-function isRateLimitError(error: string | undefined): boolean {
+export function isRateLimitError(error: string | undefined): boolean {
   if (!error)
     return false
-  return /\b429\b/.test(error)
-    || /rate.?limit/i.test(error)
-    || /exhausted.*capacity/i.test(error)
-    || /quota.*reset/i.test(error)
+  return RATE_LIMIT_429.test(error)
+    || RATE_LIMIT_TEXT.test(error)
+    || RATE_LIMIT_CAPACITY.test(error)
+    || RATE_LIMIT_QUOTA.test(error)
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agent/clis/index.ts` around lines 806 - 814, Extract the literal regular
expressions used in isRateLimitError into module-level constants (e.g.,
RATE_429_RE, RATE_LIMIT_RE, EXHAUSTED_CAPACITY_RE, QUOTA_RESET_RE) and replace
the inline regexes in the function with those constants to avoid recreating them
on every call; then export the isRateLimitError function from this module so
other files can import it, and update the duplicated usage in sync-shared.ts to
import and call isRateLimitError instead of duplicating the patterns.

816-822: Consider expanding the delay parsing pattern.

The current regex only matches reset after Xs format. Some providers may use variations like "retry in 5 seconds" or "wait 10s". The 10s default fallback is reasonable, but you could broaden coverage:

-  const match = error.match(/reset\s+after\s+(\d+)s/i)
+  const match = error.match(/(?:reset\s+after|retry\s+in|wait)\s+(\d+)\s*s(?:ec(?:ond)?s?)?/i)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agent/clis/index.ts` around lines 816 - 822, The parseRateLimitDelay
function currently only recognizes "reset after Xs"; update its regex to also
match common variants like "retry in Xs", "retry after X seconds", "wait Xs",
"wait X seconds", and optional plural/abbrev forms (sec/second/seconds) and
allow optional decimal or whitespace (e.g., "5", "5.0"); keep using
isRateLimitError check, extract the captured number from parseRateLimitDelay (or
fallback to 10) and convert to Number; ensure the regex and extraction are
applied inside parseRateLimitDelay so callers (e.g., parseRateLimitDelay and
places relying on its return) get broader coverage without changing other code.
src/commands/sync-shared.ts (1)

1455-1458: Import isRateLimitError instead of duplicating the pattern.

This regex is identical to isRateLimitError in src/agent/clis/index.ts. Reuse the existing function to avoid divergence when patterns need updating.

♻️ Proposed refactor

After exporting isRateLimitError from index.ts, update this file:

 import {
   agents,
   buildAllSectionPrompts,
   createToolProgress,
   generateSkillMd,
   getAvailableModels,
   getModelLabel,
   getModelName,
+  isRateLimitError,
   optimizeDocs,
   SECTION_OUTPUT_FILES,
 } from '../agent/index.ts'

Then update the condition:

-    if (error && /\b429\b|rate.?limit|exhausted.*capacity|quota.*reset/i.test(error))
+    if (isRateLimitError(error))
       llmLog.error(`Rate limited by LLM provider. Try again shortly or use a different model via \`skilld config\``)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/commands/sync-shared.ts` around lines 1455 - 1458, Replace the inline
regex rate-limit check with the shared isRateLimitError helper: import
isRateLimitError from the agent CLI module (the file that exports it) and change
the condition in the enhancement error handling to call isRateLimitError(error)
instead of testing the regex; keep the same llmLog.error calls (both the
rate-limited message and the generic “Enhancement failed” message) so behavior
is unchanged but pattern logic is centralized in isRateLimitError.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/agent/clis/index.ts`:
- Around line 806-814: Extract the literal regular expressions used in
isRateLimitError into module-level constants (e.g., RATE_429_RE, RATE_LIMIT_RE,
EXHAUSTED_CAPACITY_RE, QUOTA_RESET_RE) and replace the inline regexes in the
function with those constants to avoid recreating them on every call; then
export the isRateLimitError function from this module so other files can import
it, and update the duplicated usage in sync-shared.ts to import and call
isRateLimitError instead of duplicating the patterns.
- Around line 816-822: The parseRateLimitDelay function currently only
recognizes "reset after Xs"; update its regex to also match common variants like
"retry in Xs", "retry after X seconds", "wait Xs", "wait X seconds", and
optional plural/abbrev forms (sec/second/seconds) and allow optional decimal or
whitespace (e.g., "5", "5.0"); keep using isRateLimitError check, extract the
captured number from parseRateLimitDelay (or fallback to 10) and convert to
Number; ensure the regex and extraction are applied inside parseRateLimitDelay
so callers (e.g., parseRateLimitDelay and places relying on its return) get
broader coverage without changing other code.

In `@src/commands/sync-shared.ts`:
- Around line 1455-1458: Replace the inline regex rate-limit check with the
shared isRateLimitError helper: import isRateLimitError from the agent CLI
module (the file that exports it) and change the condition in the enhancement
error handling to call isRateLimitError(error) instead of testing the regex;
keep the same llmLog.error calls (both the rate-limited message and the generic
“Enhancement failed” message) so behavior is unchanged but pattern logic is
centralized in isRateLimitError.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f0374321-b51a-4574-b0d8-85ec70c30d72

📥 Commits

Reviewing files that changed from the base of the PR and between 491b3e5 and 2e85e97.

📒 Files selected for processing (2)
  • src/agent/clis/index.ts
  • src/commands/sync-shared.ts

@harlan-zw harlan-zw merged commit 4eecac6 into main Mar 24, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant