feat: Wire rate limiting & circuit breaker into watch command (#515)#522
feat: Wire rate limiting & circuit breaker into watch command (#515)#522tamirdresher wants to merge 6 commits intobradygaster:devfrom
Conversation
…ygaster#515) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add non-null assertions for array indexing in PredictiveCircuitBreaker - Add ./ralph/rate-limiting export entry to SDK package.json Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…aster#515) Integrates PredictiveCircuitBreaker from squad-sdk into Ralph's watch loop: - Pre-round rate limit check via gh api rate_limit - Traffic light gating (GREEN/AMBER/RED) - Predictive circuit opening before 429 - Exponential backoff on rate limit errors - Half-open recovery with 2-success threshold - State persistence to .squad/ralph-circuit-breaker.json - Adds ghRateLimitCheck() and isRateLimitError() to gh-cli.ts Depends on bradygaster#518 for the SDK module. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds ./core/gh-cli subpath export to CLI package.json so the rate limit utilities (isRateLimitError, ghRateLimitCheck) are accessible from tests and external consumers. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
9187404 to
d97cd7c
Compare
- Add roundInProgress flag to prevent overlapping setInterval rounds - Remove unused getRetryDelay import and dead variable (line 391) - Remove unused shouldProceed import Fixes Q review findings: race condition when executeRound() takes longer than interval causes double-triaging. Now skips round if previous is still running. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- roundInProgress prevents overlapping concurrent rounds - roundInProgress resets to false after a round throws (finally block) - Rate limit amber traffic light blocks P1/P2 agents - Rate limit red traffic light blocks all agents - PredictiveCircuitBreaker stays closed with no samples Fixes Q review finding: only 7 tests for 534 lines of new code
bradygaster
left a comment
There was a problem hiding this comment.
The circuit breaker state machine and exponential backoff implementation are solid — this is clearly production-informed code. However, the delivery approach needs adjustment:\n\nMain issue: Full rewrite of watch.ts. This PR deletes all 355 lines of the existing watch.ts and replaces them with 534 new lines. The actual new functionality (executeRound wrapper, CircuitBreakerState, circuit breaker lifecycle) is modest — it should be addable as a surgical patch on top of the existing file rather than a full rewrite.\n\nWhy this matters:\n- Creates guaranteed merge conflicts with #520 (which also modifies watch.ts)\n- Removes the existing try/catch error handling from
unCheck, letting all errors from \ghIssueList\ crash the round (behavior change)\n- Makes future watch.ts changes harder to review\n\nRequest: Please rework this as an additive patch — keep the existing watch.ts structure and wrap the circuit breaker around the existing flow. The algorithms are great; just the delivery needs to change.\n\nA few smaller notes:\n- \saveCBState\ is called with \�wait\ but uses \writeFileSync\ — harmless but inconsistent\n- ./core/gh-cli\ export added to CLI package.json exposes an internal module as public API — is that intentional?\n\nHappy to discuss the approach if you'd like to chat about it! 💬
Code Review — Race Condition AnalysisHi @bradygaster — reviewing PR #522 as a downstream consumer of the Squad framework. Race Condition Fix ✅The One Small Issue:
|
|
Closing this in favor of a reworked PR that addresses all feedback:
New PR incoming from \ amirdresher/515-rate-limit-watch-v2\ branch. |
…r#546) Consolidated 15 remaining issues into a single PR. Redesigns help UX with grouped categories, surfaces all /help commands, aligns docs with actual CLI behavior, adds first-run welcome, standardizes naming. Closes bradygaster#510, Closes bradygaster#513, Closes bradygaster#514, Closes bradygaster#515, Closes bradygaster#516, Closes bradygaster#517, Closes bradygaster#518, Closes bradygaster#521, Closes bradygaster#522, Closes bradygaster#523, Closes bradygaster#524, Closes bradygaster#525, Closes bradygaster#527, Closes bradygaster#528, Closes bradygaster#529 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Summary
Wires the rate limiting SDK (PR #518) into Ralph's watch command polling loop.
Changes
unCheck()\ with circuit breaker state machine (closed/open/half-open)
Fixes (Q Review)
oundInProgress\ flag to prevent overlapping \setInterval\ rounds when a round takes longer than the interval
Architecture Note: Model Fallback
This circuit breaker handles GitHub API rate limits (429 from \gh\ CLI). It is intentionally separate from the SDK's \ModelFallbackExecutor\ which handles LLM model switching (trying cheaper models when premium ones fail). The watch command makes no LLM calls — it uses deterministic triage logic — so model fallback does not apply here. If LLM-based triage is added in the future, \ModelFallbackExecutor\ should be integrated at that point.
Testing
Closes #515
Depends on #518