fix(session): retry network errors, cap at 3, add retry_exhausted status#28792
Open
OrShmuel22 wants to merge 13 commits into
Open
fix(session): retry network errors, cap at 3, add retry_exhausted status#28792OrShmuel22 wants to merge 13 commits into
OrShmuel22 wants to merge 13 commits into
Conversation
When a parent agent (e.g. an orchestrator) has edit:deny and spawns a subagent (e.g. an editor) that has edit:allow, the parent's deny was unconditionally inherited into the subagent's session permission. Because permission evaluation is last-match-wins, the inherited deny overrode the subagent's own allow — removing the edit tool from the subagent's palette. Fix: only inherit parent edit:deny rules when the subagent does NOT explicitly declare edit:allow. If a subagent says it can edit, the parent's self-restriction should not override that declared capability. This preserves Plan Mode security: subagents without explicit edit declarations (like general, explore) still inherit the parent's edit:deny as before. Relates to anomalyco#26700 anomalyco#26747 anomalyco#26758 anomalyco#27123
… max retries at 3 - Add RETRY_MAX_ATTEMPTS = 3 to prevent infinite retry loops - Add NETWORK_ERROR_PATTERNS for ECONNRESET, ECONNREFUSED, ETIMEDOUT, fetch failed, socket hang up, network error, connection reset/refused/timeout - Add nested error envelope inspection (server_error, upstream_error, stream_read_error, service_unavailable_error) - Fix OpenRouter numeric code bug (typeof json.code === 'number') - Add comprehensive test coverage for all new retry patterns Closes anomalyco#20822, anomalyco#21716, anomalyco#21893, anomalyco#23287 Related anomalyco#19394, anomalyco#20466, anomalyco#22448, anomalyco#26369
- Add retry_exhausted to SessionStatus.Info schema union - Add retriesExhausted tracking to ProcessorContext - Detect exhausted retries in halt() and set retry_exhausted status - Return retry_exhausted from process() when retries are exhausted - Preserve retry_exhausted status in run-state onIdle (don't reset to idle) - Handle retry_exhausted in compaction.ts (treat as stop) - Add tests for retry_exhausted status lifecycle
…t retry_exhausted hang
…try_exhausted flash
Contributor
|
Thanks for your contribution! This PR doesn't have a linked issue. All PRs must reference an existing issue. Please:
See CONTRIBUTING.md for details. |
Contributor
|
Thanks for updating your PR! It now meets our contributing guidelines. 👍 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Issue for this PR
Closes #20822, #21716, #21893, #23287
Related #19394, #20466, #22448, #26369
Type of change
What does this PR do?
Network errors (ECONNRESET, ECONNREFUSED, ETIMEDOUT, fetch failed, socket hang up) were never retried —
retryable()only matched rate limits and 5xx. Sessions halted with no recovery path. Had to ESC and type "continue" every time laptop slept or WiFi blinked.Changes:
Network error patterns in
retryable()— Added ECONNRESET, ECONNREFUSED, ETIMEDOUT, ECONNABORTED, fetch failed, Failed to fetch, socket hang up, network error, connection reset/refused/timeout. Also handles nested error envelopes (server_error, upstream_error, stream_read_error, service_unavailable_error) and fixes the OpenRouter numericcodebug.Caps retries at 3 (
RETRY_MAX_ATTEMPTS = 3, 2→4→8s backoff ≈14s budget). Prevents infinite-retry loops.retry_exhaustedstatus — When retries run out on a retryable error, status is set toretry_exhaustedinstead ofidle. TUI shows error message withenter retry · esc dismiss. Enter re-sends last user message. Escape dismisses. Subagents skip this status and fall through to idle+error so parent handles it.Why this approach:
retryable()whitelist too narrow) without over-engineeringHow did you verify your code works?
bun test test/session/— 364 pass, 0 failbun test test/session/retry.test.ts— 49 pass, 0 failbun run typecheck— clean, no errorsChecklist