Skip to content

fix: retry premature close and connection errors in core#11794

Closed
RomneyDa wants to merge 1 commit intomainfrom
fix/retry-premature-close
Closed

fix: retry premature close and connection errors in core#11794
RomneyDa wants to merge 1 commit intomainfrom
fix/retry-premature-close

Conversation

@RomneyDa
Copy link
Collaborator

@RomneyDa RomneyDa commented Mar 25, 2026

Summary

  • Add "premature close", "premature end", "connection reset", and "socket hang up" as retryable error patterns in core/util/withExponentialBackoff.ts and core/llm/utils/retry.ts
  • These patterns were already handled in the CLI (extensions/cli/src/util/exponentialBackoff.ts) but missing from core, which is used by the VS Code extension
  • Add 4 new test cases covering the new connection error patterns

Context

Five open issues report "premature close" errors across different providers (Ollama, OpenAI-compatible, xAI). All are from VS Code users. The CLI already retries these transient errors, but the core retry logic did not — so users saw unrecoverable "Unknown error" messages instead of automatic recovery.

Fixes #11108, #11102, #10948, #9667, #9999

Test plan

  • All 29 retry tests pass (including 4 new ones for premature close, premature end, socket hang up, connection reset)
  • Manual: trigger a premature close with a local Ollama instance and verify retry behavior

Summary by cubic

Core now retries transient connection/stream errors (“premature close”, “premature end”, “connection reset”, “socket hang up”) to match the CLI. This prevents unrecoverable “Unknown error” in the VS Code extension.

Written for commit e9fccbc. Summary will update on new commits.

The CLI already retries transient connection errors like "premature close",
"socket hang up", etc., but the core retry logic (used by the VS Code
extension) did not. This caused unrecoverable errors for users hitting
transient network/stream interruptions with various providers.

Add connection error patterns to both core/util/withExponentialBackoff.ts
and core/llm/utils/retry.ts to match the CLI behavior.

Fixes #11108, #11102, #10948, #9667, #9999
@RomneyDa RomneyDa requested a review from a team as a code owner March 25, 2026 02:33
@RomneyDa RomneyDa requested review from sestinj and removed request for a team March 25, 2026 02:33
@dosubot dosubot bot added the size:M This PR changes 30-99 lines, ignoring generated files. label Mar 25, 2026
@continue
Copy link
Contributor

continue bot commented Mar 25, 2026

Docs Review

No documentation update needed for this PR.

This change adds retry logic for transient connection errors (premature close, socket hang up, connection reset, etc.) in core—aligning it with the existing CLI behavior. This is an internal reliability improvement that:

  • Is completely transparent to users (no configuration required)
  • Doesn't change any APIs, commands, or user-facing workflows
  • Simply makes the extension more resilient to network hiccups

Users benefit automatically without needing to know or do anything differently. 👍

@continue
Copy link
Contributor

continue bot commented Mar 25, 2026

Test Coverage Gap

The changes to core/llm/utils/retry.ts have excellent test coverage with 4 new test cases covering each connection error pattern.

However, the same patterns were also added to core/util/withExponentialBackoff.ts without corresponding tests. While a test file exists (withExponentialBackoff.test.ts), the entire test suite is currently skipped (describe.skip).

Suggestion: Consider adding test coverage for the new connection error patterns in withExponentialBackoff.ts:

  • premature close
  • premature end
  • connection reset
  • socket hang up

These tests would verify that the exponential backoff utility correctly retries on these connection errors, similar to the tests added for retry.ts.

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 3 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="core/util/withExponentialBackoff.ts">

<violation number="1" location="core/util/withExponentialBackoff.ts:26">
P3: The new retryable connection errors are logged as "Hit rate limit," which is inaccurate and can mislead troubleshooting. Use a generic retry message (or include the actual error type) for this shared retry branch.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

lowerMessage.includes("premature close") ||
lowerMessage.includes("premature end") ||
lowerMessage.includes("connection reset") ||
lowerMessage.includes("socket hang up")
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3: The new retryable connection errors are logged as "Hit rate limit," which is inaccurate and can mislead troubleshooting. Use a generic retry message (or include the actual error type) for this shared retry branch.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At core/util/withExponentialBackoff.ts, line 26:

<comment>The new retryable connection errors are logged as "Hit rate limit," which is inaccurate and can mislead troubleshooting. Use a generic retry message (or include the actual error type) for this shared retry branch.</comment>

<file context>
@@ -19,7 +19,11 @@ const withExponentialBackoff = async <T>(
+        lowerMessage.includes("premature close") ||
+        lowerMessage.includes("premature end") ||
+        lowerMessage.includes("connection reset") ||
+        lowerMessage.includes("socket hang up")
       ) {
         const retryAfter = (error as APIError).response?.headers.get(
</file context>
Fix with Cubic

@RomneyDa
Copy link
Collaborator Author

Actually nvm premature can happen mid-stream.

@RomneyDa RomneyDa closed this Mar 25, 2026
@github-project-automation github-project-automation bot moved this from Todo to Done in Issues and PRs Mar 25, 2026
@github-actions github-actions bot locked and limited conversation to collaborators Mar 25, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

size:M This PR changes 30-99 lines, ignoring generated files.

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

Error: DSllmO-Coder - Unknown error

1 participant