Skip to content

Add HTTP client retry logic with exponential backoff for MCP connections#6694

Closed
Copilot wants to merge 5 commits into
mainfrom
copilot/add-http-client-retry-logic
Closed

Add HTTP client retry logic with exponential backoff for MCP connections#6694
Copilot wants to merge 5 commits into
mainfrom
copilot/add-http-client-retry-logic

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Dec 17, 2025

Implementation Complete: HTTP Client Connection Retry Logic

  • 1. Create retry helper function with exponential backoff
    • 1.1. Add connectWithRetry function that wraps client.Connect() calls
    • 1.2. Add isTransientError function to identify retryable errors
    • 1.3. Implement exponential backoff (1s, 2s, 4s) for 3 attempts
    • 1.4. Ensure context cancellation is respected during retries
  • 2. Update HTTP connection code in mcp_inspect_mcp.go
    • 2.1. Replace client.Connect() call in connectHTTPMCPServer (line ~270)
    • 2.2. Keep stdio connections unchanged (no retry needed for local processes)
    • 2.3. Resolve merge conflict with main branch (MCP timeout constants)
  • 3. Add comprehensive tests
    • 3.1. Test successful connection after retry (integration test)
    • 3.2. Test permanent error fails immediately (integration test)
    • 3.3. Test context cancellation during retry (integration test)
    • 3.4. Test retry with exponential backoff timing (unit test)
    • 3.5. Test isTransientError function for various error types (unit test)
  • 4. Validate changes
    • 4.1. Run unit tests to ensure existing tests pass ✅
    • 4.2. Run build, lint, test, and recompile successfully ✅
    • 4.3. Fix deprecated Temporary() method warning ✅
    • 4.4. Verify no new security issues introduced ✅
    • 4.5. Merge main branch and resolve conflicts ✅

Summary

Successfully implemented HTTP client connection retry logic with exponential backoff for the MCP client. The implementation includes:

Key Features

  • Retry Logic: 3 total attempts (initial + 2 retries) with exponential backoff delays (1s, 2s)
  • Smart Error Detection: Only retries on transient network errors (connection refused, network/host unreachable, timeouts, connection reset)
  • Context-Aware: Respects context cancellation during retry delays
  • Debug Logging: Comprehensive logging via logger package for troubleshooting

Files Added/Modified

  • pkg/cli/mcp_connect_retry.go - Core retry logic implementation
  • pkg/cli/mcp_connect_retry_test.go - Unit tests (14 test cases)
  • pkg/cli/mcp_connect_retry_integration_test.go - Integration tests (3 test scenarios)
  • pkg/cli/mcp_inspect_mcp.go - Updated HTTP connection to use retry logic, merged with main branch changes

Merge Resolution

  • Merged main branch which introduced MCP timeout constants
  • Resolved conflict by using parent context timeout (30s) instead of creating inner timeout
  • Retry logic respects the 30s parent timeout which is sufficient for 3 attempts with exponential backoff

Testing

  • ✅ All unit tests pass (including new retry tests)
  • ✅ All integration tests pass (validates retry behavior)
  • ✅ All existing MCP inspect tests pass
  • ✅ Build, lint, and recompile succeed
  • ✅ No new security vulnerabilities introduced

Acceptance Criteria Met

✅ Retry logic implemented with 3 attempts and exponential backoff
✅ Only transient errors trigger retries (permanent errors fail immediately)
✅ Context cancellation is respected during retry delays
✅ Tests added for retry behavior (success after retry, permanent failure, context cancellation)
✅ Existing tests pass
✅ Main branch merged successfully

The implementation makes the HTTP MCP client more resilient to temporary network issues while avoiding unnecessary retries on permanent failures.

Original prompt

This section details on the original issue you should resolve

<issue_title>[plan] Implement HTTP client connection retry logic</issue_title>
<issue_description>## Objective

Add retry logic with exponential backoff for transient HTTP connection failures in the MCP client.

Context

The MCP SDK v1.1.1 includes fixes for transient error handling (#723) and HTTP connection reuse (#709). Adding retry logic will make the client more resilient to temporary network issues.

Approach

  1. Create a helper function with exponential backoff:
func connectWithRetry(ctx context.Context, client *mcp.Client, transport mcp.Transport) (*mcp.ClientSession, error) {
    for i := 0; i < 3; i++ {
        session, err := client.Connect(ctx, transport, nil)
        if err == nil || !isTransientError(err) {
            return session, err
        }
        select {
        case <-ctx.Done():
            return nil, ctx.Err()
        case <-time.After(time.Second * time.Duration(1<<i)):
            // Retry with exponential backoff: 1s, 2s, 4s
        }
    }
    return nil, fmt.Errorf("failed after 3 retries")
}

func isTransientError(err error) bool {
    // Check for network errors that are worth retrying
    // E.g., connection refused, timeout, temporary DNS failures
}
  1. Replace direct Connect() calls with connectWithRetry() in HTTP client code

Files to Modify

  • pkg/cli/mcp_inspect_mcp.go (lines ~134, 226)

Acceptance Criteria

  • Retry logic implemented with 3 attempts and exponential backoff
  • Only transient errors trigger retries (permanent errors fail immediately)
  • Context cancellation is respected during retry delays
  • Tests added for retry behavior (success after retry, permanent failure, context cancellation)
  • Existing tests pass
    Related to [plan] MCP SDK improvements from Go Fan review #6678

AI generated by Plan Command for discussion #6658

Comments on the Issue (you are @copilot in this section)


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Copilot AI and others added 3 commits December 17, 2025 02:39
Co-authored-by: mnkiefer <8320933+mnkiefer@users.noreply.github.com>
Co-authored-by: mnkiefer <8320933+mnkiefer@users.noreply.github.com>
Co-authored-by: mnkiefer <8320933+mnkiefer@users.noreply.github.com>
Copilot AI changed the title [WIP] Add HTTP client connection retry logic Add HTTP client retry logic with exponential backoff for MCP connections Dec 17, 2025
Copilot AI requested a review from mnkiefer December 17, 2025 03:09
@pelikhan pelikhan marked this pull request as ready for review December 17, 2025 04:19
@pelikhan
Copy link
Copy Markdown
Collaborator

@copilot merge main

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Dec 17, 2025

@copilot merge main

Main branch merged successfully in e0cb804. Resolved conflict in mcp_inspect_mcp.go by using parent context timeout (30s) which is sufficient for retry logic with exponential backoff.

Copilot AI requested a review from pelikhan December 17, 2025 04:30
@pelikhan pelikhan closed this Dec 17, 2025
@pelikhan pelikhan deleted the copilot/add-http-client-retry-logic branch December 18, 2025 01:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[plan] Implement HTTP client connection retry logic

3 participants