Skip to content

Remote MCP servers disconnected by server-side idle timeout lack automatic reconnection #15209

@chenhaodongHD

Description

@chenhaodongHD

Problem

Remote MCP servers (using SSEClientTransport or StreamableHTTPClientTransport) may have their connections closed by the remote server during idle periods due to server-side timeout policies. Currently, OpenCode:

  1. No keepalive mechanism - There's no heartbeat/ping to keep connections alive during idle periods
  2. No automatic reconnection - When a connection drops, OpenCode doesn't automatically attempt to reconnect
  3. No pre-use connection check - When an agent runs and needs MCP tools, OpenCode doesn't verify if connections are still alive before attempting to use them
  4. Reactive failure handling only - Connection failures are only detected when a tool call fails, at which point the client is marked as "failed" and removed from the active clients pool

Current Behavior

From `packages/opencode/src/mcp/index.ts`:

```typescript
// Tool fetching fails if connection is dead
const toolsResult = await client.listTools().catch((e) => {
log.error("failed to get tools", { clientName, error: e.message })
s.status[clientName] = { status: "failed", error: e.message }
delete s.clients[clientName]
return undefined
})
```

The current flow:

  1. MCP servers connect at startup
  2. Connection sits idle (no keepalive)
  3. Remote server closes connection due to inactivity
  4. User runs an agent that tries to use MCP tool
  5. Tool call fails → client marked as "failed"
  6. User must manually reconnect via `opencode mcp connect ` or restart OpenCode

Expected Behavior

  1. Keepalive mechanism - Periodic ping/heartbeat to keep remote connections alive (configurable interval)
  2. Auto-reconnection - When connection drops, automatically attempt to reconnect with exponential backoff
  3. Pre-use health check - Before an agent session starts, verify all configured MCP servers are connected and attempt reconnection for failed ones
  4. Connection state monitoring - Detect disconnections proactively (e.g., SSE close events) rather than waiting for tool call failures

Suggested Solution

  1. Add a configurable `keepalive` interval for remote MCP servers (e.g., `mcp.keepaliveInterval` in config)
  2. Implement periodic health checks using `client.ping()` or similar mechanism
  3. Listen for transport close/error events to detect disconnections immediately
  4. Add auto-reconnect logic with exponential backoff (configurable max retries)
  5. Before agent execution, check all MCP statuses and attempt to reconnect failed/disconnected servers

Workaround

Currently, users need to:

  • Manually reconnect: `opencode mcp connect `
  • Restart OpenCode to re-establish all MCP connections
  • Use a proxy/load balancer that keeps connections alive

Impact

This affects reliability when using remote MCP servers, especially:

  • Cloud-hosted MCP services with aggressive idle timeouts
  • MCP servers behind corporate firewalls/proxies
  • Long-running OpenCode sessions with sporadic MCP usage

Metadata

Metadata

Assignees

Labels

coreAnything pertaining to core functionality of the application (opencode server stuff)

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions