Skip to content

MCPToolset hangs indefinitely #3084

@akintunca

Description

@akintunca

Describe the bug

MCPToolset hangs indefinitely after the server sits idle for several minutes, blocking the entire event loop. The issue occurs when MCPSessionManager returns a cached session with stale HTTP/SSE connections to MCP servers. When session.list_tools() is called on the stale session, it blocks forever instead of timing out or reconnecting. The hang is so severe that Ctrl+C doesn't work, requiring a kill signal to stop the process.

To Reproduce

Minimal code to reproduce:

from google.adk.agents import LlmAgent
from google.adk.tools.mcp_tool.mcp_toolset import MCPToolset, StreamableHTTPConnectionParams
from google.adk.runners import Runner
from google.adk.sessions import InMemorySessionService
from google.adk.memory.in_memory_memory_service import InMemoryMemoryService
from google.adk.artifacts import InMemoryArtifactService
from google.genai import types

Create MCPToolset with remote MCP server

connection_params = StreamableHTTPConnectionParams(
url="https://your-mcp-server.com/mcp",
headers={},
timeout=30,
)

toolset = MCPToolset(
connection_params=connection_params,
tool_filter=None,
)

Create agent with the toolset

agent = LlmAgent(
model="gpt-4o",
name="test_agent",
description="Test agent",
instruction="You are a helpful assistant",
tools=[toolset],
)

Create runner

runner = Runner(
app_name="test_agent",
agent=agent,
artifact_service=InMemoryArtifactService(),
session_service=InMemorySessionService(),
memory_service=InMemoryMemoryService(),
)

Steps to reproduce:

1. Start the server and send a request - works fine

2. Wait 5-10 minutes without sending any requests

3. Send another request

4. Observe that runner.run_async() hangs indefinitely

5. Ctrl+C doesn't work - must kill -9 the process

Expected behavior

After idle periods, MCPToolset should either:

Detect stale connections and reconnect automatically
Timeout gracefully and raise an error that can be caught
Implement keepalive to prevent connections from becoming stale
Create fresh sessions instead of caching stale ones
The @retry_on_closed_resource decorator suggests this is a known issue, but it doesn't handle the timeout/hang scenario.

Root Cause Analysis

In mcp_toolset.py line 165:

session = await self._mcp_session_manager.create_session()
_mcp_session_manager is created once in
init
and reused forever. After idle periods, it returns a session with stale connections. When session.list_tools() (line 168) is called, it blocks indefinitely.

Desktop:

OS: Linux
Python version: Python 3.13
ADK version: google-adk==1.15.1
Model Information:

Are you using LiteLLM: Yes
Which model is being used: gpt-4.1 (via LiteLLM proxy)
Additional context

This issue is specific to Google ADK. Other MCP client implementations (LangChain's MultiServerMCPClient, Anthropic's direct API approach) don't have this problem because they create fresh connections on each request or handle connection lifecycle properly.

The severity is high because:

The entire event loop blocks (can't even Ctrl+C)
Requires process kill to recover
Happens consistently after any idle period (5-10 minutes)
Makes long-running servers unreliable
Workaround: Restart the server process periodically, but this is not a sustainable solution for production deployments.

Metadata

Metadata

Assignees

Labels

bot triaged[Bot] This issue is triaged by ADK botmcp[Component] Issues about MCP support

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions