-
Notifications
You must be signed in to change notification settings - Fork 2k
Description
Describe the bug
MCPToolset
hangs indefinitely after the server sits idle for several minutes, blocking the entire event loop. The issue occurs when MCPSessionManager
returns a cached session with stale HTTP/SSE connections to MCP servers. When session.list_tools()
is called on the stale session, it blocks forever instead of timing out or reconnecting. The hang is so severe that Ctrl+C doesn't work, requiring a kill signal to stop the process.
To Reproduce
Minimal code to reproduce:
from google.adk.agents import LlmAgent
from google.adk.tools.mcp_tool.mcp_toolset import MCPToolset, StreamableHTTPConnectionParams
from google.adk.runners import Runner
from google.adk.sessions import InMemorySessionService
from google.adk.memory.in_memory_memory_service import InMemoryMemoryService
from google.adk.artifacts import InMemoryArtifactService
from google.genai import types
Create MCPToolset with remote MCP server
connection_params = StreamableHTTPConnectionParams(
url="https://your-mcp-server.com/mcp",
headers={},
timeout=30,
)
toolset = MCPToolset(
connection_params=connection_params,
tool_filter=None,
)
Create agent with the toolset
agent = LlmAgent(
model="gpt-4o",
name="test_agent",
description="Test agent",
instruction="You are a helpful assistant",
tools=[toolset],
)
Create runner
runner = Runner(
app_name="test_agent",
agent=agent,
artifact_service=InMemoryArtifactService(),
session_service=InMemorySessionService(),
memory_service=InMemoryMemoryService(),
)
Steps to reproduce:
1. Start the server and send a request - works fine
2. Wait 5-10 minutes without sending any requests
3. Send another request
4. Observe that runner.run_async() hangs indefinitely
5. Ctrl+C doesn't work - must kill -9 the process
Expected behavior
After idle periods, MCPToolset should either:
Detect stale connections and reconnect automatically
Timeout gracefully and raise an error that can be caught
Implement keepalive to prevent connections from becoming stale
Create fresh sessions instead of caching stale ones
The @retry_on_closed_resource decorator suggests this is a known issue, but it doesn't handle the timeout/hang scenario.
Root Cause Analysis
In mcp_toolset.py line 165:
session = await self._mcp_session_manager.create_session()
_mcp_session_manager is created once in
init
and reused forever. After idle periods, it returns a session with stale connections. When session.list_tools() (line 168) is called, it blocks indefinitely.
Desktop:
OS: Linux
Python version: Python 3.13
ADK version: google-adk==1.15.1
Model Information:
Are you using LiteLLM: Yes
Which model is being used: gpt-4.1 (via LiteLLM proxy)
Additional context
This issue is specific to Google ADK. Other MCP client implementations (LangChain's MultiServerMCPClient, Anthropic's direct API approach) don't have this problem because they create fresh connections on each request or handle connection lifecycle properly.
The severity is high because:
The entire event loop blocks (can't even Ctrl+C)
Requires process kill to recover
Happens consistently after any idle period (5-10 minutes)
Makes long-running servers unreliable
Workaround: Restart the server process periodically, but this is not a sustainable solution for production deployments.