-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Open
Description
Initial Checks
- I confirm that I'm using the latest version of MCP Python SDK
- I confirm that I searched for my issue in https://github.com/modelcontextprotocol/python-sdk/issues before opening this issue
Description
For availability requirements, after my MCP client establishes a connection with the server, I need to use a separate thread to poll whether the MCP client is functioning properly. When detecting that the MCP client cannot retrieve tools, I manually perform cleanup and then reconnect. However, when encountering the following sequence of operations, the client's call_tool operation becomes permanently blocked:
- MCP client successfully establishes connection with the server;
- MCP client calls the call_tool method, which takes a long time to execute (assume 20 seconds);
- The health check thread detects that the client cannot retrieve the tool list and initiates cleanup operation;
- The health check thread completes cleanup, and the MCP server subprocess terminates (verified via ps command);
- The call_tool from step 2 remains permanently blocked without any exception (unless timeout is set);
Additional notes:
- If steps 3 and 4 are replaced with manually killing the MCP server subprocess, call_tool immediately responds with an error: "connection closed."
- For availability considerations, I cannot abandon the health check requirement.
Example Code
server.py
# server.py
import asyncio
from mcp.server.fastmcp import FastMCP
# Create server
mcp = FastMCP("Test Server")
@mcp.tool()
async def test() -> str:
# mock block for 20s
await asyncio.sleep(20)
return "hello world"
if __name__ == "__main__":
mcp.run(transport="stdio")client.py
import logging
from contextlib import AsyncExitStack
from datetime import timedelta
from typing import Optional
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
logging.basicConfig(
level=logging.DEBUG, format="%(asctime)s - %(name)s - %(levelname)s - %(message)s"
)
logger = logging.getLogger(__name__)
class MCPClient:
def __init__(self, params: dict):
self.params = params
self.session: Optional[ClientSession] = None
self.exit_stack = AsyncExitStack()
async def connect(self):
server_params = StdioServerParameters(**self.params)
stdio_transport = await self.exit_stack.enter_async_context(
stdio_client(server_params)
)
stdio, write = stdio_transport
self.session = await self.exit_stack.enter_async_context(
ClientSession(stdio, write)
)
await self.session.initialize()
async def cleanup(self):
await self.exit_stack.aclose()
async def test_call_tool(client: MCPClient):
await asyncio.sleep(5)
try:
tools = await client.session.list_tools()
logger.debug(f"[call_tool] tools count: {len(tools.tools)}")
# timeout works well
# result = await client.session.call_tool("test", {}, read_timeout_seconds=timedelta(seconds=60))
# always block
result = await client.session.call_tool("test", {})
logger.debug(f"[call_tool] result text length: {len(result.content[0].text)}")
except asyncio.CancelledError:
logger.debug("[test_call_tool] cancelled")
except Exception as e:
logger.debug(f"test_call_tool failed: {e}")
async def test_check_connection(client: MCPClient):
try:
# init connection
await client.connect()
logger.debug("[check_connection] connect completed")
# list tools
tools = await client.session.list_tools()
logger.debug(f"[check_connection] tools count: {len(tools.tools)}")
# wait for call_tool started
await asyncio.sleep(10)
# mock cleanup when call_tool is running
await client.cleanup()
logger.debug("[check_connection] cleanup completed")
# mock reconnect
await client.connect()
logger.debug("[check_connection] reconnect completed")
except asyncio.CancelledError:
logger.debug("[test_check_connection] cancelled")
except Exception as e:
logger.debug(f"[test_check_connection] failed: {e}")
async def main():
params = {
"command": "python",
"args": ["server.py"],
}
client = MCPClient(params)
try:
tasks = [test_call_tool(client), test_check_connection(client)]
await asyncio.gather(*tasks)
logger.debug("[main] test completed")
except Exception as e:
logger.debug(f"[main] test failed: {e}")
except asyncio.CancelledError:
logger.debug("[main] test cancelled")
finally:
logger.debug("[main] cleanup")
if __name__ == "__main__":
import asyncio
asyncio.run(main())console output
2025-10-31 11:58:14,924 - __main__ - DEBUG - [check_connection] connect completed
2025-10-31 11:58:14,926 - __main__ - DEBUG - [check_connection] tools count: 1
2025-10-31 11:58:18,828 - __main__ - DEBUG - [call_tool] tools count: 1
2025-10-31 11:58:26,945 - __main__ - DEBUG - [check_connection] cleanup completed
2025-10-31 11:58:28,071 - __main__ - DEBUG - [check_connection] reconnect completed
Python & MCP Python SDK
Python: 3.12
MCP Python SDK: 1.20.0
Metadata
Metadata
Assignees
Labels
No labels