Skip to content

[bug]: deadlock on self-hosted MCP servers during first-call #112

@gotsysdba

Description

@gotsysdba

MCPTool.run_async() and MCPToolBox._get_tools_inner_async() call the synchronous AsyncRuntime.get_or_create_session() directly from an async context. During that setup, _create_long_lived_session() blocks the calling thread via self._portal.call(recv.receive). If the calling thread is the event loop thread (which it is when invoked from an ASGI handler), the event loop is blocked until the session initializes.

This causes a deadlock when the MCP server and client are co-hosted in the same process because the server cannot process the incoming HTTP request from the portal's session_runner while its event loop is blocked. The deadlock manifests only on the first call requiring a session creation and times out after 60 seconds (SessionParameters.read_timeout_seconds). Subsequent calls reuse the cached session and complete normally.

Affected code

  • wayflowcore/mcp/tools.py lines 110-119
  • wayflowcore/mcp/tools.py lines 216-221

Both call the sync get_or_create_session()_create_long_lived_session()self._portal.call(recv.receive) which blocks the current thread.

Deadlock sequence

1. ASGI handler awaits flow execution
2. ToolExecutionStep calls MCPTool.run_async()
3. run_async() calls get_or_create_session() [sync, blocks event loop thread]
4. _create_long_lived_session() → portal.call(recv.receive) [blocks]
5. portal's session_runner sends HTTP POST to http://127.0.0.1:8000/mcp/
6. uvicorn cannot process the request — event loop is blocked at step 4
7. DEADLOCK → 60 second timeout → McpError

Minimal reproduction

"""
Minimal reproduction: co-hosted FastAPI + FastMCP server with wayflowcore MCPTool.

Run with: uvicorn repro:app --port 8000
Then:     curl -X POST http://localhost:8000/run -H "Content-Type: application/json" -d '{}'

Expected: tool result returned
Actual:   hangs for 60 seconds, then McpError timeout
"""

from contextlib import asynccontextmanager

from fastapi import FastAPI
from fastmcp import FastMCP
from starlette.middleware import Middleware

from wayflowcore.mcp.clienttransport import StreamableHTTPTransport
from wayflowcore.mcp.tools import MCPTool
from wayflowcore.mcp._session_persistence import get_mcp_async_runtime

# -- MCP server (same process) --
mcp_server = FastMCP("repro")
@mcp_server.tool()
async def echo(message: str) -> str:
    """Echo a message back."""
    return f"echo: {message}"

# -- FastAPI app with mounted MCP --
mcp_app = mcp_server.http_app(path="/")

@asynccontextmanager
async def lifespan(app):
    yield

app = FastAPI(lifespan=lifespan)
app.mount("/mcp", mcp_app)

# -- MCPTool pointing back at self --
transport = StreamableHTTPTransport(
    url="http://127.0.0.1:8000/mcp/",
)

@app.post("/run")
async def run_tool():
    # This triggers MCPTool.run_async() which blocks the event loop
    mcp_tool = MCPTool(
        name="echo",
        client_transport=transport,
        _validate_server_exists=False,
        _validate_tool_exist_on_server=False,
        description="Echo tool",
        input_descriptors=[],
    )
    result = await mcp_tool.run_async(message="hello")
    return {"result": result}

Error output

ExceptionGroup: unhandled errors in a TaskGroup (1 sub-exception)
  ...
  mcp.shared.exceptions.McpError: Timed out while waiting for response to ClientRequest. Waited 60.0 seconds.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions