# Managing Multiple MCP Servers with MCPClientSessionManager

**Author:** Priyanshu Deshmukh

This notebook demonstrates how to use the `MCPClientSessionManager` class to dynamically manage multiple MCP (Model Context Protocol) server sessions. In this example, we build a research assistant that can query both arXiv papers and Wikipedia articles by opening MCP sessions on-demand within a tool function.

## Key Features Demonstrated

- **Dynamic Session Management**: Open MCP sessions inside tool functions as needed
- **Multi-Transport Support**: Handle both stdio (ArXiv) and SSE (Wikipedia) servers
- **Tool-Based Server Selection**: Let agents choose which server to use
- **Context Manager Pattern**: Automatic resource cleanup with `async with`
- **Agent Handoffs**: Coordinate between research assistant and server-specific agents

## What is MCPClientSessionManager?

The `MCPClientSessionManager` is a utility class that simplifies managing MCP client sessions. Unlike the traditional approach where you open a session at the start and keep it alive, `MCPClientSessionManager` allows you to:

- **Open sessions on-demand**: Create sessions only when needed within your workflow
- **Switch between servers**: Dynamically select which MCP server to connect to
- **Manage multiple transports**: Handle both `stdio` (process-based) and `SSE` (HTTP-based) protocols
- **Ensure clean resource management**: Automatic cleanup with async context managers

### Key Components

1. **StdioConfig**: Configuration for stdio-based MCP servers (local processes)
   - Starts a Python process that communicates via stdin/stdout
   - Example: Local arXiv paper search server

2. **SseConfig**: Configuration for SSE-based MCP servers (HTTP endpoints)
   - Connects to a remote server via Server-Sent Events
   - Example: Remote Wikipedia API server

3. **MCPConfig**: Container for multiple server configurations
   - Holds all available servers in one configuration object
   - Enables dynamic server selection at runtime

## Setup and Imports

Import all necessary libraries for our multi-server research assistant.

In [None]:
import os

# Only needed for Jupyter notebooks
import nest_asyncio

from autogen import ConversableAgent, LLMConfig
from autogen.agentchat.group import AgentTarget
from autogen.agentchat.group.llm_condition import StringLLMCondition
from autogen.agentchat.group.on_condition import OnCondition
from autogen.agentchat.group.reply_result import ReplyResult
from autogen.mcp.mcp_client import MCPClientSessionManager, MCPConfig, SseConfig, StdioConfig, create_toolkit
from autogen.tools import tool

nest_asyncio.apply()

## Configure LLM

Set up the language model configuration. Make sure to set your `OPENAI_API_KEY` environment variable.

In [None]:
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")

llm_config = LLMConfig(
    config_list=[
        {
            "model": "gpt-4o",
            "api_type": "openai",
            "api_key": OPENAI_API_KEY,
        }
    ],
    temperature=0.7,
)

## Configure Multiple MCP Servers

Here we define two different MCP servers with different transport protocols:

1. **ArxivServer** (stdio transport):
   - Runs as a local Python process
   - Provides tools to search and retrieve arXiv papers
   - Stores papers in `/tmp/arxiv_papers`

2. **WikipediaServer** (SSE transport):
   - Connects to an HTTP endpoint
   - Provides tools to search and retrieve Wikipedia articles
   - Must be running separately (e.g., via `python mcp/mcp_wikipedia.py sse`)

The `MCPConfig` object holds both server configurations, allowing us to dynamically select which one to use at runtime.

In [None]:
# Configure a stdio-based MCP server (local process)
ArxivServer = StdioConfig(
    command="python3",
    args=["mcp/mcp_arxiv.py", "stdio", "--storage-path", "/tmp/arxiv_papers"],
    transport="stdio",
    server_name="ArxivServer",
)

# Configure an SSE-based MCP server (HTTP endpoint)
WikipediaServer = SseConfig(
    url="http://127.0.0.1:8000/sse",
    timeout=10,
    sse_read_timeout=60,
    server_name="WikipediaServer",
)

# Create an MCPConfig with both servers
mcp_config = MCPConfig(servers=[ArxivServer, WikipediaServer])

print(f"Configured {len(mcp_config.servers)} MCP servers:")
for server in mcp_config.servers:
    print(f"  - {server.server_name}")

## Helper Function: Get Server Config

This utility function retrieves a specific server configuration by name from our `MCPConfig`. It's essential for dynamic server selection within tools.

In [None]:
def get_server_config(mcp_config: MCPConfig, server_name: str) -> StdioConfig | SseConfig:
    """
    Return the server config (StdioConfig or SseConfig) matching the given server_name.
    Args:
        mcp_config: The MCP configuration containing all servers
        server_name: Name of the server to retrieve
    Returns:
        The matching server configuration
    Raises:
        KeyError: If the server name is not found
    """
    existing_names = {getattr(server, "server_name", None) for server in mcp_config.servers}
    for server in mcp_config.servers:
        if getattr(server, "server_name", None) == server_name:
            return server
    raise KeyError(f"Server '{server_name}' not found in MCPConfig. Existing servers: {list(existing_names)}")

## Define the Research Assistant Agent

The research assistant is the main agent that users interact with. It has access to a tool that can query either the ArXiv or Wikipedia servers based on the user's needs.

In [None]:
RESEARCH_AGENT_PROMPT = """
You are a research assistant agent.
You will provide assistance for research tasks.
You have two MCP servers to use:
1. ArxivServer: to search for papers on arXiv
2. WikipediaServer: to search for articles on Wikipedia
"""

research_assistant = ConversableAgent(
    name="research_assistant",
    system_message=RESEARCH_AGENT_PROMPT,
    llm_config=llm_config,
    human_input_mode="NEVER",
)

print("Research assistant created successfully!")

## Create the MCP Tool: Dynamic Server Connection

This is the key innovation: a tool that opens an MCP session on-demand, creates a specialized agent with the server's tools, executes the query, and returns results.

### How It Works

1. **Server Selection**: Takes `server_name` as a parameter (ArxivServer or WikipediaServer)
2. **Session Management**: Uses `MCPClientSessionManager().open_session()` to connect
3. **Toolkit Creation**: Converts MCP tools to AG2 format
4. **Agent Creation**: Creates a temporary agent with access to the server's tools
5. **Query Execution**: Runs the agent with the user's query
6. **Result Extraction**: Returns the last message content
7. **Cleanup**: Session automatically closes when exiting the context manager

### Agent Handoffs

The tool also configures a handoff back to the research assistant when the task is complete, creating a smooth workflow loop.

In [None]:
TOOL_PROMPT = """
You are an MCP server tool.
Your purpose is to identify the correct server to execute based on the user's query.

Inputs:
- query: (actual user query)
- server_name: (name of the server to execute)

You have two MCP servers to use:
1. ArxivServer: to search for papers on arXiv
2. WikipediaServer: to search for articles on Wikipedia

NOTE:
- Strictly return only the server name for server_name param (e.g., ArxivServer)
- TERMINATE after response from the server
"""


@tool(description=TOOL_PROMPT)
async def run_mcp_agent_to_client(query: str, server_name: str) -> ReplyResult:
    """
    Execute a query on the specified MCP server.

    This tool:
    1. Opens a session to the specified MCP server
    2. Creates a toolkit from available MCP tools
    3. Creates a temporary agent with those tools
    4. Executes the query
    5. Returns the result
    """
    # Get the server configuration by name
    server = get_server_config(mcp_config, server_name)

    # Create a session manager and open a session
    async with MCPClientSessionManager().open_session(server) as session:
        # Initialize the session
        await session.initialize()

        # Get available tools from the server
        agent_tool_prompt = await session.list_tools()

        # Create a toolkit from the session
        toolkit = await create_toolkit(session=session)

        # Create a temporary agent for this server
        agent = ConversableAgent(
            name="mcp_agent",
            system_message=f"You are an agent with access to {server_name} tools. Use them to answer queries.",
            llm_config=llm_config,
            human_input_mode="NEVER",
        )

        # Register MCP tools with the agent
        toolkit.register_for_llm(agent)
        toolkit.register_for_execution(agent)

        # Configure handoff back to research assistant
        agent.handoffs.add_llm_conditions([
            OnCondition(
                target=AgentTarget(research_assistant),
                condition=StringLLMCondition(prompt="The research paper ids are fetched."),
            ),
        ])

        # Execute the query using the MCP tools
        result = await agent.a_run(
            message=query + " Use the following tools to answer the question: " + str(agent_tool_prompt),
            tools=toolkit.tools,
            max_turns=5,
        )

        # Process results
        res = await result.process()
        last_message = await res.last_message()

        # Return as ReplyResult with handoff to research assistant
        return ReplyResult(message=str(last_message["content"][-1]), target_agent=AgentTarget(research_assistant))


print("Tool 'run_mcp_agent_to_client' defined successfully!")

## Run the Research Assistant

Now let's use the research assistant to query Wikipedia. The assistant will:
1. Receive the user's query
2. Call the `run_mcp_agent_to_client` tool with the appropriate server name
3. The tool will open a session to WikipediaServer
4. Execute the query using Wikipedia's MCP tools
5. Return the results to the research assistant

### Expected Workflow

```
User Query → Research Assistant → run_mcp_agent_to_client(query, "WikipediaServer")
                ↑                                          ↓
                └──────────── Results ←─────── MCP Agent (with Wikipedia tools)
```

> **Note**: Make sure the Wikipedia SSE server is running at `http://127.0.0.1:8000/sse` before executing this cell. You can start it with:
> ```bash
> python mcp/mcp_wikipedia.py sse --port 8000
> ```

In [None]:
# Run the research assistant with a query
result = research_assistant.run(
    message="Get me the latest news from Wikipedia using WikipediaServer",
    tools=[run_mcp_agent_to_client],
    max_turns=2,
).process()

print("\nResearch assistant completed!")

## Alternative: Query ArXiv Papers

Let's try querying the ArXiv server instead. This demonstrates the flexibility of the same tool working with different servers and transport protocols.

The ArXiv server uses stdio transport (local process) while Wikipedia uses SSE (HTTP), but the `MCPClientSessionManager` handles both seamlessly.

In [None]:
# Query ArXiv for research papers
result_arxiv = research_assistant.run(
    message="Search for recent papers about large language models on ArxivServer",
    tools=[run_mcp_agent_to_client],
    max_turns=2,
).process()

print("\nArXiv query completed!")

## Key Takeaways

### Why Use MCPClientSessionManager This Way?

This pattern of opening sessions inside tools provides several advantages:

1. **Resource Efficiency**: Sessions are only opened when needed and automatically closed after use
2. **Dynamic Server Selection**: The agent can choose which server to use based on the query
3. **Isolation**: Each query gets a fresh session, preventing state pollution
4. **Flexibility**: Easy to add new servers without changing the core workflow
5. **Error Handling**: If one server fails, others remain unaffected

### The Tool-Based Pattern

By wrapping `MCPClientSessionManager` in a tool:
- The LLM decides which server to use
- Sessions are managed automatically
- The same pattern works for any number of servers
- Easy to extend with more server types

### Best Practices

1. **Use context managers**: Always use `async with` for proper cleanup
2. **Initialize sessions**: Call `await session.initialize()` after opening
3. **Unique server names**: Ensure each server has a distinct name in `MCPConfig`
4. **Error handling**: Add try-except blocks for production use
5. **Timeout configuration**: Set appropriate timeouts for SSE servers

### Comparison with Traditional Approach

**Traditional (keep session open):**
```python
async with stdio_client(...) as (read, write):
    async with ClientSession(read, write) as session:
        # Do all work here
        # Session stays open entire time
```

**MCPClientSessionManager (on-demand):**
```python
async with MCPClientSessionManager().open_session(config) as session:
    # Do specific work
    # Session closes automatically
```

The `MCPClientSessionManager` pattern is ideal for:
- Multi-server workflows
- Long-running applications
- Dynamic server selection
- Resource-constrained environments

### Next Steps

- Add more MCP servers to `MCPConfig` (filesystem, database, APIs, etc.)
- Implement intelligent routing logic in the research assistant
- Add error handling and retry mechanisms
- Create specialized agents for different server types
- Build a server health check tool