Skip to content

Conversation

dbschmigelski
Copy link
Member

@dbschmigelski dbschmigelski commented Sep 19, 2025

Description

There was a slight deviation from the proposed design in the description - leaving there to make it easier to follow the history of the PR.

Rather than having MCPToolProvider and MCPClient, we are going to have MCPClient implement ToolProvider. This will remove the requirement to wrap the client every time like MCPToolProvider(mcp_client) and allows a very clean syntax of

Agent(tools=[mcp_client])
which is what customers have wanted all along


MCP Developer Experience Design Document

The Story Behind Our MCP Integration

When we first shipped MCP support, we knew the developer experience wasn't perfect, but we had chosen the idiomatic Python approach. Users would leverage context managers to connect to MCP servers, load tools, and build powerful agents. We expected some friction, but then the support tickets started rolling in faster than anticipated.

Customers kept getting tripped up by the context manager syntax. The most common mistake looked perfectly reasonable at first glance:

with mcp_client:
    agent = Agent(tools=mcp_client.list_tools_sync())
response = agent("Your prompt")  # Will fail with MCPClientInitializationError

What happens here? The MCP connection closes when you exit the with block, but the agent still needs it to execute tools. This pattern appeared repeatedly in issues #96 and #184. We tried to help by adding clearer exception messaging in PR #175, but this wasn't enough and certainly not the seamless developer experience Strands users deserve.

Two other pain points emerged from the community. Issue #481 highlighted the need for tool filtering - users wanted to cherry-pick specific tools from MCP servers rather than importing everything. Issue #715 revealed that when multiple servers provided tools with identical names, Strands would silently override one with the other, leading to confusing debugging sessions.

These issues pointed to three fundamental requirements we needed to address: we had to offload context management from users as much as possible, build extensibility for future config-driven setups, and enable deeper integrations like filtering and disambiguation.

The Path Forward: Two Approaches

We considered two fundamentally different approaches to solve these problems.

Option 1: Direct MCP Parameter

The first is the most direct: add a new mcp parameter to the Agent constructor. This approach recognizes MCP's growing importance in the AI agent ecosystem by giving it first-class status alongside other core Agent parameters like model and tools. For agent builders, this elevation signals that MCP isn't just another tool source - it's a fundamental building block for modern AI applications.

This would give users a clean, declarative way to specify their MCP setup:

agent = Agent(
    mcp={
        clients=[
            MCPClient(lambda: streamablehttp_client(...)),
            MCPClient(lambda: stdio_client(StdioServerParameters(...))
        ],
        configs=[
            "./path-to-config",
            {
              "mcpServers": {
                "find-a-domain": {
                   "type": "http",
                    "url": "https://api.findadomain.dev/mcp"
                }
              }
            }
        ]
    }
)

The appeal of this approach goes beyond syntax. By elevating MCP to the Agent constructor level, we acknowledge its role as a critical infrastructure component. Agent builders working with multiple MCP servers could configure everything in one place, making their code more readable and maintainable. The declarative nature also opens doors for configuration-driven agent setups, where MCP connections could be specified in YAML files or environment variables.

Option 2: ToolProvider Pattern (Recommended)

The broader option we'd been considering since before the beta release is the ToolProvider pattern. This approach treats MCP servers as just another type of tool source, fitting naturally into our existing architecture while providing powerful lifecycle management.

The ToolProvider interface is elegantly simple:

class ToolProvider(ABC):
    """Interface for providing tools with lifecycle management."""
    
    @abstractmethod
    async def load_tools(self) -> Sequence[AgentTool]:
        """Load and return the tools in this provider."""
    
    @abstractmethod
    async def cleanup(self) -> None:
        """Clean up resources used by the tools in this provider."""

The MCPToolProvider implementation would look like:

mcp_client = MCPClient(
    lambda: stdio_client(StdioServerParameters(...))
)

mcp_provider = MCPToolProvider(
    client=mcp_client, 
    tool_filters={"allowed": ["echo", "calculator"]}, 
    disambiguator="local_tools"
)

agent = Agent(tools=[mcp_provider])

The ToolProvider pattern offers several architectural advantages that expand our customers' toolkit of capabilities. It provides a consistent interface for any tool source that requires lifecycle management, not just MCP servers. This extensibility means future tool sources like database connections, API clients, or file system watchers could all implement the same pattern, giving customers a unified way to integrate any resource-intensive tool into their agents. The Agent can manage all ToolProvider lifecycles uniformly, eliminating the need for users to understand the specific cleanup requirements of each tool source while unlocking powerful new integration possibilities.

Option 3: MCPToolProvider under the hood

Considering these approaches, a third option emerged. What if there didn't have to be a choice? The ToolProvider architecture could be implemented first, with the direct mcp parameter added later as syntactic sugar:

# Future possibility
agent = Agent(
    mcp={"clients": [client1, client2], "filters": {...}},  # Maps to MCPToolProviders internally
    tools=[other_tools]  # Direct tools
)

This hybrid approach provides architectural flexibility while preserving the option for simpler syntax. The key insight is that these approaches aren't mutually exclusive - the direct parameter could internally create MCPToolProvider instances.

The decision to add the direct mcp parameter would be driven by community feedback, not assumptions. Starting with MCPToolProvider validates core functionality first, then adds syntactic sugar only if users consistently request it. This prevents API bloat and ensures we build what users actually need.

The Lifecycle Management Challenge

All approaches share a fundamental challenge: when should MCP connections be cleaned up? Unlike simple function calls, agents are designed for multi-turn conversations where the same agent instance handles multiple interactions over time. This means cleanup can't happen when agent("prompt") returns - the agent might be used again moments later.

Strands currently has no explicit lifecycle management. There's no agent.cleanup() method or built-in context management to signal when an agent is finished. This forces us into an uncomfortable choice:

  1. Rely on __del__: Python's garbage collector will eventually call __del__ on MCPToolProvider instances, triggering cleanup. But __del__ isn't guaranteed to be called promptly (or at all), making this both bad practice and potentially dangerous for resource management.

  2. Add explicit cleanup: Introduce an agent.cleanup() method that users must remember to call. This shifts the burden back to developers and reintroduces the manual lifecycle management we're trying to eliminate.

  3. Make Agent a context manager: Force users to use with Agent(...) as agent: syntax. While idiomatic Python and helpful for creating a single context manager for the entire agent lifecycle, this creates friction for the simple agent = Agent(); agent("prompt") pattern that most users expect and doesn't significantly improve the developer experience.

The proposal takes a "belt and braces" approach: automatic cleanup through __del__ with warnings to guide users toward explicit cleanup when needed. Not perfect, but pragmatic given Strands' current architecture.

The Proposal: Start with ToolProvider, Keep Options Open

The proposal is to implement Option 2 (ToolProvider Pattern) first, but with a crucial caveat: ship it as experimental to gather real user feedback before committing to the API.

Here's what we would build:

class MCPToolProvider(ToolProvider):
    def __init__(self, *, client: MCPClient, tool_filters: Optional[dict] = None, disambiguator: Optional[str] = None)
    async def load_tools(self) -> list[AgentTool]
    async def cleanup(self) -> None

The implementation would include tool filtering with string, regex, and callable patterns, automatic name disambiguation to prevent conflicts, and our "belt and braces" cleanup strategy.

Just as important is what we propose not to build initially. We would hold back on the direct mcp parameter to avoid premature API commitment. We would reject making Agent a context manager based on user friction concerns.

Testing Our Assumptions

We propose shipping MCPToolProvider in strands.experimental.tools.mcp with a clear timeline: gather feedback for 2-3 weeks, then decide whether to promote it to the main API. This experimental period would let us validate our lifecycle management approach, assess whether users actually want the direct mcp parameter, and refine the filtering capabilities based on real usage patterns.

The beauty of this approach is that we wouldn't close any doors. If users love the ToolProvider pattern, we'll promote it. If they clamor for the simpler syntax, we can add the direct parameter that internally uses ToolProviders. We would learn from real usage before making irreversible API decisions.

Related Issues

#198
#481
#731

Documentation PR

To be created after approval

Type of Change

New feature

Testing

How have you tested the change? Verify that the changes do not break functionality or introduce warnings in consuming repositories: agents-docs, agents-tools, agents-cli

  • I ran hatch run prepare

Checklist

  • I have read the CONTRIBUTING document
  • I have added any necessary tests that prove my fix is effective or my feature works
  • I have updated the documentation accordingly
  • I have added an appropriate example to the documentation to outline the feature, or no new docs are needed
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@dbschmigelski dbschmigelski marked this pull request as ready for review September 19, 2025 14:38
@dbschmigelski dbschmigelski changed the title feat(mcp): add experimental agent managed connection support feat(mcp): add experimental agent managed connection via ToolProvider Sep 25, 2025
@signoredems
Copy link
Contributor

@dbschmigelski Any updates on this?

Copy link

codecov bot commented Oct 8, 2025

Codecov Report

❌ Patch coverage is 99.42529% with 1 line in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/strands/tools/mcp/mcp_client.py 98.94% 0 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

@dbschmigelski
Copy link
Member Author

Added to ignore - "src/strands/experimental/tools/mcp/mcp_tool_provider.py", but assuming it doesn't consider it since the files not merged yet

@dbschmigelski
Copy link
Member Author

There was a slight deviation from the proposed design in the description - leaving there to make it easier to follow the history of the PR.

Rather than having MCPToolProvider and MCPClient, we are going to have MCPClient implement ToolProvider. This will remove the requirement to wrap the client every time like MCPToolProvider(mcp_client) and allows a very clean syntax of

Agent(tools=[mcp_client])

which is what customers have wanted all along

zastrowm
zastrowm previously approved these changes Oct 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants