-
Notifications
You must be signed in to change notification settings - Fork 424
feat(mcp): add experimental agent managed connection via ToolProvider #895
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@dbschmigelski Any updates on this? |
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
Added to ignore - "src/strands/experimental/tools/mcp/mcp_tool_provider.py", but assuming it doesn't consider it since the files not merged yet |
There was a slight deviation from the proposed design in the description - leaving there to make it easier to follow the history of the PR. Rather than having MCPToolProvider and MCPClient, we are going to have MCPClient implement ToolProvider. This will remove the requirement to wrap the client every time like
which is what customers have wanted all along |
Description
There was a slight deviation from the proposed design in the description - leaving there to make it easier to follow the history of the PR.
Rather than having MCPToolProvider and MCPClient, we are going to have MCPClient implement ToolProvider. This will remove the requirement to wrap the client every time like MCPToolProvider(mcp_client) and allows a very clean syntax of
Agent(tools=[mcp_client])
which is what customers have wanted all along
MCP Developer Experience Design Document
The Story Behind Our MCP Integration
When we first shipped MCP support, we knew the developer experience wasn't perfect, but we had chosen the idiomatic Python approach. Users would leverage context managers to connect to MCP servers, load tools, and build powerful agents. We expected some friction, but then the support tickets started rolling in faster than anticipated.
Customers kept getting tripped up by the context manager syntax. The most common mistake looked perfectly reasonable at first glance:
What happens here? The MCP connection closes when you exit the
with
block, but the agent still needs it to execute tools. This pattern appeared repeatedly in issues #96 and #184. We tried to help by adding clearer exception messaging in PR #175, but this wasn't enough and certainly not the seamless developer experience Strands users deserve.Two other pain points emerged from the community. Issue #481 highlighted the need for tool filtering - users wanted to cherry-pick specific tools from MCP servers rather than importing everything. Issue #715 revealed that when multiple servers provided tools with identical names, Strands would silently override one with the other, leading to confusing debugging sessions.
These issues pointed to three fundamental requirements we needed to address: we had to offload context management from users as much as possible, build extensibility for future config-driven setups, and enable deeper integrations like filtering and disambiguation.
The Path Forward: Two Approaches
We considered two fundamentally different approaches to solve these problems.
Option 1: Direct MCP Parameter
The first is the most direct: add a new
mcp
parameter to the Agent constructor. This approach recognizes MCP's growing importance in the AI agent ecosystem by giving it first-class status alongside other core Agent parameters likemodel
andtools
. For agent builders, this elevation signals that MCP isn't just another tool source - it's a fundamental building block for modern AI applications.This would give users a clean, declarative way to specify their MCP setup:
The appeal of this approach goes beyond syntax. By elevating MCP to the Agent constructor level, we acknowledge its role as a critical infrastructure component. Agent builders working with multiple MCP servers could configure everything in one place, making their code more readable and maintainable. The declarative nature also opens doors for configuration-driven agent setups, where MCP connections could be specified in YAML files or environment variables.
Option 2: ToolProvider Pattern (Recommended)
The broader option we'd been considering since before the beta release is the ToolProvider pattern. This approach treats MCP servers as just another type of tool source, fitting naturally into our existing architecture while providing powerful lifecycle management.
The ToolProvider interface is elegantly simple:
The MCPToolProvider implementation would look like:
The ToolProvider pattern offers several architectural advantages that expand our customers' toolkit of capabilities. It provides a consistent interface for any tool source that requires lifecycle management, not just MCP servers. This extensibility means future tool sources like database connections, API clients, or file system watchers could all implement the same pattern, giving customers a unified way to integrate any resource-intensive tool into their agents. The Agent can manage all ToolProvider lifecycles uniformly, eliminating the need for users to understand the specific cleanup requirements of each tool source while unlocking powerful new integration possibilities.
Option 3: MCPToolProvider under the hood
Considering these approaches, a third option emerged. What if there didn't have to be a choice? The ToolProvider architecture could be implemented first, with the direct
mcp
parameter added later as syntactic sugar:This hybrid approach provides architectural flexibility while preserving the option for simpler syntax. The key insight is that these approaches aren't mutually exclusive - the direct parameter could internally create MCPToolProvider instances.
The decision to add the direct
mcp
parameter would be driven by community feedback, not assumptions. Starting with MCPToolProvider validates core functionality first, then adds syntactic sugar only if users consistently request it. This prevents API bloat and ensures we build what users actually need.The Lifecycle Management Challenge
All approaches share a fundamental challenge: when should MCP connections be cleaned up? Unlike simple function calls, agents are designed for multi-turn conversations where the same agent instance handles multiple interactions over time. This means cleanup can't happen when
agent("prompt")
returns - the agent might be used again moments later.Strands currently has no explicit lifecycle management. There's no
agent.cleanup()
method or built-in context management to signal when an agent is finished. This forces us into an uncomfortable choice:Rely on
__del__
: Python's garbage collector will eventually call__del__
on MCPToolProvider instances, triggering cleanup. But__del__
isn't guaranteed to be called promptly (or at all), making this both bad practice and potentially dangerous for resource management.Add explicit cleanup: Introduce an
agent.cleanup()
method that users must remember to call. This shifts the burden back to developers and reintroduces the manual lifecycle management we're trying to eliminate.Make Agent a context manager: Force users to use
with Agent(...) as agent:
syntax. While idiomatic Python and helpful for creating a single context manager for the entire agent lifecycle, this creates friction for the simpleagent = Agent(); agent("prompt")
pattern that most users expect and doesn't significantly improve the developer experience.The proposal takes a "belt and braces" approach: automatic cleanup through
__del__
with warnings to guide users toward explicit cleanup when needed. Not perfect, but pragmatic given Strands' current architecture.The Proposal: Start with ToolProvider, Keep Options Open
The proposal is to implement Option 2 (ToolProvider Pattern) first, but with a crucial caveat: ship it as experimental to gather real user feedback before committing to the API.
Here's what we would build:
The implementation would include tool filtering with string, regex, and callable patterns, automatic name disambiguation to prevent conflicts, and our "belt and braces" cleanup strategy.
Just as important is what we propose not to build initially. We would hold back on the direct
mcp
parameter to avoid premature API commitment. We would reject making Agent a context manager based on user friction concerns.Testing Our Assumptions
We propose shipping MCPToolProvider in
strands.experimental.tools.mcp
with a clear timeline: gather feedback for 2-3 weeks, then decide whether to promote it to the main API. This experimental period would let us validate our lifecycle management approach, assess whether users actually want the directmcp
parameter, and refine the filtering capabilities based on real usage patterns.The beauty of this approach is that we wouldn't close any doors. If users love the ToolProvider pattern, we'll promote it. If they clamor for the simpler syntax, we can add the direct parameter that internally uses ToolProviders. We would learn from real usage before making irreversible API decisions.
Related Issues
#198
#481
#731
Documentation PR
To be created after approval
Type of Change
New feature
Testing
How have you tested the change? Verify that the changes do not break functionality or introduce warnings in consuming repositories: agents-docs, agents-tools, agents-cli
hatch run prepare
Checklist
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.