-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Description
Description
Related Issues: #638, #600, #1305
Summary
Add support for passing custom HTTP headers on a per-request basis when calling MCP tools via ClientSession.call_tool(). This is needed for multi-tenant applications where different requests require different authentication tokens or request-specific metadata headers while maintaining a single persistent MCP connection.
Use Case: Multi-Tenant SaaS Applications
In multi-tenant deployments, a single application instance serves multiple users/tenants concurrently. Each user request needs to include tenant-specific authentication headers when calling MCP tools, but creating a new MCP connection per request introduces unacceptable latency:
- Connection establishment: ~500ms overhead per request
- At 1000 concurrent users: 8.3 minutes of cumulative overhead per minute
- Connection pooling doesn't help because each user needs different headers
Current State: Headers can only be set at connection initialization in streamablehttp_client(), making them static for the connection lifetime.
Desired State: Ability to pass custom headers per call_tool() invocation while maintaining a single persistent connection.
Related Use Cases
This pattern appears in several scenarios:
- Per-request authentication - Different auth tokens per user (Issue FastMCP Auth Context in tools #638)
- Request tracing - Correlation IDs, trace IDs for distributed systems
- Rate limiting - User-specific rate limit tokens
- A/B testing - Feature flags or experiment IDs
- Tenant isolation - Tenant identifiers for data partitioning
Current Behavior
from mcp.client.streamable_http import streamablehttp_client
# Headers set at connection time - static for connection lifetime
async with streamablehttp_client(
url="https://mcp.example.com",
headers={"Authorization": "Bearer static-token"}, # Cannot change per-request
) as (read_stream, write_stream, get_session_id):
async with ClientSession(read_stream, write_stream) as session:
# User A's request
result = await session.call_tool("list_sites", {}) # Uses static token
# User B's request (needs different token)
result = await session.call_tool("list_sites", {}) # Still uses static tokenProposed Solution
Option 1: Add extra_headers Parameter to call_tool()
Similar to how read_timeout_seconds was added for per-request timeout configuration (Issue #600), add an optional parameter for per-request headers:
async def call_tool(
self,
name: str,
arguments: dict[str, Any] | None = None,
read_timeout_seconds: timedelta | None = None,
progress_callback: ProgressFnT | None = None,
*,
meta: dict[str, Any] | None = None,
extra_headers: dict[str, str] | None = None, # NEW
) -> types.CallToolResult:
"""
Send a tools/call request.
Args:
extra_headers: Additional HTTP headers to include in this specific request.
These are merged with connection-level headers, with extra_headers
taking precedence for duplicate keys.
"""Usage Example:
async with streamablehttp_client(
url="https://mcp.example.com",
headers={"Authorization": "Bearer org-token"}, # Organization-level auth
) as (read_stream, write_stream, get_session_id):
async with ClientSession(read_stream, write_stream) as session:
# User A's request
result = await session.call_tool(
"list_sites",
{},
extra_headers={"X-Auth-Token": "user-a-token", "X-Trace-Id": "trace-123"}
)
# User B's request
result = await session.call_tool(
"list_sites",
{},
extra_headers={"X-Auth-Token": "user-b-token", "X-Trace-Id": "trace-456"}
)Implementation Notes:
- Modify
ClientSession.call_tool()signature to acceptextra_headers - Pass headers to transport layer via request context
- In
StreamableHTTPTransport._handle_post_request(), merge extra_headers with base headers - Extra headers take precedence over connection-level headers for duplicate keys
Option 2: Extend Transport Context
Add headers to the existing RequestContext mechanism:
# In StreamableHTTPTransport
async def _handle_post_request(
self,
ctx: RequestContext,
extra_headers: dict[str, str] | None = None # NEW
) -> None:
headers = self._prepare_request_headers(ctx.headers)
if extra_headers:
headers.update(extra_headers) # Merge per-request headers
async with ctx.client.stream("POST", self.url, json=message, headers=headers):
...Backward Compatibility
All proposed solutions are backward compatible:
extra_headersparameter is optional (defaults toNone)- Existing code continues to work unchanged
- No breaking changes to MCP protocol or message format
- Headers are HTTP transport-specific, not part of JSON-RPC messages
Precedent
The SDK already supports per-request configuration for timeouts:
result = await session.call_tool(
"slow_operation",
{},
read_timeout_seconds=timedelta(seconds=120) # Per-request timeout
)This establishes a pattern that certain aspects of tool invocation may need per-request customization beyond the JSON-RPC protocol itself.
Non-Solution: Per-Request Connections
Creating a new connection per request defeats the purpose of persistent connections and introduces significant latency overhead.
Implementation Considerations
Header Merging Strategy
Connection-level headers should be merged with per-request headers:
def _merge_headers(
base_headers: dict[str, str],
extra_headers: dict[str, str] | None
) -> dict[str, str]:
"""Merge headers with extra_headers taking precedence."""
merged = base_headers.copy()
if extra_headers:
merged.update(extra_headers)
return mergedTransport-Specific
This feature should only affect HTTP-based transports (Streamable HTTP, SSE). Stdio transport would ignore extra_headers as it has no HTTP layer.
Security Considerations
Per-request headers enable proper security patterns:
- Least-privilege: Each request carries only the permissions it needs
- Token rotation: Different tokens can be used without reconnecting
- Audit trails: Request-specific correlation IDs for logging
Related Issues
- FastMCP Auth Context in tools #638 - FastMCP Auth Context in tools (same root problem)
- SDKs and other middleware SHOULD allow these timeouts to be configured on a per-request basis. #600 - Per-request timeout configuration (precedent for per-request parameters)
- Feature Proposal: Secure Tool/Resource/Prompt Decorators with Auth + Encrypted I/O #1305 - Secure Tool/Resource/Prompt Decorators with Auth (related auth concern)
I'm willing to contribute a pull request implementing this feature if the approach is acceptable to maintainers. Our production use case requires this functionality, and we believe it would benefit the broader MCP community.
Thanks,
Damian.
References
No response