Skip to content

Support Per-Request HTTP Headers in call_tool() #1509

@damianoneill

Description

@damianoneill

Description

Related Issues: #638, #600, #1305

Summary

Add support for passing custom HTTP headers on a per-request basis when calling MCP tools via ClientSession.call_tool(). This is needed for multi-tenant applications where different requests require different authentication tokens or request-specific metadata headers while maintaining a single persistent MCP connection.

Use Case: Multi-Tenant SaaS Applications

In multi-tenant deployments, a single application instance serves multiple users/tenants concurrently. Each user request needs to include tenant-specific authentication headers when calling MCP tools, but creating a new MCP connection per request introduces unacceptable latency:

  • Connection establishment: ~500ms overhead per request
  • At 1000 concurrent users: 8.3 minutes of cumulative overhead per minute
  • Connection pooling doesn't help because each user needs different headers

Current State: Headers can only be set at connection initialization in streamablehttp_client(), making them static for the connection lifetime.

Desired State: Ability to pass custom headers per call_tool() invocation while maintaining a single persistent connection.

Related Use Cases

This pattern appears in several scenarios:

  1. Per-request authentication - Different auth tokens per user (Issue FastMCP Auth Context in tools #638)
  2. Request tracing - Correlation IDs, trace IDs for distributed systems
  3. Rate limiting - User-specific rate limit tokens
  4. A/B testing - Feature flags or experiment IDs
  5. Tenant isolation - Tenant identifiers for data partitioning

Current Behavior

from mcp.client.streamable_http import streamablehttp_client

# Headers set at connection time - static for connection lifetime
async with streamablehttp_client(
    url="https://mcp.example.com",
    headers={"Authorization": "Bearer static-token"},  # Cannot change per-request
) as (read_stream, write_stream, get_session_id):
    async with ClientSession(read_stream, write_stream) as session:
        # User A's request
        result = await session.call_tool("list_sites", {})  # Uses static token
        
        # User B's request (needs different token)
        result = await session.call_tool("list_sites", {})  # Still uses static token

Proposed Solution

Option 1: Add extra_headers Parameter to call_tool()

Similar to how read_timeout_seconds was added for per-request timeout configuration (Issue #600), add an optional parameter for per-request headers:

async def call_tool(
    self,
    name: str,
    arguments: dict[str, Any] | None = None,
    read_timeout_seconds: timedelta | None = None,
    progress_callback: ProgressFnT | None = None,
    *,
    meta: dict[str, Any] | None = None,
    extra_headers: dict[str, str] | None = None,  # NEW
) -> types.CallToolResult:
    """
    Send a tools/call request.
    
    Args:
        extra_headers: Additional HTTP headers to include in this specific request.
                      These are merged with connection-level headers, with extra_headers
                      taking precedence for duplicate keys.
    """

Usage Example:

async with streamablehttp_client(
    url="https://mcp.example.com",
    headers={"Authorization": "Bearer org-token"},  # Organization-level auth
) as (read_stream, write_stream, get_session_id):
    async with ClientSession(read_stream, write_stream) as session:
        # User A's request
        result = await session.call_tool(
            "list_sites",
            {},
            extra_headers={"X-Auth-Token": "user-a-token", "X-Trace-Id": "trace-123"}
        )
        
        # User B's request
        result = await session.call_tool(
            "list_sites",
            {},
            extra_headers={"X-Auth-Token": "user-b-token", "X-Trace-Id": "trace-456"}
        )

Implementation Notes:

  1. Modify ClientSession.call_tool() signature to accept extra_headers
  2. Pass headers to transport layer via request context
  3. In StreamableHTTPTransport._handle_post_request(), merge extra_headers with base headers
  4. Extra headers take precedence over connection-level headers for duplicate keys

Option 2: Extend Transport Context

Add headers to the existing RequestContext mechanism:

# In StreamableHTTPTransport
async def _handle_post_request(
    self, 
    ctx: RequestContext,
    extra_headers: dict[str, str] | None = None  # NEW
) -> None:
    headers = self._prepare_request_headers(ctx.headers)
    if extra_headers:
        headers.update(extra_headers)  # Merge per-request headers
    
    async with ctx.client.stream("POST", self.url, json=message, headers=headers):
        ...

Backward Compatibility

All proposed solutions are backward compatible:

  • extra_headers parameter is optional (defaults to None)
  • Existing code continues to work unchanged
  • No breaking changes to MCP protocol or message format
  • Headers are HTTP transport-specific, not part of JSON-RPC messages

Precedent

The SDK already supports per-request configuration for timeouts:

result = await session.call_tool(
    "slow_operation",
    {},
    read_timeout_seconds=timedelta(seconds=120)  # Per-request timeout
)

This establishes a pattern that certain aspects of tool invocation may need per-request customization beyond the JSON-RPC protocol itself.

Non-Solution: Per-Request Connections

Creating a new connection per request defeats the purpose of persistent connections and introduces significant latency overhead.

Implementation Considerations

Header Merging Strategy

Connection-level headers should be merged with per-request headers:

def _merge_headers(
    base_headers: dict[str, str],
    extra_headers: dict[str, str] | None
) -> dict[str, str]:
    """Merge headers with extra_headers taking precedence."""
    merged = base_headers.copy()
    if extra_headers:
        merged.update(extra_headers)
    return merged

Transport-Specific

This feature should only affect HTTP-based transports (Streamable HTTP, SSE). Stdio transport would ignore extra_headers as it has no HTTP layer.

Security Considerations

Per-request headers enable proper security patterns:

  • Least-privilege: Each request carries only the permissions it needs
  • Token rotation: Different tokens can be used without reconnecting
  • Audit trails: Request-specific correlation IDs for logging

Related Issues

I'm willing to contribute a pull request implementing this feature if the approach is acceptable to maintainers. Our production use case requires this functionality, and we believe it would benefit the broader MCP community.

Thanks,
Damian.

References

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1Significant bug affecting many users, highly requested featurefeature requestRequest for a new feature that's not currently supportedready for workEnough information for someone to start working on

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions