Support Per-Request HTTP Headers in call_tool()

### Description


**Related Issues**: #638, #600, #1305


Summary

Add support for passing custom HTTP headers on a per-request basis when calling MCP tools via `ClientSession.call_tool()`. This is needed for multi-tenant applications where different requests require different authentication tokens or request-specific metadata headers while maintaining a single persistent MCP connection.

Use Case: Multi-Tenant SaaS Applications

In multi-tenant deployments, a single application instance serves multiple users/tenants concurrently. Each user request needs to include tenant-specific authentication headers when calling MCP tools, but creating a new MCP connection per request introduces unacceptable latency:

- Connection establishment: ~500ms overhead per request
- At 1000 concurrent users: 8.3 minutes of cumulative overhead per minute
- Connection pooling doesn't help because each user needs different headers

Current State: Headers can only be set at connection initialization in `streamablehttp_client()`, making them static for the connection lifetime.

Desired State: Ability to pass custom headers per `call_tool()` invocation while maintaining a single persistent connection.

Related Use Cases

This pattern appears in several scenarios:

1. Per-request authentication - Different auth tokens per user (Issue #638)
2. Request tracing - Correlation IDs, trace IDs for distributed systems
3. Rate limiting - User-specific rate limit tokens
4. A/B testing - Feature flags or experiment IDs
5. Tenant isolation - Tenant identifiers for data partitioning

Current Behavior

```python
from mcp.client.streamable_http import streamablehttp_client

# Headers set at connection time - static for connection lifetime
async with streamablehttp_client(
    url="https://mcp.example.com",
    headers={"Authorization": "Bearer static-token"},  # Cannot change per-request
) as (read_stream, write_stream, get_session_id):
    async with ClientSession(read_stream, write_stream) as session:
        # User A's request
        result = await session.call_tool("list_sites", {})  # Uses static token
        
        # User B's request (needs different token)
        result = await session.call_tool("list_sites", {})  # Still uses static token
```

Proposed Solution

Option 1: Add `extra_headers` Parameter to `call_tool()`

Similar to how `read_timeout_seconds` was added for per-request timeout configuration (Issue #600), add an optional parameter for per-request headers:

```python
async def call_tool(
    self,
    name: str,
    arguments: dict[str, Any] | None = None,
    read_timeout_seconds: timedelta | None = None,
    progress_callback: ProgressFnT | None = None,
    *,
    meta: dict[str, Any] | None = None,
    extra_headers: dict[str, str] | None = None,  # NEW
) -> types.CallToolResult:
    """
    Send a tools/call request.
    
    Args:
        extra_headers: Additional HTTP headers to include in this specific request.
                      These are merged with connection-level headers, with extra_headers
                      taking precedence for duplicate keys.
    """
```

**Usage Example**:

```python
async with streamablehttp_client(
    url="https://mcp.example.com",
    headers={"Authorization": "Bearer org-token"},  # Organization-level auth
) as (read_stream, write_stream, get_session_id):
    async with ClientSession(read_stream, write_stream) as session:
        # User A's request
        result = await session.call_tool(
            "list_sites",
            {},
            extra_headers={"X-Auth-Token": "user-a-token", "X-Trace-Id": "trace-123"}
        )
        
        # User B's request
        result = await session.call_tool(
            "list_sites",
            {},
            extra_headers={"X-Auth-Token": "user-b-token", "X-Trace-Id": "trace-456"}
        )
```

Implementation Notes:

1. Modify `ClientSession.call_tool()` signature to accept `extra_headers`
2. Pass headers to transport layer via request context
3. In `StreamableHTTPTransport._handle_post_request()`, merge extra_headers with base headers
4. Extra headers take precedence over connection-level headers for duplicate keys


Option 2: Extend Transport Context

Add headers to the existing `RequestContext` mechanism:

```python
# In StreamableHTTPTransport
async def _handle_post_request(
    self, 
    ctx: RequestContext,
    extra_headers: dict[str, str] | None = None  # NEW
) -> None:
    headers = self._prepare_request_headers(ctx.headers)
    if extra_headers:
        headers.update(extra_headers)  # Merge per-request headers
    
    async with ctx.client.stream("POST", self.url, json=message, headers=headers):
        ...
```

Backward Compatibility

All proposed solutions are backward compatible:

- `extra_headers` parameter is optional (defaults to `None`)
- Existing code continues to work unchanged
- No breaking changes to MCP protocol or message format
- Headers are HTTP transport-specific, not part of JSON-RPC messages

Precedent

The SDK already supports per-request configuration for timeouts:

```python
result = await session.call_tool(
    "slow_operation",
    {},
    read_timeout_seconds=timedelta(seconds=120)  # Per-request timeout
)
```

This establishes a pattern that certain aspects of tool invocation may need per-request customization beyond the JSON-RPC protocol itself.


Non-Solution: Per-Request Connections

Creating a new connection per request defeats the purpose of persistent connections and introduces significant latency overhead.


Implementation Considerations

Header Merging Strategy

Connection-level headers should be merged with per-request headers:

```python
def _merge_headers(
    base_headers: dict[str, str],
    extra_headers: dict[str, str] | None
) -> dict[str, str]:
    """Merge headers with extra_headers taking precedence."""
    merged = base_headers.copy()
    if extra_headers:
        merged.update(extra_headers)
    return merged
```

Transport-Specific

This feature should only affect HTTP-based transports (Streamable HTTP, SSE). Stdio transport would ignore `extra_headers` as it has no HTTP layer.

Security Considerations

Per-request headers enable proper security patterns:

- Least-privilege: Each request carries only the permissions it needs
- Token rotation: Different tokens can be used without reconnecting
- Audit trails: Request-specific correlation IDs for logging


Related Issues

- #638 - FastMCP Auth Context in tools (same root problem)
- #600 - Per-request timeout configuration (precedent for per-request parameters)
- #1305 - Secure Tool/Resource/Prompt Decorators with Auth (related auth concern)


I'm willing to contribute a pull request implementing this feature if the approach is acceptable to maintainers. Our production use case requires this functionality, and we believe it would benefit the broader MCP community.

Thanks,
Damian.

### References

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support Per-Request HTTP Headers in call_tool() #1509

Description

References

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support Per-Request HTTP Headers in call_tool() #1509

Description

Description

References

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions