Skip to content

URL Normalisation #2890

@therealnb

Description

@therealnb

I had to add the following code in the optimizer

        if proxy_mode == ToolHiveProxyMode.STREAMABLE:
            # Strip fragments for streamable-http
            # (fragments not supported by streamable-http client)
            parsed = urlparse(url)

            # Fix path: streamable-http uses /mcp endpoint, not /sse
            path = parsed.path
            if path.endswith("/sse"):
                path = path.replace("/sse", "/mcp")
            elif not path.endswith("/mcp"):
                # Only add /mcp if the path doesn't already contain /mcp
                # This prevents double-adding /mcp to URLs like /mcp/test-server
                if "/mcp" not in path:
                    # If path doesn't end with /mcp or /sse, and doesn't contain /mcp,
                    # ensure it ends with /mcp
                    if path.endswith("/"):
                        path = path + "mcp"
                    else:
                        path = path + "/mcp"

            # Reconstruct URL without fragment and with corrected path
            normalized_tuple = (
                parsed.scheme,
                parsed.netloc,
                path,
                parsed.params,
                parsed.query,
                "",  # Empty fragment
            )
            normalized = str(urlunparse(normalized_tuple))
            if normalized != url:
                logger.debug(
                    "Normalized URL for streamable-http",
                    original_url=url,
                    normalized_url=normalized,
                    workload=self.workload.name,
                )
            return normalized

Why This Code Was Added

The normalization addresses inconsistencies in URLs returned by ToolHive:

  1. Fragment stripping: ToolHive’s GenerateMCPServerURL should not add fragments for streamable-http (line 42), but edge cases (legacy configs, bugs, remote URLs) can still include them. The streamable-http client library doesn’t support fragments.

  2. Path normalization:

    • Converting /sse/mcp for streamable-http
    • Adding /mcp if missing (especially for remote URLs where ToolHive uses the remote path as-is)
    • Preventing double-adding /mcp (e.g., /mcp/test-server)
  3. SSE vs streamable-http: SSE needs fragments for container identification; streamable-http does not.

These are misconfigurations in how remote MCP servers are configured. Here's the breakdown:

The Problem

Looking at ToolHive's GenerateMCPServerURL (lines 41-42):

if isStreamable {
    return fmt.Sprintf("%s%s", base, path)  // Uses remote path as-is
}

When a remote streamable-http MCP server is configured with:

  • https://api.example.com/custom/endpoint → ToolHive generates http://localhost:8080/custom/endpoint
  • https://api.example.com/sse → ToolHive generates http://localhost:8080/sse

These are misconfigurations because:

  1. The remote server URL should match the transport type
  2. For streamable-http, it should end with /mcp
  3. For SSE, it should end with /sse

ToolHive Should Still Normalize

Even with misconfigurations, ToolHive should normalize proxy URLs because:

  1. ToolHive controls the proxy endpoint: The proxy should expose a consistent endpoint (/mcp for streamable-http) regardless of the remote server's path.
  2. The remote path is an implementation detail: ToolHive should route internally to the remote server's actual path, but expose a standard proxy endpoint.
  3. User error tolerance: Users may misconfigure remote URLs; ToolHive should handle it gracefully.

The Real Issue

The bug is in ToolHive's GenerateMCPServerURL function. It should:

  • For streamable-http: Always generate /mcp endpoint (normalize the proxy URL)
  • Route internally to the remote server's actual path
  • Not blindly copy the remote path to the proxy URL

Recommendation

  1. Fix ToolHive: Update GenerateMCPServerURL to normalize proxy URLs for streamable-http (always use /mcp).
  2. Keep mcp-optimizer normalization: As a defensive measure for:
    • Legacy URLs from older ToolHive versions
    • Edge cases ToolHive might miss
    • Kubernetes mode where URLs come from CRDs

The normalization in mcp-optimizer is a workaround for ToolHive's bug, but it's still valuable as defense-in-depth.

Metadata

Metadata

Assignees

Labels

apiItems related to the APIbugSomething isn't workinggoPull requests that update go codeproxy

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions