Skip to content

Python query-stripping uses naive ? find #19

@AndresL230

Description

@AndresL230

Python query-stripping uses naive ? find

Severity: Medium
Affected repos: middleware-python
Component boundary: middleware-python interceptor privacy

Symptom

_strip_query() in middleware-python/recost/_interceptor.py finds the first ? in the URL and truncates. This fails on:

  • URLs with a # fragment before the query position: https://api.example.com/path#anchor?x=1 → keeps everything before ?, which now includes the fragment. Not actually a privacy leak, but inconsistent with Node.
  • Encoded %3F in the path: https://api.example.com/path%3F/real?key=secret → truncates at the encoded ?, leaving the real query string in place. Privacy leak.
  • Unparseable URLs: fall through with the raw string.

The Node SDK uses URL parsing (new URL(...)) and extracts pathname cleanly. Parity is broken.

Evidence

  • middleware-python/recost/_interceptor.py_strip_query() uses url.find("?") and url[:idx].
  • middleware-node/src/core/interceptor.ts — uses new URL(url).pathname (correct).

Impact

  • For URLs with a malformed or encoded path, the Python SDK can ship query parameters (including potentially secrets like API tokens in query strings) to the telemetry endpoint. This contradicts the SDK's own privacy promise.

Fix recommendation

from urllib.parse import urlparse

def _strip_query(url: str) -> str:
    try:
        parsed = urlparse(url)
        return f"{parsed.scheme}://{parsed.netloc}{parsed.path}"
    except Exception:
        return url  # Best effort; never raise from a hot path

Verification

  • Test cases:
    • https://api.example.com/foo?secret=abchttps://api.example.com/foo
    • https://api.example.com/p%3Fath?real=qhttps://api.example.com/p%3Fath
    • https://api.example.com/foo#x?y=zhttps://api.example.com/foo

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions