Skip to content

fix(compat): extract hostname from brand_manifest URLs with paths#679

Merged
bokelley merged 1 commit into
mainfrom
claude/issue-677-brand-manifest-url-path
May 12, 2026
Merged

fix(compat): extract hostname from brand_manifest URLs with paths#679
bokelley merged 1 commit into
mainfrom
claude/issue-677-brand-manifest-url-path

Conversation

@bokelley
Copy link
Copy Markdown
Contributor

Closes #677

Summary

strip_url_scheme only strips the scheme prefix and trailing slash — for a v2.5 brand_manifest URL like "https://acme.com/.well-known/brand.json" it returned "acme.com/.well-known/brand.json", which fails BrandReference.domain's hostname-only regex and surfaced as a confusing INVALID_REQUEST[brand.domain] error. The v2.5 spec documents brand_manifest as a URL to a JSON file, so a path is the expected input shape.

This PR adds extract_brand_domain() to _url.py using urlparse().hostname, which correctly strips both path and port. A bare-domain fallback via strip_url_scheme handles callers that pass a schemeless string. Both adapter call sites — get_products.adapt_request and _media_buy_helpers.adapt_brand_manifest_to_brand (used by create_media_buy) — are updated. update_media_buy is not affected (it carries no brand_manifest).

What was tested

  • pytest tests/test_legacy_adapter_v2_5_get_products.py tests/test_legacy_adapter_v2_5_media_buy.py tests/test_dispatcher_shape_detection.py -v54 passed
  • ruff check src/adcp/compat/ — no issues
  • mypy src/adcp/compat/ — no issues in 12 source files

Nits surfaced by pre-PR review (not fixed in this PR)

  • Schemeless URL with path (e.g. "acme.com/.well-known/brand.json") falls through to strip_url_scheme and produces an invalid domain — pre-existing gap, not introduced here; v2.5 spec defines brand_manifest as a full URL so this input shape is malformed
  • strip_url_scheme retains a public name despite now being an internal fallback inside extract_brand_domain; consider renaming to _strip_url_scheme in a follow-up
  • IPv6 literal hostnames are extracted without brackets by urlparse and will still fail BrandReference.domain's regex; documented in extract_brand_domain's docstring, callers must validate if IPv6 is possible (v2.5 spec makes this unrealistic for brand_manifest)

Pre-PR review

  • code-reviewer: approved — no blockers; schemeless-with-path fallback gap is pre-existing and documented
  • dx-expert: approved — fix is correct at both call sites; nits on strip_url_scheme naming and IPv6 noted above

Triage-managed PR. This bot does not currently iterate on
review comments or PR conversation threads (only on the source
issue). To unblock:

  • Push fixup commits directly: gh pr checkout <num>
    fix → push.
  • Or re-trigger: comment /triage execute on the source
    issue.

See adcp#3121
for context.

Session: https://claude.ai/code/session_01YX14H75zYWQ1BYuopyUL34


Generated by Claude Code

strip_url_scheme only stripped the scheme prefix; for URLs like
"https://acme.com/.well-known/brand.json" it produced
"acme.com/.well-known/brand.json", which fails BrandReference.domain's
regex and surfaces as a confusing INVALID_REQUEST[brand.domain] error.

Adds extract_brand_domain() to _url.py — uses urlparse().hostname to
isolate the hostname, falling back to strip_url_scheme for bare-domain
inputs (no scheme). Both adapter call sites (get_products.adapt_request
and _media_buy_helpers.adapt_brand_manifest_to_brand) now use the new
helper. Regression tests cover the exact URL-with-path case from #677
and URL-with-port.

Fixes #677

https://claude.ai/code/session_01YX14H75zYWQ1BYuopyUL34
@bokelley
Copy link
Copy Markdown
Contributor Author

Promoting out of draft — CI green across 3.10–3.13, both expert reviews approved, regression tests in place for both get_products and create_media_buy adapter call sites.

Follow-up nits (not blocking this PR):

  1. strip_url_scheme naming. Now that extract_brand_domain is the canonical helper for "URL → hostname", strip_url_scheme is reduced to the bare-domain fallback path. The name is misleading for the role it plays today; worth renaming or inlining in a follow-up.
  2. IPv6 silent failure. urlparse("https://[::1]/path").hostname returns "::1", which fails BrandReference.domain's regex without a clear error. Acceptable for now (brand_manifest URLs in the wild are hostname-based, not IPv6 literals), but if we ever see this in production we should validate the result and raise a typed error rather than punting to the regex.

Tracking both as ideas; happy to file follow-up issues if useful.

@bokelley
Copy link
Copy Markdown
Contributor Author

Thanks for the promotion and the CI confirmation. Both nits are worth tracking — please do file follow-up issues for them. Brief suggested scopes:

  1. strip_url_scheme rename_strip_url_scheme (or inline into extract_brand_domain) to match its reduced, internal role.
  2. IPv6 validated error — when urlparse().hostname returns a bare IPv6 literal, validate and raise a typed error rather than letting it fall through to the regex.

Neither is blocking; filing them now gives the next triage sweep a handle to act on them.


Triaged by Claude Code.


Generated by Claude Code

@bokelley
Copy link
Copy Markdown
Contributor Author

Expert review summary

code-reviewer: approved — no blockers. Confirmed urlparse().hostname is the right primitive; tests cover regression at both call sites. Minor nits: a bare-host-with-path (acme.com/path, no scheme) falls into strip_url_scheme and round-trips unchanged — fails downstream regex, which is acceptable fail-fast behaviour for an out-of-spec input.

ad-tech-protocol-expert: approved — confirmed the v2.5→v3 mapping is the only one v3's BrandReference schema permits (additionalProperties: false, domain-only). Surfaced two real spec-level follow-ups (non-blocking, filed as separate issues):

CI: 16/16 green. Merging.

@bokelley bokelley merged commit 6bb2c26 into main May 12, 2026
16 checks passed
@bokelley bokelley deleted the claude/issue-677-brand-manifest-url-path branch May 12, 2026 01:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

v2.5 brand_manifest adapter preserves URL path, breaks brand.domain regex

2 participants