Skip to content

feat: implement URLPattern.generate() (tentative spec)#21

Merged
chad-loder merged 1 commit into
mainfrom
feat/generate
May 13, 2026
Merged

feat: implement URLPattern.generate() (tentative spec)#21
chad-loder merged 1 commit into
mainfrom
feat/generate

Conversation

@chad-loder
Copy link
Copy Markdown
Owner

Summary

  • Implements URLPattern.generate(component, groups) — the last unimplemented WHATWG URLPattern method (tentative spec).
  • 19 / 19 upstream WPT cases (urlpattern-generate.tentative.any.js) now pass; previously skipped behind WHATWG_URLPATTERN_RUN_TENTATIVE=1.
  • Test count: 580 → 599 (+19).
  • Auxiliary-suites conformance: 84 / 84 → 103 / 103.

What generate() does

Reverses an exec(): given a component name and a mapping of named-group values, emit the canonical-form URL component string that fed back through exec() would yield the same groups.

pat = URLPattern({"pathname": "/users/:id"})
pat.generate("pathname", {"id": "42"})       # '/users/42'
pat.generate("pathname", {"id": "🍅"})       # '/users/%F0%9F%8D%85'
pat.generate("pathname", {"id": "bar/baz"})  # TypeError — would violate :id constraint

Useful for URL builders that want to round-trip values through the same pattern they match against.

Algorithm

Single pass over the per-component parsed part-list (already built at construction time). For each part:

  • FIXED_TEXT with no modifier → emit the canonical value verbatim.
  • Any part with a modifier (?, *, +) → TypeError (not uniquely reversible).
  • Standalone full-wildcard or anonymous regex group → TypeError (no named group to substitute into).
  • Missing required group → TypeError.
  • Otherwise: encode the supplied value via the component's encoding callback, validate the encoded value satisfies the part's own constraint (e.g. segment-wildcard rejects path delimiters), then emit prefix + encoded + suffix.

Per-part validators are compiled lazily — generate() is a cold path; paying compile cost on every constructor for a method most callers never reach is not worth the speedup.

Implementation impact on the hot path

Zero. Three small data fields are added to _ComponentMatcher (parts, encoder, segment_wildcard_regex) — all populated from values that were already being computed at compile time. test() and exec() are untouched.

Documentation updates

  • SPEC_DEVIATIONS.md — flipped generate() from "Not implemented; 19 cases xfail" to "Implemented; 19 / 19 WPT cases pass".
  • README.md — auxiliary-suites badge 84/84 → 103/103; tentative-API badge tracked → implemented; WPT runners table now shows 19 / 19 for the generate suite; API surface table marks generate() as ✓ Implemented; tentative-not-implemented legend entry removed.
  • docs/wpt-compliance.md — regenerated. All 19 generate cases flip from skip to pass. Footer legend simplified.
  • scripts/generate_compliance_report.py_run_generate_case now actually exercises the implementation (was hard-coded to skip).
  • tests/test_wpt_generate.py — dropped the WHATWG_URLPATTERN_RUN_TENTATIVE=1 env-var gate; suite runs by default.

Test plan

  • uv run pytest tests/test_wpt_generate.py -v — 19 / 19 pass
  • uv run pytest -q — 599 passed, 0 skipped (was 580 + 19 skipped)
  • just lint — all green (ruff, mypy, pyright, ty, semgrep, shellcheck, rumdl, codespell, interrogate, validate-pyproject)
  • just compliance-report — regenerates docs/wpt-compliance.md as 469 cases, 0 failing
  • Signed commit (ED25519)

Closes the v0.2.0 roadmap.

Reverses an exec: given a component name and a mapping of named-group
values, ``generate`` emits the canonical-form URL component string that
fed back through ``exec`` would produce those same groups. Useful for
URL builders that want to round-trip via the pattern.

Algorithm is a single pass over the per-component parsed part-list
(already built at construction time). For each part:

  - FIXED_TEXT with no modifier → emit the canonical value verbatim
  - Any part with a modifier (?, *, +) → TypeError (not uniquely
    reversible)
  - Standalone full-wildcard or anonymous (numeric-named) regexp →
    TypeError (no named group to substitute)
  - Missing required group → TypeError
  - Otherwise: encode the supplied value via the component's encoding
    callback, validate the encoded value satisfies the part's own
    constraint (segment-wildcard rejects path delimiters, custom regex
    fullmatches), then emit prefix + encoded + suffix.

Validation regexes are compiled lazily — ``generate`` is a cold path,
so paying the construction-time cost for every pattern is not worth
the speedup on a method most callers never reach.

The implementation needed three small data additions to
``_ComponentMatcher``:

  - ``parts``: the pre-parsed part list (was discarded after compile)
  - ``encoder``: the per-component literal-encoding callback
  - ``segment_wildcard_regex``: the ``[^<delim>]+?`` body for this
    component's options

All three are populated for free during ``_compile_component`` — no
extra work on the test()/exec() hot path.

Conformance: 19 / 19 cases in the upstream WPT corpus
(``urlpattern-generate.tentative.any.js``) pass. The test driver
previously skipped these unless ``WHATWG_URLPATTERN_RUN_TENTATIVE=1``;
now runs unconditionally alongside the rest of the WPT suite.

Updated ``SPEC_DEVIATIONS.md`` to reflect that the only tentative-spec
gap is now closed.

Test count: 580 → 599 (+19).

Closes the last v0.2.0 roadmap item.
@github-actions github-actions Bot added the enhancement New feature or request label May 13, 2026
@chad-loder chad-loder merged commit 6faa178 into main May 13, 2026
19 checks passed
@chad-loder chad-loder deleted the feat/generate branch May 13, 2026 03:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant