fix(domxref): build Web/API index to resolve API names#672
Conversation
Build a lazy en-US index of all `Web/API/*` page slugs keyed by both full sub-slug (e.g. `Window/structuredClone`) and leaf segment (e.g. `structuredClone`). `domxref` now resolves inputs against this map, using the canonical slug on hit and falling back to the previous capitalize-first-char URL construction on miss. Ambiguous lookups emit a `templ-ambiguous-arg` warning and pick the first candidate.
|
45c0148 was deployed to: https://rari-pr672.review.mdn.allizom.net/ |
This comment was marked as outdated.
This comment was marked as outdated.
The test fixture previously duplicated the per-slug indexing body of `build_index`. Adding a new alias rule in `build_index` would silently diverge from the fixture and tests would still pass. Pull the per-slug work into `index_one` and have both `build_index` and the test fixture call it.
When `Response/json` and `Response/json_static` collide under the shared `response/json` key, both have equal sort keys, so `min_by_key` falls back to insertion order. Document that this relies on the ascending slug order produced by `SubPagesSorter::Slug` in `build_index`.
The `_static` suffix strip is applied unconditionally to any slug matching the suffix. Document the implicit assumption that the suffix always denotes a static member, so future readers know the alias is not a structural guarantee.
Surface the `Window/*` leaf shortcut and the full-path requirement for other interfaces' members on the `domxref` template itself, so authors discover the behavior without reading the index implementation.
The pre-existing fallback used `api.char_indices().next()` to grab the byte offset of the first char (always 0) and then sliced `api[0..0]` to uppercase — i.e. uppercased the empty string, then appended the full input verbatim. The intended behavior never ran. Build the URL from the normalized input directly.
A failed index build silently degrades every subsequent `domxref` call to the legacy URL-construction fallback. Escalate from `warn!` to `error!` so the failure surfaces in `--deny-warnings` builds and is harder to miss in logs.
The function returns `&'static str` because it borrows from the `API_NAME_INDEX` `LazyLock`, which is never dropped. Spell that out so future readers don't wonder where the lifetime comes from.
The fixture now lists slugs in the same ascending order that `build_index` consumes them from `get_sub_pages`, matching the tie-break-correctness invariant documented in `resolve_from_map`.
✅ |
LeoMcA
left a comment
There was a problem hiding this comment.
Overall looks good, but I've got a few questions about the details:
| let indexable = sub_slug | ||
| .split_once("/Reference/") | ||
| .map(|(_, after)| after) | ||
| .unwrap_or(sub_slug); |
There was a problem hiding this comment.
Do we want to include the sub_slug in the map? Perhaps that should be inserted more explicitly, and we should wrap the insert_unique below with if let Some(...?
| Ok(pages) => pages, | ||
| Err(e) => { | ||
| error!("failed to build domxref API name index: {e}"); | ||
| return HashMap::new(); |
There was a problem hiding this comment.
Should this panic instead? Feels like this is a pretty catastrophic error which is worthy of killing the build
| if !entry.iter().any(|v| v == &value) { | ||
| entry.push(value); | ||
| } |
There was a problem hiding this comment.
Could we replace the Vec here with an IndexSet or a BTreeSet (if we don't care about preserving insertion order), to avoid having to manually dedup?
| // in iteration order. Correctness then relies on `build_index` iterating | ||
| // pages in ascending slug order (`SubPagesSorter::Slug`) so the canonical | ||
| // entry is inserted before its `_static`-suffixed sibling. |
There was a problem hiding this comment.
Can we avoid this by sorting by length, or explicitly check for strings ending with static?
An empty index silently degrades to the pre-index fallback, producing a flood of spurious `templ-redirected-link` issues without any clear signal of root cause. Failing loudly makes the problem immediately visible.
…index_one` The previous `split_once(...).unwrap_or(sub_slug)` idiom obscured the distinction between reorg pages (indexed by the post-`/Reference/` leaf) and normal pages (indexed by the full sub-slug). An explicit `if let Some` branch makes each case and its intent visible in the code structure.
…t<String>` `indexmap` is already a workspace dependency of `rari-doc`. `IndexSet` preserves insertion order (required for the resolution tie-break) and deduplicates automatically on `insert`, so the `insert_unique` helper can be replaced by a direct `.insert()` call.
…` check Replace the `v.to_lowercase() != key` secondary sort key with `v.ends_with(\"_static\")`, which is self-documenting: non-`_static` slugs are preferred over their `_static`-aliased counterparts. The previous formulation also carried a misleading comment claiming correctness depended on insertion order, which was not actually the case.
…ookup Covers the converse of `window_member_resolves_by_leaf`: non-Window members have no leaf-only shortcut and must be referenced by their full sub-path (e.g. `WorkerGlobalScope/structuredClone`).
Description
Updates the
domxrefmacro to resolve API names via an Web/API index (rather than just composingWeb/API/{arg}):Web/API/Window/foocan be referenced via{{domxref("Window/foo")}}or{{domxref("foo")}}Web/API/BackgroundFetchManager/fetchcan be referenced via{{domxref("BackgroundFetchManager/fetch")}}Web/API/Background_Synchronization_API/Reference/SyncEventcan be referenced via{{domxref("SyncEvent")}}Note: The argument of the
domxrefmacro continues to be normalized:" "→"_")"()"→"")".prototype."→".")"."→"/")Motivation
Support shortcuts like
{{domxref("structuredClone")}}avoidingtempl-redirected-linkissues.Additional details
Build comparison yielded:
domxref+templ-redirected-linkdomxref+templ-broken-link~13% of
domxrefredirected/broken issues cleared; other templates unchanged.Related issues and pull requests
Part of #683.