fix(docs): tighten onBrokenLinks to throw and fix surfaced broken links#40102
Conversation
Code Review Agent Run #ec0993Actionable Suggestions - 0Review Details
Bito Usage GuideCommands Type the following command in the pull request comment and save the comment.
Refer to the documentation for additional commands. Configuration This repository uses Documentation & Help |
✅ Deploy Preview for superset-docs-preview ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
/codeant-review |
There was a problem hiding this comment.
Pull request overview
This PR tightens the docs build by changing Docusaurus broken-link handling from warnings to failures, and updates surfaced stale or client-side-broken documentation links.
Changes:
- Sets
onBrokenLinkstothrowin Docusaurus config. - Rewrites stale
/docs/...links in the 6.0.0 docs snapshot to the active/user-docs/...or/developer-docs/...routes. - Adds explicit file extensions to many relative developer-doc links and fixes several malformed or stale documentation references.
Reviewed changes
Copilot reviewed 37 out of 37 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
docs/docusaurus.config.ts |
Fails docs builds on broken links. |
docs/versioned_docs/version-6.0.0/using-superset/creating-your-first-dashboard.mdx |
Updates stale versioned docs links. |
docs/versioned_docs/version-6.0.0/quickstart.mdx |
Updates quickstart internal links. |
docs/versioned_docs/version-6.0.0/intro.md |
Repoints feature flags link. |
docs/versioned_docs/version-6.0.0/installation/kubernetes.mdx |
Updates Kubernetes-related doc links. |
docs/versioned_docs/version-6.0.0/installation/installation-methods.mdx |
Fixes malformed installation links. |
docs/versioned_docs/version-6.0.0/installation/docker-compose.mdx |
Updates quickstart link prefix. |
docs/versioned_docs/version-6.0.0/installation/architecture.mdx |
Updates configuration and feature links. |
docs/versioned_docs/version-6.0.0/faq.mdx |
Updates FAQ documentation links. |
docs/versioned_docs/version-6.0.0/contributing/guidelines.mdx |
Updates testing guide link. |
docs/versioned_docs/version-6.0.0/contributing/development.mdx |
Updates feature flag and how-to links. |
docs/versioned_docs/version-6.0.0/contributing/contributing.mdx |
Updates documentation contribution link. |
docs/versioned_docs/version-6.0.0/configuration/timezones.mdx |
Updates database driver link. |
docs/versioned_docs/version-6.0.0/configuration/sql-templating.mdx |
Updates feature flag link. |
docs/versioned_docs/version-6.0.0/configuration/networking-settings.mdx |
Updates feature flag link. |
docs/versioned_docs/version-6.0.0/configuration/databases.mdx |
Bulk updates database section links. |
docs/versioned_docs/version-6.0.0/configuration/configuring-superset.mdx |
Updates Docker and feature flag links. |
docs/versioned_docs/version-6.0.0/configuration/cache.mdx |
Updates async query and feature flag links. |
docs/versioned_docs/version-6.0.0/configuration/alerts-reports.mdx |
Updates setup and feature flag links. |
docs/developer_docs/testing/overview.md |
Adds extensions to relative testing links. |
docs/developer_docs/guidelines/frontend/component-style-guidelines.md |
Adds extension to style guideline link. |
docs/developer_docs/extensions/security.md |
Adds extension to registry link. |
docs/developer_docs/extensions/quick-start.md |
Fixes GitHub source links and relative doc links. |
docs/developer_docs/extensions/overview.md |
Adds extensions to extension overview links. |
docs/developer_docs/extensions/mcp.md |
Adds extensions to next-step links. |
docs/developer_docs/extensions/extension-points/sqllab.md |
Adds extensions to editor links. |
docs/developer_docs/extensions/extension-points/editors.md |
Updates one extension-point next-step link. |
docs/developer_docs/extensions/development.md |
Converts source link to GitHub URL. |
docs/developer_docs/extensions/dependencies.md |
Adds extensions to next-step links. |
docs/developer_docs/extensions/contribution-types.md |
Fixes API decorator and MCP links. |
docs/developer_docs/extensions/components/index.mdx |
Fixes component and Storybook links. |
docs/developer_docs/extensions/architecture.md |
Adds extensions to next-step links. |
docs/developer_docs/contributing/submitting-pr.md |
Adds extensions to contributing links. |
docs/developer_docs/contributing/release-process.md |
Adds extension to overview link. |
docs/developer_docs/contributing/overview.md |
Adds extensions to contribution links. |
docs/developer_docs/contributing/issue-reporting.md |
Adds extensions to contribution links. |
docs/developer_docs/contributing/code-review.md |
Adds extension to issue-reporting link. |
| - **[Contribution Types](../contribution-types)** - Explore other contribution types | ||
| - **[Development](../development)** - Set up your development environment |
There was a problem hiding this comment.
Good catch — fixed in 7feff5a, plus the systemic gap that let this through CI.
What CI was missing. onBrokenLinks: 'throw' only validates file-based references (.md / .mdx). Bare relative URLs like [Foo](../foo) skip the file resolver and get emitted as raw hrefs that the browser resolves against the current page URL — wrong directory for trailing-slash routes, instant 404 on SPA nav. The linkinator job in the docs workflow can catch these but is set continue-on-error: true, so the finding is advisory.
What's in the commit.
- Fixed the two stragglers on
editors.mdyou flagged. While there, swept the rest of the PR's modified files and the broaderdeveloper_docstree — found 76 of the same pattern total (14 in PR-modified files including the two here, 62 in unchanged files). All 76 targets resolved to real files, so the fix is uniformly "append.md". Anchors / query strings preserved. - Patched the component-page generator (
generate-superset-components.mjs). 54 of the 76 lived in two auto-generated index files; without fixing the generator the next regeneration would silently undo the manual edits. - Added
docs/scripts/lint-docs-links.mjs— fast source-level linter that scans.md/.mdxfiles under active content trees (skippingversioned_docs/), classifies link URLs asdoc-with-ext/asset/bare, and exits 1 on any bare relative internal link. Excludes fenced code blocks and asset-style targets (.png/.json/etc.). - Wired it into
superset-docs-verify.ymlas a blockingLint docs linksstep ahead of the build, so future regressions fail in seconds.
|
The suggestion to add .md extensions to the ../contribution-types and ../development links in the Next Steps block aligns with the PR's goal of fixing SPA-navigation 404s by ensuring links point to .md files. This makes the links consistent with the updated ./sqllab.md link in the same block. docs/developer_docs/extensions/extension-points/editors.md |
Previously docusaurus.config.ts had `onBrokenLinks: 'warn'`, so broken
internal links produced advisory warnings during build but didn't gate
merges. Tightening to `throw` surfaces every broken internal route at
build time. Three classes of issue fell out:
1. Stale `/docs/...` and `/docs/6.0.0/...` references in the 6.0.0
versioned snapshot. The user-facing docs section was renamed
`docs` → `user-docs` (routeBasePath) at some point after 6.0.0 was
cut, but the snapshot's links still pointed at the old prefix. The
live site redirects /docs/* → /user-docs/* at runtime, but
Docusaurus's onBrokenLinks checker doesn't honor redirects.
Bulk-rewrote /docs/* → /user-docs/* across the snapshot (and one
/docs/api → /developer-docs/api).
2. Bare-relative MDX links like `[Label](./mcp)` (no .md/.mdx
extension). Docusaurus renders an absolute href in SSR HTML, so
static crawlers see correct links — BUT React Router's `<Link>`
component on the client side resolves the bare path relative to
the current URL on click, so when the page URL has a trailing
slash (e.g. /extensions/overview/), `./mcp` becomes
/extensions/overview/mcp (404). This is exactly the broken-flow a
user reported on /developer-docs/extensions/overview/. Added the
`.md`/`.mdx` extension to all 44 such links across 17 files; this
makes Docusaurus resolve them to the canonical doc URL at the
<Link> level, so SPA navigation works regardless of trailing slash.
3. Miscellaneous content fixes:
- 4 `/configuration/feature-flags` references in 6.0.0 snapshot
pointed at a page that doesn't exist in that version (the
dedicated feature-flags page was added later). Repointed to the
`#feature-flags` anchor inside `configuring-superset.mdx`.
- 3 references to `superset-core/src/superset_core/rest_api/decorators.py`
in extensions docs were rendered as relative URLs, resolving to
/developer-docs/extensions/superset-core/... (404). Converted to
absolute GitHub URLs.
- 1 `/storybook/?path=...` link in extensions/components/index.mdx
pointed at a non-existent route. Repointed to the existing
`/developer-docs/testing/storybook` page that explains how to
run Storybook locally.
- 4 unclosed-paren markdown links in 6.0.0 installation-methods.mdx
(pre-existing source bugs).
Build now passes with `onBrokenLinks: 'throw'`. Note that
`onBrokenAnchors` is still `'warn'` (default); a separate effort
should tighten that and fix the surviving anchor warnings (currently
~60 instances of `/community#superset-community-calendar`).
Copilot flagged two stragglers on editors.md where the previous file-by-file conversion stopped halfway. Sweeping for the same pattern across the active content tree found 76 bare relative internal links total — 14 in this PR's already-modified files (Copilot's two plus twelve more) and 62 in unchanged files. Why the build doesn't catch this ───────────────────────────────── `onBrokenLinks: 'throw'` (set in this PR) only validates *file-based* markdown references — links whose URL ends in `.md` / `.mdx`. Those go through Docusaurus's file resolver, which can prove the target exists. Bare relative URL paths like `[Foo](../foo)` skip that resolver entirely; Docusaurus emits them as raw hrefs. The browser then resolves them against the *current* page URL, and for trailing-slash routes that almost always lands in the wrong directory. Page navigates client-side and 404s. The linkinator job in CI *can* catch these, but it's `continue-on-error: true` so findings are advisory. What this commit does ────────────────────── 1. Fix all 76 bare relative internal links across the active docs tree by appending `.md` to each one (preserving anchors / query strings). All 76 targets resolved to real files; no link targets changed, only the form of the reference. 2. Fix the component-page generator. 54 of the 76 bare links lived in two auto-generated index files (`components/ui/index.mdx` and `components/design-system/index.mdx`). The next regeneration would have undone the manual fixes without this. The two emission sites in `generate-superset-components.mjs` now emit `.md`-suffixed links; comment at the call site explains why. 3. Add `docs/scripts/lint-docs-links.mjs` — fast source-level linter that scans `.md`/`.mdx` files under the active content trees (skipping `versioned_docs/` snapshots) and fails if it finds any markdown link whose URL starts with `./` or `../` and does not end in `.md`/`.mdx`. Excludes asset paths (.png, .json, etc.) and ignores fenced code blocks. Wired up as `yarn lint:docs-links`. 4. Add a `Lint docs links` step to `superset-docs-verify.yml`, running before the build step so PRs that introduce the pattern fail in seconds rather than at build-time / not at all. Blocking, not advisory — exactly the gap linkinator's `continue-on-error` leaves open. Verified ──────── - `yarn lint:docs-links` exits 0 on the cleaned tree - Re-introducing one bare link makes the linter report the exact file:line with the offending URL, exit code 1 - All 76 originally-flagged targets resolved to real `.md` / `.mdx` files; only the form of the reference changed
CI caught a follow-on bug from the previous commit. The component files emitted by generate-superset-components.mjs are `.mdx`, but my generator patch appended `.md` to the link suffix. Result: the generated index pages (ui/index.mdx, design-system/index.mdx, and the build-time-generated extension/index.mdx) reference `./autocomplete.md`, `./dropdowncontainer.md`, etc. — all targets that don't exist (the real files are `.mdx`). Docusaurus's `onBrokenMarkdownLinks: 'throw'` correctly bombs the build. Worth calling out: this is exactly the failure mode the previous commit was trying to fix — bare relative links 404'd silently; now that they go through the file resolver, the resolver catches the extension mismatch instead. The system is working; my fix was just wrong on this one detail. Two corrections: 1. generate-superset-components.mjs now emits `.mdx` suffix (not `.md`) on the component links it produces, matching the actual page file format. Comment updated to make the intent explicit for future maintainers. 2. The two committed auto-generated index files (developer_docs/components/ui/index.mdx and developer_docs/components/design-system/index.mdx) had 46 + 7 = 53 `.md` link suffixes from the previous commit. Bulk-corrected to `.mdx` to match their actual targets. The third generated file (components/extension/index.mdx) is produced at build time and not committed; the generator fix above handles it. Verified with a re-audit script: 92/92 markdown-link targets across the PR's modified files now resolve to real `.md` / `.mdx` files on disk. lint-docs-links still passes (it only fails on missing extensions, not on this category of mismatch — that's `onBroken MarkdownLinks`'s job, and it WAS doing its job here).
…target
Previously the linter only flagged bare relative links (no .md/.mdx
extension). The Docusaurus build catches the other two classes
(`onBrokenMarkdownLinks: 'throw'`), but only after a multi-minute
compile. Now the source-level lint catches all three:
bare `[X](../foo)` (skips file resolver entirely)
missing-target `[X](./gone.md)` (target file doesn't exist)
wrong-extension `[X](./foo.md)` w/ .mdx (the .md vs .mdx mismatch
that broke the previous
CI run on this branch)
Implementation:
- classifyLink() resolves the link target against the source file's
directory. If it exists → ok. If not but the other extension does →
wrong-extension (reports which extension is actually on disk). If
neither exists → missing-target.
- Output groups findings by kind with category-specific explanations
so the developer immediately knows whether to add an extension, fix
one, or chase a real missing target.
Verified end-to-end by injecting one of each failure mode in turn
and confirming the linter reports the right file:line / category;
restoring the file always returns the lint to green.
Build-time `onBrokenMarkdownLinks: 'throw'` stays in place — defense
in depth. The lint just makes the same finding visible in seconds
rather than minutes.
65580fc to
56d1dae
Compare
villebro
left a comment
There was a problem hiding this comment.
My god what a mess our links have been 😶🌫️ LGTM
|
Not any more!!! |
|
To be clear, those weren't ALL broken... adding the file extension just helps to stabilize them. |
…ents Snapshots all four versioned Docusaurus sections at v6.1.0, cut from master after the version-cutting tooling (#39837) and broken-internal- links fixes (#40102) landed. Captures fresh auto-generated content and freezes data dependencies so the historical snapshot stays correct. Versioning behavior: lastVersion stays at current for every section, so the canonical URLs (/docs/..., /admin-docs/..., /developer-docs/..., /components/...) continue to render content from master. The current version is consistently labeled "Next" with an unreleased banner, and 6.1.0 is a historical pin accessible only via its explicit version segment. Component playground: previously disabled: true in versions-config.json, now enabled and versioned. The plugin block in docusaurus.config.ts was already gated only by the disabled flag, so no other code changes were needed to bring it back online. Snapshot includes: - All MDX content for the four sections. - Auto-gen captured fresh: 74 database pages (engine spec metadata), ~1,800 API reference files (openapi.json), 59 component pages (Storybook stories). - Data imports frozen at cut time into snapshot-local _versioned_data/ dirs: versioned_docs/version-6.1.0/_versioned_data/src/data/databases.json (canonical 80-database diagnostics from master, preserved by the generator's input-hash cache) admin_docs_versioned_docs/version-6.1.0/_versioned_data/data/countries.json admin_docs_versioned_docs/version-6.1.0/_versioned_data/static/feature-flags.json developer_docs_versioned_docs/version-6.1.0/_versioned_data/static/data/components.json - Import paths in deeply-nested files rewritten so they still resolve from one directory deeper inside the snapshot. Verified via full yarn build: exit 0, no broken links surfaced by onBrokenLinks: throw. Anchor warnings present are pre-existing on master (community#superset-community-calendar) and unrelated.
|
Bito Automatic Review Skipped – PR Already Merged |
…ks (apache#40102) Co-authored-by: Claude Code <noreply@anthropic.com>
…omponents Snapshots all four versioned Docusaurus sections at v6.1.0, cut from master after the version-cutting tooling (#39837), broken-internal- links fix (#40102), and user_docs rename (#40171) all landed. With the rename in place, all four sections now produce parallel-named files at the docs/ root (no more bare `versioned_docs/` outlier). Versioning behavior: lastVersion stays at current for every section, so the canonical URLs (/user-docs/..., /admin-docs/..., /developer-docs/..., /components/...) continue to render content from master. The current version is consistently labeled "Next" with an unreleased banner, and 6.1.0 is a historical pin accessible only via its explicit version segment. Component playground: previously disabled: true in versions-config.json, now enabled and versioned. Snapshot includes: - All MDX content for the four sections. - Auto-gen captured fresh: 74 database pages (engine spec metadata), ~1,800 API reference files (openapi.json), 59 component pages (Storybook stories). - Data imports frozen at cut time into snapshot-local _versioned_data/ dirs: user_docs_versioned_docs/version-6.1.0/_versioned_data/src/data/databases.json (canonical 80-database diagnostics from master, preserved by the generator's input-hash cache) admin_docs_versioned_docs/version-6.1.0/_versioned_data/data/countries.json admin_docs_versioned_docs/version-6.1.0/_versioned_data/static/feature-flags.json developer_docs_versioned_docs/version-6.1.0/_versioned_data/static/data/components.json - Import paths in deeply-nested files rewritten so they still resolve from one directory deeper inside the snapshot. - developer_docs/extensions/overview.md snapshot has the FIXED ./mcp.md form (from #40102), so the SPA-nav 404 isn't baked into the 6.1.0 version. Verified via full yarn build: exit 0, no broken links surfaced by onBrokenLinks: throw.
SUMMARY
Tightens
onBrokenLinks: 'warn'→'throw'so the docs build fails on any broken internal route, and fixes everything that fell out.Three classes of issue surfaced and got fixed:
1. Bare-relative MDX links — the SPA-nav broken-flow (the "/extensions/overview/mcp" 404 a user reported)
Markdown links like
[Label](./mcp)(no.md/.mdxextension) render an absolute href in SSR HTML, so static crawlers (andcurl) see the correct link. But React Router's<Link>component on the client side resolves the bare path relative to the current URL on click — so when the page URL has a trailing slash (/extensions/overview/),./mcpresolves to/extensions/overview/mcp(404). Added the file extension to all 44 such links across 17 files; this lets Docusaurus's MDX loader resolve them to the canonical doc URL at the<Link>level, so SPA navigation works regardless of trailing slash.This is what was breaking the "Next Steps" bullets on
/developer-docs/extensions/overview/.2. Stale
/docs/...references in the 6.0.0 versioned snapshotThe user-facing docs section was renamed
docs→user-docs(routeBasePath) at some point after 6.0.0 was cut, but the snapshot's links still pointed at the old prefix. The live site redirects/docs/*→/user-docs/*at runtime, but Docusaurus'sonBrokenLinkschecker doesn't honor redirect routes. Bulk-rewrote/docs/*→/user-docs/*across the snapshot (and one/docs/api→/developer-docs/api).3. Miscellaneous content fixes
/configuration/feature-flagsreferences in the 6.0.0 snapshot pointed at a page that doesn't exist in that version (the dedicated feature-flags page was added later). Repointed to the#feature-flagsanchor insideconfiguring-superset.mdx.superset-core/src/superset_core/rest_api/decorators.pyin extensions docs were rendered as relative URLs, resolving to/developer-docs/extensions/superset-core/...(404). Converted to absolute GitHub URLs./storybook/?path=...link inextensions/components/index.mdxpointed at a non-existent/storybookroute. Repointed to the existing/developer-docs/testing/storybookpage that explains how to run Storybook locally.6.0.0/installation/installation-methods.mdx(pre-existing source bugs the strict checker exposed).Follow-up
onBrokenAnchorsis still'warn'(default). The build currently surfaces ~60 instances of/community#superset-community-calendar(the anchor IS defined indocs/src/pages/community.tsxvia<BlurredSection id="...">but Docusaurus's anchor checker doesn't seem to discover it on the custom React page). Worth a separate PR to tighten that and figure out the anchor detection.Doesn't address the broader question of catching stale deployed links (Algolia stale index, external link rot) — that'd be a periodic CI check against the live site, separate scope.
BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
N/A.
TESTING INSTRUCTIONS
cd docs yarn install yarn buildBuild should succeed. Visit
/developer-docs/extensions/overview/(with trailing slash) locally viayarn serveand click the "MCP Integration" bullet — it should navigate to/developer-docs/extensions/mcp(not/developer-docs/extensions/overview/mcp).ADDITIONAL INFORMATION