[v1.x] docs: publish llms.txt and markdown renditions of the docs#3029
Conversation
Backport of the main-branch hook, adapted for the v1.x docs layout: the API reference here is the single mkdocstrings stub page api.md rather than a generated api/ tree. The hook generates the llmstxt.org artifacts into the site: llms.txt (an index of the prose pages grouped by nav section), a .md rendition of each prose page with snippet includes resolved and relative links made absolute, and llms-full.txt with every page concatenated. Incremental (--dirty) builds are rejected: they skip unmodified pages, which would silently truncate the generated artifacts. Anchor pymdownx.snippets base_path to the config directory so the extension and the hook resolve includes identically regardless of the build's working directory.
site_name and site_description both said 'MCP Server', which is what the deployed site header, browser title, and the generated llms.txt header displayed. Use the same name and description as the v2 docs.
Unresolvable relative .md links and pages that do not start with an H1 now fail the build instead of producing broken or malformed renditions. Embedded .py snippets gain a leading comment naming the source file, so the rendition still points at the file on disk.
There was a problem hiding this comment.
1 issue found across 3 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="docs/hooks/llms_txt.py">
<violation number="1" location="docs/hooks/llms_txt.py:100">
P2: Check for unconsumed snippet markers after substitutions, not in the original page markdown. Included snippets can otherwise leak unresolved `--8<--` lines into llms artifacts.</violation>
</file>
Tip: cubic used a learning from your PR history. Let your coding agent read cubic learnings directly with the cubic MCP.
Fix all with cubic | Re-trigger cubic
| content = "\n".join(indent + line if line else line for line in content.split("\n")) | ||
| return content | ||
|
|
||
| resolved, substitutions = _SNIPPET_LINE.subn(include, markdown) |
There was a problem hiding this comment.
P2: Check for unconsumed snippet markers after substitutions, not in the original page markdown. Included snippets can otherwise leak unresolved --8<-- lines into llms artifacts.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At docs/hooks/llms_txt.py, line 100:
<comment>Check for unconsumed snippet markers after substitutions, not in the original page markdown. Included snippets can otherwise leak unresolved `--8<--` lines into llms artifacts.</comment>
<file context>
@@ -0,0 +1,178 @@
+ content = "\n".join(indent + line if line else line for line in content.split("\n"))
+ return content
+
+ resolved, substitutions = _SNIPPET_LINE.subn(include, markdown)
+ if substitutions != sum("--8<--" in line for line in markdown.splitlines()):
+ raise PluginError(f"llms_txt: unresolved snippet include in {page.file.src_uri}")
</file context>
| hooks: | ||
| - docs/hooks/llms_txt.py |
There was a problem hiding this comment.
🟡 The hook source docs/hooks/llms_txt.py lives inside the default docs_dir, and since mkdocs.yml sets no exclude_docs, MkDocs copies it verbatim into the built site, publishing it at /hooks/llms_txt.py as a stray static asset. Harmless, but easy to avoid by adding exclude_docs: hooks/ or moving the hook outside docs/ — note the same layout exists on main from #3024, so any cleanup should land on both branches to preserve parity.
Extended reasoning...
What happens: MkDocs collects every file under docs_dir (the default docs/ here — mkdocs.yml sets no docs_dir override). Files that are not markdown pages and don't match exclude_docs (which is unset; the built-in defaults only exclude dot-prefixed paths and /templates/) are treated as media files and copied verbatim into site_dir. Listing the script under the hooks: key (mkdocs.yml:117-118) only tells MkDocs to import it as a build hook — it does not exempt it from the file collection. The MkDocs documentation's own hooks example places such scripts outside docs_dir for exactly this reason.
Concrete walk-through:
mkdocs build --strictruns; the file collector walksdocs/and findsdocs/hooks/llms_txt.py..pyis not a recognized documentation page extension and the path matches no exclusion, so it's classified as a media file withdest_uri = hooks/llms_txt.py.- During the build it is copied byte-for-byte into
site/hooks/llms_txt.py. - The deploy workflow publishes the v1.x build at the site root, so the hook source becomes reachable at
https://py.sdk.modelcontextprotocol.io/hooks/llms_txt.py. - Strict mode raises no warning —
omitted_files/unrecognized_linksvalidation only applies to markdown pages and links, so the artifact ships silently.
Why nothing prevents it: the new hook adds itself to hooks: but the config adds no corresponding exclude_docs entry, and the hook itself only filters what goes into llms.txt/llms-full.txt (nav pages with markdown sources) — it has no effect on which static files MkDocs copies.
Impact: negligible. The source is already public on GitHub, the file isn't linked from any page, doesn't appear in nav/search/llms.txt, and can't break the build. It's purely an unintended deployment artifact.
On the parity argument: it's true this layout mirrors the already-merged #3024 on main (which has the same issue and ships its hooks under docs/hooks/ with no exclude_docs), and the PR explicitly aims for byte-identical parity with that hook. That makes this not a defect introduced by the backport's logic, but the artifact is still unintended on both branches. The right move is to fix it consistently — either add to both branches' mkdocs.yml:
exclude_docs: |
hooks/or move the hooks to a top-level hooks//scripts/ directory (and update the hooks: paths) on main first, then carry that into this backport. Not blocking — flagging so it can be tidied whenever convenient.
Backport of #3024 to the v1.x docs: publishes an llms.txt version of the documentation at the site root, generated at build time by a small MkDocs hook (no new dependencies):
/llms.txt— a markdown index of the prose pages, grouped by nav section.mdrendition of every prose page next to its HTML (e.g.server/index.md), with relative links rewritten to absolute URLs/llms-full.txt— every prose page concatenated for single-fetch consumptionDifferences from the main-branch hook: the v1 API reference is the single mkdocstrings stub page
api.md(not a generatedapi/tree), so that one page is linked as rendered HTML from the## Optionalsection. The snippet-include machinery is carried over for parity but is dormant — the v1 docs contain no--8<--includes.One separate commit fixes the site metadata:
site_name/site_descriptionwere both literally "MCP Server", which is what the deployed site header, browser title, and the generated llms.txt header would display; they now match the v2 docs ("MCP Python SDK"). Drop that commit if the rename is unwanted.Motivation and Context
#3024 covers
/v2/only; the deploy workflow builds the v1.x branch at the site root, so root/llms.txtrequires the hook on this branch.How Has This Been Tested?
mkdocs build --strictlocally (this branch has no docs CI job on PRs); artifacts inspected: 12 prose pages indexed, ~153 KB llms-full.txt, nested nav sections (Experimental → Tasks) flatten correctly,api.mdlinked as HTML. The hook fails the build loudly on--dirtybuilds, unresolvable snippet includes or links, pages not starting with an H1, and snippet paths outside the repo root — same failure modes as #3024.Breaking Changes
None.
Types of changes
Checklist
Additional context
The hook is byte-identical to the main-branch version apart from the
api.mdadaptations; v1.x's ruff config bansglobalstatements, which is why both versions hold state in a dataclass.AI Disclaimer