docs: publish llms.txt and markdown renditions of the docs by maxisbey · Pull Request #3024 · modelcontextprotocol/python-sdk

maxisbey · 2026-06-29T18:42:59Z

Publishes an llms.txt version of the documentation, generated at build time by a small MkDocs hook (no new dependencies):

/llms.txt — a markdown index of the prose pages, grouped by nav section
a .md rendition of every prose page next to its HTML (e.g. tutorial/tools/index.md), with --8<-- snippet includes resolved inline and relative links rewritten to absolute URLs — this is what the llms.txt links point at
/llms-full.txt — every prose page concatenated for single-fetch consumption

The API reference pages are mkdocstrings stubs with no markdown source, so they are linked as rendered HTML from an ## Optional section instead of being embedded.

Motivation and Context

The spec site already serves https://modelcontextprotocol.io/llms.txt (a Mintlify platform feature); none of the SDK doc sites do. The output format matches the spec site and the Pydantic ecosystem's mkdocs-llmstxt output. We generate from the markdown sources directly instead of using the mkdocs-llmstxt plugin: our prose pages are plain markdown, so the plugin's HTML-to-markdown round-trip (and its dependency chain) isn't needed.

How Has This Been Tested?

mkdocs build --strict (what the docs CI job runs) exercises the hook on every build; artifacts inspected manually: 41 indexed pages, ~440 KB llms-full.txt, all 118 snippet includes resolved, links absolute.
Edge cases exercised by building with synthetic docs changes: indented includes inside admonitions/tabs, pymdownx section syntax (clean error naming the page), included files whose content contains a literal --8<--, titled links, and --dirty builds (rejected with an error before any artifact is written, since dirty builds skip unmodified pages and would silently truncate the output).

Breaking Changes

None. One behavioral note: pymdownx.snippets now sets base_path: !relative $config_dir — snippet resolution previously depended on the build's working directory (the extension default is the CWD, not the repo root as the old comment claimed). Builds run from the repo root behave identically.

Types of changes

Documentation update

Checklist

I have read the MCP Documentation
My code follows the repository's style guidelines
New and existing tests pass locally
I have added appropriate error handling
I have added or updated documentation as needed

Additional context

The deploy workflow builds v1.x at the site root, so root /llms.txt needs a [v1.x] backport of this hook; this PR covers /v2/. Pages have no frontmatter description: yet, so the index bullets are title-only; the hook picks descriptions up automatically if we add them later.

_{AI Disclaimer}

A build hook generates the llmstxt.org artifacts into the site: llms.txt (an index of the prose pages grouped by nav section), a .md rendition of each prose page with snippet includes resolved and relative links made absolute, and llms-full.txt with every page concatenated for single-fetch consumption. The API reference pages are mkdocstrings stubs with no markdown source, so they are linked as rendered HTML from an Optional section instead of being embedded. Incremental (--dirty) builds are rejected by the hook: they skip unmodified pages, which would silently truncate the generated artifacts. Anchor pymdownx.snippets base_path to the config directory so the extension and the hook resolve includes identically regardless of the build's working directory (the extension's default base_path is the CWD, not the repo root as the old comment claimed).

Mirrors the restrict_base_path default of pymdownx.snippets, so the hook and the extension accept the same set of include paths.

Same shape as the v1.x backport of this hook, whose ruff config bans global statements (PLW0603).

cubic-dev-ai

2 issues found and verified against the latest diff

Prompt for AI agents (unresolved issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="docs/hooks/llms_txt.py">

<violation number="1" location="docs/hooks/llms_txt.py:44">
P2: Snippet parsing is narrower than pymdownx snippets syntax, so valid include directives can fail the build. Expand the regex to accept documented inline marker variants.</violation>

<violation number="2" location="docs/hooks/llms_txt.py:104">
P2: The unresolved-include guard treats literal/escaped `--8<--` text as failures. Check for remaining real, unescaped include directives instead of raw substring counts.</violation>
</file>

_{Reply with feedback, questions, or to request a fix.

Fix all with cubic | Re-trigger cubic}

cubic-dev-ai · 2026-06-29T18:57:01Z

+        return content
+
+    resolved, substitutions = _SNIPPET_LINE.subn(include, markdown)
+    if substitutions != sum("--8<--" in line for line in markdown.splitlines()):


P2: The unresolved-include guard treats literal/escaped --8<-- text as failures. Check for remaining real, unescaped include directives instead of raw substring counts.

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At docs/hooks/llms_txt.py, line 99: <comment>The unresolved-include guard treats literal/escaped `--8<--` text as failures. Check for remaining real, unescaped include directives instead of raw substring counts.</comment> <file context> @@ -0,0 +1,172 @@ + return content + + resolved, substitutions = _SNIPPET_LINE.subn(include, markdown) + if substitutions != sum("--8<--" in line for line in markdown.splitlines()): + raise PluginError(f"llms_txt: unresolved snippet include in {page.file.src_uri}") + </file context>

Suggested change

if substitutions != sum("--8<--" in line for line in markdown.splitlines()):

if re.search(r'^[ \t]*(?!;)-{1,}8<-{1,}(?:$|[ \t]+)', resolved, flags=re.MULTILINE):

This one's deliberate. The guard's job is to make any marker-bearing line the hook didn't consume fail the build rather than ship as junk in the renditions — including malformed directives that a "does it look like a directive" regex would miss. The cost is that a page mentioning --8<-- literally fails the build, but no page does today, and the error names the page, so whoever hits it first can adjust the hook with the actual case in front of them. A separate detection regex can drift from the consume regex, which is exactly the silent gap this avoids.

_{AI Disclaimer}

The parent comment was wrong here: the guard is intentionally checking for any marker-bearing line the hook didn’t consume, including malformed directives and literal --8<-- text. That keeps the detection in lockstep with the consume regex and avoids a silent drift gap.

Thanks for the feedback! I've saved this as a new learning to improve future reviews.

cubic-dev-ai · 2026-06-29T18:57:01Z

+    ),
+]
+
+_SNIPPET_LINE = re.compile(r'^(?P<indent>[ \t]*)--8<-- "(?P<path>[^"\n]+)"$', flags=re.MULTILINE)


P2: Snippet parsing is narrower than pymdownx snippets syntax, so valid include directives can fail the build. Expand the regex to accept documented inline marker variants.

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At docs/hooks/llms_txt.py, line 43: <comment>Snippet parsing is narrower than pymdownx snippets syntax, so valid include directives can fail the build. Expand the regex to accept documented inline marker variants.</comment> <file context> @@ -0,0 +1,172 @@ + ), +] + +_SNIPPET_LINE = re.compile(r'^(?P<indent>[ \t]*)--8<-- "(?P<path>[^"\n]+)"$', flags=re.MULTILINE) +_MD_LINK = re.compile(r'(\]$)([^)\s]+\.md)(#[^)\s]*)?( +"[^"]*")?($)') + </file context>

Also deliberate — the docs only use whole-file includes of complete runnable examples from docs_src/ (the convention described in mkdocs.yml), so the hook supports exactly that pattern and fails the build with an error naming the page for anything else. Section syntax and the block form both die loudly rather than rendering wrong. If we ever adopt other variants, extending the hook then beats carrying an implementation of pymdownx's full semantics that nothing exercises.

_{AI Disclaimer}

That whole-file-only boundary is intentional, so the parent comment was too broad here. This hook should keep failing loud on section/block variants rather than growing full pymdownx snippet semantics.

claude

I didn't find any bugs in the hook logic, but this adds a ~180-line MkDocs build hook plus a snippet base_path change, so it's worth a maintainer's look at the design and the published artifacts.

Extended reasoning...

Overview

This PR adds a new MkDocs hook (docs/hooks/llms_txt.py, ~180 lines) that generates llms.txt, llms-full.txt, and per-page markdown renditions at build time, registers it in mkdocs.yml, sets an explicit base_path: !relative $config_dir for pymdownx.snippets, and adds a short pointer in docs/index.md. No runtime/library code is touched.

Security risks

Low. The hook runs only at docs build time and writes into the site output directory. It reads snippet files from disk but mirrors the snippets extension's base-path restriction (rejecting paths that resolve outside the repo root), and snippet paths come from the repo's own markdown sources, not user input. Generated links are absolute URLs to the project's own docs site.

Level of scrutiny

This is docs/build tooling rather than production library code, so the blast radius is limited to the documentation site and CI docs builds (the strict-build guards mean failures surface as build errors rather than silently wrong output). Still, it is a new ~180-line piece of logic with regex-based snippet/link rewriting, nav traversal, and a behavioral tweak to snippet resolution — not a mechanical change — so it falls outside what I'd auto-approve and merits a human pass over the approach and output format.

Other factors

The bug hunting system found no bugs. There are two outstanding P2 comments from another review bot (about the strictness of the snippet-include regex and the unresolved-include guard) that the author hasn't responded to yet; they read as robustness nits for future doc edits rather than current correctness problems, since the PR description indicates the current docs build cleanly with this hook. There is no automated test coverage for the hook itself beyond the strict docs build exercising it.

Unresolvable relative .md links and pages that do not start with an H1 now fail the build instead of producing broken or malformed renditions. Embedded .py snippets gain a leading comment naming the source file under docs_src/, so the rendition still points at the file on disk.

claude

I didn't find any bugs, but this adds a new ~180-line MkDocs build hook plus a snippets base_path change, and whether to publish llms.txt renditions at all is a maintainer-level decision — worth a human look.

Extended reasoning...

Overview

This PR adds a new MkDocs hook (docs/hooks/llms_txt.py, ~184 lines) that generates llms.txt, llms-full.txt, and per-page markdown renditions at docs build time, registers the hook in mkdocs.yml, sets an explicit base_path for pymdownx.snippets, and adds a pointer to the new artifacts in docs/index.md. It is documentation/build-tooling only — no runtime SDK code is touched.

Security risks

Low. The hook runs only during the docs build, not in shipped code. It reads files referenced by snippet includes, but explicitly resolves them against the repo root and rejects paths that escape it (mirroring the snippets extension's restrict_base_path), and it fails the build loudly on unresolved includes, unreadable files, or skipped pages. The base_path: !relative $config_dir change makes snippet resolution independent of the build CWD, which is a behavior-preserving hardening for builds run from the repo root.

Level of scrutiny

Moderate. Although nothing here is production-critical, this is not a mechanical change: it introduces a non-trivial new Python hook with its own parsing logic (snippet-include resolution, link rewriting, nav traversal), publishes new public artifacts on the docs site, and makes a deliberate scope decision (whole-file includes only, fail-loud otherwise) that the author defended in response to automated review comments. Whether the project wants to maintain this hook versus using the mkdocs-llmstxt plugin, and whether the published URLs/format are right, are judgment calls a maintainer should make.

Other factors

The automated bug-hunting pass found no bugs. The cubic-dev-ai review raised two P2 suggestions; the author responded explaining both behaviors are intentional fail-loud guards, and the bot accepted the rationale — there are no unresolved actionable comments. There is no test coverage for the hook itself; it is exercised only by mkdocs build --strict in docs CI. The PR description also notes a follow-up backport to v1.x will be needed for the root /llms.txt, which is a deployment consideration for maintainers.

maxisbey added 2 commits June 29, 2026 18:34

docs: restrict snippet includes to paths inside the repo root

e41829f

Mirrors the restrict_base_path default of pymdownx.snippets, so the hook and the extension accept the same set of include paths.

maxisbey marked this pull request as ready for review June 29, 2026 18:44

docs: hold llms_txt hook state in a dataclass instead of module globals

6af8e07

Same shape as the v1.x backport of this hook, whose ruff config bans global statements (PLW0603).

cubic-dev-ai Bot reviewed Jun 29, 2026

View reviewed changes

claude Bot reviewed Jun 29, 2026

View reviewed changes

maxisbey mentioned this pull request Jun 30, 2026

[v1.x] docs: publish llms.txt and markdown renditions of the docs #3029

Merged

6 tasks

felixweinberger approved these changes Jun 30, 2026

View reviewed changes

maxisbey merged commit 67d7593 into main Jun 30, 2026
34 checks passed

maxisbey deleted the llms-txt branch June 30, 2026 10:30

claude Bot reviewed Jun 30, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

docs: publish llms.txt and markdown renditions of the docs#3024

docs: publish llms.txt and markdown renditions of the docs#3024
maxisbey merged 4 commits into
mainfrom
llms-txt

maxisbey commented Jun 29, 2026

Uh oh!

cubic-dev-ai Bot left a comment

Uh oh!

cubic-dev-ai Bot Jun 29, 2026 •

edited

Loading

Uh oh!

maxisbey Jun 30, 2026

Uh oh!

cubic-dev-ai Bot Jun 30, 2026

Uh oh!

cubic-dev-ai Bot Jun 29, 2026 •

edited

Loading

Uh oh!

maxisbey Jun 30, 2026

Uh oh!

cubic-dev-ai Bot Jun 30, 2026

Uh oh!

claude Bot left a comment

Uh oh!

Uh oh!

claude Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	if substitutions != sum("--8<--" in line for line in markdown.splitlines()):
	if re.search(r'^[ \t]*(?!;)-{1,}8<-{1,}(?:$\|[ \t]+)', resolved, flags=re.MULTILINE):

Uh oh!

Conversation

maxisbey commented Jun 29, 2026

Motivation and Context

How Has This Been Tested?

Breaking Changes

Types of changes

Checklist

Additional context

Uh oh!

cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai Bot Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

maxisbey Jun 30, 2026

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai Bot Jun 30, 2026

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai Bot Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

maxisbey Jun 30, 2026

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai Bot Jun 30, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Overview

Security risks

Level of scrutiny

Other factors

Uh oh!

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Overview

Security risks

Level of scrutiny

Other factors

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

cubic-dev-ai Bot Jun 29, 2026 •

edited

Loading

cubic-dev-ai Bot Jun 29, 2026 •

edited

Loading