chore: llms.txt updates#876
Conversation
…ms.txt
- Extend plugin-generate-llms.js to scan the blog directory (opt-in via
`blog: true` plugin option) and write stripped .md copies for every post.
Mirrors existing docs behaviour: the ignored build step writes to
build/blog/{slug}.md so LLM crawlers can fetch raw markdown per post.
- Add a Case Studies section to the llms.txt root text linking the six
customer case studies as .md URLs with titles and descriptions, following
the same bullet format as the existing docs sections.
Co-authored-by: claude <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
📝 WalkthroughWalkthroughUpdates Docusaurus config: replaces the site tagline with documentation-focused wording and enables the blog. Extends the llms generator plugin to discover, parse, and include blog posts (md/mdx) into generated llms.txt and related outputs. Changes
Sequence DiagramsequenceDiagram
participant FS as FileSystem
participant Plugin as LLMs Generator Plugin
participant Parser as gray-matter
participant Site as Docusaurus Config
participant Output as llms.txt Generator
Site->>Plugin: Provide blog options (path, routeBasePath)
Plugin->>FS: Resolve blog dir and glob **/*.{md,mdx}
loop for each file
Plugin->>FS: Read file content
Plugin->>Parser: Parse frontmatter
Parser-->>Plugin: title, description, slug
Plugin->>Plugin: Derive slug if missing (date-prefix / index folder logic)
Plugin->>Plugin: Build pageUrl using routeBasePath
Plugin->>Plugin: Append entry to collectedDocs
end
Plugin->>Output: Include collectedDocs (docs + blogs) in llms.txt generation
Output->>FS: Write llms*.txt and stripped markdown copies
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
plugins/plugin-generate-llms.js (1)
165-170: InconsistentfilePathshape between docs and blog entries.Docs push
filePath: path.join(config.path, file)(repo-relative, line 109), while blog posts pushfilePath: fullPath(absolute).orderDocsmatchesincludeOrderglobs againstfilePathvia minimatch, so any futureincludeOrderpattern intended to reach blog posts (e.g.blog/**) will silently fail to match on absolute paths. Current config doesn't useincludeOrderfor blog, so this is latent — worth aligning now to avoid a confusing debug later.- collectedDocs.push({ - filePath: fullPath, + collectedDocs.push({ + filePath: path.join(blogDir, file), title, description, pageUrl, });Note
writeMarkdownCopiesreads viadoc.filePath(line 224), so ensure the relative path resolves from the build CWD (it will, since Docusaurus runs withsiteDiras cwd); otherwise store bothfilePathandabsPath.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@plugins/plugin-generate-llms.js` around lines 165 - 170, The collectedDocs entries use an absolute fullPath for filePath (in the collectedDocs.push block) while docs use path.join(config.path, file) (repo-relative), causing orderDocs/includeOrder minimatch checks to fail for blog entries; change the blog collector to store repo-relative paths (e.g., use the same path.join(config.path, ...) logic or compute a relative path from fullPath) so that orderDocs, writeMarkdownCopies (which reads doc.filePath) and includeOrder globs operate consistently; if you cannot reliably produce a repo-relative path, store both filePath (relative) and absPath (absolute) on the doc object to preserve existing read semantics.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@plugins/plugin-generate-llms.js`:
- Around line 143-155: The slug fallback incorrectly uses the file basename
which yields "index" for folder-style posts; update the fallback in the block
that computes slug (where parsed.data.slug, file, path.basename, path.extname,
and the date-regex are used) so that if base === "index" you use the parent
directory name (path.basename(path.dirname(file))) instead of "index", then
apply the same /^\d{4}-\d{2}-\d{2}-(.+)$/ regex to strip the date prefix from
that parent-dir-derived base; keep the existing behavior when parsed.data.slug
is present or when no date prefix matches.
---
Nitpick comments:
In `@plugins/plugin-generate-llms.js`:
- Around line 165-170: The collectedDocs entries use an absolute fullPath for
filePath (in the collectedDocs.push block) while docs use path.join(config.path,
file) (repo-relative), causing orderDocs/includeOrder minimatch checks to fail
for blog entries; change the blog collector to store repo-relative paths (e.g.,
use the same path.join(config.path, ...) logic or compute a relative path from
fullPath) so that orderDocs, writeMarkdownCopies (which reads doc.filePath) and
includeOrder globs operate consistently; if you cannot reliably produce a
repo-relative path, store both filePath (relative) and absPath (absolute) on the
doc object to preserve existing read semantics.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: d136e2ae-60ff-488a-8fbb-02dfd65acbce
📒 Files selected for processing (2)
docusaurus.config.jsplugins/plugin-generate-llms.js
Co-authored-by: claude <noreply@anthropic.com>
There was a problem hiding this comment.
🧹 Nitpick comments (2)
plugins/plugin-generate-llms.js (2)
108-114:filePathshape is inconsistent between the docs and blog loops.The docs loop pushes a repo-relative path (
path.join(config.path, file)→ e.g."docs/foo.md"), while the blog loop pushes an absolute path (fullPath). Two downstream consequences:
orderDocsrunsminimatch(doc.filePath, pattern). A user-suppliedincludeOrderpattern such as"**/*.md"or"**/case-study-*.md"would silently start matching blog posts onceblog: trueis enabled, pulling them into anyllms-<name>.txtoutput — not just via the curatedroottext.writeMarkdownCopieshappens to work for both (relative paths resolve against CWD, absolute paths resolve directly), but this only holds as long as the build is invoked from the site root.Normalizing both branches to the same representation (both relative to
context.siteDir, or both absolute) removes the footgun.♻️ Suggested normalization (blog loop)
- collectedDocs.push({ - filePath: fullPath, + collectedDocs.push({ + filePath: path.join(blogDir, file), title, description, pageUrl, });Also applies to: 173-178
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@plugins/plugin-generate-llms.js` around lines 108 - 114, The docs and blog collection branches produce different filePath shapes (docs use path.join(config.path, file) while the blog branch uses fullPath), causing orderDocs (which runs minimatch(doc.filePath, pattern)) and writeMarkdownCopies to behave inconsistently; update the blog loop to normalize filePath to the same form as docs (e.g., make blog entries relative to context.siteDir or make docs absolute) so collectedDocs entries are consistent, and ensure the change covers the other occurrence noted (lines referenced around 173-178); keep collectedDocs, path.join(config.path, file), fullPath, context.siteDir, orderDocs, and writeMarkdownCopies in mind when making the fix.
120-171: Document or resolve the coupling between preset blog config and plugin options for future-proofing.The preset's blog config (lines 249–256 in
docusaurus.config.js) does not overridepathorrouteBasePath, so both remain at Docusaurus defaults ("blog"). The plugin receivesblog: trueand defaults itsblogDirandblogRouteBasePathto"blog"as well, so they match today. However, if the preset is later customized with a non-defaultpathorrouteBasePathwithout also updating the plugin'soptions.blogobject, the plugin will:
- Generate incorrect
pageUrls pointing to the old route (broken links inllms.txt)- Emit stripped
.mdcopies under the wrong directory, causing 404s when the site serves posts elsewhereEither add a comment in the plugin header documenting this coupling (e.g., "plugin blog options must be kept in sync with preset blog config"), or resolve the effective preset blog config at runtime to avoid manual synchronization.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@plugins/plugin-generate-llms.js` around lines 120 - 171, The plugin currently assumes defaults when options.blog === true (blogConfig, blogDir, blogRouteBasePath) which can diverge from the preset and break generated pageUrl and output locations; update the plugin so that when options.blog === true it resolves the effective preset blog config at runtime (merge the preset's blog path/routeBasePath into blogConfig before deriving blogDir/blogRouteBasePath) by reading the site/preset config from context (e.g., inspect context.siteConfig or context.presets) and falling back to the current defaults if nothing is found, or alternatively add a clear header comment above the options.blog handling stating "plugin blog options must be kept in sync with preset blog config" and reference the symbols options.blog, blogConfig, blogDir, blogRouteBasePath, url, and context.siteDir.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@plugins/plugin-generate-llms.js`:
- Around line 108-114: The docs and blog collection branches produce different
filePath shapes (docs use path.join(config.path, file) while the blog branch
uses fullPath), causing orderDocs (which runs minimatch(doc.filePath, pattern))
and writeMarkdownCopies to behave inconsistently; update the blog loop to
normalize filePath to the same form as docs (e.g., make blog entries relative to
context.siteDir or make docs absolute) so collectedDocs entries are consistent,
and ensure the change covers the other occurrence noted (lines referenced around
173-178); keep collectedDocs, path.join(config.path, file), fullPath,
context.siteDir, orderDocs, and writeMarkdownCopies in mind when making the fix.
- Around line 120-171: The plugin currently assumes defaults when options.blog
=== true (blogConfig, blogDir, blogRouteBasePath) which can diverge from the
preset and break generated pageUrl and output locations; update the plugin so
that when options.blog === true it resolves the effective preset blog config at
runtime (merge the preset's blog path/routeBasePath into blogConfig before
deriving blogDir/blogRouteBasePath) by reading the site/preset config from
context (e.g., inspect context.siteConfig or context.presets) and falling back
to the current defaults if nothing is found, or alternatively add a clear header
comment above the options.blog handling stating "plugin blog options must be
kept in sync with preset blog config" and reference the symbols options.blog,
blogConfig, blogDir, blogRouteBasePath, url, and context.siteDir.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 1cc47944-f3ac-4cf0-8d58-d99acf97101a
📒 Files selected for processing (1)
plugins/plugin-generate-llms.js
Summary
Test plan
Summary by CodeRabbit