Update llms.adoc for proper markdown filenames and AI crawler permissions by JakeSCahill · Pull Request #161 · redpanda-data/docs-site

JakeSCahill · 2026-03-21T19:50:27Z

Summary

Updates documentation and configuration to reflect the new AI-optimized markdown export structure:

Updates llms.adoc to document proper .md filenames (not index.md)
Adds explicit AI crawler permissions to production robots.txt
Fixes outdated netlify.toml redirects for markdown files

Changes

home/modules/ROOT/pages/llms.adoc

Removed references to /page/index.md structure
Updated to proper .md filenames (/page.md)
Added AI-Optimized Formats section documenting:
- llms.txt (curated overview)
- llms-full.txt (complete export)
- Component-specific exports (ROOT-full.txt, redpanda-cloud-full.txt, etc.)

antora-playbook.yml

Enhanced production robots.txt from robots: allow to explicit directives
Added permissions for 14 AI platforms including:
- GPTBot, ChatGPT-User
- Claude-Web, anthropic-ai
- Perplexity, PerplexityBot
- Google-Extended, GoogleOther
- CCBot, cohere-ai, and more
Added crawl-delay directive

netlify.toml

Updated redirect: /current.md → /current/home.md (was /current/home/index.md)
Removed outdated catch-all index.md redirect

Testing

Verified that llms.txt generation works correctly with updated documentation.

Related PRs

AI optimization: frontmatter exports and component-specific full.txt files docs-extensions-and-macros#178: Core extension changes
Add AI-friendly meta tags and enhanced structured data docs-ui#371: UI template enhancements

## Changes - Remove references to "indexify convention" and index.md files - Update markdown access instructions: replace .html with .md instead of appending /index.md - Add AI-Optimized Formats section with: - llms.txt (curated overview) - llms-full.txt (complete export ~20MB) - Component-specific exports (ROOT-full.txt, redpanda-cloud-full.txt, etc.) - Document YAML frontmatter in individual markdown pages - Update versioning section to use proper markdown paths ## Benefits - Accurate documentation of new markdown structure - Clear guidance for AI agents on available formats - Better discoverability with component-specific exports - Matches actual implementation from docs-extensions-and-macros

…ming

Enhanced the production playbook (antora-playbook.yml) with explicit robots.txt directives for AI crawlers including GPTBot, Claude-Web, Perplexity, Google-Extended, and other platforms. This makes Redpanda's intent to welcome AI crawlers explicit and clear, following best practices for AI discoverability.

netlify · 2026-03-21T19:50:33Z

✅ Deploy Preview for redpanda-documentation ready!

Name	Link
🔨 Latest commit	`03e8ea6`
🔍 Latest deploy log	https://app.netlify.com/projects/redpanda-documentation/deploys/69ca995131d376000862f6ee
😎 Deploy Preview	https://deploy-preview-161--redpanda-documentation.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.
Lighthouse	1 paths audited Performance: 85 (🟢 up 6 from production) Accessibility: 96 (no change from production) Best Practices: 100 (no change from production) SEO: 83 (🔴 down 9 from production) PWA: - View the detailed breakdown and full score reports

To edit notification comments on pull requests, go to your Netlify project configuration.

Move Crawl-delay inside wildcard User-agent block for proper robots.txt syntax. Crawl-delay should be within a User-agent block, not standalone after all blocks.

Replace all /index.md paths with proper .md filenames: - /ai-agents/index.md → /ai-agents.md - /console/index.md → /console.md - /get-started/quickstarts/index.md → /get-started/quickstarts.md - /api/doc/*.md → /api/*.md This fixes afdocs check failures for broken links and ensures all URLs in llms.txt point to actual markdown files that exist.

Add catch-all redirect: /*/index.md → /:splat.md This ensures old bookmarks and links to /page/index.md are redirected to /page.md in the new markdown structure.

Rewrite AI-Optimized Formats section to use flowing prose instead of bullet lists for better readability. Remove ~20MB size reference as it's dynamic and will change over time. Maintain all essential information while improving narrative flow.

Change 'markdown' to 'Markdown' (proper noun) throughout the Access Markdown content section. Revert to bullet list format as the prose conversion was unintended.

Removed deprecated and questionable user agents: - Claude-Web, anthropic-ai (deprecated by Anthropic in 2026) - Perplexity (duplicate/incorrect - PerplexityBot is correct) - cohere-ai (undocumented) - Omgilibot (commercial scraper, not AI development) Added Anthropic's new three-bot framework (2026): - ClaudeBot (model training) - Claude-User (user requests) - Claude-SearchBot (search optimization) Verified remaining agents with official documentation: - GPTBot, ChatGPT-User (OpenAI) - PerplexityBot (Perplexity) - Google-Extended, GoogleOther (Google) - CCBot (Common Crawl) - FacebookBot (Meta) Added AI-CRAWLER-USER-AGENTS.md documentation with: - Verification evidence for each user agent - Official documentation links - Maintenance procedures - Change log

Updates llms.adoc to include comprehensive information about the Redpanda Documentation MCP (Model Context Protocol) server: - MCP server URL: https://docs.redpanda.com/mcp - Setup instructions for Claude Code (npx doc-tools setup-mcp) - Complete list of available MCP tools: * generate_property_docs * generate_metrics_docs * generate_rpk_docs * generate_rpcn_connector_docs * generate_helm_docs * generate_crd_docs * generate_bundle_openapi * get_redpanda_version * get_console_version * get_antora_structure Reorganized AI-Optimized Formats section: - Interactive MCP Server (new subsection) - Static Exports (existing content reorganized) This makes the MCP server discoverable via llms.txt for AI agents and tools that follow the llms.txt standard. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Adds MCPcat analytics tracking to the Redpanda docs MCP server. MCPcat is an open-source analytics platform for monitoring MCP usage. The integration is optional and only activates if the MCPCAT_PROJECT environment variable is set. Changes: - Added mcpcat as a dependency in package.json - Integrated MCPcat tracking in netlify/functions/mcp.mjs - Includes error handling to prevent server crashes if analytics fail Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Major restructuring to address AI agent discovery challenges: Changes: - Added **Discovery** section at the top explaining how agents find docs - Emphasized llms.txt as "minimum barrier for entry" (industry insight) - Moved MCP server into Discovery section as primary method - Clarified discovery methods: llms.txt, sitemap.xml, MCP server - Reorganized static exports as fallback for non-MCP agents - Enhanced markdown access patterns explanation Key additions: - "Why llms.txt Matters" - explains importance for AI consumption - Primary discovery methods numbered clearly - Sitemap reference (including sitemap-llms.xml) - Emphasis that "This documentation is optimized for AI consumption" Rationale: Research shows agents don't know about llms.txt by default and many sites don't provide it. This update makes discovery explicit and prominent, helping agents understand HOW to access our docs. Based on industry research about AI docs consumption patterns. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Added configuration comments to git-full-clone extension in all three playbooks (local, preview, production) to document available options: - unshallowTimeout: Timeout in ms (default: 60000) - skipUnshallow: Set true for air-gapped environments These configuration options provide production-ready safeguards: - Timeout protection prevents hanging on large repos - Skip option enables air-gapped CI/CD builds Also includes other playbook updates: - All new extensions enabled (git-dates, faq, markdown, llms, sitemap) - Log level adjustments for debugging - Local UI bundle path for faster local builds Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Reverted the new extension configurations from preview-antora-playbook.yml and local-antora-playbook.yml to prevent build failures before the extensions are published to npm. Changes saved in playbook-extensions-for-release.patch for reapplication after npm publication. What was reverted: - git-full-clone extension configuration - add-git-dates extension - add-faq-structured-data extension - convert-sitemap-to-markdown extension - Local references in local-antora-playbook.yml What remains: - Production playbook (antora-playbook.yml) keeps new extensions for next release To reapply after npm release: git apply playbook-extensions-for-release.patch Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Feediver1

Nice

JakeSCahill added 3 commits March 21, 2026 10:31

Remove outdated index.md redirects - markdown files now use proper na…

831834a

…ming

JakeSCahill requested a review from a team as a code owner March 21, 2026 19:50

JakeSCahill mentioned this pull request Mar 21, 2026

Add AI-friendly meta tags and enhanced structured data redpanda-data/docs-ui#371

Open

JakeSCahill and others added 12 commits March 21, 2026 20:10

Fix Crawl-delay placement in robots configuration

1c0b79d

Move Crawl-delay inside wildcard User-agent block for proper robots.txt syntax. Crawl-delay should be within a User-agent block, not standalone after all blocks.

Add redirect for old index.md URLs to new .md structure

0bf47fd

Add catch-all redirect: /*/index.md → /:splat.md This ensures old bookmarks and links to /page/index.md are redirected to /page.md in the new markdown structure.

Fix Markdown capitalization in llms.adoc

8ea1699

Change 'markdown' to 'Markdown' (proper noun) throughout the Access Markdown content section. Revert to bullet list format as the prose conversion was unintended.

Delete AI-CRAWLER-USER-AGENTS.md

3663ea5

Feediver1 approved these changes Mar 30, 2026

View reviewed changes

JakeSCahill added 2 commits March 30, 2026 16:39

Delete playbook-extensions-for-release.patch

016025b

Delete playbook-extensions-for-release.README

03e8ea6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update llms.adoc for proper markdown filenames and AI crawler permissions#161

Update llms.adoc for proper markdown filenames and AI crawler permissions#161
JakeSCahill wants to merge 17 commits intomainfrom
update-llms-markdown-documentation

JakeSCahill commented Mar 21, 2026 •

edited

Loading

Uh oh!

netlify bot commented Mar 21, 2026 •

edited

Loading

Uh oh!

Feediver1 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

JakeSCahill commented Mar 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

home/modules/ROOT/pages/llms.adoc

antora-playbook.yml

netlify.toml

Testing

Related PRs

Uh oh!

netlify bot commented Mar 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for redpanda-documentation ready!

Uh oh!

Feediver1 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

JakeSCahill commented Mar 21, 2026 •

edited

Loading

netlify bot commented Mar 21, 2026 •

edited

Loading