Update llms.adoc for proper markdown filenames and AI crawler permissions#161
Open
JakeSCahill wants to merge 17 commits intomainfrom
Open
Update llms.adoc for proper markdown filenames and AI crawler permissions#161JakeSCahill wants to merge 17 commits intomainfrom
JakeSCahill wants to merge 17 commits intomainfrom
Conversation
## Changes - Remove references to "indexify convention" and index.md files - Update markdown access instructions: replace .html with .md instead of appending /index.md - Add AI-Optimized Formats section with: - llms.txt (curated overview) - llms-full.txt (complete export ~20MB) - Component-specific exports (ROOT-full.txt, redpanda-cloud-full.txt, etc.) - Document YAML frontmatter in individual markdown pages - Update versioning section to use proper markdown paths ## Benefits - Accurate documentation of new markdown structure - Clear guidance for AI agents on available formats - Better discoverability with component-specific exports - Matches actual implementation from docs-extensions-and-macros
Enhanced the production playbook (antora-playbook.yml) with explicit robots.txt directives for AI crawlers including GPTBot, Claude-Web, Perplexity, Google-Extended, and other platforms. This makes Redpanda's intent to welcome AI crawlers explicit and clear, following best practices for AI discoverability.
✅ Deploy Preview for redpanda-documentation ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
Move Crawl-delay inside wildcard User-agent block for proper robots.txt syntax. Crawl-delay should be within a User-agent block, not standalone after all blocks.
Replace all /index.md paths with proper .md filenames: - /ai-agents/index.md → /ai-agents.md - /console/index.md → /console.md - /get-started/quickstarts/index.md → /get-started/quickstarts.md - /api/doc/*.md → /api/*.md This fixes afdocs check failures for broken links and ensures all URLs in llms.txt point to actual markdown files that exist.
Add catch-all redirect: /*/index.md → /:splat.md This ensures old bookmarks and links to /page/index.md are redirected to /page.md in the new markdown structure.
Rewrite AI-Optimized Formats section to use flowing prose instead of bullet lists for better readability. Remove ~20MB size reference as it's dynamic and will change over time. Maintain all essential information while improving narrative flow.
Change 'markdown' to 'Markdown' (proper noun) throughout the Access Markdown content section. Revert to bullet list format as the prose conversion was unintended.
Removed deprecated and questionable user agents: - Claude-Web, anthropic-ai (deprecated by Anthropic in 2026) - Perplexity (duplicate/incorrect - PerplexityBot is correct) - cohere-ai (undocumented) - Omgilibot (commercial scraper, not AI development) Added Anthropic's new three-bot framework (2026): - ClaudeBot (model training) - Claude-User (user requests) - Claude-SearchBot (search optimization) Verified remaining agents with official documentation: - GPTBot, ChatGPT-User (OpenAI) - PerplexityBot (Perplexity) - Google-Extended, GoogleOther (Google) - CCBot (Common Crawl) - FacebookBot (Meta) Added AI-CRAWLER-USER-AGENTS.md documentation with: - Verification evidence for each user agent - Official documentation links - Maintenance procedures - Change log
Updates llms.adoc to include comprehensive information about the Redpanda Documentation MCP (Model Context Protocol) server: - MCP server URL: https://docs.redpanda.com/mcp - Setup instructions for Claude Code (npx doc-tools setup-mcp) - Complete list of available MCP tools: * generate_property_docs * generate_metrics_docs * generate_rpk_docs * generate_rpcn_connector_docs * generate_helm_docs * generate_crd_docs * generate_bundle_openapi * get_redpanda_version * get_console_version * get_antora_structure Reorganized AI-Optimized Formats section: - Interactive MCP Server (new subsection) - Static Exports (existing content reorganized) This makes the MCP server discoverable via llms.txt for AI agents and tools that follow the llms.txt standard. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Adds MCPcat analytics tracking to the Redpanda docs MCP server. MCPcat is an open-source analytics platform for monitoring MCP usage. The integration is optional and only activates if the MCPCAT_PROJECT environment variable is set. Changes: - Added mcpcat as a dependency in package.json - Integrated MCPcat tracking in netlify/functions/mcp.mjs - Includes error handling to prevent server crashes if analytics fail Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Major restructuring to address AI agent discovery challenges: Changes: - Added **Discovery** section at the top explaining how agents find docs - Emphasized llms.txt as "minimum barrier for entry" (industry insight) - Moved MCP server into Discovery section as primary method - Clarified discovery methods: llms.txt, sitemap.xml, MCP server - Reorganized static exports as fallback for non-MCP agents - Enhanced markdown access patterns explanation Key additions: - "Why llms.txt Matters" - explains importance for AI consumption - Primary discovery methods numbered clearly - Sitemap reference (including sitemap-llms.xml) - Emphasis that "This documentation is optimized for AI consumption" Rationale: Research shows agents don't know about llms.txt by default and many sites don't provide it. This update makes discovery explicit and prominent, helping agents understand HOW to access our docs. Based on industry research about AI docs consumption patterns. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Added configuration comments to git-full-clone extension in all three playbooks (local, preview, production) to document available options: - unshallowTimeout: Timeout in ms (default: 60000) - skipUnshallow: Set true for air-gapped environments These configuration options provide production-ready safeguards: - Timeout protection prevents hanging on large repos - Skip option enables air-gapped CI/CD builds Also includes other playbook updates: - All new extensions enabled (git-dates, faq, markdown, llms, sitemap) - Log level adjustments for debugging - Local UI bundle path for faster local builds Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Reverted the new extension configurations from preview-antora-playbook.yml and local-antora-playbook.yml to prevent build failures before the extensions are published to npm. Changes saved in playbook-extensions-for-release.patch for reapplication after npm publication. What was reverted: - git-full-clone extension configuration - add-git-dates extension - add-faq-structured-data extension - convert-sitemap-to-markdown extension - Local references in local-antora-playbook.yml What remains: - Production playbook (antora-playbook.yml) keeps new extensions for next release To reapply after npm release: git apply playbook-extensions-for-release.patch Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Summary
Updates documentation and configuration to reflect the new AI-optimized markdown export structure:
Changes
home/modules/ROOT/pages/llms.adoc
/page/index.mdstructure.mdfilenames (/page.md)antora-playbook.yml
robots: allowto explicit directivesnetlify.toml
/current.md→/current/home.md(was/current/home/index.md)Testing
Verified that llms.txt generation works correctly with updated documentation.
Related PRs