Skip to content

Conversation

@aidankmcalister
Copy link
Member

@aidankmcalister aidankmcalister commented Nov 7, 2025

Summary by CodeRabbit

  • New Features

    • Serve pre-rendered Markdown to recognized AI crawlers with fallback to normal handling and short public caching; improved crawler detection and handling.
  • Documentation

    • Integrated an LLM-focused docs plugin to generate Markdown/LLMS TXT outputs and restrict generation to docs content.
  • Chores

    • Added runtime typings for Pages-style functions and updated dependencies to support the new docs generation and middleware.

@cloudflare-workers-and-pages
Copy link

cloudflare-workers-and-pages bot commented Nov 7, 2025

Deploying docs with  Cloudflare Pages  Cloudflare Pages

Latest commit: 8d328be
Status: ✅  Deploy successful!
Preview URL: https://576ecb80.docs-51g.pages.dev
Branch Preview URL: https://dc-5820ai-agents-serve-md.docs-51g.pages.dev

View logs

@github-actions
Copy link
Contributor

github-actions bot commented Nov 7, 2025

Dangerous URL check

No absolute URLs to prisma.io/docs found.
No local URLs found.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 7, 2025

Walkthrough

Adds Cloudflare Pages middleware to detect AI crawlers and serve corresponding Markdown assets, introduces Cloudflare Pages Functions TypeScript declarations (EventContext, PagesFunction), configures a Docusaurus plugin for LLMS TXT generation in docusaurus.config.ts, and adds the plugin dependency to package.json.

Changes

Cohort / File(s) Change Summary
Docusaurus configuration
docusaurus.config.ts
Adds @signalwire/docusaurus-plugin-llms-txt plugin configuration with typed PluginOptions (uses satisfies), imports PluginOptions from @signalwire/docusaurus-plugin-llms-txt/public, and minor formatting tweaks (trailing commas).
Cloudflare Pages middleware
functions/_middleware.ts
New middleware export const onRequest: PagesFunction<Env> that detects AI crawlers using AI_CRAWLER_PATTERNS and isAICrawler, appends .md to request paths, attempts context.env.ASSETS.fetch for the Markdown asset, returns fetched content as text/markdown; charset=utf-8 with public one-hour caching on success, or calls next() on fetch failure/no match; logs observed content-type.
Type declarations
functions/types.d.ts
New TypeScript declarations: exported EventContext<Env = unknown> interface and exported PagesFunction<Env = unknown> type used by the middleware.
Dependencies
package.json
Adds dependency @signalwire/docusaurus-plugin-llms-txt; reorders docusaurus-plugin-sass entry (no version change).

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Inspect functions/_middleware.ts for correct, case-insensitive User-Agent and Accept header checks, and ensure path normalization when appending .md (handle query strings, trailing slashes, index resolution).
  • Verify context.env.ASSETS.fetch usage, response handling, and headers (Content-Type, Cache-Control) are correct and secure.
  • Confirm functions/types.d.ts signatures match runtime expectations and are imported/used consistently.
  • Validate docusaurus.config.ts uses the correct PluginOptions import path and that the new plugin configuration aligns with the added dependency.

Pre-merge checks

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'DC-5820 AI Agents Served Markdown' accurately reflects the main change: enabling AI crawler detection and serving markdown content to these crawlers through middleware and Docusaurus plugin configuration.

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 68c23d9 and 8d328be.

📒 Files selected for processing (1)
  • functions/_middleware.ts (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • functions/_middleware.ts
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Check internal links
  • GitHub Check: Cloudflare Pages

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Contributor

github-actions bot commented Nov 7, 2025

Redirect check

This PR probably requires the following redirects to be added to static/_redirects:

  • This PR does not change any pages in a way that would require a redirect.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ddfc555 and daf7d45.

⛔ Files ignored due to path filters (1)
  • package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (4)
  • docusaurus.config.ts (3 hunks)
  • functions/_middleware.ts (1 hunks)
  • functions/types.d.ts (1 hunks)
  • package.json (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: aidankmcalister
Repo: prisma/docs PR: 7165
File: content/800-guides/550-test-guide.mdx:85-90
Timestamp: 2025-10-08T16:23:00.388Z
Learning: For .mdx files in the prisma/docs repository: Only flag code snippets for (1) exposed secrets with real-looking values that should be placeholders (e.g., API keys, database passwords, AWS credentials), or (2) blatant syntax errors like missing parentheses, brackets, braces, or mismatched delimiters. Do not flag code quality issues, anti-patterns, security vulnerabilities, missing error handling, unused variables, or any other bad practices, as documentation intentionally shows problematic code.
Learnt from: carlagn
Repo: prisma/docs PR: 7183
File: content/200-orm/800-more/600-help-and-troubleshooting/050-dataguide/050-database-glossary.mdx:561-567
Timestamp: 2025-10-14T11:41:08.844Z
Learning: Ignore the file `content/200-orm/800-more/600-help-and-troubleshooting/050-dataguide/050-database-glossary.mdx` in future reviews as per user request. Issues in this file were addressed in PR #7185.
Learnt from: aidankmcalister
Repo: prisma/docs PR: 7167
File: content/900-ai/prompts/astro.mdx:84-85
Timestamp: 2025-10-10T13:13:30.534Z
Learning: Do not review or comment on files in the `ai/prompts/` directory or matching the path pattern `content/900-ai/prompts/` in the prisma/docs repository.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Check internal links
  • GitHub Check: Cloudflare Pages
🔇 Additional comments (5)
docusaurus.config.ts (1)

7-7: LGTM: Type-safe plugin import.

The import of PluginOptions from the new plugin enables type-safe configuration with the satisfies assertion on line 108.

functions/_middleware.ts (2)

7-34: LGTM: Comprehensive AI crawler detection.

The pattern list covers major AI crawlers including OpenAI, Anthropic, Google, Bing, and others. The case-insensitive matching in isAICrawler (line 40) ensures robust detection.


43-74: Verify markdown file generation and serving path during deployment.

The middleware code is correct, but verification requires checking the actual deployment setup since:

  1. The plugin is configured with enableMarkdownFiles: true, which should generate individual .md files for each Docusaurus page. However, the post-build hook in docusaurus.config.ts only writes .txt files; markdown generation happens in the plugin's internal code.

  2. The context.env.ASSETS binding destination is a Cloudflare Pages project setting (not visible in the repository). Ensure that the generated markdown files are included in the build output directory that Cloudflare Pages publishes as ASSETS.

  3. The plugin version (@signalwire/docusaurus-plugin-llms-txt@^2.0.0-alpha.2) is alpha, which may introduce instability.

Regarding max-age=3600: 1 hour is reasonable for documentation, but consider your update frequency and whether shorter or longer TTLs are more appropriate.

functions/types.d.ts (1)

1-20: LGTM: Well-structured Cloudflare Pages type definitions.

The ambient type declarations provide appropriate typing for Cloudflare Pages Functions:

  • Env interface correctly types the environment bindings including ASSETS.fetch and DOCUSAURUS_BASE_URL
  • EventContext properly models the request context with all necessary properties
  • PagesFunction type correctly defines the middleware signature

These ambient declarations will be available throughout the functions/ directory without explicit imports.

package.json (1)

27-27: Verify alpha dependency stability; consider upgrading to stable version 1.2.2.

The @signalwire/docusaurus-plugin-llms-txt package is at version 2.0.0-alpha.2, which indicates pre-release software. The latest stable version is 1.2.2. Alpha versions may have bugs, breaking changes, or incomplete features that could impact production stability.

Verify the plugin's stability and review its changelog before deploying to production. If stability in production is prioritized, consider upgrading to version 1.2.2 or justifying the alpha version choice.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
functions/_middleware.ts (1)

1-5: Remove duplicate Env typing

This file still redefines Env instead of reusing the ambient declaration from functions/types.d.ts, which already carries properties like DOCUSAURUS_BASE_URL. Keeping two conflicting versions means PagesFunction<Env> here will silently miss any fields added to the shared type, so the middleware can break as the contract evolves. Please drop the local interface and rely on the shared definition.

-interface Env {
-  ASSETS: {
-    fetch: typeof fetch;
-  };
-}
-
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between daf7d45 and 3fc91b3.

📒 Files selected for processing (1)
  • functions/_middleware.ts (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: carlagn
Repo: prisma/docs PR: 7183
File: content/200-orm/800-more/600-help-and-troubleshooting/050-dataguide/050-database-glossary.mdx:561-567
Timestamp: 2025-10-14T11:41:08.844Z
Learning: Ignore the file `content/200-orm/800-more/600-help-and-troubleshooting/050-dataguide/050-database-glossary.mdx` in future reviews as per user request. Issues in this file were addressed in PR #7185.
Learnt from: aidankmcalister
Repo: prisma/docs PR: 7165
File: content/800-guides/550-test-guide.mdx:85-90
Timestamp: 2025-10-08T16:23:00.388Z
Learning: For .mdx files in the prisma/docs repository: Only flag code snippets for (1) exposed secrets with real-looking values that should be placeholders (e.g., API keys, database passwords, AWS credentials), or (2) blatant syntax errors like missing parentheses, brackets, braces, or mismatched delimiters. Do not flag code quality issues, anti-patterns, security vulnerabilities, missing error handling, unused variables, or any other bad practices, as documentation intentionally shows problematic code.
Learnt from: aidankmcalister
Repo: prisma/docs PR: 7167
File: content/900-ai/prompts/astro.mdx:84-85
Timestamp: 2025-10-10T13:13:30.534Z
Learning: Do not review or comment on files in the `ai/prompts/` directory or matching the path pattern `content/900-ai/prompts/` in the prisma/docs repository.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Check internal links
  • GitHub Check: Cloudflare Pages

@github-actions
Copy link
Contributor

github-actions bot commented Nov 7, 2025

🍈 Lychee Link Check Report

Note: Links are cached for 5 minutes. Failed links (timeouts, rate limits) are retried in a second run with longer timeout.

📊 Results Overview

Status Count
🔍 Total 2413
✅ Successful 2383
⏳ Timeouts 0
🔀 Redirected 7
👻 Excluded 22
❓ Unknown 0
🚫 Errors 0
⛔ Unsupported 1

Redirects per input

Redirects in 800-guides/070-cloudflare-d1.mdx

Redirects in 800-guides/090-nextjs.mdx

Redirects in 800-guides/170-react-router-7.mdx

Redirects in 800-guides/350-authjs-nextjs.mdx

Redirects in 800-guides/370-bun.mdx

Redirects in 800-guides/380-vercel-app-deployment.mdx

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
functions/_middleware.ts (1)

69-70: Consider removing debug logging or using structured logging.

The console.log will execute on every AI crawler request. In Cloudflare Pages Functions, console logs may not be easily accessible without Real-time Logs enabled, and could create noise in production.

Consider removing the debug log or making it conditional:

-      // Check what content type we actually got
-      const actualContentType = response.headers.get("content-type");
-      console.log(`Fetched ${markdownPath}, got content-type: ${actualContentType}`);
-
       return new Response(response.body, {
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3fc91b3 and 4cf6736.

📒 Files selected for processing (1)
  • functions/_middleware.ts (1 hunks)
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: aidankmcalister
Repo: prisma/docs PR: 7165
File: content/800-guides/550-test-guide.mdx:85-90
Timestamp: 2025-10-08T16:23:00.388Z
Learning: For .mdx files in the prisma/docs repository: Only flag code snippets for (1) exposed secrets with real-looking values that should be placeholders (e.g., API keys, database passwords, AWS credentials), or (2) blatant syntax errors like missing parentheses, brackets, braces, or mismatched delimiters. Do not flag code quality issues, anti-patterns, security vulnerabilities, missing error handling, unused variables, or any other bad practices, as documentation intentionally shows problematic code.
Learnt from: carlagn
Repo: prisma/docs PR: 7183
File: content/200-orm/800-more/600-help-and-troubleshooting/050-dataguide/050-database-glossary.mdx:561-567
Timestamp: 2025-10-14T11:41:08.844Z
Learning: Ignore the file `content/200-orm/800-more/600-help-and-troubleshooting/050-dataguide/050-database-glossary.mdx` in future reviews as per user request. Issues in this file were addressed in PR #7185.
Learnt from: aidankmcalister
Repo: prisma/docs PR: 7167
File: content/900-ai/prompts/astro.mdx:84-85
Timestamp: 2025-10-10T13:13:30.534Z
Learning: Do not review or comment on files in the `ai/prompts/` directory or matching the path pattern `content/900-ai/prompts/` in the prisma/docs repository.
📚 Learning: 2025-10-08T16:22:57.129Z
Learnt from: aidankmcalister
Repo: prisma/docs PR: 7165
File: content/800-guides/550-test-guide.mdx:50-66
Timestamp: 2025-10-08T16:22:57.129Z
Learning: In `.mdx` files, do NOT flag or suggest changes for the following code quality issues even if they represent poor practices: React anti-patterns (using var instead of useState, direct DOM manipulation), missing keys in .map() iterations, non-serializable props in getServerSideProps, unused variables, missing error handling, SQL injection vulnerabilities (unless actively showing how to fix them), insecure cookie settings, missing TypeScript types, PrismaClient instantiation patterns, or any other code quality, security, or performance issues. Documentation code snippets are copied from source code and often intentionally show "before" examples or common mistakes.

Applied to files:

  • functions/_middleware.ts
📚 Learning: 2025-10-08T16:23:00.388Z
Learnt from: aidankmcalister
Repo: prisma/docs PR: 7165
File: content/800-guides/550-test-guide.mdx:85-90
Timestamp: 2025-10-08T16:23:00.388Z
Learning: For .mdx files in the prisma/docs repository: Only flag code snippets for (1) exposed secrets with real-looking values that should be placeholders (e.g., API keys, database passwords, AWS credentials), or (2) blatant syntax errors like missing parentheses, brackets, braces, or mismatched delimiters. Do not flag code quality issues, anti-patterns, security vulnerabilities, missing error handling, unused variables, or any other bad practices, as documentation intentionally shows problematic code.

Applied to files:

  • functions/_middleware.ts
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Check internal links
  • GitHub Check: Cloudflare Pages
🔇 Additional comments (2)
functions/_middleware.ts (2)

43-48: LGTM: Crawler detection logic is sound.

The case-insensitive substring matching is appropriate for user-agent detection. The null check prevents errors, and the implementation is clean.


72-78: LGTM: Response construction is well-configured.

The response properly sets the markdown content-type with UTF-8 encoding and includes appropriate caching headers for CDN efficiency.

@github-actions
Copy link
Contributor

github-actions bot commented Nov 7, 2025

🍈 Lychee Link Check Report

Note: Links are cached for 5 minutes. Failed links (timeouts, rate limits) are retried in a second run with longer timeout.

📊 Results Overview

Status Count
🔍 Total 2413
✅ Successful 2384
⏳ Timeouts 0
🔀 Redirected 6
👻 Excluded 22
❓ Unknown 0
🚫 Errors 0
⛔ Unsupported 1

Redirects per input

Redirects in 200-orm/800-more/500-development-environment/200-editor-setup.mdx

Redirects in 200-orm/800-more/600-help-and-troubleshooting/050-dataguide/050-database-glossary.mdx

Redirects in 800-guides/070-cloudflare-d1.mdx

Redirects in 800-guides/370-bun.mdx

Redirects in 800-guides/380-vercel-app-deployment.mdx

@aidankmcalister aidankmcalister merged commit 2daa27a into main Nov 10, 2025
6 checks passed
@aidankmcalister aidankmcalister deleted the DC-5820ai-agents-serve-md branch November 10, 2025 13:59
ankur-arch added a commit that referenced this pull request Nov 15, 2025
* feat(docs): add youtube embeded link to blog post (#7220)

Co-authored-by: Arthur Gamby <arthurgamby@Mac-002.lan>

* feat(docs): add quick  section to blog after the prompt (#7221)

Co-authored-by: Arthur Gamby <arthurgamby@Mac-002.lan>

* DC-5242 Astro Better-Auth Guide (#7215)

* doc created

* nextjs betterauth fixed

* guide broken down into manageable steps

* image added

* Optimised images with calibre/image-actions

* image updated

* Optimised images with calibre/image-actions

* lychee only comments on broken links

* config updated

* lychee updated based on CR comment

* chore: trigger CI checks

* ignore gnu

* fix: update naming to better-auth DC-6120

* Optimised images with calibre/image-actions

* chore: shorten word

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Ankur Datta <64993082+ankur-arch@users.noreply.github.com>

* feat: restructure getting started side nav

* DC-5841 Removed Linkspector (#7231)

* removed linkspector

* retrigger

* retrigger

* retrigger

* Update label for Prisma Postgres tab (#7236)

* fix: content changes for getting started page (#7216)

* fix: content changes for getting started page

* fix: add redirects

* getting started checkpoint

* css styles fixed

* updates

---------

Co-authored-by: Aidan McAlister <aidankmcalister@gmail.com>
Co-authored-by: Aidan McAlister <105178005+aidankmcalister@users.noreply.github.com>

* DC-5820 AI Agents Served Markdown (#7237)

* ai crawler check successful

* ai crawlers checked

* update

* added anthropic

* middleware update

* improve detection

---------

Co-authored-by: Mike Hartington <mikehartington@gmail.com>

* feat: add prisma-orm quickstarts

* fix: update times and add proper links

* fix: instropspect changes

* feat: add get started from prisma orm page

* feat: add other orms

* fix: update other tools + ppg

* fix: add more clarity

* fix: add prisma postgres

* feat: clear migrate from early access

* fix: add to existing dbs sections

* Remove MCP server exploration tip (#7241)

Removed tip about using Cloudflare's AI Playground for MCP server exploration, as it no longer seems to reliably work.

* fix: clean-up docs files

* fix: typeorm missing urls

* fix: broken link

* fix: update titles

* fix: clear redirect loop (#7250)

* fix: add generate step + sqlite fixes

* fix: clean-up redirects file DC-6228

* fix: clean-up unnecessary file

* fix: remove unknown word

* fix: add links

---------

Co-authored-by: Arthur <arthur_gamby@hotmail.fr>
Co-authored-by: Arthur Gamby <arthurgamby@Mac-002.lan>
Co-authored-by: Aidan McAlister <105178005+aidankmcalister@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Petra Donka <donkapetra@gmail.com>
Co-authored-by: Aidan McAlister <aidankmcalister@gmail.com>
Co-authored-by: Mike Hartington <mikehartington@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants