Skip to content

Added stateless llms.txt service with dependency injection#28042

Merged
ErisDS merged 11 commits into
mainfrom
llms-txt-stateless
May 28, 2026
Merged

Added stateless llms.txt service with dependency injection#28042
ErisDS merged 11 commits into
mainfrom
llms-txt-stateless

Conversation

@ErisDS
Copy link
Copy Markdown
Member

@ErisDS ErisDS commented May 21, 2026

Summary

  • Implements llms.txt / llms-full.txt generation, per-entry .md export, Accept: text/markdown content negotiation, and Link/X-Llms-Txt discovery headers
  • Stateless, compute-on-request architecture — no in-memory cache, no event listeners, relies on HTTP Cache-Control for CDN/proxy caching
  • Full dependency injection via factory functions (createLlmsService, createLlmsHandler, createLlmsDiscovery) — no module-level singleton imports
  • Gated behind private labs flag (llmsTxt) + llms_enabled setting as user-facing toggle
  • Paginated DB queries for llms-full.txt (100 per batch, 5 MiB budget) with withRelated: ['tags', 'authors'] for correct permalink resolution
  • Scoped .md suffix routes on CollectionRouter and StaticPagesRouter (not a broad regex)
  • Structured logging ([llms] prefix) and Sentry reporting on handler errors
  • markdown.js is fully pure — no Ghost singleton imports

Ref #27984

Architecture

Routing approach

Per-entry .md URLs are registered as suffix routes directly on CollectionRouter and StaticPagesRouter, reusing the existing entryLookup pipeline. This avoids a double-fetch (the broad regex approach needed a separate fetchPublicEntry call) and respects all existing routing features (_respectDominantRouter, redirect handling, etc.).

DI and service wiring

createLlmsService() receives all dependencies as constructor params. The service is created once at boot in site.js and exposed via siteApp.set('llmsService', llmsService) — entry controller accesses it via req.app.get('llmsService').

What is NOT included (intentionally)

Test plan

  • 10 service unit tests + 25 markdown unit tests passing
  • Config snapshot updated for llmsTxt labs flag
  • Lint clean across entire monorepo
  • Full CI green (unit, integration, acceptance, E2E, Playwright)
  • Labs flag off (default): /llms.txt returns 404, no discovery headers, .md URLs 404, Accept: text/markdown returns HTML
  • Labs flag on: all endpoints serve correct content with Cache-Control headers
  • Pretty URLs regression: /author/name.smith/ and /tag/v2.0/ still get trailing-slash redirects
  • Content negotiation: Accept: text/markdown on a public post returns markdown with Vary: Accept
  • Discovery headers: HTML responses include Link and X-Llms-Txt pointing to /llms.txt
  • Permalink patterns with :primary_tag/:primary_author resolve correctly in llms.txt index

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 21, 2026

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: b6f8e5ad-7fc8-4b6a-9dad-6a9d0fa3623a

📥 Commits

Reviewing files that changed from the base of the PR and between 16f7cad and 9cd67af.

⛔ Files ignored due to path filters (2)
  • ghost/core/test/e2e-api/admin/__snapshots__/config.test.js.snap is excluded by !**/*.snap
  • pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (18)
  • apps/admin-x-settings/src/components/settings/advanced/labs/private-features.tsx
  • ghost/core/core/frontend/services/llms/handler.js
  • ghost/core/core/frontend/services/llms/markdown.js
  • ghost/core/core/frontend/services/llms/service.js
  • ghost/core/core/frontend/services/routing/collection-router.js
  • ghost/core/core/frontend/services/routing/controllers/entry.js
  • ghost/core/core/frontend/services/routing/static-pages-router.js
  • ghost/core/core/frontend/web/middleware/llms-discovery.js
  • ghost/core/core/frontend/web/middleware/static-theme.js
  • ghost/core/core/frontend/web/site.js
  • ghost/core/core/server/web/shared/middleware/pretty-urls.js
  • ghost/core/core/shared/config/defaults.json
  • ghost/core/core/shared/labs.js
  • ghost/core/package.json
  • ghost/core/test/unit/frontend/services/llms/markdown.test.js
  • ghost/core/test/unit/frontend/services/llms/service.test.js
  • ghost/core/test/unit/frontend/services/routing/collection-router.test.js
  • pnpm-workspace.yaml
✅ Files skipped from review due to trivial changes (3)
  • ghost/core/core/shared/config/defaults.json
  • pnpm-workspace.yaml
  • ghost/core/test/unit/frontend/services/llms/service.test.js
🚧 Files skipped from review as they are similar to previous changes (13)
  • ghost/core/core/shared/labs.js
  • ghost/core/test/unit/frontend/services/routing/collection-router.test.js
  • apps/admin-x-settings/src/components/settings/advanced/labs/private-features.tsx
  • ghost/core/core/frontend/services/routing/collection-router.js
  • ghost/core/core/frontend/web/middleware/static-theme.js
  • ghost/core/package.json
  • ghost/core/test/unit/frontend/services/llms/markdown.test.js
  • ghost/core/core/frontend/web/site.js
  • ghost/core/core/frontend/web/middleware/llms-discovery.js
  • ghost/core/core/frontend/services/routing/controllers/entry.js
  • ghost/core/core/frontend/services/llms/markdown.js
  • ghost/core/core/frontend/services/llms/service.js
  • ghost/core/core/frontend/services/llms/handler.js

Walkthrough

Adds LLMS support: a markdown conversion module, a createLlmsService factory producing llms.txt and llms-full.txt (index and bounded full export), an Express handler mounting plaintext endpoints, per-entry .md routes and controller branches to serve/redirect markdown, discovery middleware and header handling, labs/config defaults, tests, and package/workspace dependency updates.

Possibly related PRs

  • TryGhost/Ghost#27995: Introduces the llms_enabled site setting referenced by the LLMS service enablement checks.

Suggested reviewers

  • vershwal
  • EvanHahn
  • troyciesco
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 2.17% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Added stateless llms.txt service with dependency injection' directly summarizes the main change—introducing an LLMS service with specific architectural patterns (stateless, DI).
Description check ✅ Passed The description is comprehensive and directly related to the changeset, covering implementation details, architecture, gating mechanisms, and test coverage.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch llms-txt-stateless

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (2)
ghost/core/test/unit/frontend/services/llms/markdown.test.js (1)

144-167: ⚡ Quick win

Consider adding edge-case tests for better coverage.

The integration test provides good coverage of the happy path, but consider adding tests for:

  • formatIsoDate with invalid date strings (would catch the issue flagged in markdown.js)
  • renderEntryMarkdown with missing title, no metadata fields
  • renderEntryMarkdownBody fallback to htmlToPlaintext.excerpt() when both html and plaintext are falsy

These additions would improve test maintainability by isolating failure points and documenting expected behavior for edge cases.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ghost/core/test/unit/frontend/services/llms/markdown.test.js` around lines
144 - 167, Add unit tests to cover the edge cases: 1) for formatIsoDate, add
tests that pass invalid/empty date strings and assert it handles them (returns
empty string or expected fallback) to catch parsing errors; 2) for
renderEntryMarkdown, add a test with an entry missing title and all optional
metadata (no url, authors, tags, custom_excerpt, html, plaintext) and assert the
output still renders safely (no undefined values and expected minimal
placeholders or sections omitted); 3) for renderEntryMarkdownBody, add a test
where both html and plaintext are falsy and assert it falls back to calling
htmlToPlaintext.excerpt() (mock/stub htmlToPlaintext.excerpt to verify it was
invoked and its output used). Reference the functions formatIsoDate,
renderEntryMarkdown, renderEntryMarkdownBody, and htmlToPlaintext.excerpt when
adding these tests.
ghost/core/core/frontend/services/llms/service.js (1)

197-216: Consider memory implications of limit: 'all' for large sites.

The index generation fetches all public pages/posts at once with limit: 'all'. While the html column is excluded to reduce memory usage, sites with tens of thousands of entries may still experience memory pressure during index generation.

For future optimization, consider adding a pagination option for the index or documenting expected memory requirements for large sites.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ghost/core/core/frontend/services/llms/service.js` around lines 197 - 216,
fetchIndexEntries currently calls models.Post.findPage with limit: 'all', which
can cause high memory usage for very large sites; change it to use paginated
fetching instead of 'all' by calling models.Post.findPage in a loop (or using
offset/page parameters) and accumulating entries in batches, using the same
order/filter/columns and early filtering of entries without urls, so you only
hold one batch in memory at a time; update fetchIndexEntries to iterate pages
until no more results (or expose a pagination option) rather than requesting all
rows at once from models.Post.findPage.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@ghost/core/core/frontend/services/llms/markdown.js`:
- Around line 82-88: formatIsoDate currently calls new Date(value).toISOString()
without validating the date, which can throw RangeError for invalid values;
update formatIsoDate to validate the parsed Date (use new Date(value) then check
isNaN(date.getTime()) or !isFinite(date)) and return null (or a safe fallback)
when the value is invalid, otherwise return date.toISOString(); reference the
function name formatIsoDate and the calls to new Date(...) and toISOString()
when making the change.

In `@ghost/core/core/frontend/services/llms/service.js`:
- Around line 17-28: Add unit tests for fetchPublicEntry: mock
api.pagesPublic.read and api.postsPublic.read and assert the correct controller
is called depending on resourceType ('pages' uses api.pagesPublic, 'posts' uses
api.postsPublic), that the call receives id, formats: 'html,plaintext', include:
'authors,tags' and context with the provided member, that the function returns
the first item from response[responseKey] when present (exercise optional
chaining behavior), and that it returns null when response or
response[responseKey] is empty/undefined; use spies/mocks to validate the
controller selection and context forwarding for fetchPublicEntry.

In `@pnpm-workspace.yaml`:
- Line 54: The dependency entry for node-html-markdown is using a caret range
(^2.0.0) which breaks reproducible builds; update the catalog dependency
declaration for node-html-markdown to use the exact pinned version (change
^2.0.0 to 2.0.0) so it matches the other pinned catalog dependencies and
complies with strict catalogMode.

---

Nitpick comments:
In `@ghost/core/core/frontend/services/llms/service.js`:
- Around line 197-216: fetchIndexEntries currently calls models.Post.findPage
with limit: 'all', which can cause high memory usage for very large sites;
change it to use paginated fetching instead of 'all' by calling
models.Post.findPage in a loop (or using offset/page parameters) and
accumulating entries in batches, using the same order/filter/columns and early
filtering of entries without urls, so you only hold one batch in memory at a
time; update fetchIndexEntries to iterate pages until no more results (or expose
a pagination option) rather than requesting all rows at once from
models.Post.findPage.

In `@ghost/core/test/unit/frontend/services/llms/markdown.test.js`:
- Around line 144-167: Add unit tests to cover the edge cases: 1) for
formatIsoDate, add tests that pass invalid/empty date strings and assert it
handles them (returns empty string or expected fallback) to catch parsing
errors; 2) for renderEntryMarkdown, add a test with an entry missing title and
all optional metadata (no url, authors, tags, custom_excerpt, html, plaintext)
and assert the output still renders safely (no undefined values and expected
minimal placeholders or sections omitted); 3) for renderEntryMarkdownBody, add a
test where both html and plaintext are falsy and assert it falls back to calling
htmlToPlaintext.excerpt() (mock/stub htmlToPlaintext.excerpt to verify it was
invoked and its output used). Reference the functions formatIsoDate,
renderEntryMarkdown, renderEntryMarkdownBody, and htmlToPlaintext.excerpt when
adding these tests.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 9d3c2241-d955-43a8-9c5c-0ec254852617

📥 Commits

Reviewing files that changed from the base of the PR and between 4daaa5b and 406f910.

⛔ Files ignored due to path filters (1)
  • pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (13)
  • ghost/core/core/frontend/services/llms/handler.js
  • ghost/core/core/frontend/services/llms/markdown.js
  • ghost/core/core/frontend/services/llms/service.js
  • ghost/core/core/frontend/services/routing/controllers/entry.js
  • ghost/core/core/frontend/web/middleware/llms-discovery.js
  • ghost/core/core/frontend/web/middleware/static-theme.js
  • ghost/core/core/frontend/web/site.js
  • ghost/core/core/server/web/shared/middleware/pretty-urls.js
  • ghost/core/core/shared/config/defaults.json
  • ghost/core/package.json
  • ghost/core/test/unit/frontend/services/llms/markdown.test.js
  • ghost/core/test/unit/frontend/services/llms/service.test.js
  • pnpm-workspace.yaml

Comment thread ghost/core/core/frontend/services/llms/markdown.js
Comment thread ghost/core/core/frontend/services/llms/service.js
Comment thread pnpm-workspace.yaml Outdated
@codecov
Copy link
Copy Markdown

codecov Bot commented May 21, 2026

Codecov Report

❌ Patch coverage is 34.04580% with 432 lines in your changes missing coverage. Please review.
✅ Project coverage is 73.62%. Comparing base (8bdc9fe) to head (9cd67af).

Files with missing lines Patch % Lines
ghost/core/core/frontend/services/llms/service.js 16.46% 203 Missing ⚠️
ghost/core/core/frontend/services/llms/markdown.js 25.55% 134 Missing ⚠️
ghost/core/core/frontend/services/llms/handler.js 35.82% 43 Missing ⚠️
...ore/frontend/services/routing/controllers/entry.js 23.91% 35 Missing ⚠️
...ore/core/frontend/web/middleware/llms-discovery.js 84.44% 6 Missing and 1 partial ⚠️
...e/core/server/web/shared/middleware/pretty-urls.js 80.00% 4 Missing ⚠️
...ore/frontend/services/routing/collection-router.js 62.50% 3 Missing ⚠️
...e/frontend/services/routing/static-pages-router.js 57.14% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #28042      +/-   ##
==========================================
- Coverage   73.82%   73.62%   -0.21%     
==========================================
  Files        1532     1536       +4     
  Lines      130096   130745     +649     
  Branches    15605    15637      +32     
==========================================
+ Hits        96046    96259     +213     
- Misses      33061    33520     +459     
+ Partials      989      966      -23     
Flag Coverage Δ
admin-tests 54.20% <ø> (ø)
e2e-tests 73.62% <34.04%> (-0.21%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@ErisDS ErisDS force-pushed the llms-txt-stateless branch from 406f910 to 76dbf55 Compare May 21, 2026 16:07
@github-actions
Copy link
Copy Markdown
Contributor

E2E Tests Failed

To view the Playwright test report locally, run:

REPORT_DIR=$(mktemp -d) && gh run download 26237926440 -n playwright-report -D "$REPORT_DIR" && npx playwright show-report "$REPORT_DIR"

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@ghost/core/core/frontend/services/llms/markdown.js`:
- Around line 56-60: The Accept header check in getAcceptedMarkdownContentType
is case-sensitive and will miss values like "Text/Markdown"; normalize the
header value (e.g., call toLowerCase() on req.get('Accept') or use a
case-insensitive match) before testing for 'text/markdown' or 'text/plain', then
run the existing includes checks against the normalized string so both mixed-
and upper-case media types are accepted.

In `@ghost/core/core/frontend/services/llms/service.js`:
- Around line 233-235: The current conditional only sorts entries by URL on the
first page (if (type === 'page' && pageNum === 1) { entries.sort(...) }),
causing inconsistent ordering across later paginated page exports; move or apply
the entries.sort((left, right) => left.url.localeCompare(right.url)) so that
sorting happens for all page exports before pagination (or remove the pageNum
check) so every page uses the same DB ordering; update the code paths that build
page entries (referencing variables type, pageNum, and entries) to ensure
sorting is applied consistently regardless of pageNum.

In `@ghost/core/core/frontend/services/routing/controllers/entry.js`:
- Around line 96-99: The response for negotiated markdown (built with
renderEntryMarkdown and using markdownContentType and llmsIndexUrl) is cacheable
but varies by Accept; add a Vary header for Accept before sending the response
(use res.vary('Accept') or res.set('Vary', 'Accept')) so shared caches don’t
serve markdown to HTML clients, then continue to set Cache-Control and send as
before.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 3d913420-8017-43c8-8706-3404b7c6faef

📥 Commits

Reviewing files that changed from the base of the PR and between 406f910 and 68d7e11.

⛔ Files ignored due to path filters (1)
  • pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (14)
  • ghost/core/core/frontend/services/llms/handler.js
  • ghost/core/core/frontend/services/llms/markdown.js
  • ghost/core/core/frontend/services/llms/service.js
  • ghost/core/core/frontend/services/routing/controllers/entry.js
  • ghost/core/core/frontend/web/middleware/llms-discovery.js
  • ghost/core/core/frontend/web/middleware/static-theme.js
  • ghost/core/core/frontend/web/site.js
  • ghost/core/core/server/web/shared/middleware/pretty-urls.js
  • ghost/core/core/shared/config/defaults.json
  • ghost/core/core/shared/labs.js
  • ghost/core/package.json
  • ghost/core/test/unit/frontend/services/llms/markdown.test.js
  • ghost/core/test/unit/frontend/services/llms/service.test.js
  • pnpm-workspace.yaml
✅ Files skipped from review due to trivial changes (3)
  • ghost/core/core/frontend/web/middleware/static-theme.js
  • ghost/core/package.json
  • ghost/core/core/shared/config/defaults.json

Comment thread ghost/core/core/frontend/services/llms/markdown.js
Comment thread ghost/core/core/frontend/services/llms/service.js Outdated
Comment thread ghost/core/core/frontend/services/routing/controllers/entry.js Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
ghost/core/test/unit/frontend/services/llms/service.test.js (1)

210-210: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Truncation message assertion doesn't match the configured budget.

The test sets fullTxtBudget: 1024 (1 KiB) on line 199, but this assertion expects the truncation message to say "Truncated after 5 MiB". Either:

  1. The service is hardcoding "5 MiB" in the message (misleading users/LLMs about the actual limit), or
  2. The service correctly uses the actual budget and this assertion will fail.

Update the assertion to match the actual budget used, or verify the service implementation includes the correct value in the truncation message.

🛠️ Suggested fix

If the message should reflect the actual budget:

-        assert.match(llmsFullTxt, /Truncated after 5 MiB/);
+        assert.match(llmsFullTxt, /Truncated after 1024 bytes|Truncated at budget/);

Or if the message should be generic:

-        assert.match(llmsFullTxt, /Truncated after 5 MiB/);
+        assert.match(llmsFullTxt, /Truncated/);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ghost/core/test/unit/frontend/services/llms/service.test.js` at line 210, The
test assertion under llms/service.test.js is checking for "Truncated after 5
MiB" while the test config sets fullTxtBudget: 1024 (1 KiB); update the
assertion to reflect the configured budget by either formatting the expected
truncation message using the fullTxtBudget value (e.g., derive "Truncated after
1 KiB" from fullTxtBudget) or change the assertion to a generic pattern that
matches any truncation message (e.g., /Truncated after/), so the expectation in
the test aligns with the behavior of the code that produces the truncation
message.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@ghost/core/test/unit/frontend/services/llms/service.test.js`:
- Line 210: The test assertion under llms/service.test.js is checking for
"Truncated after 5 MiB" while the test config sets fullTxtBudget: 1024 (1 KiB);
update the assertion to reflect the configured budget by either formatting the
expected truncation message using the fullTxtBudget value (e.g., derive
"Truncated after 1 KiB" from fullTxtBudget) or change the assertion to a generic
pattern that matches any truncation message (e.g., /Truncated after/), so the
expectation in the test aligns with the behavior of the code that produces the
truncation message.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 432a75b1-3c7c-43b0-8e6d-9bc896ce879b

📥 Commits

Reviewing files that changed from the base of the PR and between 68d7e11 and 718e735.

⛔ Files ignored due to path filters (1)
  • ghost/core/test/e2e-api/admin/__snapshots__/config.test.js.snap is excluded by !**/*.snap
📒 Files selected for processing (4)
  • ghost/core/core/frontend/services/llms/markdown.js
  • ghost/core/core/frontend/services/llms/service.js
  • ghost/core/core/frontend/services/routing/controllers/entry.js
  • ghost/core/test/unit/frontend/services/llms/service.test.js

@ErisDS ErisDS added the preview Deploy a PR preview environment label May 24, 2026
@Ghost-Slimer Ghost-Slimer temporarily deployed to pr-preview-28042 May 24, 2026 13:38 Destroyed
@ErisDS ErisDS added preview Deploy a PR preview environment and removed preview Deploy a PR preview environment labels May 24, 2026
@Ghost-Slimer Ghost-Slimer temporarily deployed to pr-preview-28042 May 24, 2026 15:23 Destroyed
@ErisDS ErisDS force-pushed the llms-txt-stateless branch from 648e236 to 9029929 Compare May 27, 2026 08:42
@ErisDS ErisDS added preview Deploy a PR preview environment and removed preview Deploy a PR preview environment labels May 27, 2026
@Ghost-Slimer Ghost-Slimer temporarily deployed to pr-preview-28042 May 27, 2026 09:22 Destroyed
@ErisDS ErisDS force-pushed the llms-txt-stateless branch from 322e648 to 757b6b3 Compare May 27, 2026 10:04
@ErisDS ErisDS added preview Deploy a PR preview environment and removed preview Deploy a PR preview environment labels May 27, 2026
@ErisDS ErisDS force-pushed the llms-txt-stateless branch from c1c7a23 to 9d55b2a Compare May 27, 2026 17:01
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
ghost/core/test/unit/frontend/services/routing/collection-router.test.js (2)

68-70: ⚡ Quick win

Verify the mounted handler, not just the route pattern.

The assertion only checks the route pattern (mountRouteSpy.args[3][0]), but doesn't verify the second argument (the handler). Other route assertions in this test (lines 58, 62, 66) check both the pattern and the controller. Since the markdown route uses an anonymous middleware wrapper, consider verifying that mountRouteSpy.args[3][1] is a function, or add a note explaining why handler verification is deferred to integration tests.

🧪 Proposed enhancement to verify the handler
 // markdown variant
 assert.equal(mountRouteSpy.args[3][0], '/:slug.md');
+assert.equal(typeof mountRouteSpy.args[3][1], 'function');
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ghost/core/test/unit/frontend/services/routing/collection-router.test.js`
around lines 68 - 70, The test currently only asserts the markdown route pattern
via mountRouteSpy.args[3][0] but omits verifying the mounted handler; update the
test around the markdown variant to also assert that mountRouteSpy.args[3][1] is
a function (i.e., the mounted handler is present and callable) — locate the
assertions using mountRouteSpy in collection-router.test.js and add a check for
the second argument being a function, or alternatively add a short comment
explaining that handler verification is covered by integration tests if you
intentionally choose not to assert it here.

117-119: ⚡ Quick win

Verify the mounted handler, not just the route pattern.

Same issue as the "default" test case: only the route pattern is verified, not the handler. Consider verifying that mountRouteSpy.args[3][1] is a function for consistency with the other route assertions in this test.

🧪 Proposed enhancement to verify the handler
 // markdown variant
 assert.equal(mountRouteSpy.args[3][0], '/blog/:year/:slug.md');
+assert.equal(typeof mountRouteSpy.args[3][1], 'function');
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ghost/core/test/unit/frontend/services/routing/collection-router.test.js`
around lines 117 - 119, The test currently only asserts the route pattern
(mountRouteSpy.args[3][0] === '/blog/:year/:slug.md') but should also verify the
mounted handler; update the assertion to check that mountRouteSpy.args[3][1] is
a function (similar to other assertions in this test) so you confirm the handler
was mounted correctly — locate the assertion using mountRouteSpy.args[3][0] and
add a corresponding check that mountRouteSpy.args[3][1] is of type function.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@ghost/core/test/unit/frontend/services/routing/collection-router.test.js`:
- Around line 68-70: The test currently only asserts the markdown route pattern
via mountRouteSpy.args[3][0] but omits verifying the mounted handler; update the
test around the markdown variant to also assert that mountRouteSpy.args[3][1] is
a function (i.e., the mounted handler is present and callable) — locate the
assertions using mountRouteSpy in collection-router.test.js and add a check for
the second argument being a function, or alternatively add a short comment
explaining that handler verification is covered by integration tests if you
intentionally choose not to assert it here.
- Around line 117-119: The test currently only asserts the route pattern
(mountRouteSpy.args[3][0] === '/blog/:year/:slug.md') but should also verify the
mounted handler; update the assertion to check that mountRouteSpy.args[3][1] is
a function (similar to other assertions in this test) so you confirm the handler
was mounted correctly — locate the assertion using mountRouteSpy.args[3][0] and
add a corresponding check that mountRouteSpy.args[3][1] is of type function.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 66a34aca-8096-4558-a670-20d1224543bf

📥 Commits

Reviewing files that changed from the base of the PR and between c1c7a23 and 9d55b2a.

⛔ Files ignored due to path filters (2)
  • ghost/core/test/e2e-api/admin/__snapshots__/config.test.js.snap is excluded by !**/*.snap
  • pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (18)
  • apps/admin-x-settings/src/components/settings/advanced/labs/private-features.tsx
  • ghost/core/core/frontend/services/llms/handler.js
  • ghost/core/core/frontend/services/llms/markdown.js
  • ghost/core/core/frontend/services/llms/service.js
  • ghost/core/core/frontend/services/routing/collection-router.js
  • ghost/core/core/frontend/services/routing/controllers/entry.js
  • ghost/core/core/frontend/services/routing/static-pages-router.js
  • ghost/core/core/frontend/web/middleware/llms-discovery.js
  • ghost/core/core/frontend/web/middleware/static-theme.js
  • ghost/core/core/frontend/web/site.js
  • ghost/core/core/server/web/shared/middleware/pretty-urls.js
  • ghost/core/core/shared/config/defaults.json
  • ghost/core/core/shared/labs.js
  • ghost/core/package.json
  • ghost/core/test/unit/frontend/services/llms/markdown.test.js
  • ghost/core/test/unit/frontend/services/llms/service.test.js
  • ghost/core/test/unit/frontend/services/routing/collection-router.test.js
  • pnpm-workspace.yaml
🚧 Files skipped from review as they are similar to previous changes (16)
  • ghost/core/core/frontend/web/middleware/static-theme.js
  • ghost/core/core/frontend/services/routing/static-pages-router.js
  • pnpm-workspace.yaml
  • apps/admin-x-settings/src/components/settings/advanced/labs/private-features.tsx
  • ghost/core/package.json
  • ghost/core/core/frontend/web/middleware/llms-discovery.js
  • ghost/core/core/server/web/shared/middleware/pretty-urls.js
  • ghost/core/core/frontend/web/site.js
  • ghost/core/core/shared/config/defaults.json
  • ghost/core/core/frontend/services/routing/controllers/entry.js
  • ghost/core/test/unit/frontend/services/llms/service.test.js
  • ghost/core/core/frontend/services/routing/collection-router.js
  • ghost/core/core/frontend/services/llms/handler.js
  • ghost/core/core/frontend/services/llms/service.js
  • ghost/core/test/unit/frontend/services/llms/markdown.test.js
  • ghost/core/core/frontend/services/llms/markdown.js

@ErisDS ErisDS force-pushed the llms-txt-stateless branch from 9d55b2a to 16f7cad Compare May 28, 2026 06:11
ErisDS and others added 6 commits May 28, 2026 11:09
Ghost is moving toward stateless, compute-on-request architecture.
This implements llms.txt/llms-full.txt generation, per-entry .md
export, Accept: text/markdown content negotiation, and discovery
headers using that approach — no in-memory cache, no event listeners,
full dependency injection via factory functions.

Gated behind a config flag ("llms": false by default) so the feature
can be tested before broad rollout. The llms_enabled setting (already
in main) acts as the user-facing toggle within an enabled deployment.

Key design decisions:
- Factory functions (createLlmsService, createLlmsHandler,
  createLlmsDiscovery) receive all dependencies explicitly
- Uses urlServiceFacade for lazy routing compatibility
- Paginated DB queries for llms-full.txt (100/batch, 5 MiB budget)
- Index queries exclude html column to reduce memory
- Cache-Control headers on all responses for CDN/proxy caching
- pretty-urls extension bypass scoped to .md and .txt only
- markdown.js is fully pure (no Ghost singleton imports)

Ref #27984

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
A private labs flag (llmsTxt) is more flexible than a config flag
because it can be toggled per-site via the API without a server
restart, and support can enable it for individual customers.

Routes are now always mounted; isEnabled() checks the labs flag at
request time alongside the llms_enabled setting and is_private.
This fits the stateless architecture — zero boot-time gating.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The Config API acceptance test snapshots the full labs object.
Adding the llmsTxt private feature flag requires updating
the snapshot to include the new key.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Handle Accept header case-insensitively (RFC 7231)
- Add Vary: Accept to content-negotiated markdown responses
- Remove inconsistent page sorting in fetchFullEntries (DB order
  via 'id asc' is already stable; index uses fetchIndexEntries
  which sorts all pages by URL)
- Make fullTxtBudget injectable to avoid 5MB allocation in tests
  that was causing OOM on CI runners

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Register /:slug.md routes on CollectionRouter and StaticPagesRouter
  instead of a catch-all /.+\.md$/ regex on the site app
- Entry controller handles .md requests using the already-loaded entry
  from entryLookup, eliminating the duplicate fetchPublicEntry call
- Content negotiation (Accept: text/markdown) uses the same shared
  serveMarkdown helper with DI via req.app.get('llmsService')
- Remove mountMarkdownRoutes and parallel URL resolution path from
  handler.js
- Expose llmsService on siteApp for consistent dependency injection

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
ErisDS and others added 5 commits May 28, 2026 11:09
Errors from llms.txt/llms-full.txt routes now get tagged with [llms]
in logs and llms.serve_index/llms.serve_full in Sentry before being
forwarded to Express error handling. Centralizes try/catch in serveLlms
instead of duplicating across four route handlers.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
These flags were promoted to GA on main while our branch still listed
them as private, causing duplicate flag detection to fail in CI.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The URL service needs tags/authors to substitute :primary_tag and
:primary_author in permalink patterns. Without withRelated, those
placeholders resolve to 'all'/'undefined', producing wrong URLs.
Also added published_at to index query for date-based permalinks.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
An AI agent requesting the .md URL of a members-only post now gets
a 403 with an explanation instead of a silent 404, so it can
distinguish restricted content from missing content.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Added all tags (not just primary) and entry type (post/page) to the
per-entry .md output so AI agents can classify content by topic and
distinguish evergreen pages from time-bound posts.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@ErisDS ErisDS force-pushed the llms-txt-stateless branch from 16f7cad to 9cd67af Compare May 28, 2026 10:09
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 28, 2026

Actionable comments posted: 0

@ErisDS ErisDS merged commit 1d7179f into main May 28, 2026
54 checks passed
@ErisDS ErisDS deleted the llms-txt-stateless branch May 28, 2026 13:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

preview Deploy a PR preview environment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants