Add llms.txt exports and AI crawler toggle by ErisDS · Pull Request #27984 · TryGhost/Ghost

ErisDS · 2026-05-20T09:44:30Z

Summary

Add llms.txt and llms-full.txt generation plus per-entry Markdown export for public posts and pages
Add an llms_enabled site setting and expose it in Admin under Meta data > Search
Redirect llms-specific .txt and .md routes back to canonical HTML routes when the setting is disabled

Rebased from #27400 onto latest main (resolved conflicts in general-settings.tsx, entry.test.js, integrity.test.js).

Known CI fixes still needed

Must fix (CI blockers)

Nice to have (code quality)

collapseWhitespace in markdownFromHtml — markdown.js:68-76 wraps nhm.translate() with collapseWhitespace() which replaces all \s+ with a single space, flattening markdown block structure (lists, code blocks, paragraphs) into one line. Remove the wrapper and only normalize excessive blank lines
getResourcePathFromMarkdownPath('/index.md') returns /index/ — should return / for root-route reversibility since getMarkdownPath('/') produces /index.md
#buildLlmsFullTxt preloads all post bodies — fetches all pages and posts with limit: 'all' including html/plaintext columns in parallel before the 5 MiB bounding logic runs. On large sites this is a lot of memory. Consider sequential fetching with a budget (requires larger refactor)
Truncation footer can exceed 5 MiB cap — #appendBoundedSection stops at FIVE_MIB but the truncation note is appended afterward. Reserve space for the footer
Entry controller next() on null markdown — entry.js:96-98 calls next() when fetchPublicEntry returns null during content negotiation, which skips the resolved entry and can 404. Should fall through to normal HTML rendering
Link header dedup misses comma-delimited form — llms-discovery.js:4-15 appendHeaderValue checks values.includes(newValue) on the whole string, not individual comma-separated entries

github-actions · 2026-05-20T09:44:41Z

coderabbitai · 2026-05-20T09:44:48Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

Walkthrough

Adds LLM export/discovery: a new site setting (llms_enabled), admin toggle and fixtures, LlmsService producing cached llms.txt and llms-full.txt with size bounding, markdown rendering utilities, HTTP handlers for /llms.txt, /llms-full.txt and .md entry routes, discovery middleware, frontend wiring (pretty-URLs skip, static-theme fallthrough, site middleware and entry controller negotiation), defaults/config, migration, dependency addition, and comprehensive unit/e2e/acceptance tests.

Suggested reviewers

kevinansfield
vershwal

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 7.69% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and concisely summarizes the main change: adding llms.txt exports and an AI crawler toggle feature for the Ghost platform.
Description check	✅ Passed	The description is well-detailed and directly related to the changeset, explaining the new llms.txt/llms-full.txt generation, the llms_enabled setting, and redirect behavior when disabled.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch codex/llms-setting-pr

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 8

🧹 Nitpick comments (1)

ghost/core/core/frontend/services/llms/service.js (1)

88-91: 🏗️ Heavy lift

Avoid parallel full-body loading of all pages and posts.

Fetching both collections concurrently with limit: 'all' and body fields raises peak memory pressure on large sites.

Refactor direction

-        const [pages, posts] = await Promise.all([
-            this.#fetchPublicEntries('page'),
-            this.#fetchPublicEntries('post')
-        ]);
+        const pages = await this.#fetchPublicEntries('page');
+        const posts = await this.#fetchPublicEntries('post');

Then consider chunked iteration/pagination inside #fetchPublicEntries so entries are appended progressively within the byte budget.

Also applies to: 111-114, 234-239

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ghost/core/core/frontend/services/llms/service.js` around lines 88 - 91, The
current code loads all pages and posts in parallel with full bodies, causing
peak memory pressure; update the two-call pattern that uses
this.#fetchPublicEntries('page') and this.#fetchPublicEntries('post') so you do
not fetch both collections fully in parallel and avoid using limit: 'all' with
body fields. Modify `#fetchPublicEntries` to support chunked pagination/iteration
(e.g., page/limit or cursor-based) and have the callers fetch one collection at
a time and append entries progressively within a configurable byte/item budget;
apply the same change to the other call sites that fetch pages and posts
concurrently (the other invocations of this.#fetchPublicEntries) so all
consumers stream or paginate entries instead of loading full bodies into memory
at once.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@ghost/core/core/frontend/services/llms/markdown.js`:
- Around line 110-112: The function getLlmsIndexUrl has an unused parameter
entry causing a lint error; remove the unused parameter from its declaration
(change function getLlmsIndexUrl(entry) to function getLlmsIndexUrl()) and
update any call sites that pass an argument to call it without parameters; also
apply the same removal for the duplicate occurrence referenced at the 141-141
location so both definitions/signatures no longer include the unused entry
parameter.
- Around line 43-50: The function getResourcePathFromMarkdownPath currently maps
'/index.md' to '/index/' instead of '/'—modify it so after stripping the '.md'
suffix (in getResourcePathFromMarkdownPath) you check if the resulting
resourcePath is '' or '/index' and return '/' (with trailing slash) in those
cases; otherwise keep the existing behavior of ensuring the returned path ends
with a '/' by returning resourcePath + '/' when needed.
- Around line 68-76: The function markdownFromHtml currently calls
collapseWhitespace on nhm.translate(html) which flattens line breaks and
destroys markdown structure; instead, call nhm.translate(html || '') directly,
normalize CRLF to \n and then collapse only horizontal whitespace within each
line (not across newlines) or remove collapseWhitespace entirely so
headings/lists/paragraph breaks are preserved; keep the existing
.replace(/\n{3,}/g, '\n\n').trim() step. Update markdownFromHtml to use
nhm.translate and an in-line normalization that preserves \n (or a helper that
collapses spaces per-line) rather than collapseWhitespace.

In `@ghost/core/core/frontend/services/llms/service.js`:
- Around line 135-137: The truncation footer is appended after the 5 MiB cap
check which lets the final output exceed the declared limit; update the assembly
logic in service.js (the code that sets output and checks wasTruncated) to
reserve space for the footer before enforcing the 5 MiB cap — i.e., compute the
footer length (the string starting with '\n_Truncated after 5 MiB..._\n'),
subtract that from the max-bytes limit when truncating/assembling output, and
only append the footer if it fits within the reserved space (or trim the main
output to make room) so the final payload never exceeds the declared 5 MiB cap.

In `@ghost/core/core/frontend/services/routing/controllers/entry.js`:
- Around line 74-76: The current early return when markdownEntry is falsy (the
"if (!markdownEntry) { return next(); }" branch) causes a 404 during content
negotiation; remove the return-next short-circuit so the controller falls back
to the normal entry rendering path (i.e., stop calling next() here and allow the
subsequent standard entry rendering logic to run), or explicitly invoke the
standard entry renderer instead of next() so missing markdown simply uses the
existing entry rendering flow.

In `@ghost/core/core/frontend/web/middleware/llms-discovery.js`:
- Around line 4-16: The appendHeaderValue function treats a comma-delimited
header string as a single token, allowing duplicate relations to be appended;
update appendHeaderValue to split existingValue (and newValue if it may contain
commas) by ',' into trimmed tokens, deduplicate by checking tokens membership,
and then return a comma+space joined string of the merged tokens; keep the
function name appendHeaderValue and ensure it handles both string and array
existingValue inputs and preserves order while avoiding duplicates.

In `@ghost/core/test/unit/frontend/services/routing/controllers/entry.test.js`:
- Around line 210-214: The test references a nonexistent symbol
routerManagerGetResourceByIdStub which causes a ReferenceError; remove that
stale call or replace it with a correctly declared stub for
routerManager.getResourceById (e.g., use the existing routerManager stub used
elsewhere in the test or create a sinon stub named
routerManagerGetResourceByIdStub that stubs routerManager.getResourceById) so
the test no longer references an undeclared identifier and can proceed to
assertions.

In `@ghost/core/test/unit/frontend/web/middleware/llms-discovery.test.js`:
- Line 15: The test currently stubs settingsCache.get to always return false
which forces llms_enabled to false and short-circuits discovery header checks;
update the stub in the public-site test so settingsCache.get returns values per
key (e.g. return true for 'llms_enabled' and appropriate values for other keys)
by using a callsFake/implementation that switches on the key, ensuring
llms_enabled is true so the discovery headers code path is exercised (refer to
settingsCache.get and the llms_enabled check in the llms-discovery test).

---

Nitpick comments:
In `@ghost/core/core/frontend/services/llms/service.js`:
- Around line 88-91: The current code loads all pages and posts in parallel with
full bodies, causing peak memory pressure; update the two-call pattern that uses
this.#fetchPublicEntries('page') and this.#fetchPublicEntries('post') so you do
not fetch both collections fully in parallel and avoid using limit: 'all' with
body fields. Modify `#fetchPublicEntries` to support chunked pagination/iteration
(e.g., page/limit or cursor-based) and have the callers fetch one collection at
a time and append entries progressively within a configurable byte/item budget;
apply the same change to the other call sites that fetch pages and posts
concurrently (the other invocations of this.#fetchPublicEntries) so all
consumers stream or paginate entries instead of loading full bodies into memory
at once.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: bb3038e0-8514-4fde-bfd7-30a0d0bb1b1f

📥 Commits

Reviewing files that changed from the base of the PR and between 255f180 and 0dc9307.

⛔ Files ignored due to path filters (1)

pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml

📒 Files selected for processing (31)

apps/admin-x-framework/src/test/msw-utils.ts
apps/admin-x-framework/src/test/responses/settings.json
apps/admin-x-settings/src/components/settings/general/general-settings.tsx
apps/admin-x-settings/src/components/settings/general/seo-meta.tsx
apps/admin-x-settings/test/acceptance/general/seometa.test.ts
ghost/core/core/frontend/services/llms/handler.js
ghost/core/core/frontend/services/llms/markdown.js
ghost/core/core/frontend/services/llms/service.js
ghost/core/core/frontend/services/routing/controllers/entry.js
ghost/core/core/frontend/web/middleware/index.js
ghost/core/core/frontend/web/middleware/llms-discovery.js
ghost/core/core/frontend/web/middleware/static-theme.js
ghost/core/core/frontend/web/site.js
ghost/core/core/server/api/endpoints/utils/serializers/input/settings.js
ghost/core/core/server/api/endpoints/utils/serializers/input/utils/settings-key-group-mapper.js
ghost/core/core/server/api/endpoints/utils/serializers/input/utils/settings-key-type-mapper.js
ghost/core/core/server/data/migrations/versions/6.31/2026-04-14-22-07-44-add-llms-setting.js
ghost/core/core/server/data/schema/default-settings/default-settings.json
ghost/core/core/server/web/shared/middleware/pretty-urls.js
ghost/core/core/shared/config/defaults.json
ghost/core/core/shared/settings-cache/cache-manager.js
ghost/core/package.json
ghost/core/test/e2e-frontend/llms.test.js
ghost/core/test/unit/frontend/services/llms/handler.test.js
ghost/core/test/unit/frontend/services/llms/service.test.js
ghost/core/test/unit/frontend/services/routing/controllers/entry.test.js
ghost/core/test/unit/frontend/web/middleware/llms-discovery.test.js
ghost/core/test/unit/frontend/web/middleware/static-theme.test.js
ghost/core/test/unit/server/data/schema/integrity.test.js
ghost/core/test/unit/server/web/shared/middleware/pretty-urls.test.js
ghost/core/test/utils/fixtures/default-settings.json

coderabbitai

♻️ Duplicate comments (1)

ghost/core/test/unit/frontend/web/middleware/llms-discovery.test.js (1)

15-21: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Stub llms_enabled explicitly in this test path.

This stub still leaves llms_enabled unset (null), which can short-circuit discovery header behavior and make the test miss the intended code path.

Suggested fix

 sinon.stub(settingsCache, 'get').callsFake((key) => {
     if (key === 'is_private') {
         return false;
     }
+    if (key === 'llms_enabled') {
+        return true;
+    }

     return null;
 });

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ghost/core/test/unit/frontend/web/middleware/llms-discovery.test.js` around
lines 15 - 21, The settingsCache.get stub used in the test leaves 'llms_enabled'
as null, which can short-circuit the discovery header path; update the
sinon.stub(settingsCache, 'get').callsFake callback to explicitly return the
intended test value for 'llms_enabled' (e.g., return true or false depending on
the test scenario) alongside the existing 'is_private' handling so the llms
discovery code path is exercised (refer to the stub in the test where
settingsCache.get is defined).

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@ghost/core/test/unit/frontend/web/middleware/llms-discovery.test.js`:
- Around line 15-21: The settingsCache.get stub used in the test leaves
'llms_enabled' as null, which can short-circuit the discovery header path;
update the sinon.stub(settingsCache, 'get').callsFake callback to explicitly
return the intended test value for 'llms_enabled' (e.g., return true or false
depending on the test scenario) alongside the existing 'is_private' handling so
the llms discovery code path is exercised (refer to the stub in the test where
settingsCache.get is defined).

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: c098ceb3-3568-4d22-a917-5092839a6a6e

📥 Commits

Reviewing files that changed from the base of the PR and between 0dc9307 and d335381.

⛔ Files ignored due to path filters (1)

ghost/core/test/e2e-api/admin/__snapshots__/settings.test.js.snap is excluded by !**/*.snap

📒 Files selected for processing (6)

apps/admin-x-settings/test/acceptance/search.test.ts
ghost/core/core/frontend/services/llms/markdown.js
ghost/core/test/e2e-api/admin/settings.test.js
ghost/core/test/unit/frontend/services/routing/controllers/entry.test.js
ghost/core/test/unit/frontend/web/middleware/llms-discovery.test.js
ghost/core/test/unit/server/data/exporter/index.test.js

💤 Files with no reviewable changes (1)

ghost/core/test/unit/frontend/services/routing/controllers/entry.test.js

✅ Files skipped from review due to trivial changes (2)

ghost/core/test/e2e-api/admin/settings.test.js
ghost/core/test/unit/server/data/exporter/index.test.js

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (3)

ghost/core/core/frontend/services/llms/markdown.js (2)

48-49: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Map /index.md back to root route.

Line 48/49 currently resolves /index.md to /index/, which breaks reversible routing for the root page.

Proposed fix

 function getResourcePathFromMarkdownPath(pathname) {
     if (!pathname || !pathname.endsWith('.md')) {
         return null;
     }
 
+    if (pathname === '/index.md') {
+        return '/';
+    }
+
     const resourcePath = pathname.slice(0, -3) || '/';
     return resourcePath.endsWith('/') ? resourcePath : `${resourcePath}/`;
 }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ghost/core/core/frontend/services/llms/markdown.js` around lines 48 - 49, The
current logic computes resourcePath from pathname.slice(0, -3) and then forces a
trailing slash, which turns '/index.md' into '/index/'; change the
post-processing to map the special case '/index' back to '/' (i.e. if
resourcePath === '/index' return '/'), otherwise keep the existing
trailing-slash behavior; update the code around the resourcePath computation and
the return expression that currently uses resourcePath.endsWith('/') to handle
this '/index' special case.

69-75: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Preserve markdown block structure during HTML conversion.

Line 69 collapses all whitespace (including newlines), flattening headings/lists/paragraphs in generated markdown.

Proposed fix

 function markdownFromHtml(html) {
-    const markdown = collapseWhitespace(nhm.translate(html || ''));
+    const markdown = (nhm.translate(html || '') || '')
+        .replace(/\r\n/g, '\n')
+        .trim();
 
     if (!markdown) {
         return null;
     }
 
-    return markdown.replace(/\n{3,}/g, '\n\n').trim();
+    return markdown
+        .replace(/[ \t]+\n/g, '\n')
+        .replace(/\n{3,}/g, '\n\n')
+        .trim();
 }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ghost/core/core/frontend/services/llms/markdown.js` around lines 69 - 75, The
current code calls collapseWhitespace(nhm.translate(html || '')) which removes
all newlines and flattens block-level Markdown (headings, lists, paragraphs);
change the call so whitespace collapsing preserves newline boundaries (e.g., use
a variant/option of collapseWhitespace that does not remove \n or introduce a
helper that only collapses horizontal whitespace but keeps \n), keeping the rest
of the flow (variable markdown, null check, and the final replace(/\n{3,}/g,
'\n\n').trim()). Update the usage of collapseWhitespace in this function to a
newline-preserving variant so block structure produced by nhm.translate is
retained.

ghost/core/core/frontend/services/routing/controllers/entry.js (1)

74-76: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fallback to normal entry rendering when markdown fetch misses.

Line 75 returning next() can turn a valid entry request into a 404 during content negotiation. Use the existing HTML renderer fallback instead.

Proposed fix

                 return llmsService.fetchPublicEntry(res.routerOptions.resourceType, entry.id)
                     .then((markdownEntry) => {
                         if (!markdownEntry) {
-                            return next();
+                            return renderer.renderEntry(req, res)(entry);
                         }
 
                         res.type(markdownContentType);
                         res.send(renderEntryMarkdown(markdownEntry));
                     });

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ghost/core/core/frontend/services/routing/controllers/entry.js` around lines
74 - 76, The current code returns next() when markdownEntry is falsy, which
causes valid requests to be treated as 404s; instead, remove the early return
and invoke the existing HTML renderer fallback for the same entry path (i.e.,
when markdownEntry is not found call the project's HTML renderer rather than
next()). Locate the check using markdownEntry and replace the "return next();"
with a call into the existing HTML rendering path used elsewhere in this
controller (the HTML renderer function/handler already used for non-markdown
responses), passing the same req, res, and next so content negotiation falls
back to HTML.

🧹 Nitpick comments (1)

ghost/core/core/frontend/services/llms/service.js (1)

111-114: 🏗️ Heavy lift

Avoid preloading posts when pages already consume the full export budget.

#buildLlmsFullTxt() loads pages and posts in parallel, but posts are unnecessary if the pages section already truncates. Deferring post fetch until after the pages budget check would cut heavy work on large sites.

Proposed direction

-        const [pages, posts] = await Promise.all([
-            this.#fetchPublicEntries('page'),
-            this.#fetchPublicEntries('post')
-        ]);
+        const pages = await this.#fetchPublicEntries('page');

         ...
         if (!wasTruncated) {
+            const posts = await this.#fetchPublicEntries('post');
             const postSection = this.#appendBoundedSection(output, 'Posts', posts, entry => this.#buildFullEntry(entry));
             output = postSection.output;
             wasTruncated = postSection.wasTruncated;
         }

Also applies to: 129-133

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ghost/core/core/frontend/services/llms/service.js` around lines 111 - 114,
The current parallel fetch (const [pages, posts] = await Promise.all([...])) in
`#buildLlmsFullTxt` triggers fetching posts even when pages already hit the
export/truncation budget; change the logic to fetch pages first via
this.#fetchPublicEntries('page'), inspect the pages result for truncation/budget
exhaustion (the same check used later in the function), and only call
this.#fetchPublicEntries('post') if pages did not already consume the full
export budget; apply the same sequential-fetch fix to the other occurrence noted
(the block around lines 129–133) so posts are only fetched when needed.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@ghost/core/test/unit/frontend/services/llms/service.test.js`:
- Around line 125-130: The truncation test needs an explicit byte-size
assertion: after calling service.getLlmsFullTxt() and existing content checks
(llmsFullTxt, /Truncated after 5 MiB/), add an assertion that
Buffer.byteLength(llmsFullTxt, 'utf8') is less than or equal to 5 * 1024 * 1024
(5 MiB) to prevent regressions; update the test that uses llmsFullTxt in
ghost/core/test/unit/frontend/services/llms/service.test.js to include this
byte-length check so the test fails if the payload exceeds the hard cap.

---

Duplicate comments:
In `@ghost/core/core/frontend/services/llms/markdown.js`:
- Around line 48-49: The current logic computes resourcePath from
pathname.slice(0, -3) and then forces a trailing slash, which turns '/index.md'
into '/index/'; change the post-processing to map the special case '/index' back
to '/' (i.e. if resourcePath === '/index' return '/'), otherwise keep the
existing trailing-slash behavior; update the code around the resourcePath
computation and the return expression that currently uses
resourcePath.endsWith('/') to handle this '/index' special case.
- Around line 69-75: The current code calls
collapseWhitespace(nhm.translate(html || '')) which removes all newlines and
flattens block-level Markdown (headings, lists, paragraphs); change the call so
whitespace collapsing preserves newline boundaries (e.g., use a variant/option
of collapseWhitespace that does not remove \n or introduce a helper that only
collapses horizontal whitespace but keeps \n), keeping the rest of the flow
(variable markdown, null check, and the final replace(/\n{3,}/g,
'\n\n').trim()). Update the usage of collapseWhitespace in this function to a
newline-preserving variant so block structure produced by nhm.translate is
retained.

In `@ghost/core/core/frontend/services/routing/controllers/entry.js`:
- Around line 74-76: The current code returns next() when markdownEntry is
falsy, which causes valid requests to be treated as 404s; instead, remove the
early return and invoke the existing HTML renderer fallback for the same entry
path (i.e., when markdownEntry is not found call the project's HTML renderer
rather than next()). Locate the check using markdownEntry and replace the
"return next();" with a call into the existing HTML rendering path used
elsewhere in this controller (the HTML renderer function/handler already used
for non-markdown responses), passing the same req, res, and next so content
negotiation falls back to HTML.

---

Nitpick comments:
In `@ghost/core/core/frontend/services/llms/service.js`:
- Around line 111-114: The current parallel fetch (const [pages, posts] = await
Promise.all([...])) in `#buildLlmsFullTxt` triggers fetching posts even when pages
already hit the export/truncation budget; change the logic to fetch pages first
via this.#fetchPublicEntries('page'), inspect the pages result for
truncation/budget exhaustion (the same check used later in the function), and
only call this.#fetchPublicEntries('post') if pages did not already consume the
full export budget; apply the same sequential-fetch fix to the other occurrence
noted (the block around lines 129–133) so posts are only fetched when needed.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: f7de417e-e6d2-452e-880a-14a562ee5829

📥 Commits

Reviewing files that changed from the base of the PR and between 9ff9b66 and 35b8909.

⛔ Files ignored due to path filters (2)

ghost/core/test/e2e-api/admin/__snapshots__/settings.test.js.snap is excluded by !**/*.snap
pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml

📒 Files selected for processing (35)

apps/admin-x-framework/src/test/msw-utils.ts
apps/admin-x-framework/src/test/responses/settings.json
apps/admin-x-settings/src/components/settings/general/general-settings.tsx
apps/admin-x-settings/src/components/settings/general/seo-meta.tsx
apps/admin-x-settings/test/acceptance/general/seometa.test.ts
apps/admin-x-settings/test/acceptance/search.test.ts
ghost/core/core/frontend/services/llms/handler.js
ghost/core/core/frontend/services/llms/markdown.js
ghost/core/core/frontend/services/llms/service.js
ghost/core/core/frontend/services/routing/controllers/entry.js
ghost/core/core/frontend/web/middleware/index.js
ghost/core/core/frontend/web/middleware/llms-discovery.js
ghost/core/core/frontend/web/middleware/static-theme.js
ghost/core/core/frontend/web/site.js
ghost/core/core/server/api/endpoints/utils/serializers/input/settings.js
ghost/core/core/server/api/endpoints/utils/serializers/input/utils/settings-key-group-mapper.js
ghost/core/core/server/api/endpoints/utils/serializers/input/utils/settings-key-type-mapper.js
ghost/core/core/server/data/migrations/versions/6.40/2026-04-14-22-07-44-add-llms-setting.js
ghost/core/core/server/data/schema/default-settings/default-settings.json
ghost/core/core/server/web/shared/middleware/pretty-urls.js
ghost/core/core/shared/config/defaults.json
ghost/core/core/shared/settings-cache/cache-manager.js
ghost/core/package.json
ghost/core/test/e2e-api/admin/settings.test.js
ghost/core/test/e2e-frontend/llms.test.js
ghost/core/test/legacy/models/model-settings.test.js
ghost/core/test/unit/frontend/services/llms/handler.test.js
ghost/core/test/unit/frontend/services/llms/service.test.js
ghost/core/test/unit/frontend/services/routing/controllers/entry.test.js
ghost/core/test/unit/frontend/web/middleware/llms-discovery.test.js
ghost/core/test/unit/frontend/web/middleware/static-theme.test.js
ghost/core/test/unit/server/data/exporter/index.test.js
ghost/core/test/unit/server/data/schema/integrity.test.js
ghost/core/test/unit/server/web/shared/middleware/pretty-urls.test.js
ghost/core/test/utils/fixtures/default-settings.json

✅ Files skipped from review due to trivial changes (5)

ghost/core/test/legacy/models/model-settings.test.js
ghost/core/test/unit/server/data/exporter/index.test.js
ghost/core/test/unit/server/data/schema/integrity.test.js
ghost/core/core/shared/settings-cache/cache-manager.js
apps/admin-x-framework/src/test/responses/settings.json

codecov · 2026-05-20T12:38:29Z

Codecov Report

❌ Patch coverage is 78.45220% with 142 lines in your changes missing coverage. Please review.
✅ Project coverage is 73.85%. Comparing base (7b87322) to head (8d87f2c).
⚠️ Report is 9 commits behind head on main.

Files with missing lines	Patch %	Lines
ghost/core/core/frontend/services/llms/markdown.js	71.18%	50 Missing and 1 partial ⚠️
ghost/core/core/frontend/services/llms/service.js	83.72%	42 Missing ⚠️
ghost/core/core/frontend/services/llms/handler.js	72.46%	37 Missing and 1 partial ⚠️
...ore/core/frontend/web/middleware/llms-discovery.js	81.57%	6 Missing and 1 partial ⚠️
...ore/frontend/services/routing/controllers/entry.js	83.33%	3 Missing ⚠️
...e/core/server/web/shared/middleware/pretty-urls.js	93.75%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main   #27984      +/-   ##
==========================================
+ Coverage   73.81%   73.85%   +0.03%     
==========================================
  Files        1523     1527       +4     
  Lines      128935   129593     +658     
  Branches    15479    15629     +150     
==========================================
+ Hits        95177    95708     +531     
- Misses      32799    32923     +124     
- Partials      959      962       +3

Flag	Coverage Δ
admin-tests	`53.54% <ø> (ø)`
e2e-tests	`73.85% <78.45%> (+0.03%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

coderabbitai

🧹 Nitpick comments (1)

ghost/core/core/frontend/services/llms/markdown.js (1)

83-89: 💤 Low value

Consider defensive handling for invalid date values.

new Date(value).toISOString() throws RangeError if value is a non-falsy but invalid date string. Ghost data is typically validated, but a defensive wrapper would prevent unexpected crashes during markdown generation.

Proposed fix

 function formatIsoDate(value) {
     if (!value) {
         return null;
     }
 
-    return new Date(value).toISOString();
+    const date = new Date(value);
+    return isNaN(date.getTime()) ? null : date.toISOString();
 }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ghost/core/core/frontend/services/llms/markdown.js` around lines 83 - 89, The
function formatIsoDate currently calls new Date(value).toISOString() which will
throw for non-falsy but invalid date values; update formatIsoDate to defensively
construct a Date object, verify it's valid (e.g., using isNaN(date.getTime()) or
Number.isFinite(date.getTime())), and only call toISOString() when valid,
returning null (or a safe fallback) when the date is invalid or unparsable;
reference the formatIsoDate function and its return path so the change is
localized and avoids throwing during markdown generation.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@ghost/core/core/frontend/services/llms/markdown.js`:
- Around line 83-89: The function formatIsoDate currently calls new
Date(value).toISOString() which will throw for non-falsy but invalid date
values; update formatIsoDate to defensively construct a Date object, verify it's
valid (e.g., using isNaN(date.getTime()) or Number.isFinite(date.getTime())),
and only call toISOString() when valid, returning null (or a safe fallback) when
the date is invalid or unparsable; reference the formatIsoDate function and its
return path so the change is localized and avoids throwing during markdown
generation.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 0ab175cb-86dc-43c1-9288-644354929ff3

📥 Commits

Reviewing files that changed from the base of the PR and between 35b8909 and 8d87f2c.

⛔ Files ignored due to path filters (2)

ghost/core/test/e2e-api/admin/__snapshots__/settings.test.js.snap is excluded by !**/*.snap
ghost/core/test/e2e-api/members/__snapshots__/well-known.test.js.snap is excluded by !**/*.snap

📒 Files selected for processing (4)

ghost/core/core/frontend/services/llms/markdown.js
ghost/core/core/frontend/services/llms/service.js
ghost/core/core/frontend/services/routing/controllers/entry.js
ghost/core/core/frontend/web/middleware/llms-discovery.js

Ghost now exposes llms.txt and per-entry markdown for public content, with a matching admin setting so sites can disable the feature without theme overrides. The setting defaults to on, redirects llms-specific URLs back to canonical HTML when disabled, and keeps the markdown export subdirectory-aware and strictly public-only.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The stub returned false for every settingsCache.get() call, which made llms_enabled === false and caused the middleware to bail out before adding discovery headers. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

getByLabel('Search') resolved to 2 elements because both the sidebar search input and the SEO Meta 'Search' tab matched. Use getByPlaceholder('Search settings') for an unambiguous selector. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

routerManager was removed from the entry controller on main. The markdown content negotiation test no longer needs it — the controller reads resourceType from res.routerOptions directly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The llms-discovery middleware adds Link and X-Llms-Txt headers to all siteApp responses, including the /members/.well-known/jwks.json endpoint. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Snapshot template literals need double-backslash-quote for literal escaped quotes in serialized values. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

getMarkdownPath("/") produces "/index.md", but the reverse getResourcePathFromMarkdownPath("/index.md") returned "/index/" instead of "/". Now handles the /index.md special case. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

collapseWhitespace replaces all whitespace sequences with a single space, which flattens markdown block structure (lists, code blocks, paragraphs) into one line. The node-html-markdown output already has proper structure; only excessive blank lines need normalizing. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The truncation note was appended after the size check, so the final output could exceed 5 MiB. Now the budget accounts for the footer size upfront. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

When fetchPublicEntry returns null during content negotiation, the controller called next() which skipped the resolved entry and could 404. Now falls through to normal HTML rendering instead. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

appendHeaderValue checked array elements with includes(), but a single header string like "a, b" is one element and would not match "b". Now splits comma-delimited values before checking for duplicates. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

… entry The llms_enabled migration now lives in 6.40 (from merged PR #27995). Removed the old 6.31 migration and a duplicate snapshot entry left by the rebase. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

… to async Reverted admin UI toggle changes (general-settings keywords, seo-meta toggle, seometa acceptance test) - these should ship in a follow-up PR. Converted entry controller markdown test from done() callback to async/await to match the vitest migration on main. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The admin UI belongs in this PR since this is where the setting starts controlling behavior (llms.txt generation). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

These were left over from the original feature commit - the setting is already in the correct position from the merged PR #27995. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ng in cache-manager JSDoc Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Ghost is moving toward stateless, compute-on-request architecture. This implements llms.txt/llms-full.txt generation, per-entry .md export, Accept: text/markdown content negotiation, and discovery headers using that approach — no in-memory cache, no event listeners, full dependency injection via factory functions. Gated behind a config flag ("llms": false by default) so the feature can be tested before broad rollout. The llms_enabled setting (already in main) acts as the user-facing toggle within an enabled deployment. Key design decisions: - Factory functions (createLlmsService, createLlmsHandler, createLlmsDiscovery) receive all dependencies explicitly - Uses urlServiceFacade for lazy routing compatibility - Paginated DB queries for llms-full.txt (100/batch, 5 MiB budget) - Index queries exclude html column to reduce memory - Cache-Control headers on all responses for CDN/proxy caching - pretty-urls extension bypass scoped to .md and .txt only - markdown.js is fully pure (no Ghost singleton imports) Ref #27984 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

ErisDS · 2026-05-22T05:32:15Z

Closing in favour of #28042

Ghost is moving toward stateless, compute-on-request architecture. This implements llms.txt/llms-full.txt generation, per-entry .md export, Accept: text/markdown content negotiation, and discovery headers using that approach — no in-memory cache, no event listeners, full dependency injection via factory functions. Gated behind a config flag ("llms": false by default) so the feature can be tested before broad rollout. The llms_enabled setting (already in main) acts as the user-facing toggle within an enabled deployment. Key design decisions: - Factory functions (createLlmsService, createLlmsHandler, createLlmsDiscovery) receive all dependencies explicitly - Uses urlServiceFacade for lazy routing compatibility - Paginated DB queries for llms-full.txt (100/batch, 5 MiB budget) - Index queries exclude html column to reduce memory - Cache-Control headers on all responses for CDN/proxy caching - pretty-urls extension bypass scoped to .md and .txt only - markdown.js is fully pure (no Ghost singleton imports) Ref #27984 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

ErisDS added the migration [pull request] Includes migration for review label May 20, 2026

ErisDS mentioned this pull request May 20, 2026

Add llms.txt exports and AI crawler toggle #27400

Closed

coderabbitai Bot reviewed May 20, 2026

View reviewed changes

ErisDS force-pushed the codex/llms-setting-pr branch from 9ff9b66 to 35b8909 Compare May 20, 2026 10:54

coderabbitai Bot reviewed May 20, 2026

View reviewed changes

Comment thread ghost/core/test/unit/frontend/services/llms/service.test.js

coderabbitai Bot reviewed May 20, 2026

View reviewed changes

ErisDS mentioned this pull request May 20, 2026

Added llms_enabled site setting #27995

Merged

4 tasks

JohnONolan and others added 14 commits May 20, 2026 18:30

Fixed unused entry parameter in getLlmsIndexUrl

fac9d84

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Fixed llms-discovery test stub returning false for all settings

da2c222

The stub returned false for every settingsCache.get() call, which made llms_enabled === false and caused the middleware to bail out before adding discovery headers. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Updated settings API test count and snapshot for llms_enabled

80ab64b

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Updated JWKS header snapshot for llms-discovery headers

bccca68

The llms-discovery middleware adds Link and X-Llms-Txt headers to all siteApp responses, including the /members/.well-known/jwks.json endpoint. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Fixed quote escaping in JWKS header snapshot

1cc6a80

Snapshot template literals need double-backslash-quote for literal escaped quotes in serialized values. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Reserved space for truncation footer in 5 MiB budget

7d74588

The truncation note was appended after the size check, so the final output could exceed 5 MiB. Now the budget accounts for the footer size upfront. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

ErisDS force-pushed the codex/llms-setting-pr branch from 8d87f2c to 3e154ee Compare May 20, 2026 17:31

ErisDS and others added 4 commits May 20, 2026 19:01

Restored admin UI toggle for llms_enabled setting

5f2e472

The admin UI belongs in this PR since this is where the setting starts controlling behavior (llms.txt generation). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Removed duplicate llms_enabled entries from default-settings

9013c5f

These were left over from the original feature commit - the setting is already in the correct position from the merged PR #27995. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Moved llms_enabled to match schema ordering in serializer maps

af4c1f6

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

ErisDS and others added 2 commits May 21, 2026 13:45

Removed duplicate llms_enabled from settings fixture and fixed orderi…

4bee6de

…ng in cache-manager JSDoc Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Added byte-length assertion to llms-full.txt truncation test

94ed0fe

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

ErisDS mentioned this pull request May 21, 2026

Added stateless llms.txt service with dependency injection #28042

Merged

10 tasks

ErisDS closed this May 22, 2026

Uh oh!

Conversation

ErisDS commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Known CI fixes still needed

Must fix (CI blockers)

Nice to have (code quality)

Uh oh!

github-actions Bot commented May 20, 2026

General requirements

Schema changes

Data changes

Uh oh!

coderabbitai Bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

codecov Bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

ErisDS commented May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ErisDS commented May 20, 2026 •

edited

Loading

coderabbitai Bot commented May 20, 2026 •

edited

Loading

codecov Bot commented May 20, 2026 •

edited

Loading