Added lazy routing#28356
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (9)
🚧 Files skipped from review as they are similar to previous changes (9)
WalkthroughThis PR adds a lazy URL-resolution stack: NQL-based router filter utilities, a permalink matcher, a per-request DB lookup factory that returns eager-shaped records, and LazyUrlService which registers routers, generates forward URLs, determines ownership, and resolves incoming paths to resources. It includes unit tests for each utility/service and an integration suite verifying parity with the existing eager UrlService. Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
ghost/core/test/integration/url-service-parity.test.js (1)
237-238: ⚡ Quick winMake the zero-post gating case deterministic instead of fixture-dependent.
assert.ok(checked > 0, ...)makes this test fail when fixture composition changes, even if lazy/eager parity is still correct. Consider creating an explicit public tag/author with no posts in test setup for this case so coverage stays stable over fixture churn.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@ghost/core/test/integration/url-service-parity.test.js` around lines 237 - 238, The test currently uses assert.ok(checked > 0, ...) which is fixture-dependent; modify the test setup to create an explicit public tag or author with zero posts (e.g. create a public Tag/Author entity in the test's before/arrange step) and then assert deterministically against that created entity instead of relying on checked > 0; update the assertion (the assert.ok call and any usage of checked) to verify that the newly created public tag/author is present and has zero posts, so the parity check is stable regardless of fixture composition.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@ghost/core/core/server/services/url/lazy-url-service.ts`:
- Line 130: The reverse-filter check in resolveUrl() skips normalization and
calls filterMatches(config.compiledFilter, resource) directly; change
resolveUrl() to pass the resolved resource through the existing normalization
helper _recordForFilter() (i.e. call _recordForFilter(resource) and use that
result for filterMatches) so page: filters are evaluated against the same
normalized shape used by getUrlForResource() and ownsResource(); update any
variable names accordingly and add/adjust tests to cover resolveUrl() behavior
for page:false routers.
In `@ghost/core/core/server/services/url/permalink-matcher.ts`:
- Around line 36-55: The params object should be a null-prototype map so
captured placeholder names like "__proto__" aren't swallowed by
Object.prototype; change the declaration of params in permalink-matcher.ts (the
variable currently declared as const params: Record<string, string> = {}) to use
a null-prototype object (e.g. Object.create(null)) or an ES Map and update the
local type annotations accordingly, keep the rest of the capture logic
(fieldName check, decodeURIComponent assignment) the same; then add a unit test
in ghost/core/test/unit/server/services/url/permalink-matcher.test.js that
matches a template with a placeholder like "/:__proto__/" and asserts the
returned params includes "__proto__" => the decoded value (and that
JSON.stringify-based cache keys or findResource lookups would receive that
param), to prevent regression.
---
Nitpick comments:
In `@ghost/core/test/integration/url-service-parity.test.js`:
- Around line 237-238: The test currently uses assert.ok(checked > 0, ...) which
is fixture-dependent; modify the test setup to create an explicit public tag or
author with zero posts (e.g. create a public Tag/Author entity in the test's
before/arrange step) and then assert deterministically against that created
entity instead of relying on checked > 0; update the assertion (the assert.ok
call and any usage of checked) to verify that the newly created public
tag/author is present and has zero posts, so the parity check is stable
regardless of fixture composition.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: f3fdf7bf-af42-42d8-a240-96eb49e258e4
📒 Files selected for processing (9)
ghost/core/core/server/services/url/lazy-find-resource.tsghost/core/core/server/services/url/lazy-url-service.tsghost/core/core/server/services/url/permalink-matcher.tsghost/core/core/server/services/url/router-filter.tsghost/core/test/integration/url-service-parity.test.jsghost/core/test/unit/server/services/url/lazy-find-resource.test.jsghost/core/test/unit/server/services/url/lazy-url-service.test.jsghost/core/test/unit/server/services/url/permalink-matcher.test.jsghost/core/test/unit/server/services/url/router-filter.test.js
ref https://linear.app/ghost/issue/HKG-1817 - the lazy URL service needs to turn a request path back into a resource, which means matching it against each router's permalink template (/:slug/, /:year/:month/:slug/) and extracting the placeholders - kept this as a standalone pure function rather than folding it into the service so the matching rules can be exhaustively unit-tested in isolation, with no service or DB setup - matched segment-by-segment instead of compiling a regex per template: Ghost only uses whole-segment placeholders, so a literal walk both keeps the logic obvious and removes any ReDoS surface on attacker-controlled request paths - validated placeholder names with a character scan rather than a small regex so the module is regex-free end to end, matching its documented design; the check only ever ran on template field names and never on request paths, so this is a consistency change rather than a security fix - treated malformed %-escapes and mixed literal+placeholder segments as non-matches rather than throwing, since neither is a valid Ghost permalink and a reverse lookup should quietly fall through to 404
ref https://linear.app/ghost/issue/HKG-1817 - a router decides whether it owns a resource by running its routes.yaml filter (featured:true, tag:news, page:false) against the resource, so the lazy service needs the same NQL semantics the eager UrlGenerator uses - extracted the expansions, the page:true/false transformer, and the type mapping into a small module of pure helpers so the service file stays about routing decisions and these rules can be tested directly - copied the expansions and transformer from the eager generator on purpose instead of sharing one module: the eager service is our parity oracle while both run side by side, and a shared helper could hide a bug by producing the same wrong answer on both sides - made a missing relation (e.g. a thin record with no tags) a non-match rather than an error, matching the eager generator's try/catch, so a filtered route quietly falls through instead of throwing mid-request - routerTypeOf accepts both the singular DB `type` ('post') and the plural router key ('posts') because migrated callers pass either shape
ref https://linear.app/ghost/issue/HKG-1817 - the lazy service resolves a request path to a resource on demand, so it needs a way to load one record by the params a permalink captured; this hook is that single DB-touching seam, injected into resolveUrl so the service itself stays pure and testable without a database - mirrored the eager resourceConfig's visibility rules (posts/pages published-only, tags/authors through the public TagPublic/Author models that gate on having posts) so a guessed slug can never surface a draft, private tag, or staff user that the eager path kept hidden - restricted the lookup to real, unique column params (id/uuid/slug) and treated a paramset with none as a clean non-match, so a multi-segment permalink like /:year/:month/:slug/ or /:primary_tag/:slug/ can't pass a non-column key into findOne and turn a simple URL miss into a 500; validating the derived/relation segments against the record is the deferred column-aware follow-up - pruned every loaded record to the same shape the eager resourceConfig exposes, sourcing the exclude lists from that config so the two can't drift: dropped the post body/status, tag description/meta, author bio and contact fields, and the auto-loaded posts_meta relation, because the resolved record is what downstream consumers see and e.g. ContentStatsService.getResourceTitle branches on title being absent for posts/pages - trimmed tags/authors/primary_tag/primary_author down to {id, slug} to match the eager withRelatedFields, so reverse lookups can't leak author emails or full tag rows out of the URL service - mirrored eager's quirk of exposing primary_tag/primary_author as null on pages while never carrying the tags/authors arrays there, so the page record shape matches the eager service field for field - loaded tags and authors only for posts, because collection routers filter on tag:/author: and resolveUrl re-checks that filter against the record; pages always come from the filterless static-pages router, so the extra relations would be wasted work - derived primary_author from the first author to match the eager withRelatedPrimary, since Post.toJSON already computes primary_tag but not primary_author, and date/tag/author permalinks depend on it - built it as a factory over injected models rather than reaching for the model layer directly, keeping the module unit-testable and leaving the choice of real models to the later wiring step
ref https://linear.app/ghost/issue/HKG-1817 - this is the on-demand replacement for the eager UrlService: it answers the same forward (resource -> URL), ownership, and reverse (URL -> resource) questions, but computes each answer per call from the registered router configs instead of precomputing and holding a full resource map at boot - composed it from the already-tested permalink-matcher and router-filter helpers so this file is only about the routing decision (pick the first registered router of the resource's type whose filter matches) and stays small enough to reason about against the eager generator - made ownership exclusive like the eager service: eager reserves each resource for the first router that claims it, so ownsResource only returns true when no earlier router of the same type also matches, otherwise a catch-all router would wrongly claim resources already owned by a higher-priority collection - kept forward lookups pure and confined every DB hit to resolveUrl via the injected findResource hook, so the whole service is unit-testable without a database and there is one obvious seam where on-demand loading happens - memoized findResource within a single resolveUrl call, keyed by (resourceType, params), so routers that share a permalink shape or fall through across filters don't issue the same DB query twice, while router priority and the per-config filter re-check are preserved - re-validated the resolved record against its canonical URL: the slug-only DB lookup can't filter on derived/relation permalink segments (year/month, primary_tag), so without this a wrong-date or wrong-tag URL would resolve a post the eager service 404s; regenerating the record's URL with the same replacePermalink the eager generator uses and comparing the captured params makes the reverse path agree with eager by construction - matched the eager getUrlByResourceId 404 path exactly (no subdirectory prefix on the miss fallback) so the two services agree even on subdirectory installs - logged a thin-resource error when a filtered router can't see a relation the caller didn't load: that is precisely the case where lazy would 404 while eager (full data in memory) returns a URL, so surfacing it protects parity during the shadow-comparison window - restored the preserved contract spec from skipped to running now that the implementation it pins exists again
ref https://linear.app/ghost/issue/HKG-1817 - the whole point of this milestone is to swap the eager service for the lazy one without changing any URL, so the lazy service is only trustworthy if it provably returns the same answer as the eager service for the same inputs; this integration test is that proof, run against real fixtures and a real findResource over the model layer - drove both services from one identical routing set per scenario and fed the lazy service the exact records the eager service cached, so any difference the test reports is a routing-logic divergence and not a data-shape artefact - compared all three directions a caller depends on: forward (relative and absolute URL for every cached resource, including the resources that resolve to /404/), ownership (every router against every resource), and reverse (every eager-generated URL resolves back to the same resource and type) - asserted the reverse lookup returns the same record shape as the eager service, not just the same id and type: exact field-set parity, matching slug/name, no leaked title, and relations trimmed to {id, slug}, so a divergence like lazy returning the post body or untrimmed author rows can't slip through - added a date-based permalink scenario and asserted both services 404 a wrong-date variant of a generated URL, locking the canonical re-check so lazy can't resolve a non-canonical URL the eager service rejects - added a featured-collection scenario alongside the default set specifically to exercise priority fallthrough; this is what caught the ownership exclusivity bug now fixed in the service, where a catch-all router claimed resources already owned by a higher-priority collection - guarded against a vacuous pass by asserting the fixtures actually produce cached resources and generated URLs, so the parity assertions can never silently iterate over nothing
| return !!compiledFilter.queryJSON(record); | ||
| } catch (err) { | ||
| const message = err instanceof Error ? err.message : String(err); | ||
| debug('NQL match failed', message); |
There was a problem hiding this comment.
Why not an actual logging.warn or logging.error? Do we expect errors here?
| * the `:field` placeholders, or returns `null` when the path doesn't match. | ||
| */ | ||
| export function matchPermalink(template: string, urlPath: string): Record<string, string> | null { | ||
| if (typeof template !== 'string' || typeof urlPath !== 'string') { |
There was a problem hiding this comment.
Is this ever expected? Should we error here, or log?
| export function createFindResource(models: Models): FindResource { | ||
| const loadOne = async ( | ||
| Model: BookshelfModel, | ||
| query: Record<string, unknown>, |
There was a problem hiding this comment.
I think this can be a stronger type? it's either {slug: string} or {id: string} or {uuid: string| right?
There was a problem hiding this comment.
Hmm, maybe it can be a combination of those? I think really we only ever need one though.
| if (config.resourceType !== routerType) { | ||
| continue; | ||
| } | ||
| this._warnIfThin(config, resource, routerType); |
There was a problem hiding this comment.
In a production ready implementation this should throw completely - it should never happen. I also wonder if we can just check once at the top of getUrlForResource
| * Matches a permalink template (e.g. `/:slug/`) against a URL path and extracts | ||
| * the `:field` placeholders, or returns `null` when the path doesn't match. | ||
| */ | ||
| export function matchPermalink(template: string, urlPath: string): Record<string, string> | null { |
There was a problem hiding this comment.
I think we only care about the slug id and uuid params here? Maybe we can be specific to that, as well as doing some basic checking of the values?
e.g. a permalink of /post/:uuid should not match /post/blahblah because "blahblah" is not a valid uuid?
ref https://linear.app/ghost/issue/HKG-1817
Why
The eager
UrlServiceprecomputes a full resource → URL map at boot and holds every published resource in memory. We want to replace it with an on-demand service that computes each answer per call. The earlier experiment (#28336, behindconfig.lazyRouting) was reverted (HKG-1816); this rebuilds it as production-ready code with cleaner, smaller, individually-testable pieces.This PR only adds the new service and its tests — nothing is wired into the request path yet. The
UrlServiceFacadealready has the seams, but no instance selects the lazy backend. Wiring and the eager-removal cutover are separate milestone issues.The overriding goal is parity: the lazy service must return the same answer as the eager service for the same inputs, because downstream issues will shadow-compare the two before we switch over.
What
Built as four small modules so each piece is testable in isolation, plus a real-DB parity suite:
permalink-matcher.ts— pure, regex-free function that matches a request path against a permalink template (/:slug/,/:year/:month/:slug/) and extracts the params. No ReDoS surface; malformed/unsupported shapes are non-matches.router-filter.ts— the NQL routing semantics (expansions, thepage:transformer,routerTypeOf,buildFilter,filterMatches). A deliberate copy of the eager generator's rules, not a shared module, so a shared bug can't hide identical wrong output on both sides while eager is the comparison oracle.lazy-find-resource.ts— the single DB-touching seam, injected intoresolveUrl. Mirrors the eagerresourceConfigvisibility rules (published-only posts/pages, publicTagPublic/Authorthat gate on having posts) so a guessed slug can't surface anything the eager path hid. Queries only by real column params (id/uuid/slug) and prunes the loaded record to the exact shape the eager service exposes — dropping the post body and other excluded fields, and trimmingtags/authors/primary_tag/primary_authorto{id, slug}— so the resolved record never leaks fields eager strips.lazy-url-service.ts— composes the helpers. Forward lookups are pure; all DB access is confined toresolveUrl. Ownership is exclusive (first matching router of the type wins), matching eager's reservation. The/404/fallback matches eager's miss path exactly, including subdirectory handling. Logs a thin-resource error when a filtered router can't see a relation the caller didn't load — that's exactly the case where lazy would/404/while eager returns a URL.Notable design choice:
resolveUrlre-validates the resolved record against its canonical URL. The slug-only DB lookup can't filter on derived or relation permalink segments (:year/:month,:primary_tag), so it regenerates the record's URL with the samereplacePermalinkthe eager generator uses and confirms the captured params match. This makes the reverse path agree with eager by construction: the eager service only resolves a URL that equals a resource's generated (canonical) URL, so a wrong-date (/2026/05/<slug>/) or wrong-tag URL 404s in both backends rather than resolving the post by slug alone.Testing
lazy-url-servicecontract spec from skipped to running (HKG-1817's acceptance criterion)./404/), ownership (every router × every resource), and reverse (every eager-generated URL resolves back to the same resource, with full record-shape parity). A featured-collection scenario exercises priority fallthrough (this caught and fixed an ownership-exclusivity bug during development), and a date-based scenario asserts both services 404 a non-canonical (wrong-date) URL.tsc --noEmit,eslint, all unit suites, and the parity integration suite pass.Known follow-ups (deferred to the wiring/comparison issue)
resolveUrlqueriesfindResourceonly by real column params (id/uuid/slug) and validates derived/relation segments via the canonical re-check, so custom permalinks (/:year/:month/:slug/,/:primary_tag/:slug/) resolve and 404 in parity with eager. Permalinks that key off a field the pruned record doesn't carry remain a wiring-phase follow-up.