Added lazy url service#28336
Conversation
…iring ref https://linear.app/ghost/issue/HKG-1816 - the lazy URL service only ever shipped to run a production experiment behind config.lazyRouting; the experiment is concluded and we are rebuilding it as production-ready code, so the experimental engine and its boot/reload wiring should not linger in the tree - reverted by hand rather than git revert: a later sitemap follow-up rewrote overlapping lines so a clean revert is no longer possible, and we deliberately want to keep the facade and caller migrations that landed alongside the experiment - kept UrlServiceFacade and all caller migrations untouched because they are clean, independent of the engine, and the re-implementation will dispatch through them - preserved the lazy service test files as skipped specs instead of deleting them so the behaviour contract the re-implementation must satisfy stays visible in-tree; the repo forbids skipped tests by default, hence the scoped eslint-disable - dropped the lazyRouting flag so no instance can select a now-deleted backend; the re-implementation will introduce an explicit three-mode flag instead - with no lazy backend attached the facade falls through to the eager service, so self-hosters and production (rollout already paused) are unaffected
ref https://linear.app/ghost/issue/HKG-1816 - the engine revert deleted the lazy DB-driven sitemap path, so the unit tests covering _populateFromDatabase and lazyRouting-mode event wiring now assert against code that no longer exists; keeping them would give a false sense of coverage and fail on the next run - kept the eager event-driven sitemap tests because the url.added/ url.removed/URLResourceUpdatedEvent path is the only one production uses now, and it must stay protected through the observation window - deleted rather than skip-preserved (unlike the LazyUrlService specs): the lazy sitemap will be repopulated through the facade/event model, not the raw-knex populate approach, so these specs would not return as-is
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
WalkthroughThis PR introduces a complete lazy URL service architecture for Ghost. It adds four new core modules: Possibly related PRs
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
ref https://linear.app/ghost/issue/HKG-1817 - the lazy URL service needs to turn a request path back into route params without the eager service's precomputed map; isolating this as a standalone pure function keeps the matching logic unit-testable and free of any service or DB state - kept it a linear segment-by-segment scan rather than a regex: Ghost permalinks only use whole-segment :field placeholders, so a regex buys nothing and would add ReDoS/backtracking risk on attacker-controlled paths - treats malformed %-escapes and mixed literal+placeholder segments as non-matches instead of throwing, so a hostile or malformed URL degrades to a 404 rather than erroring the request Co-authored-by: Cursor <cursoragent@cursor.com>
ref https://linear.app/ghost/issue/HKG-1817 - the lazy URL service must evaluate routes.yaml filters the same way the eager UrlGenerator does, so its output matches the eager baseline during the shadow-comparison window; this captures those expansions, the page: transformer, and the type mapping in one module for the lazy side - it is a deliberate copy of those rules, not a module the eager service should import: eager is the comparison oracle (removed in HKG-1824), so sharing one filter module would let a bug produce identical wrong output on both sides and slip past the comparison - exposed buildFilter/filterMatches/routerTypeOf as small pure functions so the filter semantics are unit-testable in isolation - filterMatches swallows NQL evaluation errors as a non-match rather than throwing, so a thin record missing a referenced relation 404s instead of crashing the request Co-authored-by: Cursor <cursoragent@cursor.com>
af7ca79 to
dd57bbc
Compare
|
@coderabbitai review |
✅ Action performedReview finished.
|
There was a problem hiding this comment.
Pull request overview
This PR introduces the core building blocks for a lazy URL resolution service that resolves URLs and ownership on demand from registered router configs (rather than precomputing a full resource→URL map at boot). It adds the lazy backend plus supporting matcher/filter helpers and corresponding unit/integration test coverage, while explicitly not wiring the lazy backend into production yet (it will sit behind UrlServiceFacade in a follow-up PR).
Changes:
- Added
LazyUrlServicebackend with router registration, forward URL generation, and reverse lookup (resolveUrl) behavior. - Added pure helper modules for permalink matching (
permalink-matcher) and route filter semantics (router-filter) mirroring the eager generator. - Added unit tests for the helpers and lazy service, plus an integration spec exercising
createFindResourceagainst real models.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| ghost/core/core/server/services/url/router-filter.ts | New helper encapsulating NQL routing-filter semantics (expansions + page: transformer + type mapping). |
| ghost/core/core/server/services/url/permalink-matcher.ts | New non-regex permalink template matcher used for reverse URL lookups. |
| ghost/core/core/server/services/url/lazy-find-resource.ts | New factory for the injected DB lookup hook used by lazy reverse resolution. |
| ghost/core/core/server/services/url/lazy-url-service.ts | New lazy URL service backend orchestrating matching/filtering and delegating DB lookups via injected hook. |
| ghost/core/test/unit/server/services/url/router-filter.test.js | Unit tests covering router type mapping, filter compilation, and filter matching semantics. |
| ghost/core/test/unit/server/services/url/permalink-matcher.test.js | Unit tests for permalink matching and percent-decoding behavior. |
| ghost/core/test/unit/server/services/url/lazy-find-resource.test.js | Unit tests asserting the lookup hook’s query shape and scoping behavior. |
| ghost/core/test/unit/server/services/url/lazy-url-service.test.js | Re-enabled and extended unit coverage for lazy URL service behavior and hardening expectations. |
| ghost/core/test/integration/url-service.test.js | Integration coverage for createFindResource(models) against a real seeded DB plus an end-to-end LazyUrlService.resolveUrl check. |
Comments suppressed due to low confidence (2)
ghost/core/test/unit/server/services/url/lazy-url-service.test.js:303
- The
resolveUrlspecs for multi-segment permalinks only cover the “happy path” and stubfindResourceto return a record that always matches the captured params. That means the suite won’t catch regressions whereresolveUrlreturns a resource even when the captured:primary_tagsegment doesn’t match the fetched record’s actualprimary_tag(which can happen in real life if the DB lookup effectively keys offslugonly). Adding a negative assertion here would lock in canonical-path behaviour and prevent false-positive reverse resolutions.
it('matches multi-segment permalinks (e.g. /:primary_tag/:slug/)', async function () {
const findResource = sinon.stub();
// withArgs binds to the exact param shape, so dropping a capture fails here.
findResource
.withArgs('posts', {primary_tag: 'podcast', slug: 'hello'})
.resolves({id: 'p1', slug: 'hello'});
const service = new LazyUrlService({urlUtils, findResource});
service.onRouterAddedType('default', null, 'posts', '/:primary_tag/:slug/');
const result = await service.resolveUrl('/podcast/hello/');
assert.equal(result.id, 'p1');
assert.equal(result.type, 'posts');
});
ghost/core/test/unit/server/services/url/lazy-url-service.test.js:296
- The date-based permalink spec doesn’t currently lock in canonical behaviour for
:year/:monthplaceholders. If the backend lookup effectively becomes “slug only” (unknown keys get dropped during model filtering),resolveUrl('/2099/01/hello/')can incorrectly return thehellopost even when itspublished_atis in a different month/year. Adding a mismatch assertion here (and ensuring the stubbed record includespublished_atso canonical re-rendering is possible) would catch that class of bug.
// withArgs binds to the exact param shape, so dropping a capture fails here.
findResource
.withArgs('posts', {primary_tag: 'podcast', slug: 'hello'})
.resolves({id: 'p1', slug: 'hello'});
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
dd57bbc to
84a2ac8
Compare
84a2ac8 to
ca390d4
Compare
ca390d4 to
88abd1c
Compare
ref https://linear.app/ghost/issue/HKG-1817 - reverse lookups hit the database per request instead of reading a precomputed map, so the hook deliberately mirrors the eager service's visibility rules: published-only post/page scoping and the public scoped tag/author models, so a guessed slug can never surface something the eager path would have hidden (e.g. a staff user or a tag with no published posts) - post lookups load tags/authors because collection routers can filter on tag:/author: and resolveUrl re-checks that filter against the record; without the relations every relation-filtered route would silently 404. Pages skip the relations entirely since they come only from the filterless /:slug/ StaticPagesRouter, so loading them would be wasted query work - projected post lookups to the routing-relevant columns the eager resourceConfig keeps, dropping the heavy long-text bodies (mobiledoc/lexical/ html/plaintext): reverse lookups run on demand, so pulling whole post bodies per request would be a needless DB cost and a divergence from eager - derived primary_tag/primary_author in the hook because Post.toJSON only computes primary_tag when no column projection is requested (and never primary_author), so the /:primary_tag/ and /:primary_author/ permalinks keep resolving once we slim the columns - the FindResource type lives with the factory that produces it so the service depends on this module rather than the reverse, avoiding a circular import - pinned require:false last in the options spread so a caller can never flip it and turn a routine miss into a thrown error, keeping the null-on-miss contract - enabled the preserved spec that pins this behaviour and extended it to cover query forwarding and the require:false miss path Co-authored-by: Cursor <cursoragent@cursor.com>
88abd1c to
763cf37
Compare
ref https://linear.app/ghost/issue/HKG-1817 - the service is a thin orchestrator over the permalink-matcher and router-filter helpers rather than a monolith that inlines the matching rules, so each concern stays small and independently testable - kept forward lookups (getUrlForResource / ownsResource) pure and confined all database access to resolveUrl via the injected findResource hook, so the whole service is unit-testable without a DB and boot does zero URL work - guarded resolveUrl against non-canonical paths: Bookshelf drops non-column captures (year/month/primary_tag/...) before the query, so a slug match can come back whose other segments differ; it now re-renders the resource's canonical URL and accepts only an exact match, the invariant the eager service gets for free by storing canonical URLs alone - normalized the plural router type ('posts'/'pages') to the singular DB value for filter evaluation only, since migrated callers tag resources with the plural type while the page: transformer matches the singular column; without it a page: filter would never match and parity with the eager generator breaks - retained the thin-resource warning so operators can alert on any caller that hands the service a resource missing the tags/authors a filtered route needs, rather than silently 404ing - enabled the preserved spec and added hardening coverage for the router-update no-op, the warning branches, reset clearing reverse lookups, and the export shape Co-authored-by: Cursor <cursoragent@cursor.com>
ref https://linear.app/ghost/issue/HKG-1817 - the unit tests stub the database, so nothing proved the model queries the hook depends on; this exercises the hook against real models so a change to the published-only scoping, the post/page split, or the TagPublic/Author shouldHavePosts gate fails loudly - pins the security-relevant guards: a guessed slug must not surface a draft, a tag with no published posts, or a staff user, matching what the eager service would have hidden - asserts withRelated genuinely populates tags/authors, since that population is what lets the service re-check tag:/author: route filters rather than silently 404ing Co-authored-by: Cursor <cursoragent@cursor.com>
763cf37 to
ec43302
Compare
| // Only posts need relations: collection routers can filter on tag:/author:, | ||
| // so resolveUrl re-checks those against the record (HKG-1738). Pages come only | ||
| // from the filterless, /:slug/ StaticPagesRouter, so loading them is wasted work. | ||
| const POST_RELATIONS = ['tags', 'authors']; |
| case 'tags': | ||
| return loadOne(models.TagPublic, {...query, visibility: 'public'}); | ||
| case 'authors': | ||
| return loadOne(models.Author, {...query, visibility: 'public'}); |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@ghost/core/core/server/services/url/lazy-url-service.ts`:
- Around line 89-95: The current loop calls _warnIfThin for every candidate
before checking filterMatches, causing false-positive LAZY_URL_THIN_RESOURCE
logs; change the logic so you only call _warnIfThin when a thin resource
actually prevented resolution—either move the _warnIfThin call to run only after
a candidate matched the filter and failed due to thinness, or (preferably)
remove per-candidate warnings and call _warnIfThin once after the loop when no
candidate matched (i.e., resolution failed). Adjust the loop around candidates,
filterMatches, and this._formatPath so that _warnIfThin is only invoked in the
“no match” fallback path (or immediately for the single router whose filter
matched but could not format), referencing the methods _warnIfThin and
filterMatches and the candidates iteration to locate the change.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 95bbc483-1596-408b-9c7f-344878672aaf
📒 Files selected for processing (5)
ghost/core/core/server/services/url/lazy-find-resource.tsghost/core/core/server/services/url/lazy-url-service.tsghost/core/test/integration/url-service.test.jsghost/core/test/unit/server/services/url/lazy-find-resource.test.jsghost/core/test/unit/server/services/url/lazy-url-service.test.js
🚧 Files skipped from review as they are similar to previous changes (3)
- ghost/core/test/integration/url-service.test.js
- ghost/core/core/server/services/url/lazy-find-resource.ts
- ghost/core/test/unit/server/services/url/lazy-url-service.test.js
| for (const config of candidates) { | ||
| this._warnIfThin(config, resource, routerType); | ||
| if (filterMatches(config.compiledFilter, record)) { | ||
| const path = this.urlUtils.replacePermalink(config.permalink, resource); | ||
| return this._formatPath(path, options); | ||
| } | ||
| } |
There was a problem hiding this comment.
_warnIfThin fires for every candidate router, producing false-positive error logs.
_warnIfThin runs for each candidate before filterMatches, so a thin resource that is correctly routed by one router will still log a LAZY_URL_THIN_RESOURCE InternalServerError for any other same-type router whose filter references tags/authors/primary_*. Since the stated purpose is to let operators alert on uninflated callers, warning for routers the resource was never intended for defeats that goal and creates alert noise (e.g. a tag-less post that legitimately matches the default posts router but also passes a tag:[...] collection router).
Consider warning only when no candidate matched (i.e., the thinness actually prevented resolution), or only for the router whose filter failed to match.
♻️ One option: warn only when nothing matched
const candidates = this.routerConfigs.filter(c => c.resourceType === routerType);
const record = this._recordForFilter(resource);
for (const config of candidates) {
- this._warnIfThin(config, resource, routerType);
if (filterMatches(config.compiledFilter, record)) {
const path = this.urlUtils.replacePermalink(config.permalink, resource);
return this._formatPath(path, options);
}
}
+ for (const config of candidates) {
+ this._warnIfThin(config, resource, routerType);
+ }
return this._formatNotFound(options);📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| for (const config of candidates) { | |
| this._warnIfThin(config, resource, routerType); | |
| if (filterMatches(config.compiledFilter, record)) { | |
| const path = this.urlUtils.replacePermalink(config.permalink, resource); | |
| return this._formatPath(path, options); | |
| } | |
| } | |
| const candidates = this.routerConfigs.filter(c => c.resourceType === routerType); | |
| const record = this._recordForFilter(resource); | |
| for (const config of candidates) { | |
| if (filterMatches(config.compiledFilter, record)) { | |
| const path = this.urlUtils.replacePermalink(config.permalink, resource); | |
| return this._formatPath(path, options); | |
| } | |
| } | |
| for (const config of candidates) { | |
| this._warnIfThin(config, resource, routerType); | |
| } | |
| return this._formatNotFound(options); |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@ghost/core/core/server/services/url/lazy-url-service.ts` around lines 89 -
95, The current loop calls _warnIfThin for every candidate before checking
filterMatches, causing false-positive LAZY_URL_THIN_RESOURCE logs; change the
logic so you only call _warnIfThin when a thin resource actually prevented
resolution—either move the _warnIfThin call to run only after a candidate
matched the filter and failed due to thinness, or (preferably) remove
per-candidate warnings and call _warnIfThin once after the loop when no
candidate matched (i.e., resolution failed). Adjust the loop around candidates,
filterMatches, and this._formatPath so that _warnIfThin is only invoked in the
“no match” fallback path (or immediately for the single router whose filter
matched but could not format), referencing the methods _warnIfThin and
filterMatches and the candidates iteration to locate the change.
ref https://linear.app/ghost/issue/HKG-1817
Part of [Moving routes.yaml off of disk → Introduce lazy loading]. Adds the lazy
URL resolution service: it resolves URLs and ownership on demand from the
registered router configs instead of the eager service's precomputed
resource→URL map, so boot does no URL work and memory stays flat regardless of
content volume. It sits behind the existing
UrlServiceFacade.Not wired in
services/url/index.jsstill constructs the facade eager-only, so this PRchanges no production behaviour. Wiring + the three-mode (eager / lazy /
compare) facade and the shadow-comparison framework land in the next PR
(HKG-1818–1820).
Structure (5 commits, bottom-up)
/:slug/,/:year/:month/:slug/); segment-by-segment, no regex backtracking, gracefulon malformed %-escapes.
page:transformer, the type map, compile/match helpers). Mirrors the eager
generator's rules so lazy output matches the baseline during the
shadow-comparison window — a deliberate copy, not a shared module (eager is
the comparison oracle and is removed in HKG-1824, so coupling them would let a
shared bug hide identical wrong output on both sides).
service's visibility rules (published-only, public scoped tag/author models);
owns the
FindResourcetype so the dependency stays one-directional.pure, the only DB access is
resolveUrlvia the injected hook. Carries thethin-resource warning so operators can alert on uninflated callers.
Why decomposed
Splitting the matcher, filter semantics, DB lookup, and orchestration into
separate modules makes each concern independently unit-testable and gives the
lazy service one clear home for its routing rules.
Tests
the tag/author
shouldHavePostsgate, andwithRelatedrelation population.