🎨 Improved admin Comments page load time on large comment datasets#28734
Conversation
|
It looks like this PR contains a migration 👀 General requirements
Schema changes
Data changes
|
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (6)
✅ Files skipped from review due to trivial changes (1)
🚧 Files skipped from review as they are similar to previous changes (3)
WalkthroughThe comments table schema index list was expanded with indexes on 🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
I have also created a PR that contains all four new indexes combined with a query refactor under my fork: jwmarshall#1 I'm working out a test environment to show the difference in performance gains. |
69aa198 to
fdb14ea
Compare
|
| Command | Status | Duration | Result |
|---|---|---|---|
nx run ghost:test:ci:integration |
✅ Succeeded | 2m 30s | View ↗ |
nx run ghost:test:integration |
✅ Succeeded | 2m 27s | View ↗ |
nx run ghost:test:legacy |
✅ Succeeded | 2m 57s | View ↗ |
nx run ghost-admin:test |
✅ Succeeded | 2m 51s | View ↗ |
nx run ghost:test:e2e |
✅ Succeeded | 2m 15s | View ↗ |
nx run @tryghost/admin:build |
✅ Succeeded | 1m 56s | View ↗ |
nx run-many --target=build --projects=tag:publi... |
✅ Succeeded | 2s | View ↗ |
nx run-many -t lint -p ghost-admin,ghost |
✅ Succeeded | 39s | View ↗ |
Additional runs (3) |
✅ Succeeded | ... | View ↗ |
💡 Verify your cache is correct by running tasks in a sandbox. Read docs ↗
☁️ Nx Cloud last updated this comment at 2026-06-26 17:15:43 UTC
9e3c79f to
b28b3a9
Compare
no ref - The admin "all comments" moderation page (getAdminAllComments) orders by created_at and emits the count.replies / count.direct_replies relations as per-row correlated subqueries. With no supporting indexes the optimizer full-scans the comments table once per returned row and filesorts the whole table for the ordering, so the page took ~90s to load on sites with very large numbers of comments (especially ones skewed toward top-level comments). - Added four additive secondary indexes on comments — created_at, status, (in_reply_to_id, status) and (parent_id, in_reply_to_id, status) — covering the ORDER BY list, the COUNT(DISTINCT) pagination count, and the reply-count subqueries, cutting the page's DB time from ~90s to tens of milliseconds. - Chose indexes over a query refactor to keep the change minimal and backward-compatible; they are purely additive, so the migration's down drops them without disturbing the existing foreign-key indexes.
- InnoDB silently consolidates the auto-created single-column FK indexes on parent_id and in_reply_to_id into the new composite indexes once they exist (because the composites have those columns as leading prefix), so dropping the composites in down() failed with ERROR 1553 'needed in a foreign key constraint' - down() now re-adds the single-column FK indexes before dropping the composites, restoring the pre-migration shape cleanly - comment block corrected to describe what each index actually does (status is for status-filtered pagination COUNTs, not the unfiltered one which picks created_at) and to call out the InnoDB consolidation
- the replies hasMany relation ordered by created_at ASC only; replies inserted in a tight loop (admin Comments e2e tests) share a millisecond on created_at, so the tie-break was implementation-defined - the new created_at index in the comments-moderation migration exposed this by giving the planner a covering index for the sort, which flipped two reply orderings against the test's implicit assumption about insertion order - pinned ordering to id ASC as a deterministic tie-break, restoring the insertion-order behavior the tests and any client paginating by id rely on
b28b3a9 to
16661eb
Compare
|
Got tripped up on a couple minor version releases + flaky acceptance tests. |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #28734 +/- ##
=======================================
Coverage 74.37% 74.37%
=======================================
Files 1559 1559
Lines 135002 135006 +4
Branches 16406 16414 +8
=======================================
+ Hits 100402 100414 +12
- Misses 33573 33596 +23
+ Partials 1027 996 -31
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|

Hello Ghost team 👋 - This is my first PR and I've tried to follow all repository guidelines. For context, I'm working on a migration from Wordpress that has almost 400k comments. Once imported I noticed performance issues for some pages and queries for comments. This PR attempts to add some tables indexes to improve performance.
Thanks in advance!
--
On sites with a large number of comments, the admin Comments moderation page ("all comments") becomes effectively unusable. It took ~90s to load on a real-world dataset of ~390k comments. The public per-post comment widget is unaffected; this is purely the admin
getAdminAllCommentspath.The page query orders by
created_atand emits thecount.replies/count.direct_repliesrelations as per-row correlated subqueries. With no supporting indexes, the optimizer full-scans thecommentstable once per returned row and filesorts the whole table for the ordering. The problem is worst when the data is skewed toward top-level comments (parent_id IS NULL), where the column is too low-cardinality for the optimizer to trust the existing single-column FK index, but it slows down any large comment table.What does it do?
Adds four additive secondary indexes to the
commentstable, declared inschema.jsand applied via a non-transactional migration:comments(created_at)ORDER BY created_at DESClist (removes the filesort)comments(status)COUNT(DISTINCT id)pagination countcomments(in_reply_to_id, status)count.direct_repliessubquery onin_reply_to_idcomments(parent_id, in_reply_to_id, status)count.repliesplus theparent_id + in_reply_to_id IS NULLhalf ofcount.direct_repliesThe 3-column index covers both the
parent_id-only andparent_id + in_reply_to_id IS NULLsubqueries, so a separate(parent_id, status)is not needed. On the test dataset this cut the page's DB time from ~90s to tens of milliseconds.The indexes are purely additive —
parent_idandin_reply_to_idkeep their own foreign-key indexes — so the migration'sdowndrops them without any FK index re-add dance.Why is this something Ghost users or developers need?
Comment moderation on any high-volume Ghost site is currently slow to the point of timing out. This is a low-risk, backward-compatible fix (no schema/data. changes beyond indexes, no API or behaviour changes) that makes the moderation page usable at scale.
Notes / trade-offs
comments(status)is low-cardinality: it still beats a clustered full scan for theCOUNT(DISTINCT)because the secondary index is far narrower than the row (which carries thehtmllongtext). It's the most droppable of the four.Testing
integrity.test.js).upcreates all four indexes, is idempotent, anddownreverses cleanly.A note on that last checkbox: this is a pure index/migration change, covered by the schema integrity test and manual migration verification, but there's no automated test proving the performance win (that needs a large seeded dataset and EXPLAIN ANALYZE, which isn't practical as a unit test). I left it unchecked to be honest.