Skip to content

Fixed slow database migration on large tables#27666

Merged
sagzy merged 1 commit intomainfrom
intelligent-lamarr-47a06c
May 5, 2026
Merged

Fixed slow database migration on large tables#27666
sagzy merged 1 commit intomainfrom
intelligent-lamarr-47a06c

Conversation

@sagzy
Copy link
Copy Markdown
Contributor

@sagzy sagzy commented May 5, 2026

ref https://linear.app/ghost/issue/BER-3565
ref https://linear.app/ghost/issue/BER-3542
ref #27615
ref https://ghost.slack.com/archives/CJBJ3MZFD/p1777972226248739

What changed

The migration 2026-04-29-12-13-44-add-batch-id-to-members-status-events.js now passes {algorithm: 'auto'} to createAddColumnMigration, so the emitted DDL is ALTER TABLE members_status_events ADD COLUMN batch_id... omits the default ALGORITHM=COPY.

Why

createAddColumnMigration delegates to commands.addColumn, which on MySQL appends ALGORITHM=COPY by default unless {algorithm: 'auto'} is explicitly passed. ALGORITHM=COPY rebuilds the entire table: it creates a new copy with the new column, copies every row across, then swaps it in. The cost is O(rows), and on members_status_events — which gets one row per member status change — this scales with the size of the member base.

We hit this on a staging site with ~6M members_status_events rows on boot:

[MIGRATE] Migration error:  alter table `members_status_events` add `batch_id` varchar(24) null, algorithm=copy - Connection lost: The server closed the connection.

ALGORITHM=AUTO lets MySQL pick the best strategy. For a nullable trailing-column add on MySQL 8.0+ it picks INSTANT, which is a metadata-only change and effectively constant-time regardless of table size.

This isn't a new conclusion — #25018 (Added utm_ columns to events tables) introduced the algorithm option for exactly this reason after observing 60–250s migrations on a 2M-row members_created_events table vs. 2–3s when MySQL was allowed to choose. All the column adds in that PR (and the canonical add-utm-fields.js migration) use {algorithm: 'auto'}.

Why we're modifying an existing migration

We don't normally do this — .agents/skills/create-database-migration/rules.md states migrations are immutable once on main. We're making an exception here because:

  1. A new migration cannot fix the original. The bad ALTER TABLE runs first when production upgrades through 6.36/. Adding a fix in a later folder doesn't change that.
  2. The schema is unchanged. Only the MySQL execution hint differs. schema.js is untouched, so the integrity hash isn't bumped, and the end-state column definition is identical.
  3. The migration helper is idempotent. createColumnMigration checks hasColumn and skips if the column already exists — so sites that already ran the original version on 6.36.0 simply no-op when redeployed against this commit. They don't see the migration twice and there's no schema drift between sites that ran the old version and sites that will run the new one.

In short: existing sites that already migrated are unaffected; sites that haven't yet (including production) get the fast path.

Test plan

  • Verify migration runs successfully on a staging site with a large members_status_events table
  • Verify migration is a no-op on a site where batch_id already exists

🤖 Generated with Claude Code

ref https://linear.app/ghost/issue/BER-3565
ref https://linear.app/ghost/issue/BER-3542

Without {algorithm: 'auto'}, createAddColumnMigration emits
ALGORITHM=COPY by default, which rebuilds the entire table. On a
staging site with ~6M members_status_events rows this was slow enough
to drop the MySQL connection mid-migration and prevent boot:

  [MIGRATE] Migration error: alter table members_status_events
  add batch_id varchar(24) null, algorithm=copy
  - Connection lost: The server closed the connection.

Passing {algorithm: 'auto'} lets MySQL pick INSTANT for this nullable
trailing-column add, making it metadata-only and effectively
constant-time regardless of row count. Same conclusion reached in
2ea046c for the utm_ events column adds.

Modifying the existing migration (rather than adding a follow-up) is
required because the bad ALTER would otherwise run first on production
before any later fix could take effect. The change is safe: schema.js
is unchanged, the helper is idempotent, and sites that already ran the
original version no-op on next deploy.
@github-actions github-actions Bot added the migration [pull request] Includes migration for review label May 5, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 5, 2026

It looks like this PR contains a migration 👀
Here's the checklist for reviewing migrations:

General requirements

  • ⚠️ Tested performance on staging database servers, as performance on local machines is not comparable to a production environment
  • Satisfies idempotency requirement (both up() and down())
  • Does not reference models
  • Filename is in the correct format (and correctly ordered)
  • Targets the next minor version
  • All code paths have appropriate log messages
  • Uses the correct utils
  • Contains a minimal changeset
  • Does not mix DDL/DML operations
  • Tested in MySQL and SQLite

Schema changes

  • Both schema change and related migration have been implemented
  • For index changes: has been performance tested for large tables
  • For new tables/columns: fields use the appropriate predefined field lengths
  • For new tables/columns: field names follow the appropriate conventions
  • Does not drop a non-alpha table outside of a major version

Data changes

  • Mass updates/inserts are batched appropriately
  • Does not loop over large tables/datasets
  • Defends against missing or invalid data
  • For settings updates: follows the appropriate guidelines

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 5, 2026

Walkthrough

A database migration file for adding a batch_id column to the members_status_events table was updated. The migration's createAddColumnMigration function call now includes an additional options parameter {algorithm: 'auto'}. The column definition itself, which specifies a string type with a maximum length of 24 characters and nullable set to true, remains unchanged. This modification affects how the database executes the column addition operation.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Title check ✅ Passed The title accurately and concisely summarizes the main change: optimizing a database migration to prevent timeouts on large tables by adding the algorithm=auto option.
Description check ✅ Passed The pull request description is directly related to the changeset, explaining the specific migration modification, the rationale behind the MySQL algorithm change, and the impact on performance.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch intelligent-lamarr-47a06c

Warning

Review ran into problems

🔥 Problems

Git: Failed to clone repository. Please run the @coderabbitai full review command to re-trigger a full review. If the issue persists, set path_filters to include or exclude specific files.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@sagzy sagzy added the preview Deploy a PR preview environment label May 5, 2026
@Ghost-Slimer Ghost-Slimer temporarily deployed to pr-preview-27666 May 5, 2026 10:24 Destroyed
@sagzy sagzy changed the title Used algorithm=auto for adding batch_id to members_status_events Fixed algorithm for adding batch_id to members_status_events for large tables May 5, 2026
@sagzy sagzy changed the title Fixed algorithm for adding batch_id to members_status_events for large tables Fixed slow database migration on large tables May 5, 2026
@sagzy sagzy merged commit 29dd749 into main May 5, 2026
49 checks passed
@sagzy sagzy deleted the intelligent-lamarr-47a06c branch May 5, 2026 12:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

migration [pull request] Includes migration for review preview Deploy a PR preview environment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants