Skip to content

dm: ignore ghost/trash table DDL without gh-ost comment when online-ddl is enabled#12565

Open
Takashi-kun wants to merge 1 commit intopingcap:masterfrom
Takashi-kun:master
Open

dm: ignore ghost/trash table DDL without gh-ost comment when online-ddl is enabled#12565
Takashi-kun wants to merge 1 commit intopingcap:masterfrom
Takashi-kun:master

Conversation

@Takashi-kun
Copy link
Copy Markdown

@Takashi-kun Takashi-kun commented Mar 19, 2026

What problem does this PR solve?

Issue Number: close #12564

When a gh-ost migration leaves behind a ghost table (e.g. _t_gho), an operator may manually rename it for cleanup:

RENAME TABLE _t_gho TO will_be_deleted_t_gho;

Because DM identifies gh-ost tables purely by table-name regex, it treats any GhostTable → RealTable rename as a gh-ost cutover and tries to replay stored ALTER TABLE statements against the destination. Since that table doesn't exist in TiDB, the apply fails and blocks replication.

What is changed and how it works?

gh-ost embeds a /* gh-ost */ comment in all SQL statements it issues — CREATE TABLE, ALTER TABLE, RENAME TABLE, DROP TABLE, etc.:

create /* gh-ost */ table `schema`.`_t_gho` like `schema`.`t`
alter /* gh-ost */ table `schema`.`_t_gho` add column `n` int
rename /* gh-ost */ table `schema`.`t` to `schema`.`_t_del`, `schema`.`_t_gho` to `schema`.`t`

This comment is preserved verbatim in the MySQL binlog QUERY event.

Since ghost/trash tables (_gho, _ghc, _del) never exist in TiDB (DM does not replicate their creation downstream), any DDL on these tables that lacks the /* gh-ost */ comment is a manual operation and must be silently ignored. In processOneDDL, before calling Apply(), we now check the first source table type and the presence of the comment. Without the comment on a ghost/trash table, the event returns nil and replication continues unblocked.

Why not pass the statement through?
The initial approach returned the statement as-is for manual renames. But since ghost tables never exist in TiDB, executing RENAME TABLE _t_gho TO will_be_deleted downstream would also fail. Returning nil (ignore) is the correct behaviour.

Check List

Tests

  • Unit test (TestGhostTableDDLWithoutGhostComment)
    • Sub-case A: gh-ost ALTER (with /* gh-ost */ comment) is stored; cutover RENAME (with comment) replays the stored ALTER — existing behaviour preserved
    • Sub-case B: manual RENAME without comment — silently ignored, no DDL emitted, no error

Questions

Will it cause performance regression or break compatibility?

No. The added check only runs when onlineDDL is enabled (online-ddl: true). The regexp match is O(n) on the raw SQL string and is negligible compared to the rest of DDL processing.

Do you need to update user documentation, design documentation or monitoring documentation?

No.

Release note

Fixed DM incorrectly replaying stored gh-ost ALTER TABLE DDLs when a ghost/trash table is manually operated on without the gh-ost comment. Such events are now silently ignored since ghost tables never exist in the downstream.

@ti-chi-bot ti-chi-bot bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note Denotes a PR that will be considered when it comes time to generate release notes. do-not-merge/needs-triage-completed area/dm Issues or PRs related to DM. labels Mar 19, 2026
@ti-chi-bot
Copy link
Copy Markdown
Contributor

ti-chi-bot bot commented Mar 19, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign benjamin2037 for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot added contribution This PR is from a community contributor. first-time-contributor Indicates that the PR was contributed by an external member and is a first-time contributor. labels Mar 19, 2026
@ti-chi-bot
Copy link
Copy Markdown
Contributor

ti-chi-bot bot commented Mar 19, 2026

Hi @Takashi-kun. Thanks for your PR.

I'm waiting for a pingcap member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@ti-chi-bot ti-chi-bot bot added the needs-ok-to-test Indicates a PR created by contributors and need ORG member send '/ok-to-test' to start testing. label Mar 19, 2026
@ti-chi-bot
Copy link
Copy Markdown
Contributor

ti-chi-bot bot commented Mar 19, 2026

Welcome @Takashi-kun!

It looks like this is your first PR to pingcap/tiflow 🎉.

I'm the bot to help you request reviewers, add labels and more, See available commands.

We want to make sure your contribution gets all the attention it needs!



Thank you, and welcome to pingcap/tiflow. 😃

@pingcap-cla-assistant
Copy link
Copy Markdown

pingcap-cla-assistant bot commented Mar 19, 2026

CLA assistant check
All committers have signed the CLA.

@ti-chi-bot ti-chi-bot bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Mar 19, 2026
@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request resolves an issue where DM would erroneously attempt to replay stored DDLs during manual RENAME TABLE operations involving gh-ost tables. By inspecting the original SQL for the presence of a specific gh-ost comment, the system can now correctly distinguish between automated gh-ost cutovers and operator-initiated cleanups, ensuring that only legitimate gh-ost cutovers trigger DDL replay and allowing manual renames to proceed without interference.

Highlights

  • Problem Addressed: DM was incorrectly replaying stored ALTER TABLE DDLs when a gh-ost table was manually renamed to a real table without the /* gh-ost */ comment, leading to replication failures.
  • Solution Implemented: Introduced a check within processOneDDL to identify RENAME TABLE statements involving ghost-to-real table renames. If the original SQL for such a rename does not contain the /* gh-ost */ comment, it is now treated as a manual operation and passed through as-is, preventing DM from attempting to replay stored ALTER TABLE statements.
  • Comment Detection: A new regular expression, ghostCommentRegexp, was added to detect the presence of the /* gh-ost */ marker in the originSQL of RENAME TABLE statements.
  • Test Coverage: A new unit test, TestGhostTableRenameWithoutGhostComment, was added to cover both scenarios: gh-ost cutover renames with the comment (preserving existing behavior) and manual renames without the comment (applying the new logic).

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly fixes an issue where manual renames of gh-ost ghost tables were incorrectly treated as gh-ost cutovers, leading to DDL replay errors. The approach of checking for the /* gh-ost */ comment in the RENAME TABLE statement is a solid and reliable way to distinguish between automated cutovers and manual operations. The implementation is clean, and the addition of TestGhostTableRenameWithoutGhostComment provides good test coverage for both the new and existing logic paths. I have one suggestion to improve code readability.

dm/syncer/ddl.go Outdated
Comment on lines +1283 to +1289
if _, ok := ddlInfo.stmtCache.(*ast.RenameTableStmt); ok &&
len(ddlInfo.sourceTables) == parserpkg.SingleRenameTableNameNum &&
ddl.onlineDDL.TableType(ddlInfo.sourceTables[0].Name) == onlineddl.GhostTable &&
ddl.onlineDDL.TableType(ddlInfo.sourceTables[1].Name) == onlineddl.RealTable &&
!ghostCommentRegexp.MatchString(qec.originSQL) {
return []string{ddlInfo.originDDL}, nil
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This if condition is a bit long and combines several checks, which can be hard to parse. To improve readability, consider breaking down the logic into a few named boolean variables that describe each part of the condition.

Suggested change
if _, ok := ddlInfo.stmtCache.(*ast.RenameTableStmt); ok &&
len(ddlInfo.sourceTables) == parserpkg.SingleRenameTableNameNum &&
ddl.onlineDDL.TableType(ddlInfo.sourceTables[0].Name) == onlineddl.GhostTable &&
ddl.onlineDDL.TableType(ddlInfo.sourceTables[1].Name) == onlineddl.RealTable &&
!ghostCommentRegexp.MatchString(qec.originSQL) {
return []string{ddlInfo.originDDL}, nil
}
if _, ok := ddlInfo.stmtCache.(*ast.RenameTableStmt); ok {
isGhostToRealRename := len(ddlInfo.sourceTables) == parserpkg.SingleRenameTableNameNum &&
ddl.onlineDDL.TableType(ddlInfo.sourceTables[0].Name) == onlineddl.GhostTable &&
ddl.onlineDDL.TableType(ddlInfo.sourceTables[1].Name) == onlineddl.RealTable
isManualRename := isGhostToRealRename && !ghostCommentRegexp.MatchString(qec.originSQL)
if isManualRename {
return []string{ddlInfo.originDDL}, nil
}
}

@Takashi-kun Takashi-kun marked this pull request as ready for review March 19, 2026 07:51
@ti-chi-bot ti-chi-bot bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 19, 2026
@Takashi-kun Takashi-kun changed the title dm: fix spurious online-DDL replay for ghost→real RENAME without gh-ost comment dm: ignore ghost/trash table DDL without gh-ost comment when online-ddl is enabled Mar 20, 2026
…dl is enabled

When a gh-ost migration leaves behind ghost/trash tables (_gho, _ghc, _del),
operators may manually rename or drop them for cleanup. Because DM identifies
these tables purely by table-name regex, it was treating any GhostTable→RealTable
RENAME as a gh-ost cutover and replaying stored ALTER TABLE DDLs against the
destination, blocking replication.

gh-ost embeds a /* gh-ost */ comment in every SQL statement it issues (CREATE,
ALTER, RENAME, DROP). Since ghost/trash tables never exist in TiDB (DM never
replicates their creation), any DDL on these tables without the gh-ost comment
is a manual operation and must be silently ignored.

In processOneDDL, before calling Apply(), check whether the first source table
is a ghost or trash table and the raw binlog SQL lacks the /* gh-ost */ marker.
If so, return nil — no downstream DDL is emitted and replication continues
unblocked.

close pingcap#12564

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/dm Issues or PRs related to DM. contribution This PR is from a community contributor. do-not-merge/needs-triage-completed first-time-contributor Indicates that the PR was contributed by an external member and is a first-time contributor. needs-ok-to-test Indicates a PR created by contributors and need ORG member send '/ok-to-test' to start testing. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

online-ddl: manual cleanup of gh-ost tables incorrectly triggers ALTER TABLE replay

1 participant