Skip to content

Table Space Bloat Due to Gh-ost #1484

@xiehaopeng

Description

@xiehaopeng

Description:
We have observed significant table space bloat as a result of using gh-ost to alter table. The number of rows before and after alter table has basically not changed, but the table space has expanded by 60%.The amount of table space expansion is related to the size of a single row in the table. When the size of a single row is about 4500 bytes, the expansion is more obvious.

We believe the primary cause of this issue appears to be the interchangeable use of "copy rows" and "apply events" during the gh-ost processes. When there are INSERT operations happening concurrently, this leads to a mix of sequential and out-of-order inserts. This mixed insertion pattern triggers InnoDB's insert point split strategy frequently. After the split, the leaf nodes contain unutilized space (page-level fragmentation), which results in wasted space and inflated table size.

Steps to Reproduce:

  • First, mock a large table of 50GB, with the size of a single row controlled at around 4500 bytes.
  • Then use native MySQL alter to organize table.
  • Then write a script to simulate low-frequency insert traffic.
  • Then execute gh-ost , and start the traffic simulation script before gh-ost.
  • Finally, after gh-ost is completed, compare the size of the original table and the temporary table.

A Possible Improvement Idea:
Based on the ignoring binlog events feature of feat binlog apply optimization #1378 we dynamically expand the maximum boundary value of the copy, ignore the large unique key insert event, which may avoid this problem.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions