Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Online DDL: report rows_copied for migration (initial support in gh-ost) #8255

Merged
merged 2 commits into from
Jun 6, 2021

Conversation

shlomi-noach
Copy link
Contributor

@shlomi-noach shlomi-noach commented Jun 3, 2021

Description

This PR introduces a new column rows_copied in _vt.schema_migrations table and in the output for SHOW VITESS_MIGRATIONS ... queries.

rows_copied is either an exact or close estimate of the number of rows copied from the original table into the ghost table during a migration.

  • If the migration is successfully complete then rows_copied is the exact number of rows.
  • If the migration is failed then rows_copied value can be slightly below the actual number of rows copied, depending on the type of failure.

The value gets populated in this PR by gh-ost migrations. Work on #8015 will make the value available in online (vreplication) migrations, too.

There's an endtoend test that verifies the behavior in a sharded cluster.

Related Issue(s)

Checklist

  • Tests were added or are not required
  • Documentation was added or is not required

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
@shlomi-noach shlomi-noach added Type: Enhancement Logical improvement (somewhere between a bug and feature) Component: Query Serving labels Jun 3, 2021
@shlomi-noach shlomi-noach requested review from deepthi and a team June 3, 2021 11:57
// we don't update the table value.
return nil
}
query, err := sqlparser.ParseAndBind(sqlUpdateMigrationRowsCopied,
Copy link
Collaborator

@vmg vmg Jun 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My only concern is that this ParseAndBind could be optimized by only parsing the statement once and re-binding on each update, but I see the same pattern is used throughout the executor, so optimizing only for updateRowsCopied seems wrong. I'll revisit this down the road and refactor the whole file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This statement is expected to run once or twice a minute (when a migration is running). All over the Online DDL code, I don't try to optimize query performance. Do feel free to rewrite if you want to have uniform coding style through the code, but optimization-wise this is not an interesting place IMHO.

@@ -2703,8 +2725,12 @@ func (e *Executor) OnSchemaMigrationStatus(ctx context.Context, uuidParam, statu
if eta, err := strconv.ParseInt(etaParam, 10, 64); err == nil {
etaSeconds = eta
}
var rowsCopied int64 = 0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weird that the linter is not complaining about this. I thought we had a lint for not zero-initializing variables that are already zero-initialized.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, weird. I'll clean it nonetheless.

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
@shlomi-noach shlomi-noach merged commit c96003e into vitessio:main Jun 6, 2021
@shlomi-noach shlomi-noach deleted the online-ddl-rows-copied branch June 6, 2021 08:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: Query Serving Type: Enhancement Logical improvement (somewhere between a bug and feature)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants