Feature Request: Online DDL cut-over backoff + forced completion #14530

shlomi-noach · 2023-11-16T08:30:06Z

Feature Description

An Online DDL ALTER TABLE completes by cutting over from the original table to the shadow table. This final step involves holding table locks, and has a timeout.

On very busy tables, the operation will timeout. The Online DDL scheduler will reattempt after 1 minute. Under a sustained load this could mean repetitive attempts over hours at 1 minute intervals. This is both wasteful and harmful. It's harmful because 15sec in every minute will attempt to acquire locks, which means interfering with traffic even more.

We want to offer two opposed changes at the same time:

A backoff mechanism: first retry in 1min, then in, say, 5min, then 10min, 30min, 1hr, and keep at 1h intervals (precise values to change).
A way to require a brute-force cut-over. This involves:

A pre-determined brute force cutover duration: counting from the moment of the first cut-over attempt, after given duration the Online DDL attempts a brute-force cut-over (see following)
And/or a SQL command such as ALTER VITESS_MIGRATION ... DO THE THING AND BRUTE FORCE CUT OVER NOW PLEASE
Brute-force cut-over implemented by identifying any queries + transactions holding locks on migrated table. When in brute-force mode, the cut-over mechanism attempts to kill related queries/connections.

Industry solutions typically attempt to kill any non-replication long-running queries. We want to be smart and only affect relevant queries, as well as identify transactions that are holding locks on the table but not in fact running any specific query on the table at the moment, maybe not running any query at the moment.

Use Case(s)

Online DDL on busy systems

The text was updated successfully, but these errors were encountered:

shlomi-noach added Type: Feature Request Component: Online DDL Online DDL (vitess/native/gh-ost/pt-osc) labels Nov 16, 2023

shlomi-noach self-assigned this Nov 16, 2023

shlomi-noach mentioned this issue Nov 19, 2023

Online DDL: support migration cut-over backoff and forced cut-over #14546

Merged

4 tasks

shlomi-noach closed this as completed in #14546 Dec 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: Online DDL cut-over backoff + forced completion #14530

Feature Request: Online DDL cut-over backoff + forced completion #14530

shlomi-noach commented Nov 16, 2023

Feature Request: Online DDL cut-over backoff + forced completion #14530

Feature Request: Online DDL cut-over backoff + forced completion #14530

Comments

shlomi-noach commented Nov 16, 2023

Feature Description

Use Case(s)