You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
An Online DDL ALTER TABLE completes by cutting over from the original table to the shadow table. This final step involves holding table locks, and has a timeout.
On very busy tables, the operation will timeout. The Online DDL scheduler will reattempt after 1 minute. Under a sustained load this could mean repetitive attempts over hours at 1 minute intervals. This is both wasteful and harmful. It's harmful because 15sec in every minute will attempt to acquire locks, which means interfering with traffic even more.
We want to offer two opposed changes at the same time:
A backoff mechanism: first retry in 1min, then in, say, 5min, then 10min, 30min, 1hr, and keep at 1h intervals (precise values to change).
A way to require a brute-force cut-over. This involves:
A pre-determined brute force cutover duration: counting from the moment of the first cut-over attempt, after given duration the Online DDL attempts a brute-force cut-over (see following)
And/or a SQL command such as ALTER VITESS_MIGRATION ... DO THE THING AND BRUTE FORCE CUT OVER NOW PLEASE
Brute-force cut-over implemented by identifying any queries + transactions holding locks on migrated table. When in brute-force mode, the cut-over mechanism attempts to kill related queries/connections.
Industry solutions typically attempt to kill any non-replication long-running queries. We want to be smart and only affect relevant queries, as well as identify transactions that are holding locks on the table but not in fact running any specific query on the table at the moment, maybe not running any query at the moment.
Use Case(s)
Online DDL on busy systems
The text was updated successfully, but these errors were encountered:
Feature Description
An Online DDL
ALTER TABLE
completes by cutting over from the original table to the shadow table. This final step involves holding table locks, and has a timeout.On very busy tables, the operation will timeout. The Online DDL scheduler will reattempt after 1 minute. Under a sustained load this could mean repetitive attempts over hours at 1 minute intervals. This is both wasteful and harmful. It's harmful because 15sec in every minute will attempt to acquire locks, which means interfering with traffic even more.
We want to offer two opposed changes at the same time:
1min
, then in, say,5min
, then10min
,30min
,1hr
, and keep at1h
intervals (precise values to change).ALTER VITESS_MIGRATION ... DO THE THING AND BRUTE FORCE CUT OVER NOW PLEASE
Industry solutions typically attempt to kill any non-replication long-running queries. We want to be smart and only affect relevant queries, as well as identify transactions that are holding locks on the table but not in fact running any specific query on the table at the moment, maybe not running any query at the moment.
Use Case(s)
Online DDL on busy systems
The text was updated successfully, but these errors were encountered: