-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OnlineDDL: "cancel-all" command to cancel all pending migrations in keyspace #7099
OnlineDDL: "cancel-all" command to cancel all pending migrations in keyspace #7099
Conversation
The cancel-all command cancels all pending (queued, ready, running) migrations in a keyspace Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
…-all Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Documentation PR: vitessio/website#622 |
I think we might also want to cancel any ongoing migration as well. If it is realised that it cannot run at the moment may be because of wrong query or time of change is wrong or it can be grouped better with newer change, etc. I suggest to add a parameter to this command which can be force cancel. |
This already exists: https://vitess.io/docs/user-guides/schema-changes/managed-online-schema-changes/#cancelling-a-migration e.g. will cancel a speciic running migration, or, if it's not running yet, will ensure it doesn't run.
|
May be I miss read the description. |
So, we already have a per-migration |
time.Sleep(10 * time.Second) | ||
checkCancelAllMigrations(t, 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can this sleep time be reduced. It just delays the e2e test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately all the online-DDL flow works asynchronously, with check intervals. In this endtoend
tests intervals are reduced to just a few secs, and still you have to give time for the check, for gh-ost
, for callback...
This is why I extracted online-ddl to be on its own CI shard
; it's still shorter than our other endtoend
tests so right now I feel like it's not worth investing the optimization effort.
func (e *Executor) cancelPendingMigrations(ctx context.Context) (result *sqltypes.Result, err error) { | ||
parsed := sqlparser.BuildParsedQuery(sqlSelectPendingMigrations, "_vt") | ||
r, err := e.execQuery(ctx, parsed.Query) | ||
if err != nil { | ||
return result, err | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there can be race condition right between running and completed Migration. For my understanding, what will happen when a running migration just got completed after it is selected in pending migrations Or when cancelPendingMigration is in progress?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're correct, and this race condition is unavoidable. If the user chooses to "cancel" or "cancel-all" just when the migration is about to complete, it's possible that the migration will win the race and complete; in which case the "cancel" is a no-op and nothing happens (the migration is just complete)
This PR adds a
vtctl OnlineDDL <keyspace> cancel-all
command.The command is pretty strong: it will cancel any pending migration for the given keyspace. A migration is
pending
if it's running or about to run, ie in one of these states:queued
ready
running
I have a specific use case for
cancel-all
, where there's need to re-evaluate the migrations for a given keyspace; I can think of another use case: some BigOperation needs to take place on the keyspace and the operations team wishes to abort pending migrations.General context: #6926