Experimental: automated, scheduled, dependency free online DDL via gh-ost/pt-online-schema-change #6547

shlomi-noach · 2020-08-09T13:13:59Z

This PR (work in progress) introduces zero dependency online schema changes with gh-ost/pt-online-schema-change.

UPDATE: this comment edited to reflect support for pt-online-schema-change. Originally this PR only supported gh-ost. Mostly whenever you see gh-ost, consider pt-online-schema-change to apply, as well.

TL;DR

User will issue:

alter with 'gh-ost' table example modify id bigint not null;

alter with 'pt-osc' table example modify id bigint not null

or

$ vtctl -topo_implementation etcd2 -topo_global_server_address localhost:2379 -topo_global_root /vitess/global \
    ApplySchema -sql "alter with 'gh-ost' table example modify id bigint unsigned not null" commerce

$ vtctl -topo_implementation etcd2 -topo_global_server_address localhost:2379 -topo_global_root /vitess/global \
    ApplySchema -sql "alter with 'pt-osc' table example modify id bigint unsigned not null" commerce

and vitess will schedule an online schema change operation to run on all relevant shards, then proceed to apply the change via gh-ost on all shards.

While this PR is WIP, this flow works. More breakdown to follow, indicating what's been done and what's still missing.

The ALTER TABLE problem

First, to iterate the problem: schema changes have always been a problem with MySQL; a straight ALTER is a blocking operation; a ONLINE ALTER is only "online" on the master/primary, but is effectively blocking on replicas. Online schema change tools like pt-online-schema-change and gh-ost overcome these limitations by emulating an ALTER on a "ghost" table, which is populated from the original table, then swapped in its space.

For disclosure, I authored gh-ost's code as part of the database infrastructure team at GitHub.

Traditionally, online schema changes are considered to be "risky". Trigger based migrations add significant load onto the master server, and their cut-over phase is known to be a dangerous point. gh-ost was created at GitHub to address these concerns, and successfully eliminated concerns for operational risks: with gh-ost the load on the master is low, and well controlled, and the cut-over phase is known to cause no locking issues. gh-ost comes with different risks: it applies data changes programmatically, thus the issue of data integrity is of utmost importance. Another note of concern is data traffic: going out from MySQL into gh-ost and back into MySQL (as opposed to all-in MySQL in pt-online-schema-change).

This way or the other, running an online schema change is typically a manual operation. A human being will schedule the migration, kick it running, monitor it, possibly cut-over. In a sharded environment, a developer's request to ALTER TABLE explodes to n different migrations, each needs to be scheduled, kicked, monitored & tracked.

Sharded environments are obviously common for vitess users and so these users feel the pain more than others.

Schema migration cycle & steps

Schema management is a process that begins with the user designing a schema change, and ends with the schema being applied in production. This is a breakdown of schema management steps as I know them:

Design code
Publish changes (pull request)
Review
Formalize migration command (the specific ALTER TABLE or pt-online-schema-change or gh-ost command)
Locate: where in production should this migration run?
Schedule
Execute
Audit/monitor
Cut-over/complete
Cleanup
Notify user
Deploy & merge

What we propose to address

Vitess's architecture uniquely positions it to be able to automate away much of the process. Specifically:

Formalize migration command: turning an ALTER TABLE statement into a gh-ost invocation is super useful if done by vitess, since vitess can not only validate schema/params, but also can provide credentials, identify a throttle-control replica, can instruct gh-ost on how to communicate progress via hooks, etc.
Locate: given schema/table, vitess just knows where the table is located. It knows if the schema is sharded. It knows who the shards are, who the shards masters are. It knows where to run gh-ost. Last, vitess can tell us which replicas we can use for throttling.
Schedule: vitess is again in a unique position to schedule migrations. The fact someone asks for a migration to run does not mean the migration should start right away. For example, a shard may already be running an earlier migration. Running two migrations at a time is less than ideal, and it's best to wait out the first migration before beginning the second. A scheduling mechanism is both useful to running the migrations in optimal order/sequence, as well as providing feedback to the user ("your migration is on hold because this and that", or "your migration is 2nd in queue to run")
Execute: vttablet is the ideal entity to run a migration; can read instructions from topo server and can write progress to topo server. vitess is aware of possible master failovers and can request a re-execute is a migration is so interrupted mid process.
Audit/monitor: vtctld API can offer endpoints to track status of a migration (e.g. "in progress on -80, in queue on 80-"). It may offer progress pct and ETA.
cut-over/complete: in my experience with gh-ost, the cut-over phase is safe to automate away.
cleanup: the old table needs to be dropped; vttablet is in an excellent position to automate that away.

What this PR does, and what we expect to achieve

The guideline for this PR is: zero added dependencies; everything must be automatically and implicitly available via a normal vitess installation.

A breakdown:

User facing

This PR enables the user to run an online schema migration (aka online DDL) via:

vtgate: the user connects to vitess with their standard MySQL client, and issues a ALTER WITH 'gh-ost' TABLE ... statement. Notice this isn't a valid MySQL syntax -- it's a hint for vitess that we want to run this migration online. vitess still supports synchronous, "normal" ALTER TABLE statements, which IMO should be discouraged.
vtctl: the user runs vtctl ApplySchema -sql "alter with _gh-ost' table ...".

The response, in both cases, is a migration ID, or a job ID, if you will. Consider the following examples.

via vtgate:

mysql> create table example(id int auto_increment primary key, name tinytext);

mysql> show create table example \G

CREATE TABLE `example` (
  `id` int NOT NULL AUTO_INCREMENT,
  `name` tinytext,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci

mysql> alter with 'gh-ost' table example modify id bigint not null, add column status int, add key status_dx(status);
+--------------------------------------+
| uuid                                 |
+--------------------------------------+
| 211febfa-da2d-11ea-b490-f875a4d24e90 |
+--------------------------------------+

-- <wait...>

mysql> show create table example \G

CREATE TABLE `example` (
  `id` bigint NOT NULL,
  `name` tinytext,
  `status` int DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `status_dx` (`status`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci

via vtctl:

$ mysql -e "show create table example\G"

CREATE TABLE `example` (
  `id` bigint NOT NULL,
  `name` tinytext,
  `status` int DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `status_dx` (`status`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci


$ vtctl -topo_implementation etcd2 -topo_global_server_address localhost:2379 -topo_global_root /vitess/global \
    ApplySchema -sql "alter with 'gh-ost'  table example modify id bigint unsigned not null" commerce
8ec347e1-da2e-11ea-892d-f875a4d24e90


$ mysql -e "show create table example\G"

CREATE TABLE `example` (
  `id` bigint unsigned NOT NULL,
  `name` tinytext,
  `status` int DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `status_dx` (`status`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci

In both cases, a UUID is returned, which can be used for tracking (WIP) the progress of the migration across shards.

Parser

Vitess' parser now accepts ALTER WITH 'gh-ost' TABLE and ALTER WITH 'pt-osc' TABLE syntax. We're still to determine if this is the exact syntax we want to go with.

Topo

Whether submitted by vtgate or vtctl, we don't immediately run the migration. As mentioned before, we may wish to postpone the migration. Perhaps the relevant servers are already running a migration.

Instead, we write the migration request into global topo, e.g.:

key: /vitess/global/schema-migration/requests/90c5afd4-da38-11ea-a3ff-f875a4d24e90
content:

{"keyspace":"commerce","table":"example","sql":"alter table example modify id bigint not null","uuid":"90c5afd4-da38-11ea-a3ff-f875a4d24e90","online":true,"time_created":1596701930662801294,"status":"requested"}

Once we create the request in topo, we immediately return the generated UUID/migration ID (90c5afd4-da38-11ea-a3ff-f875a4d24e90 in the above example) to the user.

vtctld

vtctld gets a conceptual "upgrade" with this PR. It is no longer a reactive service. vtctld now actively monitors new schema-migration/requests in topo.

~~When it sees such a request, it evaluates what are the relevant n shards.~~

~~With current implementaiton, it writes n "job" entries, one per shard. e.g.~~

/vitess/global/schema-migration/jobs/commerce/-80/ce45b84a-da2d-11ea-b490-f875a4d24e90 and
/vitess/global/schema-migration/jobs/commerce/80-/ce45b84a-da2d-11ea-b490-f875a4d24e90 for a keyspace with two shards; or just
/vitess/global/schema-migration/jobs/commerce/0/1dd17132-da23-11ea-a3d2-f875a4d24e90 for a keyspace with one shard.

DONE: WIP: we will investigate use of new VExec to actually distribute the jobs to vttablet.

what vtctld does now, is, once it sees a migration request, it pushes a VExec request for that migration. If the VExec request succeeds, that means all shards have been notified, and vtctld can stow away the migration request (work is complete as far as vtctld is concerned). If VExec returns with an error, that means at least one shard did not get the request, and vtctld will keep retrying pushing this request.

vttablet

This is where most of the action takes place.

vttablet runs a migration service which continuously probes for, schedules, and executes migrations.

DONE: ~~With current implementation, tablets which have tablet_type=MASTER continuously probe for new entries. We look to replace this with VExec.~~

migration requests are pushed via VExec; the request includes the INSERT IGNORE query that persists the migration in _vt.schema_migrations. The tablet no longer reads from, nor writes to, Global Topo.

A new table is introduced: _vt.schema_migrations, which is how vttablet manages and tracks its own migrations.

vttablet will only run a single migration at a time.

vttablet will see if there's an unhandled migration requests. It will queue it.

vttablet will make a migration ready if there's no running migration and no other migration is marked as ready.

vttablet will run a ready migration. This is really the interesting part, with lots of goodies:

vttablet will evaluate the gh-ost ... command to run. It will obviously populate --alter=... --database=....
vttablet creates a temp directory where it generates a script to run gh-ost.
vttablet creates a hooks path and auto-generates hook files. The hooks will interact with vttablet
vttablet has an API endpoint by which the hooks can communicate gh-ost's status (started/running/success/failure) with vttablet.
vttablet provides gh-ost with --hooks-hint which is the migration's UUID.
vttablet automatically generates a gh-ost user on the MySQL server, with a random password. The password is never persisted and does not appear on ps. It is written to, and loaded from, an environment variable.
vttablet grants the proper privileges on the newly created account
vttablet will destroy the account once migration completes.
vitess repo includes a gh-ost binary. We require gh-ost from openark/gh-ost as opposed to github/gh-ost because we've had to make some special adjustments to gh-ost s oas to support this flow. I do not have direct ownership to github/gh-ost and cannot enforce those changes upstream, though I have made the contribution requestss upstream.
make build automatically appends gh-ost binary, compressed, to vttablet binary, via Ricebox.
vttablet, upon startup, auto extracts gh-ost binary into /tmp/vt-gh-ost. Please note that the user does not need to install gh-ost.
WIP: vttablet to report back the job as complete/failed. We look to use VExec. TBD.

Tracking breakdown

Quite likely more entries to be added.

Call for feedback

We're looking for community's feedback on the above suggestions/flow. Thank you for taking the time to read and respond!

shlomi-noach · 2020-08-10T04:36:40Z

Zero dependencies doesn't mean zero configuration. What's the throttle replication lag? I'm used to as low as 1s, but some setups cannot meet that constraint. How do we let the user configure that?

derekperkins · 2020-08-10T06:04:42Z

I'm very excited to see this type of automation. Solving problems everyone deals with in a Vitess-native way goes a long way towards driving mass adoption. At the same time, it's somewhat disappointing that it is using gh-ost. I get why it is, but given that gh-ost doesn't support foreign keys, and I know your personal views about them, that alienates us and many/most MySQL users who do use them.

shlomi-noach · 2020-08-10T07:11:24Z

I get why it is

👋 it's always best to be explicit. I'm not sure if your impression is that I'm fighting a religious war or am just too obsessed with my own creation 😄 , this isn't the case. FWIW pt-online-schema-change is also based on my own creation, so my bias is towards both.

pt-online-schema-change can potentially also be supported. I began with gh-ost as this is the tool I'm most ocnvenient with and have fluency in its code. My Perl foo skills are very low, and hacking the pt-online-schema-change hooks will be more difficult for me.

your personal views about them, that alienates us and many/most MySQL users who do use them.

I'm sorry to hear that, and apologize if I've alienated you in any way. I'm not sure my view on MySQL foreign keys should be alienating people and I'm dumbfounded that this is the case.

derekperkins · 2020-08-10T13:15:41Z

Apologies, my comment wasn't meant to be a personal attack or to suggest that you have personally alienated me. Your tools are awesome, and my "I get why" comment was just me acknowledging that you wrote it and thus are able to move the quickest with it, not to mention that it probably has been the most requested integration.

As for FKs, I wasn't trying to say that you have alienated anyone personally with your views, just recognizing that I'm aware of them from prior posts. Given the history of gh-ost where you were working at specific companies that didn't use FKs, it makes total sense to not deal with the extra complexity that they bring to DB tooling. In Vitess by contrast, where one of the main areas of focus is full MySQL compatibility, we're trying to support the majority of workloads, many of which include FKs, so it'd be great to support them eventually, whether that is achieved via gh-ost, pt-osc, vreplication, or something else.

Again I'm sorry for coming across negatively. I've personally interacted with you building the Orchestrator integration with the Vitess helm charts and have always been impressed by your knowledge and willingness to help. I was super excited when I found out you were going to Planetscale. As I mentioned originally, I love that you are doing the work to add this level of automation, taking advantage of the control plane that doesn't exist in vanilla MySQL, and will really help to drive adoption of Vitess. I look forward to continued interaction with you and want you to know that I hold you in the highest regard.

mattlord · 2020-08-10T15:12:50Z

😍

shlomi-noach · 2020-08-10T15:59:12Z

@derekperkins Thank you for your kind message ❤️ and I also reflect that I may take some words to present differently than intended, as I'm not a native English speaker and I can mis-parse things. I also very much enjoyed working with you in our orchestrator/helm interactions.

Regarding foreign keys, there's two ways forward:

support pt-online-schema-change. This should be doable, though I expect not zero-dependency (the user will have to make sure to have Perl and dependent packages)
add support for foreign keys in gh-ost. This should also be doable. At the time I laid the plan for what a community contribution might look like.

ameetkotian · 2020-08-10T23:54:25Z

Thanks so much for this work! I am incredibly excited about the prospect of online schema change as first-class feature supported in Vitess.

We have been using gh-ost for Vitess schema changes for about ~18 months now. The largest keyspace has more than a thousand shards. We had to build a schema change service outside of Vitess to handle distributed schema changes. The same service uses pt-osc to execute schema changes for non-Vitess cluster. Here are some of the problems that we needed to solve to make gh-ost work for Vitess. It is not intended to be a requirements list but something that will help you find more datapoints for this RFC.

Handle operations like tablet replacement, primary failovers, shard splits, mysql downtime due to backups, deployments, and upgrades during ongoing schema changes
Tracking owners for schema changes i.e., who triggered the ALTER statement
Handling retries in case of failures
Ability to trigger gh-ost's builtin throttling
Ability to send out notifications on job success/failures.

Here are something of the things we are working on adding -

Integrate gh-ost test on replica feature
Integrate table consistency checks before cutover.
Safe execution of DROP table - workflow should rename the tables and execute the drop at a later point for ease of rollback.
Version controlled schemas
Integrating pt-archiver

My only concern with the current proposal is the overhead with using the topo server for co-ordination of schema changes.

shlomi-noach · 2020-08-11T03:32:08Z

@ameetkotian adressing some of the bullet points:

Handle operations like tablet replacement, primary failovers, shard splits, mysql downtime due to backups, deployments, and upgrades during ongoing schema changes

Yes. As mentioned above, first iteration will not support concurrent migration+reshard operation, but that should be solved in future iterations. The current PR as it is still does not address the topic of resharding. With regard to failovers, again current PR does not address it, but the idea is that in the short term we will identify a failover and restart the migration. Possibly, and only where gh-ost is involved, we could finally work towards resurrection. I'd say that's in the long future.
I have no insight yet into backups/deployments/upgrades.

Tracking owners for schema changes i.e., who triggered the ALTER statement

I see that more as an external migration tracking/management system ownership. At least for now, the purpose of the PR is to provide the mechanics for online schema changes.

Handling retries in case of failures

Agreed

Ability to trigger gh-ost's builtin throttling

Agreed. i suspect VExec will make that a simple operation. VExec is a recent addition in vitess, at this time only used in vreplication, where the user can interact with internals of vitess using SQL statements; consider virtual tables like MySQL's INFORMATION_SCHEMA, only those are also updatable. More to come I hope.

Ability to send out notifications on job success/failures.

At this time I see this at a higher level than vitess.

Integrate table consistency checks before cutover.

I'd like to point you to this experimental PR, checksumming data on the fly. I haven't yet tested it in production.

Safe execution of DROP table - workflow should rename the tables and execute the drop at a later point for ease of rollback.

👍 This is on my agenda.

shlomi-noach · 2020-08-11T03:34:11Z

Possible syntax change:

ALTER WITH_GHOST TABLE... to use gh-ost
ALTER WITH_PT TABLE... to use pt-online-schema-change

shlomi-noach · 2020-08-11T05:40:28Z

Recent commit, c68d438, changes syntax to

ALTER WITH_GHOST TABLE..., which runs gh-ost
ALTER WITH_PT TABLE..., which does nothing at this time

and also breaks vtctl ApplySchema. The problem with ApplySchema is that if we say alter with_ghost table... then th epre-flight test fails, since with_ghost is not valid MySQL syntax. Need to look into that.

shlomi-noach · 2020-08-11T06:12:26Z

My only concern with the current proposal is the overhead with using the topo server for co-ordination of schema changes.

The WIP on VExec will mostly eliminate that. We will only write to global topo upon ALTER TABLE request. So one write per migration request, and then I believe another write once the migration is fully complete on all shards, and TBD what kind of write, if any, should one or more migrations fail.

shlomi-noach · 2020-08-11T10:09:25Z

I have a POC for pt-online-schema-change. Problems I have:

pt-osc doesn't report periodic status, ie. does not say "I'm healthy" throughout the migration. This is workable, but I feel blind.
user will have to make sure they have Perl::DBI and Perl::DBD::MySQL installed; so this isn't a zero dependency setup. Again, workable.
still perl-foo-ing the plugins

shlomi-noach · 2020-08-11T10:58:25Z

pt-online-schema-change now supported via ALTER WITH_PT TABLE ...

pt-online-schema-change seems to require credentials in cleartext: either on the command line, or in .cnf file; I can't seem to hide the password in an environment file.

ajm188 · 2020-08-11T11:56:12Z

Shlomi I am soooo excited about this!

I have no insight yet into backups/deployments/upgrades.

I can provide some details around this.

Backups: when running with the builtinbackupengine, vttablet will do a shutdown of mysqld to copy all the ibd files.
- If this happens on the tablet gh-ost is streaming, then (I'd guess, but we could test), gh-ost will get into a bad state not being able to connect to the replica for up to several hours.
- If it happens on a tablet that gh-ost is using for throttle-control-replicas (aside: is every replica in the shard going to be watched for replication lag?), then the alter will be throttled for the duration of the backup (again, several hours)
- Suggestion, just my opinion 😄 : I think what I want here is that the vitess/gh-ost integration won't pick the backup tablet (if a backup is ongoing), and, the builtinbackup code won't pick the gh-ost tablet (if a schema change is ongoing), and will tell gh-ost to stop checking replication lag on the backup tablet
Deployments (vttablet): After restarting all replicas, eventually you need to restart the primary, which is safest to do by PlannedReparent-ing to another tablet in the shard and then restarting the former primary. We've seen problems with the _ghc table heartbeats being written to a replica cause errant transactions, so if that reparent happens while a gh-ost process is running, we run into trouble.
- I don't have any good suggestions here, except maybe make PlannedReparent refuse to operate if a schema change is happening? That feels it could cause more problems than it solves, though.
Upgrades (I'm assuming Ameet meant mysql upgrades): same problem as deployments with respect to reparents, but also there's another concern I have about gh-ost losing connection to the replica it's streaming from for however long mysqld ends up being down during the upgrade.

shlomi-noach · 2020-08-11T12:54:30Z

The current implementation, by the way, is to run gh-ost directly on the master server via its tablet. I consider to keep it that way, and use the replicas only for throttling. Since vitess requires ROW binlog format in the first place, this should be a safe decision.

As for replicas taken down for backup or for other reasons, I wish to use freno as the all-knowing throttling service, and that’s in the mid-term run. In the short term, I still need to figure it out...

derekperkins · 2020-08-11T22:47:26Z

ALTER WITH_GHOST TABLE / ALTER WITH_PT TABLE

I like this syntax choice a lot, it's very readable. Are there any configuration options we might want to set in SQL? I'm not sure if it makes sense, but maybe these could be pseudo function calls, WITH_GHOST(a, b, c), though adding more parameters would be hard to keep backwards compatible. Maybe using comments?

Regarding foreign keys, there's two ways forward...

I'm glad that there's a viable path to support them down the road. For the reasons you've laid out in other comments, I'd prefer to see support in gh-ost eventually because it seems to be the superior tool, and maybe we'll be able to contribute towards it.

shlomi-noach · 2020-08-17T15:39:18Z

Are there any configuration options we might want to set in SQL? I'

Yeah, I suspect we'd need to support some config via SQL; in particular, I'm looking at what's an acceptable replication lag.
I'm still digging the parser, but I'd suspect the following format may work:

ALTER WITH_GHOST MIGRATION_LAG_SECONDS=1.0 TABLE ...

shlomi-noach · 2020-08-19T06:34:02Z

On the topic of handling failures:

A new VExec interface allows the user to retry a migration via SQL: something like vtctl VExec commerce.91b5c953-e1e2-11ea-a097-f875a4d24e90 "update _vt.schema_migrations set migration_status='retry'
If vttblet itself fails during the migration, then it's possible that we never mark the migration as failed.
- in the case of gh-ost, this is identifyable, because gh-ost sends keepalive updates via on-status hook. Thus, we can heuristically claim that if no sign of life was seen from a migration in past 10 minutes, the migration must be failed.
- with pt-online-schema-change this is complicated, and I don't have an immediate solution yet; probably need to track the unix process ID. I'm open to suggestions.

shlomi-noach · 2020-08-19T07:34:25Z

@rohit-nayak-ps I've now pushed my changes to VExec. I don't have a separate PR just for the VExec changes, because I worked the refactor based on the changes I needed in schema_migrations and what I found in common with vreplication.

Notable changes:

obviously vexec.go, vexec_plan.go.
- I introduced a vexecPlanner and inverted the query analysis logic (first we identify affected table, then we select tha planner, which then analyzes the query)
- the planner uses vexecPlannerParams, a set of hints to instruct the planner on how to analyze/refactor the query: name of workflow column in backend table; set of mutable or immutable columns in UPDATE queries, etc.
- VExec is now much more stateful, and keeps query, parsed statement, refactored query, planner, etc.
Introducing new VExecRequest. It's a combination of workflow+keyspace+query. I think VReplicationExec can be made redundant.
Introducing new VExec() gRPC call, which takes a VExecRequest and returns a VExecResponse. Again, I think vreplication logic can use this.
on the tablet side, a VExec interceptor gets the query
- and constructs a TabletVExec entity, which analyzes the query. It parses some useful information, like the set of columns changed in an UPDATE query, or columns with literal values on the WHERE clause. This entity is given to the final code (the "engine" or however it is referenced) that implements the vexec logic on the tablet side. It will use TabletVExec to make sanity checks, validate the query, and finally return a result set.

shlomi-noach · 2020-08-19T09:26:24Z

re: pt-online-schema-change, I will add a replication-lag plugin, that re-implements the original lag check, plus ensures to report every minute to vttablet.

shlomi-noach · 2020-08-20T08:55:36Z

Migration options now available:

alter with_ghost table my_table ... -- no options
alter with_ghost '--max-lag-millis=1500' table my_table ...
alter with_pt '--max-lag 1.5s --null-to-not-null' table my_table ...

shlomi-noach · 2020-08-20T09:03:48Z

It's now possible to retry or cancel a migration.

retry only possible if the migration is failed or cancelled
cancel only possible if the migration is queued, reader or running (in the latter case we interrupt the migration and cause it to fail)

syntax subject to change:

vtctl VExec commerce.5fe35da1-e2be-11ea-aa07-f875a4d24e90 "update _vt.schema_migrations set migration_status='retry' "
vtctl VExec commerce.5fe35da1-e2be-11ea-aa07-f875a4d24e90 "update _vt.schema_migrations set migration_status='cancel' "

shlomi-noach · 2020-08-23T05:11:20Z

Suggestions from Andrew Mason in Vitess slack:

I'm thinking a simple solution would be to add two flags to vttablet that is like:
-gh-ost-path
-pt-osc-path
Then if one of those isn't set, vttablet can use the baked-in binary for that strategy, otherwise it can locate binary where I told it to look.

Another thought with respect to not passing -drop-old-table, is that gh-ost could use -force-table-names to include the migration UUID in the gh-ost table names

…into online-ddl Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

shlomi-noach · 2020-10-04T08:07:35Z

Incorporated #6815, where the throttler is disabled, by default.

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

shlomi-noach · 2020-10-04T12:04:15Z

The test TestBackupMysqlctld/TestMasterReplicaSameBackup (endtoend shard 21) keeps failing; vtctlclient exits with error status 1. I'm able to reproduce locally. Testing locally, I see:

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x30 pc=0xd4017d]

goroutine 183 [running]:
vitess.io/vitess/go/vt/topo/etcd2topo.(*Server).Watch.func1(0xc000da7680, 0xc0010ee870, 0xc0010dcc30, 0xc0011c8c80, 0x11, 0xc0010e00d8, 0xc0010e00d0, 0x2733420, 0xc001270140, 0xc00003a7a0)
	/home/shlomi/dev/github/planetscale/vitess/go/vt/topo/etcd2topo/watch.go:93 +0x77d
created by vitess.io/vitess/go/vt/topo/etcd2topo.(*Server).Watch
	/home/shlomi/dev/github/planetscale/vitess/go/vt/topo/etcd2topo/watch.go:68 +0x5c7

The watched path is /zone1/SrvVSchema. I'm still unsure why this error happens on this PR and not on other PRs. Looking for things that could cause lock/timeouts; can't see an issue with Open()/Close() on tablet sever; any topo entry I Lock I then unlock. Still digging.

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

shlomi-noach · 2020-10-04T12:45:10Z

Found it! endtoend (21) is now fixed in df502e9. The problem was with an unqualified query, called by initSchema(), called by Open(). I'm not sure why this would only fail in TestBackupMain/TestMasterReplicaSameBackup and not in TestBackupMain/TestReplicaBackup or TestBackupMain/TestRdonlyBackup or TestBackupMain/TestMasterBackup.

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

shlomi-noach · 2020-10-05T05:51:28Z

Upon migration completion (whether successful or failed), online-ddl executor renames away the artifacts. This uses some logic from #6719 :

An artifact table, if found (e.g. _mytable_old or _c07abcac_06cb_11eb_ac94_f875a4d24e90_20201005082948_del) is RENAMEd to e.g. _vt_PURGE_0b19830706ca11ebaf6bf875a4d24e90_20201005051720
When Managed DROP TABLE #6719 is merged, this will take the table through garbage collection lifecycle

…endtoend tests Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

shlomi-noach · 2020-10-05T09:18:53Z

I'm ready to have this PR merged. It now supports throttling and table lifecycle.

I have not made changes to the ALTER syntax. See #6782 ; we can iterate in a followup PR.

shlomi-noach · 2020-10-06T06:05:20Z

OMG 🎉

shlomi-noach · 2020-10-06T07:23:33Z

Pointing out that the ALTER WITH... syntax is still subject to change.

This checks if a vtgate is currently filtering keyspaces before requesting the TopoServer. This is necessary because a TopoServer can't be accessed in those cases as the filtered Topo in those cases could make it unsafe to make writes since all reads would be returning a subset of the actual topo data. The only use of the requested topoServer that I found was in the DDL handling path and was introduced in vitessio#6547. This is deployed on dev but should get testing (endtoend or unit, unclear on best path atm) before going upstream.

This checks if a vtgate is currently filtering keyspaces before requesting the TopoServer. This is necessary because a TopoServer can't be accessed in those cases as the filtered Topo in those cases could make it unsafe to make writes since all reads would be returning a subset of the actual topo data. The only use of the requested topoServer that I found was in the DDL handling path and was introduced in vitessio#6547. This is deployed on dev but should get testing (endtoend or unit, unclear on best path atm) before going upstream. # Conflicts: # go/vt/vtgate/vcursor_impl.go Signed-off-by: Richard Bailey <rbailey@slack-corp.com>

shlomi-noach requested review from derekperkins, dkhenry and sougou as code owners August 9, 2020 13:14

This was referenced Aug 9, 2020

WIP: ApplySchema to run online schema changes via gh-ost planetscale/vitess#67

Closed

Support ALTER ONLINE TABLE syntax openark/gh-ost#9

Closed

shlomi-noach changed the title ~~WIP: automated, scheduled, dependency free online DDL via gh-ost~~ WIP: automated, scheduled, dependency free online DDL via gh-ost/pt-online-schema-chane Aug 11, 2020

shlomi-noach changed the title ~~WIP: automated, scheduled, dependency free online DDL via gh-ost/pt-online-schema-chane~~ WIP: automated, scheduled, dependency free online DDL via gh-ost/pt-online-schema-change Aug 11, 2020

Merge remote-tracking branch 'origin/vttablet-throttle-feature-flag' …

289279b

…into online-ddl Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

shlomi-noach added 3 commits October 4, 2020 11:20

moved endtoend online-ddl test out of vtgate and into its own path

a38856c

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

creating MigrationRequestsPath once, so that ListDir doesn't show error

fe9341e

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

do not report error if node exists

c305450

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

fixed missing schema qualifier in init query

df502e9

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

shlomi-noach mentioned this pull request Oct 4, 2020

vttablet throttler feature flag: -enable-lag-throttler #6815

Merged

shlomi-noach added 4 commits October 5, 2020 08:34

functionality from managed-drop-table

0064650

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

text functionality from managed-drop-table

fb3d39a

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

artifact cleanup: RENAME away tables for complete/failed migrations

075f894

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

invoking artifact GC at end of migration

3ad851f

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

migration_check_interval now configurable, in particular in favor of …

295b1a8

…endtoend tests Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

deepthi approved these changes Oct 5, 2020

View reviewed changes

deepthi merged commit dd6ecae into vitessio:master Oct 5, 2020

shlomi-noach deleted the online-ddl branch October 6, 2020 06:05

askdba added this to the v8.0 milestone Oct 12, 2020

shlomi-noach mentioned this pull request Oct 22, 2020

Online DDL: tracking issue #6926

Open

setassociative mentioned this pull request Mar 5, 2021

Vitess v8.0 Release branch tinyspeck/vitess#194

Merged

setassociative mentioned this pull request Mar 15, 2021

Fix bug in vtgate when running with keyspace filtering tinyspeck/vitess#199

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Experimental: automated, scheduled, dependency free online DDL via gh-ost/pt-online-schema-change #6547

Experimental: automated, scheduled, dependency free online DDL via gh-ost/pt-online-schema-change #6547

shlomi-noach commented Aug 9, 2020 •

edited

shlomi-noach commented Aug 10, 2020

derekperkins commented Aug 10, 2020

shlomi-noach commented Aug 10, 2020

derekperkins commented Aug 10, 2020 •

edited

mattlord commented Aug 10, 2020

shlomi-noach commented Aug 10, 2020

ameetkotian commented Aug 10, 2020

shlomi-noach commented Aug 11, 2020 •

edited

shlomi-noach commented Aug 11, 2020

shlomi-noach commented Aug 11, 2020

shlomi-noach commented Aug 11, 2020

shlomi-noach commented Aug 11, 2020

shlomi-noach commented Aug 11, 2020

ajm188 commented Aug 11, 2020 •

edited

shlomi-noach commented Aug 11, 2020

derekperkins commented Aug 11, 2020

shlomi-noach commented Aug 17, 2020

shlomi-noach commented Aug 19, 2020

shlomi-noach commented Aug 19, 2020

shlomi-noach commented Aug 19, 2020

shlomi-noach commented Aug 20, 2020

shlomi-noach commented Aug 20, 2020

shlomi-noach commented Aug 23, 2020

shlomi-noach commented Oct 4, 2020

shlomi-noach commented Oct 4, 2020

shlomi-noach commented Oct 4, 2020

shlomi-noach commented Oct 5, 2020

shlomi-noach commented Oct 5, 2020

shlomi-noach commented Oct 6, 2020

shlomi-noach commented Oct 6, 2020

Experimental: automated, scheduled, dependency free online DDL via gh-ost/pt-online-schema-change #6547

Experimental: automated, scheduled, dependency free online DDL via gh-ost/pt-online-schema-change #6547

Conversation

shlomi-noach commented Aug 9, 2020 • edited

TL;DR

The ALTER TABLE problem

Schema migration cycle & steps

What we propose to address

What this PR does, and what we expect to achieve

User facing

Parser

Topo

vtctld

vttablet

Tracking breakdown

Further reading, resources, acknowledgements

Call for feedback

shlomi-noach commented Aug 10, 2020

derekperkins commented Aug 10, 2020

shlomi-noach commented Aug 10, 2020

derekperkins commented Aug 10, 2020 • edited

mattlord commented Aug 10, 2020

shlomi-noach commented Aug 10, 2020

ameetkotian commented Aug 10, 2020

shlomi-noach commented Aug 11, 2020 • edited

shlomi-noach commented Aug 11, 2020

shlomi-noach commented Aug 11, 2020

shlomi-noach commented Aug 11, 2020

shlomi-noach commented Aug 11, 2020

shlomi-noach commented Aug 11, 2020

ajm188 commented Aug 11, 2020 • edited

shlomi-noach commented Aug 11, 2020

derekperkins commented Aug 11, 2020

shlomi-noach commented Aug 17, 2020

shlomi-noach commented Aug 19, 2020

shlomi-noach commented Aug 19, 2020

shlomi-noach commented Aug 19, 2020

shlomi-noach commented Aug 20, 2020

shlomi-noach commented Aug 20, 2020

shlomi-noach commented Aug 23, 2020

shlomi-noach commented Oct 4, 2020

shlomi-noach commented Oct 4, 2020

shlomi-noach commented Oct 4, 2020

shlomi-noach commented Oct 5, 2020

shlomi-noach commented Oct 5, 2020

shlomi-noach commented Oct 6, 2020

shlomi-noach commented Oct 6, 2020

shlomi-noach commented Aug 9, 2020 •

edited

derekperkins commented Aug 10, 2020 •

edited

shlomi-noach commented Aug 11, 2020 •

edited

ajm188 commented Aug 11, 2020 •

edited