Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

release-20.1: schemachange: speed up slow schema changes #48621

Merged
merged 1 commit into from
May 11, 2020

Conversation

spaskob
Copy link
Contributor

@spaskob spaskob commented May 9, 2020

Backport 1/1 commits from #48608.

/cc @cockroachdb/release


Touches #45150.
Fixes #47607.
Touches #47790.

Release note (performance improvement):
Before this a simple schema change could take 30s+.
The reason was that if the schema change is not first
in line in the table mutation queue it would return a
re-triable error and the jobs framework will re-adopt and
run it later. The problem is that the job adoption loop
is 30s.

To repro run this for some time:

cockroach sql --insecure --watch 1s -e 'drop table if exists users cascade; create table users (id uuid not null, name varchar(255) not null, email varchar(255) not null, password varchar(255) not null, remember_token varchar(100) null, created_at timestamp(0) without time zone null, updated_at timestamp(0) without time zone null, deleted_at timestamp(0) without time zone null); alter table users add primary key (id); alter table users add constraint users_email_unique unique (email);'

Instead of returning on re-triable errors we retry with exponential
backoff in the schema change code. This pattern of dealing with
re-triable errors in client job code is encouraged vs relying on the
registry because the latter leads to slowness and additionally to more
complicated test fixtures that rely on hacking with the internals of the
job registry,

Touches cockroachdb#47790.

Release note (performance improvement):
Before this a simple schema change could take 30s+.
The reason was that if the schema change is not first
in line in the table mutation queue it would return a
re-triable error and the jobs framework will re-adopt and
run it later. The problem is that the job adoption loop
is 30s.

To repro run this for some time:
```
cockroach sql --insecure --watch 1s -e 'drop table if exists users cascade; create table users (id uuid not null, name varchar(255) not null, email varchar(255) not null, password varchar(255) not null, remember_token varchar(100) null, created_at timestamp(0) without time zone null, updated_at timestamp(0) without time zone null, deleted_at timestamp(0) without time zone null); alter table users add primary key (id); alter table users add constraint users_email_unique unique (email);'
```

Instead of returning on retriable errors we retry with a exponential
backoff in the schema change code. This pattern of dealing with
retriable errors in client job code is encouraged vs relying on the
registry beacuse the latter leads to slowness and additionally to more
complicated test fixtures that rely in hacking with the internals of the
job registry,
@spaskob spaskob requested review from ajwerner and thoszhang May 9, 2020 01:36
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: too

I'm going to push the button

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @ajwerner)

@ajwerner ajwerner merged commit 864a8f3 into cockroachdb:release-20.1 May 11, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants