Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

migrate: Refactor _lockMigrations to avoid forUpdate #3395

Merged
merged 2 commits into from Aug 28, 2019

Conversation

@bdarnell
Copy link
Contributor

commented Aug 13, 2019

Since _isLocked is only called just before _lockMigrations, it is redundant
and we can accomplish the same thing by checking the rowCount returned
by the update (verified by testing with postgres, mysql, sqlite, and
mssql).

The benefit of this change is improving compatbility with CockroachDB,
which does not support FOR UPDATE.

Updates #2002

migrate: Refactor _lockMigrations to avoid forUpdate
Since _isLocked is only called just before _lockMigrations, it is redundant
and we can accomplish the same  thing by checking the rowCount returned
by the update (verified by testing with postgres, mysql, sqlite, and
mssql).

The benefit of this change is improving compatbility with CockroachDB,
which does not support FOR UPDATE.

Updates #2002
_lockMigrations(trx) {
const tableName = getLockTableName(this.config.tableName);
return getTable(this.knex, tableName, this.config.schemaName)
.transacting(trx)
.update({ is_locked: 1 });
.where('is_locked', '=', 0)

This comment has been minimized.

Copy link
@elhigu

elhigu Aug 16, 2019

Collaborator

This does not work. There can be two processes at the same time arriving to this row and both will return one row and both then updates the lock value to 1.

This comment has been minimized.

Copy link
@elhigu

elhigu Aug 16, 2019

Collaborator

forUpdate exactly causes DB to prevent that from happening.

@elhigu

This comment has been minimized.

Copy link
Collaborator

commented Aug 16, 2019

This approach doesn't work.

@elhigu elhigu closed this Aug 16, 2019

@kibertoad

This comment has been minimized.

Copy link
Collaborator

commented Aug 16, 2019

@elhigu Couldn't this be an alternate mechanism that one can opt-in for? Considering that usually migrations are not being executed in same environment concurrently from multiple places, this particular approach is unlikely to cause many real issues in actual production environments for users that may rely on it.

@elhigu

This comment has been minimized.

Copy link
Collaborator

commented Aug 16, 2019

That would mean basically option for omitting is locked check completely. I'm not against it, but still I would prefer solution, which actually works. I'm pretty sure there must be a way to do something like this that actually works also on CockroachDB

@kibertoad

This comment has been minimized.

Copy link
Collaborator

commented Aug 16, 2019

@elhigu Option to disable lock check might actually be useful.

@elhigu

This comment has been minimized.

Copy link
Collaborator

commented Aug 16, 2019

In that case those changes that were made to _lockMigrations could be implemented anyways as a secondary safety measure.

@bdarnell

This comment has been minimized.

Copy link
Contributor Author

commented Aug 24, 2019

This does not work. There can be two processes at the same time arriving to this row and both will return one row and both then updates the lock value to 1.

Can you please elaborate on when exactly this fails? I still believe it works, although reasoning about sub-serializable isolation levels is always tricky and under-documented.

My informal argument is that in any database that distinguishes between regular SELECT and FOR UPDATE, the WHERE clause of an UPDATE statement should be evaluated in the FOR UPDATE mode (the mysql docs are explicit that UPDATE WHERE gets an exclusive lock on every record it encounters). If there are two concurrent invocations of UPDATE t SET is_locked=1 WHERE is_locked=0, the first one will grab a write lock on the row when it reads the zero value, then the second one will have to wait until the first transaction commits (so it will see the new value of 1). Therefore only the transaction that causes a transition from 0 to 1 will return "1 row affected".

There are databases that don't work this way, for example ones that are asynchronously replicated and eventually consistent. My proposal would indeed be incorrect for these systems. However, I've never seen a system like that that has a useful implementation of SELECT FOR UPDATE, so the original implementation would also fail in this case.

I'm pretty sure there must be a way to do something like this that actually works also on CockroachDB

One alternative is for the CockroachDB dialect to turn forUpdate() into a no-op (or for the database itself to accept and ignore the FOR UPDATE clause). CockroachDB always runs with SERIALIZABLE isolation and most uses of FOR UPDATE are unnecessary in this mode. However, I'd prefer to avoid unnecessary uses of FOR UPDATE rather than just ignoring them (and eventually implement it in CockroachDB), since there are (rare) cases where it's still useful in SERIALIZABLE transactions (mainly to control lock ordering and minimize deadlocks).

@elhigu

This comment has been minimized.

Copy link
Collaborator

commented Aug 24, 2019

Can you please elaborate on when exactly this fails? I still believe it works, although reasoning about sub-serializable isolation levels is always tricky and under-documented.

Both processes will get the lock and start running migrations parallel.

My informal argument is that in any database that distinguishes between regular SELECT and FOR UPDATE, the WHERE clause of an UPDATE statement should be evaluated in the FOR UPDATE mode

Hmm... that actually might be true +1 I have never tried that if it really works that way. Only with plain selects and then updating... I'll try it out how it behaves.

@elhigu

This comment has been minimized.

Copy link
Collaborator

commented Aug 24, 2019

It seems to work at least on postgres so that indeed is a lot better approach 👍 We should add integration test for this to also see in which databases that is supported.

update-locking-test

Dont worry about writing invalid sql in the fisrt query... it was actually correct and terminal was just messing my typings 😅

@elhigu elhigu reopened this Aug 24, 2019

@bdarnell

This comment has been minimized.

Copy link
Contributor Author

commented Aug 25, 2019

I've added a test, which passes on postgres, mysql, and mssql (it doesn't work on sqlite because it doesn't support concurrent transactions anyway). I followed the existing promises-and-callback style but it would be simpler if it's OK to introduce async/await into this file.

@kibertoad

This comment has been minimized.

Copy link
Collaborator

commented Aug 25, 2019

@bdarnell By all means feel free to use async-await! Promise chains are just legacy.

@kibertoad

This comment has been minimized.

Copy link
Collaborator

commented Aug 25, 2019

@bdarnell Can you take a look at plethora of failures in https://travis-ci.org/tgriesser/knex/jobs/576541045? Something fishy is going on there.

@bdarnell

This comment has been minimized.

Copy link
Contributor Author

commented Aug 25, 2019

  1) Integration Tests
       oracle | oracledb
         knex.migrate
           knex.migrate.latest
             should work with concurent calls to _lockMigrations:
     TypeError: knex.transaction(...).then(...).then(...).then(...).then(...).then(...).then(...).finally is not a function
      at Context.<anonymous> (test/integration/migrate/index.js:327:19)

This is the root cause, and then because we failed to clean up in the finally callback, everything afterwards fails with some variation of "migration table already locked" or migrations just silently fail to run. (There's a global before function that's supposed to reset this lock, but it doesn't seem to be doing its job and I don't understand why)

The failures are all on the Node 8 job; that version of node must be using a version of Promise that doesn't have the finally method. Hopefully this just goes away if I change everything to async/await.

@kibertoad

This comment has been minimized.

Copy link
Collaborator

commented Aug 25, 2019

@bdarnell Yup, that seems to be it.
Btw, I've started preparing 0.19.3 release, but I can delay it until this PR is merged :)

@bdarnell bdarnell force-pushed the bdarnell:master branch from 2498ddb to a42c051 Aug 28, 2019

@bdarnell

This comment has been minimized.

Copy link
Contributor Author

commented Aug 28, 2019

Alright, I've updated this to use async/await and we'll see if it passes CI.

Thanks for offering to delay the release. FYI, this won't get us complete cockroachdb compatibility, but it's the main thing I've seen so far that can't be worked around at the dialect level.

@kibertoad

This comment has been minimized.

Copy link
Collaborator

commented Aug 28, 2019

@bdarnell Appreciate the time spent on this!

@kibertoad kibertoad merged commit 8f40f8d into tgriesser:master Aug 28, 2019

1 of 2 checks passed

coverage/coveralls Coverage decreased (-0.03%) to 87.869%
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@kibertoad

This comment has been minimized.

Copy link
Collaborator

commented Aug 28, 2019

Will cut a new release today.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.