Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression: Concurrent subtransactions result in incorrect savepoint release order #2213

Open
joepie91 opened this issue Sep 6, 2017 · 2 comments

Comments

@joepie91
Copy link
Contributor

@joepie91 joepie91 commented Sep 6, 2017

Affected versions

I have not determined the earliest version where this occurs, but it occurs in the latest release (0.13.0), and the issue still exists in master. I've tested with PostgreSQL, but there is no PostgreSQL-specific transaction logic, so presumably this affects other SQL databases as well (as the implicit-release behaviour involved is a standard feature).

Reduced testcase

'use strict';

const Promise = require("bluebird");
const knex = require("knex")(require("./knexfile"));

Promise.try(() => {
	return knex.transaction((trx) => {
		return Promise.all([
			trx.transaction((trx2) => {
				return Promise.delay(500);
			}),
			trx.transaction((trx3) => {
				return Promise.delay(600);
			}),
		]);
	});
}).then((result) => {
	console.log("All succeeded");
}).catch((err) => {
	console.error(err);
});

Expected result

600 milliseconds delay, then an "All succeeded" message.

Actual result

error: RELEASE SAVEPOINT trx6; - no such savepoint
    at Connection.parseE (/home/sven/projects/knex-subtransaction-bugcase/node_modules/pg/lib/connection.js:546:11)
    at Connection.parseMessage (/home/sven/projects/knex-subtransaction-bugcase/node_modules/pg/lib/connection.js:371:19)
    at Socket.<anonymous> (/home/sven/projects/knex-subtransaction-bugcase/node_modules/pg/lib/connection.js:114:22)
    at emitOne (events.js:96:13)
    at Socket.emit (events.js:188:7)
    at readableAddChunk (_stream_readable.js:176:18)
    at Socket.Readable.push (_stream_readable.js:134:10)
    at TCP.onread (net.js:547:20)

Relevant debug logs:

  knex:tx trx1: Starting top level transaction +0ms
  knex:pool INFO pool postgresql:pg:client0 - dispense() clients=1 available=0 +4ms
  knex:client acquired connection from pool: __knexUid2 +30ms
  knex:query BEGIN; +1ms
  knex:bindings undefined +0ms
  knex:tx trx5: Starting nested transaction +5ms
  knex:tx trx6: Starting nested transaction +0ms
  knex:pool INFO pool postgresql:pg:client0 - dispense() clients=0 available=1 +1ms
  knex:pool INFO pool postgresql:pg:client0 - dispense() clients=0 available=2 +0ms
  knex:query SAVEPOINT trx5; +1ms
  knex:bindings undefined +0ms
  knex:query SAVEPOINT trx6; +0ms
  knex:bindings undefined +0ms
  knex:query RELEASE SAVEPOINT trx5; +504ms
  knex:bindings undefined +0ms
  knex:tx trx5: releasing connection +3ms
  knex:query RELEASE SAVEPOINT trx6; +97ms
  knex:bindings undefined +0ms
  knex:tx trx6 error running transaction query +5ms
  knex:tx trx6: releasing connection +0ms
  knex:query ROLLBACK; +1ms
  knex:bindings undefined +0ms
  knex:tx trx1: releasing connection +0ms
  knex:client releasing connection to pool: __knexUid2 +0ms

Problem description

When running multiple subtransactions concurrently, they don't wait for one another; and if they complete in any order other than "the exact reverse of the order in which they were initiated", the savepoints are released in the wrong order.

In the above example, note how trx5 and trx6 are created in that order, but also released in that order; meaning a timeline that's something like the following:

  1. trx5 created
  2. trx6 created
  3. trx5 released - implicitly releases trx6 as it was created after trx5
  4. trx6 released - error occurs, because trx6 doesn't exist anymore (since it was implicitly released in step 3)

Cause

While siblings are tracked using the _previousSibling property, and this used to be used for waiting on sibling subtransactions (see eg. this version of acquireConnection), this is no longer the case in the most recent version of acquireConnection. The _previousSibling property is also not used anywhere else in the code, meaning that no waiting occurs.

Presumably, this logic got lost during refactoring.

Suggested resolution

  1. Reintroduce the "wait on _previousSibling" logic, either in acquireConnection or in whichever other place it's more suitable.
  2. Add a test case to prevent further regressions. If it's of any use, feel free to consider the above reduced testcase a code contribution towards this end :)
@joepie91

This comment has been minimized.

Copy link
Contributor Author

@joepie91 joepie91 commented Sep 6, 2017

As an additional note: I don't think subtransactions are documented, and I had to dig through the issue tracker to discover that an abstraction for savepoints already existed. Should I create a separate (documentation) issue for that?

@elhigu

This comment has been minimized.

Copy link
Member

@elhigu elhigu commented Sep 6, 2017

Thanks for the detailed report! I think that only mention about nested transactions is in the first paragraph. Please open issue about missing documentation here https://github.com/knex/documentation

codeclown added a commit to codeclown/knex that referenced this issue Sep 13, 2019
codeclown added a commit to codeclown/knex that referenced this issue Sep 13, 2019
kibertoad added a commit that referenced this issue Sep 20, 2019
* Fix for concurrent child transactions failing (#2213)

* Skip new sibling transaction test for mssql

* Update test to ensure Promise resolve order
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.