New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More Transactions than max pool connections kills knex #1040

Closed
ArdentZeal opened this Issue Oct 30, 2015 · 11 comments

Comments

Projects
None yet
5 participants
@ArdentZeal

ArdentZeal commented Oct 30, 2015

Hello,

when starting more transactions then there are connections allowed in the pool, knexjs hangs indefinitly and does not throw an error. I would expect knexjs to queue all transactions, then start one after another as soon as a connection frees up.

To get a better understanding:

  • Are transactions always executed on 1 connection?
  • When I start a transaction and do some read calls without the transacting(trx) part, because I know the data was there before the transaction started anyway --> more than 1 connection is used? Or do these read calls also use the transaction connection?

Kind Regards,
ArdentZeal

@ArdentZeal ArdentZeal changed the title from Transactions --> Pool to More Transactions than max pool connections kills knex Oct 30, 2015

@vschoettke

This comment has been minimized.

Collaborator

vschoettke commented Oct 30, 2015

Every knex SQL command that uses .transacting(trx) or uses trx instead of knex (where trx is the callback parameter of .transaction()) is executed within the same connection.

If you don't specify/use the transaction (trx) object the command will run outside of the transaction.

@vschoettke vschoettke added the question label Oct 30, 2015

@ArdentZeal

This comment has been minimized.

ArdentZeal commented Oct 30, 2015

If you don't specify/use the transaction (trx) object the command will run outside of the transaction.

I know the theory that far, I am interested specifically in the connection part of the question. So if a query runs outside the transaction, it has to use a different connection and can not be executed on the transaction connection?

I am assuming following scenario might be happening

I have 5 transactions running simultaneously with a connection pool of 5 and there are some calls in my logic which do not use .transacting(trx). These calls would need at least one seperate connection to do their stuff, which is not possible because all of the available connections are blocked by the running transactions. ==> Deadlocked

is this possible to happen or can knex use the transaction connections to finish up the calls which are not .transacting(trx).

Kind Regards,
ArdentZeal

@vschoettke

This comment has been minimized.

Collaborator

vschoettke commented Oct 30, 2015

Knex will not reuse connection with a transaction for other SQL commands outside of that transaction (and it should not because doing so may change the query result).

Regarding your example, the deadlock will be resolved by the acquireTimeout from the pool2 so the pending queries will fail after the specified timeout.

I highly recommend using the trx object as knex replacement since .transacting(trx) is easily missed. See http://knexjs.org/#Transactions for an example.

@ArdentZeal

This comment has been minimized.

ArdentZeal commented Oct 30, 2015

Regarding your example, the deadlock will be resolved by the acquireTimeout from the pool2 so the pending queries will fail after the specified timeout.

Thanks for all the help so far, but there is no timeout happening. Is the default timeout no timeout?

Here is the example I am using maybe it helps you to find the lockup

Connectionpool is set to min: 0, max 5.

10 transactions properly coded work fine.
As soon as i do the scenario described above, knex hangs up completly.

describe.only('knex case >>', function() {
    it('should not lock up knex', function(next) {

        async.times(10, function(n, callback) {
            _strucd.knex.transaction(function(trx) {
                _strucd.knex('strucd_lists')
                    .transacting(trx)
                    .insert({ name: 'Test', description: 'Test', shortId: shortId.generate(), uuid: uuid.v4(), created_at: new Date(), updated_at: new Date() })
                    .returning('*')
                    .then(trx.commit)
                    .catch(trx.tollback);
            })
            .then(function(results) {
                return callback(null);
            })
            .catch(function(err) {
                return callback(err);
            });
        }, function(err) {
            if(err) return next(err);

            // check for knex still working
            _strucd.knex('strucd_lists')
                .select('*')
                .then(function(results) {
                    console.log(results.length);
                    return next(null);
                })
                .catch(function(err) {
                    return next(err);
                });
        });
    });

    it('does lock up knex', function(next) {
        async.times(5, function(n, callback) {
            _strucd.knex.transaction(function(trx) {
                return Promise.join(
                    _strucd.knex('strucd_lists')
                        .transacting(trx)
                        .insert({ name: 'Test', description: 'Test', shortId: shortId.generate(), uuid: uuid.v4(), created_at: new Date(), updated_at: new Date() })
                        .returning('*'),
                    // do it without transaction to break knex
                    _strucd.knex('strucd_lists')
                        .insert({ name: 'Test', description: 'Test', shortId: shortId.generate(), uuid: uuid.v4(), created_at: new Date(), updated_at: new Date() })
                        .returning('*')
                )
                .then(trx.commit)
                .catch(trx.tollback);
            })
            .then(function(results) {
                return callback(null);
            })
            .catch(function(err) {
                return callback(err);
            });
        }, function(err) {
            if(err) return next(err);

            // check for knex still working
            _strucd.knex('strucd_lists')
                .select('*')
                .then(function(results) {
                    console.log(results.length);
                    return next(null);
                })
                .catch(function(err) {
                    return next(err);
                });
        });
    });
});

###Result
10
    ✓ should not lock up knex (107ms)
    1) does lock up knex


  1 passing (17s)
  1 failing

  1) knex case >> does lock up knex:
     Error: timeout of 2000ms exceeded. Ensure the done() callback is being called in this test.

Kind Regards,
ArdentZeal

@elhigu

This comment has been minimized.

Collaborator

elhigu commented Nov 14, 2015

I think what you are doing there is that you start 5 transactions:

_strucd.knex('strucd_lists').transacting(trx).insert({ name: 'Test', description: 'Test', shortId: shortId.generate(), uuid: uuid.v4(), created_at: new Date(), updated_at: new Date() }).returning('*')

But before committing transaction you also send non transacting queries (which require separate connection):

 _strucd.knex('strucd_lists').insert({ name: 'Test', description: 'Test', shortId: shortId.generate(), uuid: uuid.v4(), created_at: new Date(), updated_at: new Date() }).returning('*')

Because you have already 5 transactions open, which are taking all the connections your non-transacting queries are waiting for connection.

However before committing transaction you have required with Promise.join that also non transacting query must have been completed, so your transactions are waiting for non-transacting queries, which are waiting for connection... so there is deadlock.

If you have configured your connection pool (https://github.com/myndzi/pool2) correctly there should be acquireTimeout which should finally resolve this situation by rolling back your transactions in .catch(trx.tollback);.

@ghost

This comment has been minimized.

ghost commented Nov 18, 2015

I also have problems with opening more transactions than pool size simultaneously. I do not make any query outside transaction. If i have a pool of 10 and make 11 simultaneously queries to the database all inside new transaction knex hangs without any error. I debugged the pool2 and found that pool is not released (pool2 Not growing pool: pending=10, to be available=0 +0ms). Why is this happening?

@ollisal

This comment has been minimized.

ollisal commented Nov 18, 2015

@elhigu , acquireTimeout is a timeout for acquiring resources (DB connections) to the pool, not for allocating a free connection in the pool for running queries in a transaction. See here myndzi/pool2#7 (comment) .

So, acquireTimeout will not roll back your transaction because all connections in your pool are in use in a transaction, and a new connection would be required to free any of them up, even if the pool has the maximum number of allowed connections open already. acquireTimeout would only be triggered when the pool maximum limit still allows new connections to be made, but a new connection cannot be established within acquireTimeout milliseconds. So it's like a timeout for connecting to the DB server, and only applies in the cases where pool2 decides it should open up a new connection.

Knex should have a configurable timeout for getting a connection to run a query to resolve these kinds of cases.

@elhigu

This comment has been minimized.

Collaborator

elhigu commented Nov 18, 2015

@ollisal sounds right. I had misunderstood meaning of pool2 acquireTimeout.

I agree that knex should have a feature to prevent these deadlocks when transactions are consuming all the connections.

By the comment you mentioned looks like they might not be willing to add that to pool2, so knex would be the place for it.

@rhys-vdw rhys-vdw added the PR please label Nov 19, 2015

@rhys-vdw

This comment has been minimized.

Collaborator

rhys-vdw commented Nov 19, 2015

Knex should have a configurable timeout for getting a connection to run a query to resolve these kinds of cases.

Sounds like a reasonable solution. I'll accept a PR implementing this. (Unless someone has a better idea?)

@ollisal

This comment has been minimized.

ollisal commented Nov 20, 2015

It would be very useful for the returned error to indicate which query caused it, that way one could find the root of the deadlock issue more easily. This can get very difficult in complex operations which make many different queries.

Currently even the verbose knex query logging to console only prints queries AFTER it has acquired a connection for the query, so it's quite hard to figure out which query is missing .transacting.

I'm quite busy at the moment but I could look into implemeting this, if nobody else beats me to it.

@elhigu

This comment has been minimized.

Collaborator

elhigu commented Feb 3, 2016

Fixed in #1177

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment