Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.timeout(ms, {cancel: true}) for with queued queries in transaction #4415

Open
martinmacko47 opened this issue Apr 6, 2021 · 9 comments · May be fixed by #4416 or #4444
Open

.timeout(ms, {cancel: true}) for with queued queries in transaction #4415

martinmacko47 opened this issue Apr 6, 2021 · 9 comments · May be fixed by #4416 or #4444

Comments

@martinmacko47
Copy link
Contributor

Environment

Knex version: current master
Database + version: MySQL - any version
OS: any

Bug

If we queue multiple slow queries with timeouts in a transaction, not all of them will get cancelled after timeouting. E.g.:

knex.transaction(function(trx) {
  return Promise.all([
    firstSlowQuery(trx).timeout(5000, {cancel: true}),
    secondSlowQuery(trx).timeout(5000, {cancel: true}),
    ...
    lastSlowQuery(trx).timeout(5000, {cancel: true}),
  ])
})

As the queries are in the transaction, they get queued in the driver which sends them to DB one by one. As all queries will timeout at approximately the same time, all their KILL QUERY ? commands will be also sent to DB at approximately the same time. Therefore they will kill only the first query. After that the driver sends the second query to DB which will be never killed, as its timeout already elapsed and its KILL QUERY ? command was already sent.

The problem occurs only on MySQL. Postgres marks the transaction failed after killing the query, therefore any further queries in the same transaction will immediately fail.

I guess we should simulate the same behaviour on MySQL as well. Either by removing the timeouted queries from the driver queue, or by rolling back the transaction with the killed timeouted query.

I've written a test to replicate this bug. It passes for Postgres and fails for MySQL: https://github.com/martinmacko47/knex/pull/4/files#diff-ef380392f7c99984959538380cd3a01631d49da7eb768b02fd71695bfb2a3febR981-R1052

@martinmacko47
Copy link
Contributor Author

This bug was originaly mentioned in #2211 (comment), where @elhigu commented:

Actually also in postgresql it is configurable behaviour if transaction should fail when one of the queries sent to it fails. Is it possible to make mysql to rollback transaction implicitly in that case too?

I'm not sure what actually should be done in this case... Maybe it is better to open new issue for this.

@martinmacko47
Copy link
Contributor Author

Is it possible to make mysql to rollback transaction implicitly in that case too?

We could use KILL CONNECTION ? instead of KILL QUERY ? in cancelQuery method for MySQL. This would close the entire connection. Then the reamining queued queries will crash immediately:

EEE Error: Connection lost: The server closed the connection.
    at Protocol.end (/knex/node_modules/mysql/lib/protocol/Protocol.js:112:13)
    at Socket.<anonymous> (/knex/node_modules/mysql/lib/Connection.js:94:28)
    at Socket.<anonymous> (/knex/node_modules/mysql/lib/Connection.js:526:10)
    at Socket.emit (events.js:326:22)
    at endReadableNT (_stream_readable.js:1226:12)
    at processTicksAndRejections (internal/process/task_queues.js:80:21)
    --------------------
    at Protocol._enqueue (/knex/node_modules/mysql/lib/protocol/Protocol.js:144:48)
    at Connection.query (/knex/node_modules/mysql/lib/Connection.js:198:25)
    at /knex/lib/dialects/mysql/index.js:126:18
    at new Promise (<anonymous>)
    at Client_MySQL._query (/knex/lib/dialects/mysql/index.js:120:12)
    at executeQuery (/knex/lib/execution/internal/query-executioner.js:37:17)
    at Client_MySQL.query (/knex/lib/client.js:134:12)
    at /knex/lib/execution/transaction.js:363:24
    at new Promise (<anonymous>)
    at Client_MySQL.trxClient.query (/knex/lib/execution/transaction.js:358:12)
    at Runner.query (/knex/lib/execution/runner.js:130:36)
    at ensureConnectionCallback (/knex/lib/execution/internal/ensure-connection-callback.js:13:17)
    at Runner.ensureConnection (/knex/lib/execution/runner.js:272:20)
    at processTicksAndRejections (internal/process/task_queues.js:93:5)
    at Runner.run (/knex/lib/execution/runner.js:30:19)
    at /knex/test/integration/query/additional.js:1035:11
    at async Promise.all (index 0) {
  fatal: true,
  code: 'PROTOCOL_CONNECTION_LOST'
}

And also the final attempt to COMMIT will crash:

     Error: COMMIT; - Cannot enqueue Query after fatal error.
      at Protocol._validateEnqueue (node_modules/mysql/lib/protocol/Protocol.js:213:16)
      at Protocol._enqueue (node_modules/mysql/lib/protocol/Protocol.js:138:13)
      at Connection.query (node_modules/mysql/lib/Connection.js:198:25)
      at /knex/lib/dialects/mysql/index.js:126:18
      at new Promise (<anonymous>)
      at Client_MySQL._query (lib/dialects/mysql/index.js:120:12)
      at executeQuery (lib/execution/internal/query-executioner.js:37:17)
      at Client_MySQL.query (lib/client.js:134:12)
      at /knex/lib/execution/transaction.js:363:24
      at new Promise (<anonymous>)
      at Client_MySQL.trxClient.query (lib/execution/transaction.js:358:12)
      at Transaction_MySQL.query (lib/dialects/mysql/transaction.js:10:8)
      at Transaction_MySQL.commit (lib/execution/transaction.js:115:17)
      at Function.transactor.commit (lib/execution/transaction.js:314:40)
      at /knex/lib/execution/transaction.js:228:35
      at processTicksAndRejections (internal/process/task_queues.js:93:5)

However I didn't manage to handle those errors properly, to raise KnexTimeoutError instead, and to discard the connection from the pool.

@martinmacko47
Copy link
Contributor Author

I created a PR #4416 with an attempt to fix it by removing the timeouted queries from the driver queue. However the solution messes with the driver internal queue quite a bit, so I'm not sure it's safe. But I haven't come up with any other working solution yet.

@elhigu
Copy link
Member

elhigu commented Apr 6, 2021

I mean... is there a way to setup mysql to also automatically rollback transaction if any of the queries fail. Current functionality is consistent with postgresql, since any failing query (for example insert conflict) will rollback the postgresql transaction, but it won't rollback in mysql.

So wouldn't making timeout to be a special case rolling back in mysql would make knex behaviour more inconsistent?

If consistency between mysql and postgresql is what you are looking for, then you could configure postgresql transactions to not rollback in case of failing queries?

@elhigu
Copy link
Member

elhigu commented Apr 6, 2021

I'm still not convinced that this behaviour even should be changed.

@martinmacko47
Copy link
Contributor Author

martinmacko47 commented Apr 7, 2021

@elhigu

is there a way to setup mysql to also automatically rollback transaction if any of the queries fail.

Not that I'm aware of. But even if there is such option, the timeouted query cancelation should work reliably even without using this option.

If consistency between mysql and postgresql is what you are looking for, then you could configure postgresql transactions to not rollback in case of failing queries?

Not really. It would be nice for MySQL and Postgres to behave consistently, but it is probably not possible, as their query cancellation commands behave differently.

My goal it to make cancelation of timeouted queries working reliably also in transactions on both databases. That is, when a query times out, it is cancelled. But not necessarily the same way or with the same impact on the transaction or the connection on both databases.

Currently, after fixing #4324, cancelation of timeouted queries works on MySQL reliably in transactions only if the queries are run one by one from a single coroutine. But it fails in some cases when queries are started simultaneously from multiple parallel coroutines. In such case they are queued in the driver and wait for their turn. And if they timeout while they are queued, they will never get cancelled. Before fixing #4324 cancelation of timeouted queries didn't work at all in transactions, so I wasn't aware of this corner case (or perhaps a race condition).

@martinmacko47
Copy link
Contributor Author

So wouldn't making timeout to be a special case rolling back in mysql would make knex behaviour more inconsistent?

Sorry, we probably didn't understand each other. No, I don't want to rollback the transaction in MySQL, I just want to cancel timeouted queries. And yes, preferably without closing the transaction nor the connection if possible.

Btw, even in Postgress the user may still resurrect their transaction after it is marked failed. They can create a savepoint before the slow query, try run it, and roll back to the savepoint in case the query times out. So it is good to keep the connection (and the transaction) open even in MySQL. We just need to find a reliable way how to cancel a query, even in the case it times out while still queued.

@martinmacko47
Copy link
Contributor Author

martinmacko47 commented Apr 7, 2021

Another example to demonstrate the problem:

knex.transaction(function(trx) {
  return Promise.all([
    knex.raw('SELECT SLEEP(2)').transacting(trx), // query A - without timeout
    knex.raw('SELECT SLEEP(10)').timeout(1000, {calcel: true}).transacting(trx), // query B
  ])
})

Expected behaviour in MySQL

  1. Query A is sent to DB. Query B is queued.
  2. After one second: Query B raises timeout exception and is cancelled. Query A is left running in DB.
  3. After two seconds: Query A completes
  4. After three seconds: No query is running in DB

Current behaviour with MySQL

  1. Query A is sent to DB. Query B is queued.
  2. After one second: Query B raises timeout exception and runs KILL QUERY ? which kills query A. After that query B is sent to DB
  3. After two seconds: Query B is running in DB
  4. After three seconds: Query B is still running in DB. I will be running for full 10 seconds until it completes, as it will be never canceled.

The problem is that KILL QUERY ? does not kill the query that timed out, but the query that is currently running instead. The query that times out before sending to DB, should be removed from the queue instead of sending KILL QUERY ?.

@elhigu
Copy link
Member

elhigu commented Apr 7, 2021

Btw, even in Postgress the user may still resurrect their transaction after it is marked failed. They can create a savepoint before the slow query, try run it, and roll back to the savepoint in case the query times out. So it is good to keep the connection (and the transaction) open even in MySQL. We just need to find a reliable way how to cancel a query, even in the case it times out while still queued.

Ah right... looks like there wasn't any switch in DB for that, but only setting for psql which is indeed implemented with implicit savepoints. https://www.postgresql.org/docs/current/app-psql.html

ON_ERROR_ROLLBACK
When set to on, if a statement in a transaction block generates an error, the error is ignored and the transaction continues. When set to interactive, such errors are only ignored in interactive sessions, and not when reading script files. When set to off (the default), a statement in a transaction block that generates an error aborts the entire transaction. The error rollback mode works by issuing an implicit SAVEPOINT for you, just before each command that is in a transaction block, and then rolling back to the savepoint if the command fails.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants