Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Knex .timeout(ms, {cancel: boolean}) use cancel instead of terminate for postgresql #3518

Merged
merged 1 commit into from Nov 21, 2019

Conversation

@JbRezidu
Copy link
Contributor

JbRezidu commented Nov 5, 2019

This pull request is to fix a bug that I have with knex on postgresql (knex version : 0.20.1, node-pg version : 7.12.1, postgresql database version : 11

The bug is on the timeout feature of knex.
When I try to cancel a postgresql request (with the knex.timeout and the cancel option set to true) the request is correctly canceled but if I try to run a request just after the canceled one sometimes I have this error :

Error: Connection terminated unexpectedly
at Connection.con.once (xxx/node_modules/pg/lib/client.js:200:9)
at Object.onceWrapper (events.js:286:20)\n    at Connection.emit (events.js:198:13)
at Socket.<anonymous> (xxx/node_modules/pg/lib/connection.js:129:10)
at Socket.emit (events.js:203:15)\n    at endReadableNT (_stream_readable.js:1145:12)
at process._tickCallback (internal/process/next_tick.js:63:19)

The bug is kind of random but I can easily reproduce it when doing this :

const knex = require('knex')({
  client: 'postgres',
  connection: {
    host: 'pg-database',
    port: 5432,
    user: 'xxx',
    password: 'yyy',
    database: 'zzz'
  },
  pool: {min: 2, max: 10}
});
try {
  const query = knex.from('information_schema.sql_features as s').select('*')
      .join('information_schema.sql_features as s1', 's1.feature_id', 's.feature_id')
      .join('information_schema.sql_features as s2', 's2.feature_id', 's.feature_id')
      .join('information_schema.sql_features as s3', 's3.feature_id', 's.feature_id')
      .join('information_schema.sql_features as s4', 's4.feature_id', 's.feature_id')
      .join('information_schema.sql_features as s5', 's5.feature_id', 's.feature_id')
      .join('information_schema.sql_features as s6', 's6.feature_id', 's.feature_id')
      .timeout(500, {cancel: true});
} catch(err) {
  // we catch here the timeout error of knex (which is normal)
  const query2 = knex.from('information_schema.sql_features').select('*');
  try {
    const result2 = await query2;
    console.log('the bug is not here');
  } catch(err2) {
    // we should not come here cause the second request does not have any timeout
    console.log('the bug is here')
  }
}

I try to get some select * from a postgresql table (you should be abble to reproduce it), I join on the same table than the select to have a huge query that takes time (it seems that the problem is very rare on small query with small timeout or with a select pg_sleep(10)
I see the the bug is here console.log many times, not always but many times with the error I put just above...

I guess it is because of the use of pg_terminate_backend method to cancel the query which totaly destroyed the connection and for the second request I get something that is altered by this terminate thing.
If I use pg_cancel_backend instead I have no problem at all, everything works fine and the request is well canceled in postgresql database.

esvinson added a comment on 26 Sep 2018 on the pull request (related with timeout on postgresql : #2636 ) asking why we used pg_terminate_backend instead of pg_cancel_backend and he's probably right... there is no need here to use the terminate instead of the cancel and it is working fine with the cancel...

…nate
@JbRezidu JbRezidu changed the title Knex .timeout(ms, {cancel: boolean}) us cancel instead of terminate for postgresql Knex .timeout(ms, {cancel: boolean}) use cancel instead of terminate for postgresql Nov 5, 2019
@elhigu

This comment has been minimized.

Copy link
Member

elhigu commented Nov 6, 2019

Error: Connection terminated unexpectedly
at Connection.con.once (xxx/node_modules/pg/lib/client.js:200:9)
at Object.onceWrapper (events.js:286:20)\n    at Connection.emit (events.js:198:13)
at Socket.<anonymous> (xxx/node_modules/pg/lib/connection.js:129:10)
at Socket.emit (events.js:203:15)\n    at endReadableNT (_stream_readable.js:1145:12)
at process._tickCallback (internal/process/next_tick.js:63:19)

Good catch. Maybe this happen if you cancel query inside a transaction and connection is actually not returned to pool after query was cancelled (and terminated).

Otherwise there is some other bug causing that behavior and changing terminate -> cancel just hides it away. I mean that if query is terminated outside of transaction, then the canceled/broken connection is returned to the pool and connection closed event should have been recorded inside a connection telling that connection is broken (if query returns an error then it means that connection close event has also reached the pg driver).

Anyways I agree pg_cancel_backend is the correct function to call for canceling. I don't know how I missed that comment in the original PR.

I think that we also should have a separate reserved connection in the pool to run our maintenance queries like cancel in case where pool is full.

@kibertoad

This comment has been minimized.

Copy link
Collaborator

kibertoad commented Nov 14, 2019

@JbRezidu Do you still plan to expand upon this PR?

@JbRezidu

This comment has been minimized.

Copy link
Contributor Author

JbRezidu commented Nov 15, 2019

@JbRezidu Do you still plan to expand upon this PR?

I'm sorry but I won't have the time to do more, I'm on a big project and it takes me all my time... :(

@elhigu

This comment has been minimized.

Copy link
Member

elhigu commented Nov 18, 2019

I think that this PR could be merged if current tests are passing.

An additional integration test would be nice to have, where a query running inside transaction is canceled and then transaction can be still used (that cancellation might rollback the transaction which is fine).

@JbRezidu

This comment has been minimized.

Copy link
Contributor Author

JbRezidu commented Nov 18, 2019

I think all tests are cleared, don't know why it is written Some checks haven't completed yet because when I click on it I see everything green #4620 passed

@kibertoad kibertoad merged commit 2ac08c3 into knex:master Nov 21, 2019
1 check was pending
1 check was pending
continuous-integration/travis-ci/pr The Travis CI build is in progress
Details
@kibertoad

This comment has been minimized.

Copy link
Collaborator

kibertoad commented Nov 27, 2019

Released in 0.20.3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.