New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kill queries after timeout for PostgreSQL #2636

Merged
merged 60 commits into from Sep 26, 2018

Conversation

Projects
None yet
@veikovx
Copy link
Contributor

veikovx commented May 30, 2018

This is for #707 .
Basically a copy of the mysql functionality implemented for PostgreSQL

@elhigu

This comment has been minimized.

Copy link
Collaborator

elhigu commented May 30, 2018

I will get back to this, when I have time to meditate on that implementation some time. Even that it is copied from mysql it doesn't automatically mean it is correct.

@kibertoad

This comment has been minimized.

Copy link
Collaborator

kibertoad commented Jul 1, 2018

Should be updated to remove references to mariadb :)

@veikovx

This comment has been minimized.

Copy link
Contributor

veikovx commented Jul 10, 2018

Thanks for the notification @kibertoad. I've merged the latest commits into my branch now.

Hoping @elhigu will have had some time to meditate over this :)

@elhigu

This comment has been minimized.

Copy link
Collaborator

elhigu commented Aug 16, 2018

I suppose this is good :) I couldn't spot anything wrong with it. I will still try this out locally to see how tests fail, when only test changes are applied, but cancel implementation is not there.

@elhigu
Copy link
Collaborator

elhigu left a comment

Thanks for this. Code is good, but there should be link to documentation PR which adds information to docs that also postgresql now supports query canceling.

Also I would like to test that this won't leak pool connections... I added line comment about that.

Show resolved Hide resolved src/dialects/postgres/index.js Outdated
@veikovx

This comment has been minimized.

Copy link
Contributor

veikovx commented Aug 23, 2018

Associated documentation update:
knex/documentation#131

sql: 'SELECT pg_terminate_backend(?);',
bindings: [connectionToKill.processID],
options: {},
}).then(() => {

This comment has been minimized.

@elhigu

elhigu Aug 29, 2018

Collaborator

Shouldn't this then be actually .finally() so that even if that release query rejects the connection will be released from pool?

This comment has been minimized.

@elhigu

elhigu Aug 29, 2018

Collaborator

btw. that error case is really hard to test... without wrapping

this.query(conn, {
        method: 'raw',
        sql: 'SELECT pg_terminate_backend(?);',
        bindings: [connectionToKill.processID],
        options: {},
      })

part to a separate method, which can then be temporarily overridden to throw an error during tests.

This comment has been minimized.

@veikovx

veikovx Aug 30, 2018

Contributor

Yes it should be .finally(). Will fix. Do you suggest that I try to write a test for this case as well? I can agree that it would be quite a task.

This comment has been minimized.

@elhigu

elhigu Aug 30, 2018

Collaborator

The test shouldn't be so hard to write if done in a way I mentioned. Yep test would be important to have, because someone might break this code afterwards otherwise.

This comment has been minimized.

@wpoch

wpoch Sep 14, 2018

I'm porting this to a custom client I created in my source code in order to start using it and I have to wait until the this.releaseConnection(conn); promise resolves, otherwise, the next issued query will try to use the same conn DB connection and everything hangs.

This comment has been minimized.

@elhigu

elhigu Sep 17, 2018

Collaborator

@wpoch that shouldn't be possible if releaseConnection is working correctly. But that sounds like good test case to have, in case if there is bug in knex involved.

This comment has been minimized.

@wpoch

wpoch Sep 17, 2018

I finally implemented it this way:

    canCancelQuery: true,
    async cancelQuery(connectionToKill) {
      logger.info('Killing connection', { processID: connectionToKill.processID, __knexUid: connectionToKill.__knexUid, activeQueryText: connectionToKill.activeQuery.text });
      const connection = await this.acquireConnection();
      try {
        await this.query(connection, {
          method: 'raw',
          sql: 'SELECT pg_terminate_backend(?);',
          bindings: [connectionToKill.processID],
          options: {}
        });
        this.releaseConnection(connectionToKill);
      } finally {
        this.releaseConnection(connection);
      }
    }

These are my test cases, I would try to port them to the knex test suite if I have time in order to repro.

'use strict';

const should = require('should');
const uuid = require('node-uuid');
const injectorFactory = require('../../injector');

describe('Timeout PG Client', () => {
  const queryTimeout = 50;
  let knex;
  before(() => {
    const injector = injectorFactory({ database: { defaultQueryTimeout: queryTimeout } });
    knex = injector.get('db');
  });

  async function assertQueryIsIdle(query) {
    const { rows } = await knex.raw(`
      SELECT pid, now() - pg_stat_activity.query_start AS duration, query, state
      FROM pg_stat_activity
      WHERE state = 'active'
      AND query = ?;`,
      [query]
    );

    rows.length.should.be.equal(0);
  }

  describe('given a query that takes less', () => {
    describe('than the default timeout', () => {
      it('should complete and retrieve results', async () => {
        const result = await knex.raw(`SELECT 1 as result, pg_sleep(${ 1 / 1000})`);
        result.rows[0].result.should.be.equal(1);
      });
    });
    describe('than the specified timeout', () => {
      it('should complete and retrieve results', async () => {
        const result = await knex.raw(`SELECT 1 as result, pg_sleep(${ 1 / 1000})`)
          .timeout(queryTimeout, { cancel: true });
        result.rows[0].result.should.be.equal(1);
      });
    });
  });
  describe('given a query with an error', () => {
    it(`should throw the original error`, async () => {
      await knex.raw(`SELECT 1 as result FROM not_existing_table`)
        .should.be.rejectedWith('relation "not_existing_table" does not exist');
    });
  });
  describe('given a query that takes longer', () => {
    describe('than the default timeout', () => {
      it('should timeout and kill the underlying query', async () => {
        const sql = `SELECT 1 as result, pg_sleep(${(queryTimeout + 2) }), '${uuid.v4()}'`;
        await knex.raw(sql).should.be.rejectedWith(`Defined query timeout of ${queryTimeout}ms exceeded when running query.`);
        await assertQueryIsIdle(sql);
      });
    });
    describe('than the specified timeout', () => {
      describe('when the timeout is bigger than the default', () => {
        it('should timeout and kill the underlying query', async () => {
          const timeoutOverride = queryTimeout * 20;
          // Adding a generic UUID in order to not conflict with parallel runs
          const sql = `SELECT 1 as result, pg_sleep(${(timeoutOverride + 2)}), '${uuid.v4()}'`;
          await knex.raw(sql).timeout(timeoutOverride, { cancel: true }).should.be.rejectedWith(`Defined query timeout of ${timeoutOverride}ms exceeded when running query.`);
          await assertQueryIsIdle(sql);
        });
      });
      describe('when the timeout is smaller than the default', () => {
        it('should timeout and kill the underlying query', async () => {
          const timeoutOverride = queryTimeout / 2;
          const sql = `SELECT 1 as result, pg_sleep(${(timeoutOverride + 2) / 1000}), '${uuid.v4()}'`;
          await knex.raw(sql).timeout(timeoutOverride, { cancel: true }).should.be.rejectedWith(`Defined query timeout of ${timeoutOverride}ms exceeded when running query.`);
          await assertQueryIsIdle(sql);
        });
      });
    });
  });
});

I copied the implementation from this PR and the one from mysql and both hang on the assertQueryIsIdle method since they try to use the same DB connection that was being killed.
With my implementation, the assertQueryIsIdle method will use a different connection which is the same that's being used for executing the kill query.

@KristjanTammekivi

This comment has been minimized.

Copy link

KristjanTammekivi commented Sep 6, 2018

What's the status on this? (aka bump)

@elhigu

This comment has been minimized.

Copy link
Collaborator

elhigu commented Sep 6, 2018

@KristjanTammekivi Last activity was when I hoped for having regression test for part which makes sure connection is released even if release fails #2636 (comment)

willsoto and others added some commits May 29, 2018

Allow overwriting log functions (#2625)
* Example build of custom log functions
* Handle logger object better for transactions
* Adjust test to ignore sqlite warning message
Remove babel-plugin-lodash (#2634)
While in theory, this may reduce the bundle size,
in practice it adds a ton of overhead during startup
due to the number of additional requires. Bundle
size also shouldn't matter for server side modules.
Add json support to the query builder for mysql (#2635)
* Add json support to the query builder for mysql

refs #1036

Based on #1902

* Clarify supported version
fix wrapIdentifier not being called in postgres alter column (#2612)
* fix wrapIdentifier not being called in postgres alter column

* add test for wrapIdentifier call in postgres alter column

* add comment regarding issue

* add issue & PR #'s in comment
Remove readable-stream and safe-buffer (#2640)
* Add json support to the query builder for mysql

refs #1036

Based on #1902

* Clarify supported version

* Use native Buffer and Stream implementations
chore: add Node.js 10 (#2594)
* chore: add Node.js 10

* chore: trigger new build

* chore: update lockfile

* chore: trigger new build

* fix: use npm i instead of npm ci
Fix mssql driver crashing (#2637)
* Run SQL Server tests locally running SQL server in docker

* WIP mssql test stuff

* Patched MSSQL driver to not crash knex anymore
Remove WebSQL dialect (#2647)
* Add json support to the query builder for mysql

refs #1036

Based on #1902

* Clarify supported version

* Use native Buffer and Stream implementations

* Remove WebSQL dialect

tgriesser and others added some commits Jul 13, 2018

Fix issue with select(0) (#2711)
* Fix issue with select(0). Fixes #2658
Refactor migrator (#2695)
* Refactor migrator

* Fix exports

* Fix ESLint

* Fix migrator

* Fix reference to config

* Split some more

* Fix table builder

* Fix argument order

* Merge branch 'master' of https://github.com/tgriesser/knex into feature/2690-support-multiple-migration-dirs

# Conflicts:
#	src/migrate/index.js
#	test/index.js
#	test/unit/migrate/migrator.js
Fix #2715 (#2721)
* Fix #2715, explicitly set precision in datetime & timestamp
* Allow for precision in knex.fn.now, mysql time
* Add test for datetime with precision
Add tests for multiple union arguments with callbacks and builders (#…
…2749)

* Add tests for multiple union arguments

* Add some callback and raw tests to union unit tests
Use Datetime2 for MSSQL datetime + timestamp types (#2757)
* Use Datetime2 for MSSQL datetime + timestamp types

Datetime2 is now the recommended column type for new date work: https://docs.microsoft.com/en-us/sql/t-sql/data-types/datetime-transact-sql?view=sql-server-2017

* Add tests for MSSQL datetime2 changes
General/document breaking change (#2774)
* Add changelog entry for a breaking change

* Improve entry
Allow timestamp with timezone on mssql databases (#2724)
* Allow timestamp with timezone on mssql databases

* Change timestamp parameter to use object instead of boolean
Feature/2690: Multiple migration directories (#2735)
* Implement support for multiple migration directories

* Add tests for new functionality

* Fix tape tests

* Pass migration directory upwards

* Fix multiple directory support

* Fix bugs

* Rollback correctly

* Fix remaining tests

* Address comments
#2758: Implement fail-fast logic for dialect resolution (#2776)
* Implement fail-fast logic for dialect resolution, clean-up code around.

* Remove method that was deprecated long time ago

* Address additional comments

* Try addressing comments

* Set client explicitly

* Fix compatibility with older Node versions
Use columnize instead of wrap in using(). (#2713)
* Use columnize instead of wrap in using().

This is an attempt to fix #2136. Also added an integration test, couldn't find any existing.

* Exclude MSSQL from test as it does not support .using

* Change test to not use subquery

* Change test
Introduced abstraction for getting migrations (#2775)
* Introduced abstraction for getting migrations

This would allow a webpack migration source which is compatible with bundling.

* Fixed migration validation with custom migration source

* Fixed issues after rebasing on muti directory PR

* Renamed parameter and fixed error message

* Addressed some PR comments

* Finished comment

* Moved filename extension filtering into fs-migrator

* Added test showing in memory custom migration source

* Cleaned up how to get config

* Fixed failing test

* Hopefully fix tests

* Fix Node.js 10 support in tests
@kibertoad

This comment has been minimized.

Copy link
Collaborator

kibertoad commented Sep 20, 2018

@veikovx There's a bunch of conflicts.

@veikovx

This comment has been minimized.

Copy link
Contributor

veikovx commented Sep 20, 2018

Haven't found time to deal with this. Sorry about that.
@elhigu I made the separation and a test for it.
@wpoch I could not reproduce your issue. Latest test I added fails at first cancel query and then tries another query plus it's cancelling, which both succeed. With pool.max set to 2 this would mean that both cancelled and cancelling query are released. If you can reproduce it with a test that'd be great.

_wrappedCancelQueryCall(conn, connectionToKill) {
return this.query(conn, {
method: 'raw',
sql: 'SELECT pg_terminate_backend(?);',

This comment has been minimized.

@esvinson

esvinson Sep 26, 2018

Should this be using pg_cancel_backend instead of pg_terminate_backend? pg_terminate_backend is more heavy handed than necessary most of the time.

@elhigu

elhigu approved these changes Sep 26, 2018

@elhigu

This comment has been minimized.

Copy link
Collaborator

elhigu commented Sep 26, 2018

Looking good, thanks!

@elhigu elhigu merged commit 932fa4b into tgriesser:master Sep 26, 2018

2 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
coverage/coveralls Coverage increased (+0.09%) to 85.801%
Details

@veikovx veikovx deleted the veikovx:postgresKillQuery branch Sep 27, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment