Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kill queries after timeout for PostgreSQL #2636

Merged
merged 60 commits into from
Sep 26, 2018
Merged

Kill queries after timeout for PostgreSQL #2636

merged 60 commits into from
Sep 26, 2018

Conversation

veikovx
Copy link
Contributor

@veikovx veikovx commented May 30, 2018

This is for #707 .
Basically a copy of the mysql functionality implemented for PostgreSQL

@elhigu
Copy link
Member

elhigu commented May 30, 2018

I will get back to this, when I have time to meditate on that implementation some time. Even that it is copied from mysql it doesn't automatically mean it is correct.

@kibertoad
Copy link
Collaborator

Should be updated to remove references to mariadb :)

@veikovx
Copy link
Contributor Author

veikovx commented Jul 10, 2018

Thanks for the notification @kibertoad. I've merged the latest commits into my branch now.

Hoping @elhigu will have had some time to meditate over this :)

@elhigu
Copy link
Member

elhigu commented Aug 16, 2018

I suppose this is good :) I couldn't spot anything wrong with it. I will still try this out locally to see how tests fail, when only test changes are applied, but cancel implementation is not there.

Copy link
Member

@elhigu elhigu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this. Code is good, but there should be link to documentation PR which adds information to docs that also postgresql now supports query canceling.

Also I would like to test that this won't leak pool connections... I added line comment about that.

@veikovx
Copy link
Contributor Author

veikovx commented Aug 23, 2018

Associated documentation update:
knex/documentation#131

sql: 'SELECT pg_terminate_backend(?);',
bindings: [connectionToKill.processID],
options: {},
}).then(() => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this then be actually .finally() so that even if that release query rejects the connection will be released from pool?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw. that error case is really hard to test... without wrapping

this.query(conn, {
        method: 'raw',
        sql: 'SELECT pg_terminate_backend(?);',
        bindings: [connectionToKill.processID],
        options: {},
      })

part to a separate method, which can then be temporarily overridden to throw an error during tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it should be .finally(). Will fix. Do you suggest that I try to write a test for this case as well? I can agree that it would be quite a task.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test shouldn't be so hard to write if done in a way I mentioned. Yep test would be important to have, because someone might break this code afterwards otherwise.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm porting this to a custom client I created in my source code in order to start using it and I have to wait until the this.releaseConnection(conn); promise resolves, otherwise, the next issued query will try to use the same conn DB connection and everything hangs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wpoch that shouldn't be possible if releaseConnection is working correctly. But that sounds like good test case to have, in case if there is bug in knex involved.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I finally implemented it this way:

    canCancelQuery: true,
    async cancelQuery(connectionToKill) {
      logger.info('Killing connection', { processID: connectionToKill.processID, __knexUid: connectionToKill.__knexUid, activeQueryText: connectionToKill.activeQuery.text });
      const connection = await this.acquireConnection();
      try {
        await this.query(connection, {
          method: 'raw',
          sql: 'SELECT pg_terminate_backend(?);',
          bindings: [connectionToKill.processID],
          options: {}
        });
        this.releaseConnection(connectionToKill);
      } finally {
        this.releaseConnection(connection);
      }
    }

These are my test cases, I would try to port them to the knex test suite if I have time in order to repro.

'use strict';

const should = require('should');
const uuid = require('node-uuid');
const injectorFactory = require('../../injector');

describe('Timeout PG Client', () => {
  const queryTimeout = 50;
  let knex;
  before(() => {
    const injector = injectorFactory({ database: { defaultQueryTimeout: queryTimeout } });
    knex = injector.get('db');
  });

  async function assertQueryIsIdle(query) {
    const { rows } = await knex.raw(`
      SELECT pid, now() - pg_stat_activity.query_start AS duration, query, state
      FROM pg_stat_activity
      WHERE state = 'active'
      AND query = ?;`,
      [query]
    );

    rows.length.should.be.equal(0);
  }

  describe('given a query that takes less', () => {
    describe('than the default timeout', () => {
      it('should complete and retrieve results', async () => {
        const result = await knex.raw(`SELECT 1 as result, pg_sleep(${ 1 / 1000})`);
        result.rows[0].result.should.be.equal(1);
      });
    });
    describe('than the specified timeout', () => {
      it('should complete and retrieve results', async () => {
        const result = await knex.raw(`SELECT 1 as result, pg_sleep(${ 1 / 1000})`)
          .timeout(queryTimeout, { cancel: true });
        result.rows[0].result.should.be.equal(1);
      });
    });
  });
  describe('given a query with an error', () => {
    it(`should throw the original error`, async () => {
      await knex.raw(`SELECT 1 as result FROM not_existing_table`)
        .should.be.rejectedWith('relation "not_existing_table" does not exist');
    });
  });
  describe('given a query that takes longer', () => {
    describe('than the default timeout', () => {
      it('should timeout and kill the underlying query', async () => {
        const sql = `SELECT 1 as result, pg_sleep(${(queryTimeout + 2) }), '${uuid.v4()}'`;
        await knex.raw(sql).should.be.rejectedWith(`Defined query timeout of ${queryTimeout}ms exceeded when running query.`);
        await assertQueryIsIdle(sql);
      });
    });
    describe('than the specified timeout', () => {
      describe('when the timeout is bigger than the default', () => {
        it('should timeout and kill the underlying query', async () => {
          const timeoutOverride = queryTimeout * 20;
          // Adding a generic UUID in order to not conflict with parallel runs
          const sql = `SELECT 1 as result, pg_sleep(${(timeoutOverride + 2)}), '${uuid.v4()}'`;
          await knex.raw(sql).timeout(timeoutOverride, { cancel: true }).should.be.rejectedWith(`Defined query timeout of ${timeoutOverride}ms exceeded when running query.`);
          await assertQueryIsIdle(sql);
        });
      });
      describe('when the timeout is smaller than the default', () => {
        it('should timeout and kill the underlying query', async () => {
          const timeoutOverride = queryTimeout / 2;
          const sql = `SELECT 1 as result, pg_sleep(${(timeoutOverride + 2) / 1000}), '${uuid.v4()}'`;
          await knex.raw(sql).timeout(timeoutOverride, { cancel: true }).should.be.rejectedWith(`Defined query timeout of ${timeoutOverride}ms exceeded when running query.`);
          await assertQueryIsIdle(sql);
        });
      });
    });
  });
});

I copied the implementation from this PR and the one from mysql and both hang on the assertQueryIsIdle method since they try to use the same DB connection that was being killed.
With my implementation, the assertQueryIsIdle method will use a different connection which is the same that's being used for executing the kill query.

@KristjanTammekivi
Copy link
Contributor

What's the status on this? (aka bump)

@elhigu
Copy link
Member

elhigu commented Sep 6, 2018

@KristjanTammekivi Last activity was when I hoped for having regression test for part which makes sure connection is released even if release fails #2636 (comment)

willsoto and others added 15 commits September 17, 2018 15:35
* Example build of custom log functions
* Handle logger object better for transactions
* Adjust test to ignore sqlite warning message
While in theory, this may reduce the bundle size,
in practice it adds a ton of overhead during startup
due to the number of additional requires. Bundle
size also shouldn't matter for server side modules.
* Add json support to the query builder for mysql

refs #1036

Based on #1902

* Clarify supported version
* fix wrapIdentifier not being called in postgres alter column

* add test for wrapIdentifier call in postgres alter column

* add comment regarding issue

* add issue & PR #'s in comment
* Add json support to the query builder for mysql

refs #1036

Based on #1902

* Clarify supported version

* Use native Buffer and Stream implementations
* chore: add Node.js 10

* chore: trigger new build

* chore: update lockfile

* chore: trigger new build

* fix: use npm i instead of npm ci
* Run SQL Server tests locally running SQL server in docker

* WIP mssql test stuff

* Patched MSSQL driver to not crash knex anymore
* Add json support to the query builder for mysql

refs #1036

Based on #1902

* Clarify supported version

* Use native Buffer and Stream implementations

* Remove WebSQL dialect
tgriesser and others added 21 commits September 17, 2018 15:35
* Fix issue with select(0). Fixes #2658
* Refactor migrator

* Fix exports

* Fix ESLint

* Fix migrator

* Fix reference to config

* Split some more

* Fix table builder

* Fix argument order

* Merge branch 'master' of https://github.com/tgriesser/knex into feature/2690-support-multiple-migration-dirs

# Conflicts:
#	src/migrate/index.js
#	test/index.js
#	test/unit/migrate/migrator.js
* Fix #2715, explicitly set precision in datetime & timestamp
* Allow for precision in knex.fn.now, mysql time
* Add test for datetime with precision
…2749)

* Add tests for multiple union arguments

* Add some callback and raw tests to union unit tests
* Use Datetime2 for MSSQL datetime + timestamp types

Datetime2 is now the recommended column type for new date work: https://docs.microsoft.com/en-us/sql/t-sql/data-types/datetime-transact-sql?view=sql-server-2017

* Add tests for MSSQL datetime2 changes
* Add changelog entry for a breaking change

* Improve entry
* Allow timestamp with timezone on mssql databases

* Change timestamp parameter to use object instead of boolean
* Implement support for multiple migration directories

* Add tests for new functionality

* Fix tape tests

* Pass migration directory upwards

* Fix multiple directory support

* Fix bugs

* Rollback correctly

* Fix remaining tests

* Address comments
* Implement fail-fast logic for dialect resolution, clean-up code around.

* Remove method that was deprecated long time ago

* Address additional comments

* Try addressing comments

* Set client explicitly

* Fix compatibility with older Node versions
* Use columnize instead of wrap in using().

This is an attempt to fix #2136. Also added an integration test, couldn't find any existing.

* Exclude MSSQL from test as it does not support .using

* Change test to not use subquery

* Change test
* Introduced abstraction for getting migrations

This would allow a webpack migration source which is compatible with bundling.

* Fixed migration validation with custom migration source

* Fixed issues after rebasing on muti directory PR

* Renamed parameter and fixed error message

* Addressed some PR comments

* Finished comment

* Moved filename extension filtering into fs-migrator

* Added test showing in memory custom migration source

* Cleaned up how to get config

* Fixed failing test

* Hopefully fix tests

* Fix Node.js 10 support in tests
@kibertoad
Copy link
Collaborator

@veikovx There's a bunch of conflicts.

@veikovx
Copy link
Contributor Author

veikovx commented Sep 20, 2018

Haven't found time to deal with this. Sorry about that.
@elhigu I made the separation and a test for it.
@wpoch I could not reproduce your issue. Latest test I added fails at first cancel query and then tries another query plus it's cancelling, which both succeed. With pool.max set to 2 this would mean that both cancelled and cancelling query are released. If you can reproduce it with a test that'd be great.

_wrappedCancelQueryCall(conn, connectionToKill) {
return this.query(conn, {
method: 'raw',
sql: 'SELECT pg_terminate_backend(?);',

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be using pg_cancel_backend instead of pg_terminate_backend? pg_terminate_backend is more heavy handed than necessary most of the time.

@elhigu
Copy link
Member

elhigu commented Sep 26, 2018

Looking good, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.