Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retry failing for implicit pool queries #176

Closed
ivank opened this issue May 13, 2020 · 10 comments · Fixed by #308
Closed

Retry failing for implicit pool queries #176

ivank opened this issue May 13, 2020 · 10 comments · Fixed by #308
Labels

Comments

@ivank
Copy link

ivank commented May 13, 2020

When using pool.query and Postgres throws an error, for example deadlock detected, code 40P01 slink attempts a retry. It is possible to have a deadlock even when executing a normal update query, not just a transaction.

Expected Behavior

We should retry the implicit query.

Current Behavior

The query mechanism seems to assume there are transactionQueries and we end up with transactionQueries is not iterable error. (In node_modules/slonik/dist/routines/executeQuery.js:32:38)

Steps to Reproduce

Execute several update queries that would deadlock, using implicit pool.query

Logs

    TypeError: transactionQueries is not iterable
        at retryTransaction (/Users/ivankerin/Projects/boost-statements-loss/node_modules/slonik/dist/routines/executeQuery.js:32:38)
        at process._tickCallback (internal/process/next_tick.js:68:7) }

If I set transactionRetryLimit: 0, I get:

   error:
    { error: deadlock detected
        at Connection.parseE (/Users/ivankerin/Projects/boost-statements-loss/node_modules/pg/lib/connection.js:614:13)
        at Connection.parseMessage (/Users/ivankerin/Projects/boost-statements-loss/node_modules/pg/lib/connection.js:413:19)
        at Socket.<anonymous> (/Users/ivankerin/Projects/boost-statements-loss/node_modules/pg/lib/connection.js:129:22)
        at Socket.emit (events.js:198:13)
        at Socket.EventEmitter.emit (domain.js:448:20)
        at addChunk (_stream_readable.js:287:12)
        at readableAddChunk (_stream_readable.js:268:11)
        at Socket.Readable.push (_stream_readable.js:223:10)
        at TCP.onStreamRead [as onread] (internal/stream_base_commons.js:94:17)
      name: 'error',
      length: 323,
      severity: 'ERROR',
      code: '40P01',
      detail:
       'Process 383 waits for ShareLock on transaction 5191; blocked by process 378.\nProcess 378 waits for ShareLock on transaction 5190; blocked by process 383.',
      hint: 'See server log for query details.',
      position: undefined,
      internalPosition: undefined,
      internalQuery: undefined,
      where: 'while updating tuple (59565,14) in relation "supply"',
      schema: undefined,
      table: undefined,
      column: undefined,
      dataType: undefined,
      constraint: undefined,
      file: 'deadlock.c',
      line: '1146',
      routine: 'DeadLockReport' } }
@ivank ivank added the bug label May 13, 2020
@gajus
Copy link
Owner

gajus commented May 13, 2020

Sounds like a simple fix. Are you available to raise a PR?

@ivank
Copy link
Author

ivank commented May 18, 2020

Sorry, not used to dealing with flow. Would gladly help if it was plain js / TypeScript.

@gajus
Copy link
Owner

gajus commented May 18, 2020

There is nothing about this fix that requires knowing Flow.

@oh-wo
Copy link

oh-wo commented Jul 12, 2020

Hi @gajus ,

I'm interested to have a go at this one but need some guidance as I am fresh in this codebase. Like ivank I am not used to some of the conventions/structure of the codebase so please "go easy on me" if I am a bit "off".

The problem
I've had a look around and think I understand the problem:

  • retry of failed transactions works now. It works by invoking retryTransaction in which it repeatedly retries the transaction until remainingRetries is 0, at which point it exits.
  • remainingRetries expects transactionQueries to be an iterable and fails if it is not, in the loop within retryTransaction that applies each transactionQuery
  • transactionQueries is an array for transactions (set in transaction), but is undefined for other queries

How to fix
I think the best fix is to modify lines 152/159 of executeQuery.js to always set transactionQueries.

if (!connection.connection.slonik.transactionQueries) {
   connection.connection.slonik.transactionQueries = []
}

connection.connection.slonik.transactionQueries.push({
  executionContext,
  executionRoutine,
  sql: actualQuery.sql,
  values: actualQuery.values,
})
  • are there other side effects elsewhere if I do this? I don't quite understand the role of transactionQueries vs actualQuery and why they're split:
  • Alternatively, connection.connection.slonik.transactionQueries = []; could be defined elsewhere, like in mapTaggedTemplateLiteralInvocation (and the other methods in bindPoolConnection) so it is always defined for each query.
  • would that work or is there more to it than that? Is there an alternative solution you prefer where we don't use transactionQueries?
  • Looking at transaction, connection.connection.slonik.transactionQueries = []; is set when the transactionId is defined. Therefore it would be nice to do the same within targetMethod (e.g. manyFirst etc). Is that best?

If you have a few minutes, could you please provide any feedback? Thanks for your time.

@gajus
Copy link
Owner

gajus commented Jul 13, 2020

@oh-wo if you can add a failing test case to integration, I will make the necessary changes to make it work.

@oh-wo
Copy link

oh-wo commented Jul 13, 2020

Thanks; I will aim to open a PR in the next 1-2 weeks.

@oh-wo
Copy link

oh-wo commented Aug 22, 2020

Hey Gajus,

Just to follow up and apologise for the delay; I am not sure when I will get to this, despite it being on my mind.

@nponiros
Copy link
Contributor

nponiros commented Nov 5, 2021

I'm trying to fix this before fixing #163 but I'm a bit unsure what the best way to fix it is.

My initial idea was to only retry the query that failed but I don't see a way to tell if the query is part of a transaction that was started by executing a 'begin' query without using the transaction method. If it is part of a transaction then retrying the query will not work.

I also don't think that using connection.connection.slonik.transactionQueries as suggested above will work because, as far as I can tell, there is no guarantee that only one query will be executed on any given connection. If a connection is used for multiple queries and we add the transactionQueries array to it in any method we will have the bug described in #163. Also if the user didn't explicitly start a transaction it is probably a bad idea to retry all queries.

The last possible solution I can think of is to fix the code so that it doesn't retry if we are not within a transaction started by the transaction method. This way the code will not throw the transactionQueries is not iterable error. With this users will see the deadlock error and would have to handle it themselves.

@gajus unless you have something against my first idea I will just implement that and trust that the users will not start transactions without using the transaction method.

@gajus
Copy link
Owner

gajus commented Nov 5, 2021

@gajus unless you have something against my first idea I will just implement that and trust that the users will not start transactions without using the transaction method.

That's a reasonable assumption.

nponiros added a commit to nponiros/slonik that referenced this issue Nov 6, 2021
Move transaction retry handling to the transaction and nestedTransaction
methods and change it so that the handler function is called again
instead of just replaying the queries that are part of the transaction.
Fixes gajus#163.

Add a second parameter to the transaction method to allow users to
define a retry limit per transaction. If a retry limit is given for a
transaction then the global defined retry limit for transactions is
ignored.

Change executeQuery so that it retries individual queries that failed
with a transaction rollback error. Only queries that are not part of a
transaction are retried. The number of times a query is retried is
specified by the global queryRetryLimit configuration. Fixes gajus#176.

The above change also fixes gajus#196 since the transactionQueries array no
longer exists.
@gajus gajus closed this as completed in #308 Nov 7, 2021
gajus pushed a commit that referenced this issue Nov 7, 2021
Move transaction retry handling to the transaction and nestedTransaction
methods and change it so that the handler function is called again
instead of just replaying the queries that are part of the transaction.
Fixes #163.

Add a second parameter to the transaction method to allow users to
define a retry limit per transaction. If a retry limit is given for a
transaction then the global defined retry limit for transactions is
ignored.

Change executeQuery so that it retries individual queries that failed
with a transaction rollback error. Only queries that are not part of a
transaction are retried. The number of times a query is retried is
specified by the global queryRetryLimit configuration. Fixes #176.

The above change also fixes #196 since the transactionQueries array no
longer exists.

BREAKING CHANGE: changes query / transaction retry strategy
@gajus
Copy link
Owner

gajus commented Nov 7, 2021

🎉 This issue has been resolved in version 25.0.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

@gajus gajus added the released label Nov 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants