Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

idle_transaction_timeout affects transactions which are not actually idle in transaction given time #125

Closed
alexius2 opened this issue Mar 22, 2016 · 5 comments
Assignees

Comments

@alexius2
Copy link

Sometimes pgbouncer close transaction because of idle_transaction_timeout when it shouldn't. i.e. application run some slow queries (duration of query could be larger than timeout) one after another without significant delay between them (confirmed by logs).

steps to reproduce:

  1. set idle_transaction_timeout = 1 in config, reload.
  2. create file with simple transaction:
    cat ~/test-idle.sql
    begin;
    select pg_sleep(1);
    select pg_sleep(2);
    rollback;
  3. make sure you are able to connect to some db via pgbouncer without password (add password to ~/.pgpass for example) and run

for i in {1..1000}; do psql -p 6432 -h 127.0.0.1 -U postgres -f ~/test-idle.sql || break; done 1>/dev/null

  1. wait until you see error messages like this:
    psql:/var/lib/pgsql/test-idle.sql:2: ERROR: idle transaction timeout
    psql:/var/lib/pgsql/test-idle.sql:3: server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request.
    psql:/var/lib/pgsql/test-idle.sql:3: connection to server was lost

it could take several minutes. script will stop after error.

reproduced with pgbouncer 1.7.2 at centos 7.2.1511. I guess problem is somewhere in these lines:

age = now - server->link->request_time;

@Nastradamus
Copy link

We have the same bug.
'Idle in transaction time' is calculating from the begining but should from the end of last statement.

idle_transaction_timeout crashes long active transactions.

@petere petere self-assigned this Feb 2, 2018
@amir-hadi
Copy link

We plan to use pgBouncer to prevent connection leaks by using this functionality, but it seems to be broken. Is there a reason why this is not touched yet? Or am I misunderstanding this feature?

Our application runs distributed at GCP using preemptible machines. Somehow when GCP is killing the preemptible machines it can happen that our applications postgres connection remains active in postgres and with that connection all possible rowlocks, blocking certain rows for an indefinite time. The idea was to use pgBouncer to kill transactions / connections that are idling to long and using this parameter, but it this functionality doesn't work, I don't think that pgBouncer will help us in our usecase.

Is there something where we can help getting this fixed?

@alefhsousa
Copy link

I have the same problem related by @amir-hadi but using aws lambda. Exists any workaround for this?

@petere
Copy link
Member

petere commented Feb 6, 2019

I can reproduce the issue. I think the problem is that

age = now - server->link->request_time;

uses the client's request time, but we should be using the server's request time because that's where the information about whether we are "idle in transaction" comes from. (The request time is basically when the last network packet was processed.)

If I change the line to

age = now - server->request_time;

then the test case runs successfully.

(The previous coding was correct for query_timeout, so we'd have to split this up and use different age calculations for the two setting.)

Does that make sense?

@tuukkamustonen
Copy link

Did 876d8a5 have any impact on this?

@petere petere closed this as completed in 33f8c61 Jun 24, 2019
netbsd-srcmastr pushed a commit to NetBSD/pkgsrc that referenced this issue Aug 25, 2019
Changes since 1.9.0

2019-07-01 - PgBouncer 1.10.0 - "Afraid of the World"

    Features
        Add support for enabling and disabling TLS 1.3. (TLS 1.3 was
        already supported, depending on the OpenSSL library, but now the
        configuration settings to pick the TLS protocol versions also
        support it.)
    Fixes
        Fix TLS 1.3 support. This was broken with OpenSSL 1.1.1 and
        1.1.1a (but not before or after).
        Fix a rare crash in SHOW FDS
        (pgbouncer/pgbouncer#311).
        Fix an issue that could lead to prolonged downtime if many cancel
        requests arrive
        (pgbouncer/pgbouncer#329).
        Avoid "unexpected response from login query" after a postgres
        reload
        (pgbouncer/pgbouncer#220).
        Fix idle_transaction_timeout calculation
        (pgbouncer/pgbouncer#125). The
        bug would lead to premature timeouts in specific situations.
    Cleanups
        Make various log and error messages more precise.
        Fix issues found by Coverity (none had a significant impact in
        practice).
        Improve and document all test scripts.
        Add additional SHOW commands to the documentation.
        Convert the documentation from rst to Markdown.
        Python scripts in the source tree are all compatible with Python 3
        now.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants