Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not wait server_login_retry for next connect if cancellation succeeds #329

Merged
merged 1 commit into from Nov 9, 2018

Conversation

@marcocitus
Copy link

marcocitus commented Sep 25, 2018

If postgres restarts while there are N cancellations in the queue, pgbouncer is currently unavailable for at least N*server_login_retry because it uses every new connection for one queued cancellation and then waits server_login_retry before opening a new connection because the last_connect_failed flag is still set to 1. This can lead to prolonged downtime.

This changes fixes the issue by introducing a last_login_failed flag. The last_connect_failed flag is now reset when a cancellation succeeds, such that launch_new_connection no longer waits if pgbouncer manages to connect, but has queued cancellations. The last_login_failed flag has the same semantics as the last_connect_failed flag had previously, such that check_fast_fail still rejects connections when there are no servers available and the last login failed.

Fixes #328

If postgres restarts while there are N cancellations in the queue,
pgbouncer is currently unavailable for at least N*server_login_retry
because it uses every new connection for one queued cancellation and
then waits server_login_retry before opening a new connection because
the last_connect_failed flag is still set to 1. This can lead to
prolonged downtime.

This changes fixes the issue by introducing a last_login_failed flag.
The last_connect_failed is now reset when a cancellation succeeds, such
that launch_new_connection no longer waits if pgbouncer manages to
connect, but has queued cancellations. The last_login_failed flag has
the same semantics as the last_connect_failed flag had previously, such
that check_fast_fail still rejects connections when there are no servers
available and the last login failed.
@DimCitus DimCitus force-pushed the marcocitus:cancellations_fix branch from d757975 to 28de1c6 Nov 8, 2018
@PJMODOS PJMODOS merged commit d63a264 into pgbouncer:master Nov 9, 2018
1 check passed
1 check passed
continuous-integration/travis-ci/pr The Travis CI build passed
Details
netbsd-srcmastr pushed a commit to NetBSD/pkgsrc that referenced this pull request Aug 25, 2019
Changes since 1.9.0

2019-07-01 - PgBouncer 1.10.0 - "Afraid of the World"

    Features
        Add support for enabling and disabling TLS 1.3. (TLS 1.3 was
        already supported, depending on the OpenSSL library, but now the
        configuration settings to pick the TLS protocol versions also
        support it.)
    Fixes
        Fix TLS 1.3 support. This was broken with OpenSSL 1.1.1 and
        1.1.1a (but not before or after).
        Fix a rare crash in SHOW FDS
        (pgbouncer/pgbouncer#311).
        Fix an issue that could lead to prolonged downtime if many cancel
        requests arrive
        (pgbouncer/pgbouncer#329).
        Avoid "unexpected response from login query" after a postgres
        reload
        (pgbouncer/pgbouncer#220).
        Fix idle_transaction_timeout calculation
        (pgbouncer/pgbouncer#125). The
        bug would lead to premature timeouts in specific situations.
    Cleanups
        Make various log and error messages more precise.
        Fix issues found by Coverity (none had a significant impact in
        practice).
        Improve and document all test scripts.
        Add additional SHOW commands to the documentation.
        Convert the documentation from rst to Markdown.
        Python scripts in the source tree are all compatible with Python 3
        now.
@marcocitus marcocitus deleted the marcocitus:cancellations_fix branch Oct 19, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.