Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dead connection between pgbouncer and the server #138

Closed
pracucci opened this issue Jul 7, 2016 · 12 comments
Closed

Dead connection between pgbouncer and the server #138

pracucci opened this issue Jul 7, 2016 · 12 comments

Comments

@pracucci
Copy link

pracucci commented Jul 7, 2016

We do have pgbouncer configured with 2 databases: master and slave. Today we got a network issue between the pgbouncer instance and the slave server instance (all network packets were dropped) and thus all slave connections got stuck.

We do have TCP keep alive configured, but since tcp_retries2 is the default one (15) it takes a very long time before stuck connections get dropped.

We're looking for a solution that doesn't involve setting query_timeout (since we do have some very long queries) and lower tcp_retries2 system wide (since it will affect all applications running on the same server).

I'm wondering if patching pgbouncer adding support for TCP_USER_TIMEOUT could be a solution. Do you have any suggestion about it?

Thank you,
Marco

@PJMODOS
Copy link
Contributor

PJMODOS commented Sep 20, 2016

Have you tried setting the keepalives_count connection parameter or tuning the connection keepalive behaviour in general? (see https://www.postgresql.org/docs/9.5/static/libpq-connect.html#LIBPQ-PARAMKEYWORDS )

@PJMODOS
Copy link
Contributor

PJMODOS commented Sep 20, 2016

Also pgbouncer supports it's own keepalive configuration in the ini file:

;; whether tcp keepalive should be turned on (0/1)
;tcp_keepalive = 1

;; following options are Linux-specific.
;; they also require tcp_keepalive=1

;; count of keepaliva packets
;tcp_keepcnt = 0

;; how long the connection can be idle,
;; before sending keepalive packets
;tcp_keepidle = 0

;; The time between individual keepalive probes.
;tcp_keepintvl = 0

@pracucci
Copy link
Author

Yes, we've tried to both set system wide and pgbouncer specific TCP keepalives settings with no luck. Do you have any suggested setup? I'm just wondering if we did any mistake.

Thank you,
Marco

@markokr
Copy link
Contributor

markokr commented Dec 27, 2016

I know that Linux kernel rejects too short timeouts. But the succesful setup I've run is:

; 4m idle + 1m check
tcp_keepidle = 240
tcp_keepcnt = 4
tcp_keepintvl = 15

You can try to lower them, but not under 1m range. But after first failures, pgbouncer fast-fail should kick in and later clients should get faster rejects.

@markokr markokr closed this as completed Dec 27, 2016
@sdemontfort
Copy link

@pracucci did you manage to solve this issue? My team is hitting the same problem. We can get around it by changing Linux’s tcp_retries2 to a lower number but this isn’t ideal.
Is there a way to set TCP_USER_TIMEOUT ourselves?

@pracucci
Copy link
Author

@sdemontfort Tuning Linux TCP stack via sysctl - on the pgbouncer host - still looks the best option to me. Our settings are:

  • net.ipv4.tcp_keepalive_time: 30
  • net.ipv4.tcp_keepalive_intvl: 5
  • net.ipv4.tcp_keepalive_probes: 6
  • net.ipv4.tcp_retries2: 3
  • net.ipv4.tcp_tw_reuse: 1
  • net.ipv4.tcp_slow_start_after_idle: 0

@sdemontfort
Copy link

sdemontfort commented Sep 29, 2019

Thanks for the response @pracucci 😃
Our particular issue with the dead TCP connections is that we use AWS' RDS, and when it does an AZ failover, TCP connections are stuck in the ESTABLISHED state. Setting net.ipv4.tcp_retries2 to a low number fixes this. However, applying these OS-level settings in production isn't ideal for us.

I have read up about a socket-level TCP_USER_TIMEOUT option that can be set when opening a connection.
I've made a change to PgBouncer source and am testing it now with our service. The idea is to enable an .ini option: tcp_user_timeout=X which enables setting this at the socket level, rather than the OS level.

@petere, is this a change that you'd be willing to include in source?

@pracucci
Copy link
Author

@sdemontfort If I'm not missing anything, TCP_USER_TIMEOUT covers only the case the connection drops while the client is transmitting some data. Let's say you sent a query, successfully received by Postgres, you're waiting for the query results and the connection drop: is this case covered by TCP_USER_TIMEOUT?

@sdemontfort
Copy link

sdemontfort commented Sep 29, 2019

I don't think it's if the connection drops while the client is transmitting, I think that it sets the allowed amount of time that data sent can go unacknowledged before the connection is forcibly closed:

TCP_USER_TIMEOUT is a TCP level socket option that takes an unsigned int,
when > 0, to specify the maximum amount of time in ms that transmitted
data may remain unacknowledged before TCP will forcefully close the
corresponding connection and return ETIMEDOUT to the application. If
0 is given, TCP will continue to use the system default.

https://patchwork.ozlabs.org/patch/62889/

I believe the case would be more like:

  • You have connected to Postgres successfully (TCP connection in ESTABLISHED state)
  • Postgres fails over to another IP address or dies, but for some reason doesn't send back a packet via TCP to tell the connection it can close, so it's still in ESTABLISHED state
  • After the period set in TCP_USER_TIMEOUT is reached, the TCP connection is closed

However, happy to be proven wrong if I'm missing some understanding 😄

@sdemontfort
Copy link

sdemontfort commented Oct 9, 2019

@pracucci you may be happy to know that I managed to solve the problem without changing Linux OS-level settings.

I've forked this repo and introduced the socket-level option TCP_USER_TIMEOUT, which works in combination with PgBouncer keep alive settings (I know, my implementation is hacky at the moment): https://github.com/sdemontfort/pgbouncer/blob/7336a0dc26643aed4026b8ca6e4738e6501ded7a/src/util.c#L154-L161

Here's my relevant PgBouncer config using the forked repo:

# The same as tcp_retries2 but at the socket level, in ms.
tcp_user_timeout = 12500

# The check for user timeout is only done after the first keep alive probe is sent.
tcp_keepalive = 1
tcp_keepidle = 1
tcp_keepintvl = 11
tcp_keepcnt = 3

The numbers come from this amazingly helpful blog post: https://blog.cloudflare.com/when-tcp-sockets-refuse-to-die/
which says:

... We mentioned the TCP_USER_TIMEOUT option before. It sets the maximum amount of time that transmitted data may remain unacknowledged before the kernel forcefully closes the connection. On its own, it doesn't do much in the case of idle connections. The sockets will remain ESTABLISHED even if the connectivity is dropped. However, this socket option does change the semantics of TCP keepalives. The tcp(7) manpage is somewhat confusing:
...
For the user timeout to have any effect, the icsk_probes_out must not be zero. The check for user timeout is done only after the first probe went out. Let's check it out. Our connection settings:

TCP_USER_TIMEOUT = 5*1000 - 5 seconds
SO_KEEPALIVE = 1 - enable keepalives
TCP_KEEPIDLE = 1 - send first probe quickly - 1 second idle
TCP_KEEPINTVL = 11 - subsequent probes every 11 seconds
TCP_KEEPCNT = 3 - send three probes before timing out

I'll likely be creating a PR in the coming days to suggest this change into PgBouncer source, as I think it's important.
In my team we don't have the option of changing the OS-level settings at this time, and I think a per socket approach is generally better anyhow.

@pracucci
Copy link
Author

pracucci commented Oct 9, 2019

Excellent news @sdemontfort. Would be great having the PR merged in pgbouncer 🤞

@sdemontfort
Copy link

sdemontfort commented Oct 11, 2019

@pracucci here's my PR, please comment/change as you see fit. I'm active on this at the moment as our team plans to use this in production soon. I'd also included steps to test.
#428

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants