[#4200] prevent reconnect floods to Postgres #4232

ebilling · 2019-08-07T03:41:04Z

After many profiles, it became clear that at high transaction levels the system was spending substantial amounts of it's time reaping and re-establishing TLS connections in order to grab locks or run queries against Postgres.

This change resulted in a 30% reduction in CPU for our web processes. The dominate calls now are around marshaling and un-marshaling JSON.

Signed-off-by: Eric Billingsley eric@calculi.com

…loods on TLS Signed-off-by: Eric Billingsley <eric@calculi.com>

vito

thanks! left some feedback/questions inline.

atc/atccmd/command.go

vito · 2019-08-07T13:29:58Z

atc/atccmd/command.go

@@ -1172,7 +1172,8 @@ func (cmd *RunCommand) constructDBConn(
 	}

 	// Prepare
-	dbConn.SetMaxOpenConns(maxConn)
+	dbConn.SetMaxOpenConns(2 * maxIdleConns)


Not sure if we should bump this outright since folks may have tuned their Postgres server-side connection limit accordingly. Doubling the effective pool size on all the web nodes could may be surprising especially with multiple web nodes. Could we just do SetMaxIdleConns(maxConn / 2) instead?

Also, FYI we might end up re-doing our connection pooling; recently we've discussed giving critical things like GC their own connection pool (#4025) and that may turn into having per-backend-component pools with sizes based on the component's priority/usage, instead of having one big pool for everything. For if/when we get to that, what's the main learning here so we don't repeat the original mistake? Is it just to have a bit more breathing room with idle connections?

Thanks!

This doesn't double the pool size, it doubles the max size. If the system needs fewer connections it never gets to the limit. The limit should be on the idle connections, not the open.

For #4025, what's the benefit of creating separate pools? Segmentation of connections creates more opportunities for bottlenecking rather than allowing for free flow.

My thought was that the "right" thing to do is to only set the max Idle and let the max open go to a very high limit. That allows the system to scale up and flex, but keeps a minimum amount of connections open so you don't get stuck in open, use, reap, repeat flows. That's what killed performance on TLS. If you aren't using encryption to the DB, then this isn't a big deal. The setup time is low. But with TLS, you spend many milliseconds going back and forth, checking certificate chains, etc. It's more expensive than the request, so you really need to err on the side of re-using that resource.

The idea is to prevent work from taking up the entire shared connection pool and starving other mission-critical components like garbage collection because the pool has hit MaxOpenConns. This can cause GC to not run, which can cause containers to pile up on the workers and then everything starts failing.

Our thinking - and this is somewhat backed by what we've observed in our test environments - is that each internal component has some ballpark number of connections it needs in order to do its job.

GC, for example, only needs around 2 when run in serial, and around 10 when run in parallel (assuming #4152).

The pipeline scheduler, on the other hand, needs a lot more than that depending on how many jobs there are to schedule. It's fine for scheduling work to "steal" connections from other scheduling work - things will just be scheduled more slowly - but it's not OK for scheduling work to "steal" from GC.

Also keep in mind most Postgres servers have a default limit of 100 open connections. It's not enough to just set the max idle pool and let everyone grab as many connections as they want. Someone will be starved eventually, whether it's client-side or server-side. So the connection pool has to be rationed somewhat carefully.

It's a tricky problem, though; it's fundamentally difficult to coordinate these limits on both sides, and there are probably other deeper ways to improve this situation. We just want sane defaults for now so things can't go off the rails. So, our idea was to just figure out what these reasonable connection counts (or maybe just 'ratios') are for each component and tune the defaults accordingly.

Does that make sense? 🙂

But since you have multiple webs the connection limits you impose on any one of them doesn't prevent you from hitting the 100 total. In any case, you'll need to increase the number of connections on the DB to support a large number of calls. I think we are bumping into an inherit shortcoming of the connection pooling system. The ability to get a connection to the next request and bypass the idle queue is the real underlying problem.

You're definitely right about that, but it's still better to have some limit rather than no limit, isn't it? Without any limits you have no hope of preventing GC from being starved, since you'll hit the server-side limit at some point just because you received a lot of API requests or had a spike in pipeline scheduling. We have to do what we can to prevent that from turning into a cascading failure.

If I have N ATCs and they have a theoretical max pool usage of 64 connections (as it is today), I need to configure my Postgres to have a minimum of N * 64 connection pool. We just set ours to 200 and call it a day. That affords 3 web nodes which is a pretty decent number for large-scale environments.

Also bear in mind that these pool limits were added because people kept hitting the server-side limit. We did a very coarse-grain bucketing of the limit by giving the API 32 and the "backend" 32. We just want to be a bit smarter about how that's divided up by default, since 'backend' isn't fine-grained enough to prevent GC from getting wedged by other backend work.

I wonder how other systems enforce this kind of thing. In my ideal world it would all be enforced by the server so we don't have multiple 'sources of truth' and it doesn't matter how many web nodes you have. In a cursory search it looks like you can have per-user connection limits, so maybe you could have a separate user for critical things, but that's a lot to configure. 🤔

That's exactly why I took the position of maxOpen = 2 x maxIdle. The thing we really want to configure is maxIdle as GC is a bursty problem. It comes in with a lot of requests all at once and if there aren't enough sitting on the Idle queue, then it just starts making new ones.

Here is my proposal:

Create one thread pool with a MaxIdleConnections queue size that is configurable, defaulted to 20. We then set MaxOpenConnections to be a separately configurable parameter that defaults to 2x MaxIdleConnections. (alternatively we could set MaxIdleConnections to be 1/2 MaxOpenConnections), but the behavior we really want is to allow bursting and backoff, IMHO.

Locking should rely on the mutex, not the connection, so it should use the same connection pool. After all, as soon as you have 2 webs, the connection is not your savior.

This gives us the flexibility of configuration for specific environments and puts defaults in place that are reasonable. Sound like a plan?

Minor clarification: GC isn't bursty yet - right now it uses at most one or two connections. I'll be more bursty with #4152 (~10 connections) but not nearly as bursty as pipeline scheduling (N connections; at least 1 per pipeline). This difference in expected connection usage is why we want to separate the scheduling pool from the GC pool and give them reasonable sizes so they're not competing for resources.

1. Create one thread pool with a MaxIdleConnections queue size that is configurable, defaulted to 20. We then set MaxOpenConnections to be a separately configurable parameter that defaults to 2x MaxIdleConnections. (alternatively we could set MaxIdleConnections to be 1/2 MaxOpenConnections), but the behavior we really want is to allow bursting and backoff, IMHO.

Makes sense. We're just trying to fix the churning connections, so setting MaxIdle seems like the simplest thing to do.

FWIW, I don't think we should change the effective default max open connections. I don't want to surprise any Concourse operators and cause server-side limits to suddenly be hit (due to an increase) or cause components to be starved of connections (due to a decrease) where they weren't before. We're only trying to fix the connection churn, so I don't think changing anything else is worth the risk, at least not right now. We can re-assess these magic numbers when we do per-component connection pools (#4025). I'll make a note in that issue to be sure to configure MaxIdle appropriately so this doesn't regress.

2. Locking should rely on the mutex, not the connection, so it should use the same connection pool. After all, as soon as you have 2 webs, the connection is not your savior.

As clarified in #4232 (comment) locking does use the connection, and I think that'll need to stay how it is.

With all that said, this sounds like a super simple change (just SetMaxIdleConns(maxConn / 2). With a lot of thought/words behind it. 😅

Ok. Thanks for all of the help on this. There are already three connection pools (lockConn, apiConn and backendConn). If you are going to add another for gcConn and potentially others, then making this configurable could get a bit unwieldy. My personal vote would be to allow configuration of the total pool size and use the pool for all use cases. Would be nice if there was a prioritized pool, but...

How would you like me to proceed? I have this working for our needs right now, and we are happy with the changes so far. Would you like me to make any specific changes, like make the default MaxOpenConnections a configuration parameter?

I think what you've just pushed looks good. 🙂

The simplest possible thing for #4025 is probably to just have a flag for each component and spend a lot of time making sure we get the defaults right so most operators don't think about it (...until they need to, and then they have all the knobs they want).

Some components only need a certain amount of connections no matter what the scale is, so those are pretty easy. Others, like the API and pipeline scheduling, are based more on usage patterns, so it's hard to have a reasonable default. But hey, hardcoding 32 got us this far!

- roll back to previous lock behavior - make max open connections a flag - set max idle connections as 1/2 max open Signed-off-by: Eric Billingsley <eric@calculi.com>

analytically · 2019-08-12T08:52:37Z

atc/atccmd/command.go

@@ -98,6 +98,8 @@ type RunCommand struct {

 	Postgres flag.PostgresConfig `group:"PostgreSQL Configuration" namespace:"postgres"`

+	MaxOpenConnections int `long:"max-conns" description:"The maximum number of open connections for a conneciton pool." default:"32"`


Thanks for catching. Fixed.

Signed-off-by: Eric Billingsley <eric@calculi.com>

vito · 2019-08-13T13:54:24Z

atc/atccmd/command.go

@@ -1187,7 +1190,6 @@ func (cmd *RunCommand) constructLockConn(driverName string) (*sql.DB, error) {
 		return nil, err
 	}

-	dbConn.SetMaxOpenConns(1)


I think this PR is good to go once this line is restored. 👍

Signed-off-by: Eric Billingsley <eric@calculi.com>

vito

thanks a bunch!

* influxdb batch size/duration and metrics buffer size concourse/concourse#3937 * max db connection pool size concourse/concourse#4232 Signed-off-by: Jamie Klassen <cklassen@pivotal.io>

concourse/concourse#4232 Signed-off-by: Jamie Klassen <cklassen@pivotal.io>

#4232 Signed-off-by: James Thomson <jthomson@pivotal.io> Co-authored-by: Jamie Klassen <cklassen@pivotal.io>

This commit adds the new parameters that were added to Concourse 5.5. Here's a breakdown of the new parameters: - max-active-tasks-per-worker > used by the `limit-active-tasks` container placement strategy > concourse/concourse#4118 - support for influxdb batching and bigger buffer size for metrics emissions > concourse/concourse#3937 - limitting number of max connections in db conn pools > concourse/concourse#4232 Signed-off-by: Ciro S. Costa <cscosta@pivotal.io> Co-authored-by: Zoe Tian <ztian@pivotal.io>

[#4200] maintain a healthy pool of connections to prevent reconnect f…

e11cf03

…loods on TLS Signed-off-by: Eric Billingsley <eric@calculi.com>

cirocosta requested review from vito, xtremerui and jwntrs August 7, 2019 12:28

vito requested changes Aug 7, 2019

View reviewed changes

vito mentioned this pull request Aug 7, 2019

Give all of our components their own connection pools #4025

Open

ebilling requested a review from a team August 7, 2019 20:42

[#4200]

ce84576

- roll back to previous lock behavior - make max open connections a flag - set max idle connections as 1/2 max open Signed-off-by: Eric Billingsley <eric@calculi.com>

vito previously approved these changes Aug 8, 2019

View reviewed changes

analytically reviewed Aug 12, 2019

View reviewed changes

[#4200] - correct typo in flag description.

c5749d1

Signed-off-by: Eric Billingsley <eric@calculi.com>

ebilling dismissed vito’s stale review via c5749d1 August 12, 2019 18:02

vito reviewed Aug 13, 2019

View reviewed changes

[#4200] - restore MaxOpenConnections setting for lock.

0e36554

Signed-off-by: Eric Billingsley <eric@calculi.com>

ebilling requested a review from a team August 14, 2019 15:49

vito approved these changes Aug 14, 2019

View reviewed changes

vito merged commit 63b59f0 into concourse:master Aug 14, 2019

jamieklassen mentioned this pull request Aug 21, 2019

add missing job properties concourse/concourse-bosh-release#48

Merged

cirocosta mentioned this pull request Aug 23, 2019

[stable/concourse] add new parameter for Concourse 5.5 helm/charts#15978

Merged

3 tasks

jamieklassen added this to the v5.5.0 milestone Aug 26, 2019

jamieklassen pushed a commit to concourse/docs that referenced this pull request Aug 26, 2019

document max-open-connections

715c6dd

concourse/concourse#4232 Signed-off-by: Jamie Klassen <cklassen@pivotal.io>

jamieklassen pushed a commit that referenced this pull request Aug 26, 2019

document db connection pool fixes

3b2d8ea

#4232 Signed-off-by: James Thomson <jthomson@pivotal.io> Co-authored-by: Jamie Klassen <cklassen@pivotal.io>

jamieklassen added the release/documented Documentation and release notes have been updated. label Aug 29, 2019

matthewpereira pushed a commit that referenced this pull request Sep 5, 2019

document db connection pool fixes

48ed2e7

#4232 Signed-off-by: James Thomson <jthomson@pivotal.io> Co-authored-by: Jamie Klassen <cklassen@pivotal.io>

cirocosta mentioned this pull request Sep 9, 2019

Document database connections setup concourse/docs#249

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[#4200] prevent reconnect floods to Postgres #4232

[#4200] prevent reconnect floods to Postgres #4232

ebilling commented Aug 7, 2019

vito left a comment

vito Aug 7, 2019

ebilling Aug 7, 2019

vito Aug 7, 2019

ebilling Aug 7, 2019

vito Aug 7, 2019

ebilling Aug 7, 2019

vito Aug 7, 2019

ebilling Aug 7, 2019

vito Aug 7, 2019

analytically Aug 12, 2019

ebilling Aug 12, 2019

vito Aug 13, 2019

ebilling Aug 14, 2019

vito left a comment

		@@ -98,6 +98,8 @@ type RunCommand struct {

		Postgres flag.PostgresConfig `group:"PostgreSQL Configuration" namespace:"postgres"`

		MaxOpenConnections int `long:"max-conns" description:"The maximum number of open connections for a conneciton pool." default:"32"`

[#4200] prevent reconnect floods to Postgres #4232

[#4200] prevent reconnect floods to Postgres #4232

Conversation

ebilling commented Aug 7, 2019

vito left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vito left a comment

Choose a reason for hiding this comment