Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redis connection limit on Heroku (includes benchmarks) #2493

Closed
mariusandra opened this issue Nov 24, 2020 · 3 comments
Closed

Redis connection limit on Heroku (includes benchmarks) #2493

mariusandra opened this issue Nov 24, 2020 · 3 comments

Comments

@mariusandra
Copy link
Collaborator

mariusandra commented Nov 24, 2020

On Heroku's Redis addon's free plan, we have a hard 20 connection limit. Posthog with all its services exceeds that currently. This shouldn't happen. Let's dig in.

I started various services (with pm2) and measured the number of connections reported by the Redis client.

Gunicorn/Django under light load:

  • 1 worker - 3 connections
  • 2 workers - 6 connections
  • 3 workers - 9 connections

Celery:

  • 1 worker - 8 connections (went up to 11 with load)
  • 2 workers - 16 connections

Celery beat

  • 1 worker - 3 connections (1 active beat)
  • 2 workers - 4 connections (1 active + 1 inactive beat)
  • 3 workers - 5 connections (1 active + 2 inactive beats)

Plugin Server

  • 1 worker - 4 connections
  • 2 workers - 8 connections
  • 3 workers - 12 connections

What exactly are all those connections, I haven't determined.

A Heroku setup with 2 celery workers, 1 celery beat and 2 gunicorn workers consumes a total of 16 + 3 + 6 = 25 connections. This is the default setup on all review apps.

Mimicking this environment locally under light load, I got 27 connections. Under heavier load it went up to 31 at times.

2020-11-24 15 00 26

(You must -1 on the number in the screencast since the redis cli call is also counted in the list)

About 9 of these connections are long running (300+ seconds idle). Heroku's redis sets the idle timeout to 300 seconds. I'm assuming that the reason why we can use review apps at all on Heroku is because these long running connections get killed and we neatly squeeze into the 20 connection limit.

Adding 2 plugin workers adds +8 active redis connections, taking the total up to 37 at times.

2020-11-24 15 15 19

Examining the 11 long idle (400+ sec) connections:

  • 1 for each gunicorn worker (2 total)
  • 3 for each celery worker (6 total)
  • 1 for each plugin server (2 total)
  • 1 for the celery beat

What are those and can those be removed, I'm not yet sure.

Next steps

In order to have plugins working on Heroku with the free plan, we have two options:

  • Figure out why Celery uses 8-11 connections per worker and try to take this down by about 4.
  • Split the concurrency on the default worker dyno. With WEB_CONCURRENCY=2, we would start just one celery and one plugin server. Users can scale up/down as needed with extra pluginworker and celeryworker dynos.

I'm going to push out a PR for the second option, since that should fix the problem immediately. We are starting 2x the WEB_CONCURRENCY amount of processes anyway, so this makes some sense as well.

image
image

Will this have real world implications and should this be merged? Not sure. Probably.

@mariusandra mariusandra added the bug Something isn't working right label Nov 24, 2020
@mariusandra
Copy link
Collaborator Author

mariusandra commented Nov 24, 2020

We still need a bit more than 20 connections with the patch in PR #2494 , so I need to see what else we can do.

Celery does seem to use 8 connections pretty consistently for other people (changing settings doesn't work), so it might be hard to get around this.

The solution is to upgrade to some other queue like rq or huey, but that's a lot of work... and that won't work with nodejs. This will, but is not very popular. We can also use celery for python->nodejs communication (with just the nodejs worker) and use the other queue internally in python.

Alternatively, we can drop the "must work with the free redis addon" requirement for heroku, but this means review apps just won't work.

To be continued.

(To add, I'm not the only one who seems to have problems with too many connections on redis from celery)

@mariusandra
Copy link
Collaborator Author

Breaking celery down into services, this is how many redis connections each part requires:

  • Hub: 0
  • Pool: 0
  • Beat: 3 (beat worker for scheduling)
  • Consumer: 0
  • Connection: 3
  • Events: 1
  • Mingle: 0
  • Gossip: 2
  • Heart: 1 (per-worker heartbeat)
  • Tasks: 0
  • Control: 2

The low hanging fruit is to remove gossip and possibly also mingle and heart.

To be continued.

@Twixes Twixes added tech-debt and removed bug Something isn't working right labels Nov 27, 2020
@Twixes
Copy link
Collaborator

Twixes commented Apr 29, 2021

Not much more we can do here sadly.

@Twixes Twixes closed this as completed Apr 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants