Skip to content

Conversation

@davidfischer
Copy link
Member

These are minor parameter tweaks to the production webserver parameters. Since nobody else messes with this, I'll probably launch this in a few days even if nobody reviews it.

Background

Since the update to Django 5.2 a few weeks ago (#191), the application has been crashing roughly every other day and sending me an email like the one below. We didn't change the application drastically but I'm guessing the memory usage is a bit higher in Django 5.2 vs. 4.2. The free tier machines we're deployed on have 256MB of RAM and while we are only ~200MB used with 2x gunicorn workers, we are still getting killed sometimes. I did browse the Graphana graphs that fly made available (see below) but I didn't actually see a spike in memory usage around the crash time.

Also to deploy a database migration, we would need to connect to a running machine and if the machine is dying from 2 gunicorn workers, we don't have the spare memory to connect and run a migration when the need arises. We need some more spare memory. The red dashed lines in the graph below are me SSHing into an instance and the instance getting OOM killed.

graphana image

This change

So my main plan is to switch to a single worker instead of 2. I tested this for a little while to see how it performed and so far it's looking good. I'm going to let it sit a few days and see if we get a crash. Fly's free tier lets you run 3 machines for free so I'm going to run 2 of these small/cheap machines instead of the 1 we were previously running but each machine will just have 1 worker instead of the 2 previously.

I made another small change to Gunicorn's worker settings based on a recommendation from ChatGPT (don't tell Hobson). This app doesn't get a lot of non-bot traffic and getting 10k requests may take a long time. Probably best to be restarting these workers more frequently from a time perspective. In general, this prevents a memory leak from building up and causing a crash, however, I don't actually believe this is what caused the crashes I'm seeing as the graphana graph doesn't show memory slowly climbing until processes get OOM killed which is what I'd expect if memory leaks were the problem.

@davidfischer davidfischer merged commit 6a82601 into main Nov 26, 2025
1 check passed
@davidfischer davidfischer deleted the davidfischer/gunicorn-launch-params branch November 26, 2025 23:27
@davidfischer
Copy link
Member Author

I saw no crashes in 2 days so I merged it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants