Permalink
Browse files

Run gunicorn behind nginx for better buffering and logging

Heroku's default setup for Django uses the gunicorn application server. Each
Heroku dyno can only run a limited number of gunicorn workers, which means a
limited number of requets can be served in parallel (around 4 per dyno is a
good rule of thumb).

Where things get nasty is when you have devices on slow connections - like
mobile phones. Heroku's router buffers headers but it does not buffer response
bodies, so a slow device could hold up a gunicorn worker for several seconds.
Too many slow devices at once and the site will become unavailable to other
users.

This issue is explained and discussed here:

    http://blog.etianen.com/blog/2014/01/19/gunicorn-heroku-django/

That article recommends using waitress as an alternative to gunicorn, but in
the comments at the bottom of the article people suggest using the Heroku
nginx-buildpack as an alternative.

I'm actually using a fork of the Heroku buildpack which applies a more recent
version of nginx.

Here is a slightly out-of-date tutorial on getting this all set up:

    https://koed00.github.io/Heroku_setups/

I used the following commands to set up the buildpacks:

    heroku stack:set cedar-14
    heroku buildpacks:clear
    heroku buildpacks:add https://github.com/beanieboi/nginx-buildpack.git
    heroku buildpacks:add https://github.com/heroku/heroku-buildpack-python.git

Unfortunately the nginx buildpack is not yet compatible with the new heroku-16
stack, so until the nginx buildpack has been updated it's necessary to run the
application on the older cedor-14 stack. See this discussion for details:

    ryandotsmith/nginx-buildpack#68

Adding nginx in this way also gives us the opportunity to fix another
limitation of Heroku: the default logging. By default, log lines look like
this:

    Oct 01 18:01:06 simonwillisonblog heroku/router: at=info
        method=GET path="/2017/Oct/1/ship/" host=simonwillison.net
        request_id=bb22f67e-6924-4e81-b6ad-74d1f465cda7
        fwd="2001:8003:74c5:8b00:79e4:80ed:fa85:7b37,108.162.249.198"
        dyno=web.1 connect=0ms service=338ms status=200 bytes=4523 protocol=http

Notably missing here is both the user-agent string and the referrer header
sent by the browser! If you like tailing log files these omissions are pretty
disappointing.

The nginx buildback I'm using loads a default configuration file at
config/nginx.conf.erb. By including my own copy of this file I can override
the original and define my own custom log format.

The new log lines look like this:

    2017-10-02T01:44:38.762845+00:00 app[web.1]:
        measure#nginx.service=0.133 request="GET / HTTP/1.1" status_code=200
        request_id=8b6402de-d072-42c4-9854-0f71697b30e5 remote_addr="10.16.227.159"
        forwarded_for="199.188.193.220" forwarded_proto="http" via="1.1 vegur"
        body_bytes_sent=12666 referer="-" user_agent="Mozilla/5.0 (Macintosh;
        Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko)
        Chrome/61.0.3163.100 Safari/537.36"
  • Loading branch information...
simonw committed Oct 2, 2017
1 parent 22b4c1e commit 23615a4822ab463c611a3e6a1f4d6cb4dcfc5e7b
Showing with 60 additions and 1 deletion.
  1. +1 −1 Procfile
  2. +50 −0 config/nginx.conf.erb
  3. +9 −0 gunicorn.conf
View
@@ -1 +1 @@
web: gunicorn config.wsgi --log-file -
web: bin/start-nginx gunicorn -c gunicorn.conf config.wsgi --enable-stdio-inheritance --log-file -
View
@@ -0,0 +1,50 @@
daemon off;
#Heroku dynos have at least 4 cores.
worker_processes <%= ENV['NGINX_WORKERS'] || 4 %>;
events {
use epoll;
accept_mutex on;
worker_connections 1024;
}
http {
gzip on;
gzip_comp_level 2;
gzip_min_length 512;
server_tokens off;
log_format custom 'measure#nginx.service=$request_time request="$request" '
'status_code=$status request_id=$http_x_request_id '
'remote_addr="$remote_addr" forwarded_for="$http_x_forwarded_for" '
'forwarded_proto="$http_x_forwarded_proto" via="$http_via" '
'body_bytes_sent=$body_bytes_sent referer="$http_referer" '
'user_agent="$http_user_agent"';
access_log logs/nginx/access.log custom;
error_log logs/nginx/error.log;
include mime.types;
default_type application/octet-stream;
sendfile on;
#Must read the body in 5 seconds.
client_body_timeout 5;
upstream app_server {
server unix:/tmp/nginx.socket fail_timeout=0;
}
server {
listen <%= ENV["PORT"] %>;
server_name _;
keepalive_timeout 5;
location / {
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header Host $http_host;
proxy_redirect off;
proxy_pass http://app_server;
}
}
}
View
@@ -0,0 +1,9 @@
# bin/start-nginx waits for /tmp/app-initialized to be created before binding
# to a port:
# https://github.com/beanieboi/nginx-buildpack/blob/0a188252b/bin/start-nginx#L42-L53
def when_ready(server):
open('/tmp/app-initialized', 'w').close()
bind = 'unix:///tmp/nginx.socket'

0 comments on commit 23615a4

Please sign in to comment.