Skip to content

Conversation

@edmorley
Copy link
Member

@edmorley edmorley commented Jul 30, 2025

Heroku's Router 2.0 now supports Keep-Alive connections:
https://www.heroku.com/blog/tips-tricks-router-2dot0-migration/#keepalives-always-on
https://devcenter.heroku.com/articles/http-routing#legacy-router-and-router-2-0

Gunicorn supports Keep-Alive connections, and so now if an app is using Router 2.0 connections between the router instances and the app will be kept alive.

However, gunicorn's default keepalive idle timeout setting is only 5 seconds:
https://docs.gunicorn.org/en/stable/settings.html#keepalive

This is shorter than the 90 second Router 2.0 keep-alive timeout:
https://devcenter.heroku.com/articles/http-routing#keepalives

As such:
(a) This causes connections to be closed sooner than they need to be (as noted in the gunicorn docs, the default 5 seconds is more suited for cases where many clients are connecting directly to gunicorn, rather than gunicorn being behind a load balancer).
(b) there is a possibility of a race condition whereby gunicorn starts initiating the closing of an idle Keep-Alive connection just at the moment that the Heroku Router sends a new request to it. If this occurred, that new request could fail.

This race condition isn't unique to Heroku or gunicorn, but applies to any load balancer/app server combination that supports Keep-Alive. For example, see AWS' explanation about this in their ELB docs:
https://docs.aws.amazon.com/elasticloadbalancing/latest/application/edit-load-balancer-attributes.html#connection-idle-timeout

We also recommend that you configure the idle timeout of your application to be larger than the idle timeout configured for the load balancer. Otherwise, if the application closes the TCP connection to the load balancer ungracefully, the load balancer might send a request to the application before it receives the packet indicating that the connection is closed.

Therefore, the gunicorn idle timeout has been raised from 5 seconds to 100 seconds, so it is greater than the Router's idle timeout - ensuring that the router is always the one initiating connection closing (and the router will know to not send new requests to a connection it's in the middle of closing).

GUS-W-18319007.

Heroku's Router 2.0 now supports Keep-Alive connections:
https://www.heroku.com/blog/tips-tricks-router-2dot0-migration/#keepalives-always-on
https://devcenter.heroku.com/articles/http-routing#legacy-router-and-router-2-0

Gunicorn supports Keep-Alive connections, and so now if an app
is using Router 2.0 connections between the router instances and
the app will be kept alive.

However, gunicorn's default `keepalive` idle timeout setting is only
5 seconds:
https://docs.gunicorn.org/en/stable/settings.html#keepalive

This is shorter than the 90 second Router 2.0 keep-alive timeout:
https://devcenter.heroku.com/articles/http-routing#keepalives

As such:
(a) This causes connections to be closed sooner than they need to
    be (as noted in the gunicorn docs, the default 5 seconds is more
    suited for cases where many clients are connecting directly to
    gunicorn, rather than gunicorn being behind a load balancer).
(b) there is a possibility of a race condition whereby gunicorn starts
    initiating the closing of an idle Keep-Alive connection just at the
    moment that the Heroku Router sends a new request to it. If this
    occurred, that new request could fail.

This race condition isn't unique to Heroku or gunicorn, but applies to any
load balancer/app server combination that supports Keep-Alive. For example,
see AWS' explanation about this in their ELB docs:
https://docs.aws.amazon.com/elasticloadbalancing/latest/application/edit-load-balancer-attributes.html#connection-idle-timeout

Therefore, the gunicorn idle timeout has been raised from 5 seconds to
100 seconds, so it is greater than the Router's idle timeout - ensuring that
the router is always initiating connection closing:
https://docs.gunicorn.org/en/stable/settings.html#keepalive

GUS-W-18319007.
@edmorley edmorley self-assigned this Jul 30, 2025
@edmorley edmorley requested a review from a team as a code owner July 30, 2025 10:52
@heroku heroku bot temporarily deployed to getting-star-edmorley-k-xsgewo July 30, 2025 10:53 Inactive
@edmorley edmorley removed the request for review from a team July 30, 2025 10:54
@edmorley edmorley merged commit 3ca0fad into main Jul 30, 2025
1 check passed
@edmorley edmorley deleted the edmorley/keepalive branch July 30, 2025 10:55
edmorley added a commit that referenced this pull request Aug 3, 2025
For consistency with the timeout chosen by other languages since #261 landed:
- heroku/heroku-buildpack-php#823
- heroku/ruby-getting-started#190

GUS-W-18319007.

Signed-off-by: Ed Morley <501702+edmorley@users.noreply.github.com>
edmorley added a commit that referenced this pull request Aug 3, 2025
For consistency with the timeout chosen by other languages since #261 landed:
- heroku/heroku-buildpack-php#823
- heroku/ruby-getting-started#190

GUS-W-18319007.

Signed-off-by: Ed Morley <501702+edmorley@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant