Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
[ingress/controllers/nginx] Nginx shutdown doesn't gracefully close "keep-alive" connections #1123
We are analyzing the behaviour of re-deploying nginx ingress controller with a lot of requests flooding. Basically we have gatling or ab (command line tool) that performs a lot of parallel requests to our kubernetes cluster for a while.
With default nginx configuration we discovered that:
We tried several things and the latest one was to gracefully shutdown nginx in the preStop hook with this command:
The expected behaviour would be that nginx maintains keep-alive connections before receiving the SIGTERM. Then, once it receives the -s quit, it starts to block new keep-alive connections (with "Connection: keep-alive, close" header) and notifies the client that it should close the kept alive connections.
Finally we also tried to modify the parameter "keepalive_timeout" in nginx configration to "0". In this way nginx never accepts keep-alive connections (with "Connection: keep-alive, close" header) and we have a smooth results of 0 errors.
Obviously it is not the best configuration because we don't optimise number of connections used and we have a strong feeling that we are missing something ..
Just tried, same behaviour!
Here is our preStop script. (we have an endpoint configured with nginx that just serve the readiness.html file, watched by F5 in order to exclude the physical node as soon as the preStop hook has been called)
.. and this is our kuberentes deployment configuration, for high-availability
This is the output of our ab testing
Obviously if you look at the percentage this doesn't seem a big problem: "just" 100 out of 1000000. (And the behaviour could be even better if we have more than 1 replica for the nginx ingress controller pod).
The biggest point in my opinion is that when nginx dies, it drops all the connections in keep-alive status in that specific moment. So if you look at the number in that specific amount of time, 100% of requests fail.
If you have a look at the logs this is the behaviour observed:
I am writing here just because I saw that nginx-slim has been modified quite a lot and I observed this issue on kubernetes with nginx ingress controller, but it could be a nginx specific problem.
Next step on my side is to isolate the nginx behaviour outside kubernetes and the "go wrapper".
Just finished some tests and actually nginx itself doesn't gracefully shutdown requests in keep-alive status.
Here is my test:
.. then I noticed then when I launch ..
.. I continue to receive
.. until the very end, before nginx quit.
That's (as explained before) a big problem since clients that are re-using those connections are not notified by the "Connection: close" HTTP header and continue to re-use the TCP connection ..
Now what do you think we can do @aledbf ? Probably we should close this issue and notify nginx developers?
That's a good idea. Can you open a ticket in nginx with the content of the your last comment?