Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gunicorn is not rejecting connections over backlog limit #2183

Closed
tuukkamustonen opened this issue Nov 19, 2019 · 13 comments
Closed

Gunicorn is not rejecting connections over backlog limit #2183

tuukkamustonen opened this issue Nov 19, 2019 · 13 comments
Assignees

Comments

@tuukkamustonen
Copy link

Basically this: https://stackoverflow.com/questions/56755512/gunicorn-is-not-rejecting-connections-over-backlog-limit (someone else's question, just happened to find it).

How to reproduce (with gunicorn 20.0.0):

app.py:

from flask import Flask
import time

app = Flask(__name__)


@app.route('/')
def endpoint():
    print('in')
    time.sleep(2)  # simply long enough so that parallel connections get queued
    print('out')
    return 'hi'

Run it with (default) sync worker:

gunicorn --backlog 1 --workers 1 --timeout 600 --graceful-timeout 600 --log-level debug app:app

Bombard:

ab -v 2 -n 10 -c 10 -s 600 http://127.0.0.1:8000/

What happens here is that I'm sending in 10 parallel requests, of which only 1 at a time should get processed. My expectation would have been that anything above 1 or 2 parallel requests fails (1 active on worker, 1 queued on backlog), but above actually succeeds. It succeeds even with way larger counts (like 50 but maybe not 100).

The results are quite the same with --worker-class gevent --worker-connections 1.

Why is this? Why backlog and worker / worker_connection count don't result in most requests getting rejected? Where are connections queued if not at backlog or workers?

@jamadden
Copy link
Collaborator

The man page for listen(2) says this about backlog:

if the underlying protocol supports retransmission, the request may be ignored so that a later reattempt at connection succeeds.

Monitoring the TCP traffic on my system, that's exactly what happens when the backlog is full. The client that's trying to connect keeps trying with TCP retransmissions until it ultimately reaches the socket timeout.

Here, a server is running on port 64378 with a full backlog. I attempt to connect a new client and we can see the retries happening (at increasing backoffs):

Capturing on 'Loopback: lo0'
    3   5.827079    127.0.0.1 → 127.0.0.1    TCP 64763 → 64378 [SYN] Seq=0 Win=65535 [TCP CHECKSUM INCORRECT] Len=0 MSS=16344 WS=1024 TSval=1275570889 TSecr=0 SACK_PERM=1
    4   5.928514    127.0.0.1 → 127.0.0.1    TCP [TCP Retransmission] 64763 → 64378 [SYN] Seq=0 Win=65535 [TCP CHECKSUM INCORRECT] Len=0 MSS=16344 WS=1024 TSval=1275570990 TSecr=0 SACK_PERM=1
    5   6.030021    127.0.0.1 → 127.0.0.1    TCP [TCP Retransmission] 64763 → 64378 [SYN] Seq=0 Win=65535 [TCP CHECKSUM INCORRECT] Len=0 MSS=16344 WS=1024 TSval=1275571090 TSecr=0 SACK_PERM=1
    6   6.130239    127.0.0.1 → 127.0.0.1    TCP [TCP Retransmission] 64763 → 64378 [SYN] Seq=0 Win=65535 [TCP CHECKSUM INCORRECT] Len=0 MSS=16344 WS=1024 TSval=1275571190 TSecr=0 SACK_PERM=1
    7   6.231433    127.0.0.1 → 127.0.0.1    TCP [TCP Retransmission] 64763 → 64378 [SYN] Seq=0 Win=65535 [TCP CHECKSUM INCORRECT] Len=0 MSS=16344 WS=1024 TSval=1275571291 TSecr=0 SACK_PERM=1
    8   6.331518    127.0.0.1 → 127.0.0.1    TCP [TCP Retransmission] 64763 → 64378 [SYN] Seq=0 Win=65535 [TCP CHECKSUM INCORRECT] Len=0 MSS=16344 WS=1024 TSval=1275571391 TSecr=0 SACK_PERM=1
    9   6.532667    127.0.0.1 → 127.0.0.1    TCP [TCP Retransmission] 64763 → 64378 [SYN] Seq=0 Win=65535 [TCP CHECKSUM INCORRECT] Len=0 MSS=16344 WS=1024 TSval=1275571592 TSecr=0 SACK_PERM=1
   10   6.935262    127.0.0.1 → 127.0.0.1    TCP [TCP Retransmission] 64763 → 64378 [SYN] Seq=0 Win=65535 [TCP CHECKSUM INCORRECT] Len=0 MSS=16344 WS=1024 TSval=1275571992 TSecr=0 SACK_PERM=1
   11   7.736393    127.0.0.1 → 127.0.0.1    TCP [TCP Retransmission] 64763 → 64378 [SYN] Seq=0 Win=65535 [TCP CHECKSUM INCORRECT] Len=0 MSS=16344 WS=1024 TSval=1275572792 TSecr=0 SACK_PERM=1
   14   9.341732    127.0.0.1 → 127.0.0.1    TCP [TCP Retransmission] 64763 → 64378 [SYN] Seq=0 Win=65535 [TCP CHECKSUM INCORRECT] Len=0 MSS=16344 WS=1024 TSval=1275574392 TSecr=0 SACK_PERM=1
   15  12.557183    127.0.0.1 → 127.0.0.1    TCP [TCP Retransmission] 64763 → 64378 [SYN] Seq=0 Win=65535 [TCP CHECKSUM INCORRECT] Len=0 MSS=16344 SACK_PERM=1
   18  18.992804    127.0.0.1 → 127.0.0.1    TCP [TCP Retransmission] 64763 → 64378 [SYN] Seq=0 Win=65535 [TCP CHECKSUM INCORRECT] Len=0 MSS=16344 SACK_PERM=1
   21  25.430874    127.0.0.1 → 127.0.0.1    TCP [TCP Retransmission] 64763 → 64378 [SYN] Seq=0 Win=65535 [TCP CHECKSUM INCORRECT] Len=0 MSS=16344 SACK_PERM=1

@tuukkamustonen
Copy link
Author

Interesting, I totally forgot TCP retry mechanism. I wonder if we can disable it temporarily, to test this without(?)

@jamadden What is the tool / command you took that capture with? I guess the second column is for timings, but are they seconds, milliseconds, what...?

@jamadden
Copy link
Collaborator

Interesting, I totally forgot TCP retry mechanism. I wonder if we can disable it temporarily, to test this without(?)

Maybe? If so I expect it's a system dependent tunable somewhere in /proc or /sys on linux or a sysctl on macOS.

What is the tool / command you took that capture with?

That's output from tshark, a tool that's part of wireshark which builds upon libpcap.

I guess the second column is for timings, but are they seconds, milliseconds, what...?

I think they're elapsed (fractional) seconds since the capture began.

@tuukkamustonen
Copy link
Author

Ok also tried it.

Using 2s delay, sending 5 concurrent requests, I get this:

The first 3 requests get accepted:

    1 0.000000000    127.0.0.1 → 127.0.0.1    TCP 74 37822 → 8000 [SYN] Seq=0 Win=43690 Len=0 MSS=65495 SACK_PERM=1 TSval=2705125020 TSecr=0 WS=128
    2 0.000011341    127.0.0.1 → 127.0.0.1    TCP 74 8000 → 37822 [SYN, ACK] Seq=0 Ack=1 Win=43690 Len=0 MSS=65495 SACK_PERM=1 TSval=2705125020 TSecr=2705125020 WS=128
    3 0.000021269    127.0.0.1 → 127.0.0.1    TCP 66 37822 → 8000 [ACK] Seq=1 Ack=1 Win=43776 Len=0 TSval=2705125020 TSecr=2705125020
    4 0.000381569    127.0.0.1 → 127.0.0.1    TCP 74 37824 → 8000 [SYN] Seq=0 Win=43690 Len=0 MSS=65495 SACK_PERM=1 TSval=2705125020 TSecr=0 WS=128
    5 0.000388990    127.0.0.1 → 127.0.0.1    TCP 74 8000 → 37824 [SYN, ACK] Seq=0 Ack=1 Win=43690 Len=0 MSS=65495 SACK_PERM=1 TSval=2705125020 TSecr=2705125020 WS=128
    6 0.000396028    127.0.0.1 → 127.0.0.1    TCP 66 37824 → 8000 [ACK] Seq=1 Ack=1 Win=43776 Len=0 TSval=2705125020 TSecr=2705125020
    7 0.000418732    127.0.0.1 → 127.0.0.1    TCP 74 37826 → 8000 [SYN] Seq=0 Win=43690 Len=0 MSS=65495 SACK_PERM=1 TSval=2705125020 TSecr=0 WS=128
    8 0.000423281    127.0.0.1 → 127.0.0.1    TCP 74 8000 → 37826 [SYN, ACK] Seq=0 Ack=1 Win=43690 Len=0 MSS=65495 SACK_PERM=1 TSval=2705125020 TSecr=2705125020 WS=128
    9 0.000427919    127.0.0.1 → 127.0.0.1    TCP 66 37826 → 8000 [ACK] Seq=1 Ack=1 Win=43776 Len=0 TSval=2705125020 TSecr=2705125020

The 2 next don't:

   10 0.000446817    127.0.0.1 → 127.0.0.1    TCP 74 37828 → 8000 [SYN] Seq=0 Win=43690 Len=0 MSS=65495 SACK_PERM=1 TSval=2705125020 TSecr=0 WS=128
   11 0.000464200    127.0.0.1 → 127.0.0.1    TCP 74 37830 → 8000 [SYN] Seq=0 Win=43690 Len=0 MSS=65495 SACK_PERM=1 TSval=2705125020 TSecr=0 WS=128

HTTP request for the first 3 is sent:

   12 0.000482579    127.0.0.1 → 127.0.0.1    HTTP 148 GET / HTTP/1.0 
   13 0.000489299    127.0.0.1 → 127.0.0.1    TCP 66 8000 → 37822 [ACK] Seq=1 Ack=83 Win=43776 Len=0 TSval=2705125020 TSecr=2705125020
   14 0.001506206    127.0.0.1 → 127.0.0.1    HTTP 148 GET / HTTP/1.0 
   15 0.001514510    127.0.0.1 → 127.0.0.1    TCP 66 8000 → 37824 [ACK] Seq=1 Ack=83 Win=43776 Len=0 TSval=2705125021 TSecr=2705125021
   16 0.001528418    127.0.0.1 → 127.0.0.1    HTTP 148 GET / HTTP/1.0 
   17 0.001531944    127.0.0.1 → 127.0.0.1    TCP 66 8000 → 37826 [ACK] Seq=1 Ack=83 Win=43776 Len=0 TSval=2705125021 TSecr=2705125021

Retransmission of the 2 failed packets:

   18 1.026396604    127.0.0.1 → 127.0.0.1    TCP 74 [TCP Retransmission] 37830 → 8000 [SYN] Seq=0 Win=43690 Len=0 MSS=65495 SACK_PERM=1 TSval=2705126046 TSecr=0 WS=128
   19 1.026407190    127.0.0.1 → 127.0.0.1    TCP 74 [TCP Retransmission] 37828 → 8000 [SYN] Seq=0 Win=43690 Len=0 MSS=65495 SACK_PERM=1 TSval=2705126046 TSecr=0 WS=128

Response for 1st request:

   20 2.005637083    127.0.0.1 → 127.0.0.1    TCP 225 HTTP/1.0 200 OK  [TCP segment of a reassembled PDU]
   21 2.005734886    127.0.0.1 → 127.0.0.1    TCP 66 37822 → 8000 [ACK] Seq=83 Ack=160 Win=44800 Len=0 TSval=2705127026 TSecr=2705127026
   22 2.005819158    127.0.0.1 → 127.0.0.1    HTTP 68 HTTP/1.0 200 OK  (text/html)
   23 2.005834374    127.0.0.1 → 127.0.0.1    TCP 66 37822 → 8000 [ACK] Seq=83 Ack=162 Win=44800 Len=0 TSval=2705127026 TSecr=2705127026
   24 2.006035843    127.0.0.1 → 127.0.0.1    TCP 66 8000 → 37822 [FIN, ACK] Seq=162 Ack=83 Win=43776 Len=0 TSval=2705127026 TSecr=2705127026
   25 2.007534417    127.0.0.1 → 127.0.0.1    TCP 66 37822 → 8000 [FIN, ACK] Seq=83 Ack=163 Win=44800 Len=0 TSval=2705127027 TSecr=2705127026
   26 2.007554217    127.0.0.1 → 127.0.0.1    TCP 66 8000 → 37822 [ACK] Seq=163 Ack=84 Win=43776 Len=0 TSval=2705127027 TSecr=2705127027

Connection gets closed, so new one is opened (actually, I wonder why, because I'm sending 5 requests in 5 threads so each thread should only send one request):

   27 2.007634988    127.0.0.1 → 127.0.0.1    TCP 74 37832 → 8000 [SYN] Seq=0 Win=43690 Len=0 MSS=65495 SACK_PERM=1 TSval=2705127028 TSecr=0 WS=128
   28 2.007651838    127.0.0.1 → 127.0.0.1    TCP 74 8000 → 37832 [SYN, ACK] Seq=0 Ack=1 Win=43690 Len=0 MSS=65495 SACK_PERM=1 TSval=2705127028 TSecr=2705127028 WS=128
   29 2.007667538    127.0.0.1 → 127.0.0.1    TCP 66 37832 → 8000 [ACK] Seq=1 Ack=1 Win=43776 Len=0 TSval=2705127028 TSecr=2705127028
   30 2.007824776    127.0.0.1 → 127.0.0.1    HTTP 148 GET / HTTP/1.0 
   31 2.007836922    127.0.0.1 → 127.0.0.1    TCP 66 8000 → 37832 [ACK] Seq=1 Ack=83 Win=43776 Len=0 TSval=2705127028 TSecr=2705127028

Retransmission of the 2 failed packets:

   32 3.038871654    127.0.0.1 → 127.0.0.1    TCP 74 [TCP Retransmission] 37828 → 8000 [SYN] Seq=0 Win=43690 Len=0 MSS=65495 SACK_PERM=1 TSval=2705128059 TSecr=0 WS=128
   33 3.038886489    127.0.0.1 → 127.0.0.1    TCP 74 [TCP Retransmission] 37830 → 8000 [SYN] Seq=0 Win=43690 Len=0 MSS=65495 SACK_PERM=1 TSval=2705128059 TSecr=0 WS=128

And so on, it continues pretty much like above:

   34 4.014308528    127.0.0.1 → 127.0.0.1    TCP 225 HTTP/1.0 200 OK  [TCP segment of a reassembled PDU]
   35 4.014346966    127.0.0.1 → 127.0.0.1    TCP 66 37824 → 8000 [ACK] Seq=83 Ack=160 Win=44800 Len=0 TSval=2705129034 TSecr=2705129034
   36 4.014423946    127.0.0.1 → 127.0.0.1    HTTP 68 HTTP/1.0 200 OK  (text/html)
   37 4.014440023    127.0.0.1 → 127.0.0.1    TCP 66 37824 → 8000 [ACK] Seq=83 Ack=162 Win=44800 Len=0 TSval=2705129034 TSecr=2705129034
   38 4.014630970    127.0.0.1 → 127.0.0.1    TCP 66 8000 → 37824 [FIN, ACK] Seq=162 Ack=83 Win=43776 Len=0 TSval=2705129035 TSecr=2705129034
   39 4.016415592    127.0.0.1 → 127.0.0.1    TCP 66 37824 → 8000 [FIN, ACK] Seq=83 Ack=163 Win=44800 Len=0 TSval=2705129036 TSecr=2705129035
   40 4.016446150    127.0.0.1 → 127.0.0.1    TCP 66 8000 → 37824 [ACK] Seq=163 Ack=84 Win=43776 Len=0 TSval=2705129036 TSecr=2705129036
   41 4.016555193    127.0.0.1 → 127.0.0.1    TCP 74 37834 → 8000 [SYN] Seq=0 Win=43690 Len=0 MSS=65495 SACK_PERM=1 TSval=2705129036 TSecr=0 WS=128
   42 4.016577822    127.0.0.1 → 127.0.0.1    TCP 74 8000 → 37834 [SYN, ACK] Seq=0 Ack=1 Win=43690 Len=0 MSS=65495 SACK_PERM=1 TSval=2705129036 TSecr=2705129036 WS=128
   43 4.016600182    127.0.0.1 → 127.0.0.1    TCP 66 37834 → 8000 [ACK] Seq=1 Ack=1 Win=43776 Len=0 TSval=2705129037 TSecr=2705129036
   44 4.016661344    127.0.0.1 → 127.0.0.1    HTTP 148 GET / HTTP/1.0 
   45 4.016674977    127.0.0.1 → 127.0.0.1    TCP 66 8000 → 37834 [ACK] Seq=1 Ack=83 Win=43776 Len=0 TSval=2705129037 TSecr=2705129037
   46 6.021405410    127.0.0.1 → 127.0.0.1    TCP 225 HTTP/1.0 200 OK  [TCP segment of a reassembled PDU]
   47 6.021440479    127.0.0.1 → 127.0.0.1    TCP 66 37826 → 8000 [ACK] Seq=83 Ack=160 Win=44800 Len=0 TSval=2705131041 TSecr=2705131041
   48 6.021503617    127.0.0.1 → 127.0.0.1    HTTP 68 HTTP/1.0 200 OK  (text/html)
   49 6.021518815    127.0.0.1 → 127.0.0.1    TCP 66 37826 → 8000 [ACK] Seq=83 Ack=162 Win=44800 Len=0 TSval=2705131041 TSecr=2705131041
   50 6.021764897    127.0.0.1 → 127.0.0.1    TCP 66 8000 → 37826 [FIN, ACK] Seq=162 Ack=83 Win=43776 Len=0 TSval=2705131042 TSecr=2705131041
   51 6.023144947    127.0.0.1 → 127.0.0.1    TCP 66 37826 → 8000 [FIN, ACK] Seq=83 Ack=163 Win=44800 Len=0 TSval=2705131043 TSecr=2705131042
   52 6.023166083    127.0.0.1 → 127.0.0.1    TCP 66 8000 → 37826 [ACK] Seq=163 Ack=84 Win=43776 Len=0 TSval=2705131043 TSecr=2705131043
   53 7.138705088    127.0.0.1 → 127.0.0.1    TCP 74 [TCP Retransmission] 37830 → 8000 [SYN] Seq=0 Win=43690 Len=0 MSS=65495 SACK_PERM=1 TSval=2705132159 TSecr=0 WS=128
   54 7.138716112    127.0.0.1 → 127.0.0.1    TCP 74 [TCP Retransmission] 37828 → 8000 [SYN] Seq=0 Win=43690 Len=0 MSS=65495 SACK_PERM=1 TSval=2705132159 TSecr=0 WS=128
   55 7.138786036    127.0.0.1 → 127.0.0.1    TCP 74 8000 → 37830 [SYN, ACK] Seq=0 Ack=1 Win=43690 Len=0 MSS=65495 SACK_PERM=1 TSval=2705132159 TSecr=2705132159 WS=128
   56 7.138802054    127.0.0.1 → 127.0.0.1    TCP 74 8000 → 37828 [SYN, ACK] Seq=0 Ack=1 Win=43690 Len=0 MSS=65495 SACK_PERM=1 TSval=2705132129 TSecr=2705132159 WS=128
   57 7.138823800    127.0.0.1 → 127.0.0.1    TCP 66 37830 → 8000 [ACK] Seq=1 Ack=1 Win=43776 Len=0 TSval=2705132159 TSecr=2705132159
   58 7.138835300    127.0.0.1 → 127.0.0.1    TCP 66 37828 → 8000 [ACK] Seq=1 Ack=1 Win=43776 Len=0 TSval=2705132159 TSecr=2705132129
   59 8.029635001    127.0.0.1 → 127.0.0.1    TCP 225 HTTP/1.0 200 OK  [TCP segment of a reassembled PDU]
   60 8.029666079    127.0.0.1 → 127.0.0.1    TCP 66 37832 → 8000 [ACK] Seq=83 Ack=160 Win=44800 Len=0 TSval=2705133050 TSecr=2705133050
   61 8.029717662    127.0.0.1 → 127.0.0.1    HTTP 68 HTTP/1.0 200 OK  (text/html)
   62 8.029730290    127.0.0.1 → 127.0.0.1    TCP 66 37832 → 8000 [ACK] Seq=83 Ack=162 Win=44800 Len=0 TSval=2705133050 TSecr=2705133050
   63 8.029890058    127.0.0.1 → 127.0.0.1    TCP 66 8000 → 37832 [FIN, ACK] Seq=162 Ack=83 Win=43776 Len=0 TSval=2705133050 TSecr=2705133050
   64 8.030352906    127.0.0.1 → 127.0.0.1    TCP 66 37832 → 8000 [FIN, ACK] Seq=83 Ack=163 Win=44800 Len=0 TSval=2705133050 TSecr=2705133050
   65 8.030375946    127.0.0.1 → 127.0.0.1    TCP 66 8000 → 37832 [ACK] Seq=163 Ack=84 Win=43776 Len=0 TSval=2705133050 TSecr=2705133050
   66 10.034449993    127.0.0.1 → 127.0.0.1    TCP 225 HTTP/1.0 200 OK  [TCP segment of a reassembled PDU]
   67 10.034695032    127.0.0.1 → 127.0.0.1    TCP 66 37834 → 8000 [ACK] Seq=83 Ack=160 Win=44800 Len=0 TSval=2705135054 TSecr=2705135054
   68 10.034925168    127.0.0.1 → 127.0.0.1    HTTP 68 HTTP/1.0 200 OK  (text/html)
   69 10.034934651    127.0.0.1 → 127.0.0.1    TCP 66 37834 → 8000 [ACK] Seq=83 Ack=162 Win=44800 Len=0 TSval=2705135055 TSecr=2705135055
   70 10.035186979    127.0.0.1 → 127.0.0.1    TCP 66 8000 → 37834 [FIN, ACK] Seq=162 Ack=83 Win=43776 Len=0 TSval=2705135055 TSecr=2705135055
   71 10.035679077    127.0.0.1 → 127.0.0.1    TCP 66 37834 → 8000 [FIN, ACK] Seq=83 Ack=163 Win=44800 Len=0 TSval=2705135056 TSecr=2705135055
   72 10.035693280    127.0.0.1 → 127.0.0.1    TCP 66 8000 → 37834 [ACK] Seq=163 Ack=84 Win=43776 Len=0 TSval=2705135056 TSecr=2705135056
   73 10.036198693    127.0.0.1 → 127.0.0.1    TCP 66 37830 → 8000 [FIN, ACK] Seq=1 Ack=1 Win=43776 Len=0 TSval=2705135056 TSecr=2705132159
   74 10.036227058    127.0.0.1 → 127.0.0.1    TCP 66 37828 → 8000 [FIN, ACK] Seq=1 Ack=1 Win=43776 Len=0 TSval=2705135056 TSecr=2705132129
   75 10.038097445    127.0.0.1 → 127.0.0.1    TCP 66 8000 → 37830 [FIN, ACK] Seq=1 Ack=2 Win=43776 Len=0 TSval=2705135058 TSecr=2705135056
   76 10.038128896    127.0.0.1 → 127.0.0.1    TCP 66 37830 → 8000 [ACK] Seq=2 Ack=2 Win=43776 Len=0 TSval=2705135058 TSecr=2705135058
   77 10.038787060    127.0.0.1 → 127.0.0.1    TCP 66 8000 → 37828 [FIN, ACK] Seq=1 Ack=2 Win=29312 Len=0 TSval=2705135059 TSecr=2705135056
   78 10.038800644    127.0.0.1 → 127.0.0.1    TCP 66 37828 → 8000 [ACK] Seq=2 Ack=2 Win=43776 Len=0 TSval=2705135059 TSecr=2705135059

So, in test 3 requests get handled concurrently, not 1 (or 2).

@tuukkamustonen
Copy link
Author

tuukkamustonen commented Nov 20, 2019

Running with 5 workers (instead of 1), the result varies, sometimes 1-2 packets need to be re-transmitted, sometimes not. I think this due to --backlog 1 actually being able to queue 2-3 connections. Sometimes connections are handled to workers so fast that backlog never get overflown, sometimes it's slower and backlog will reject extra connections.

Also ran a few different tests, and seems like it's working reasonably. And just to make sure:

  • backlog is only for queued connections.
  • workers and worker-connections is per worker, how many parallel requests are allowed. It has no relation to backlog.

So if the only weirdness here is that a backlog value of N is actually N+1 or N+2, then doesn't feel like a big problem. I guess I'll just close this soon (unless someone has more insight on it).

@jamadden
Copy link
Collaborator

You don't mention the operating system you're testing on. That turns out to be very important as the backlog behaviour differs substantially between operating systems (and even versions). On Linux, there are actually two queues, only one of which is limited by the backlog. The backlog limit only kicks in after the three-way handshake, IIUC; the handshake limit is system-wide and may be unlimited.

The behaviour I saw when testing on macOS I think demonstrates the classic BSD approach of a single queue.

@tuukkamustonen
Copy link
Author

tuukkamustonen commented Nov 20, 2019

I'm using linux (KDE Neon 5.17, on Ubuntu 18.04, on Debian 9 I think):

Linux musttu-work 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

I'm running a virtual machine inside VirtualBox, but that probably has no impact.

Thanks for the pointers! Looks like I need to do some studying.

@tuukkamustonen
Copy link
Author

tuukkamustonen commented Nov 20, 2019

I went through the linked material and indeed it was a good read.

However, looking at my earlier trace, I still don't understand how this is possible:

First 3 connections get established:

    1 0.000000000    127.0.0.1 → 127.0.0.1    TCP 74 37822 → 8000 [SYN] Seq=0 Win=43690 Len=0 MSS=65495 SACK_PERM=1 TSval=2705125020 TSecr=0 WS=128
    2 0.000011341    127.0.0.1 → 127.0.0.1    TCP 74 8000 → 37822 [SYN, ACK] Seq=0 Ack=1 Win=43690 Len=0 MSS=65495 SACK_PERM=1 TSval=2705125020 TSecr=2705125020 WS=128
    3 0.000021269    127.0.0.1 → 127.0.0.1    TCP 66 37822 → 8000 [ACK] Seq=1 Ack=1 Win=43776 Len=0 TSval=2705125020 TSecr=2705125020
    4 0.000381569    127.0.0.1 → 127.0.0.1    TCP 74 37824 → 8000 [SYN] Seq=0 Win=43690 Len=0 MSS=65495 SACK_PERM=1 TSval=2705125020 TSecr=0 WS=128
    5 0.000388990    127.0.0.1 → 127.0.0.1    TCP 74 8000 → 37824 [SYN, ACK] Seq=0 Ack=1 Win=43690 Len=0 MSS=65495 SACK_PERM=1 TSval=2705125020 TSecr=2705125020 WS=128
    6 0.000396028    127.0.0.1 → 127.0.0.1    TCP 66 37824 → 8000 [ACK] Seq=1 Ack=1 Win=43776 Len=0 TSval=2705125020 TSecr=2705125020
    7 0.000418732    127.0.0.1 → 127.0.0.1    TCP 74 37826 → 8000 [SYN] Seq=0 Win=43690 Len=0 MSS=65495 SACK_PERM=1 TSval=2705125020 TSecr=0 WS=128
    8 0.000423281    127.0.0.1 → 127.0.0.1    TCP 74 8000 → 37826 [SYN, ACK] Seq=0 Ack=1 Win=43690 Len=0 MSS=65495 SACK_PERM=1 TSval=2705125020 TSecr=2705125020 WS=128
    9 0.000427919    127.0.0.1 → 127.0.0.1    TCP 66 37826 → 8000 [ACK] Seq=1 Ack=1 Win=43776 Len=0 TSval=2705125020 TSecr=2705125020

Ok, even if they were in SYN queue at this point, they will be moved to accept queue (and then to worker).

How is possible that 3 requests are actually allowed to send content:

   12 0.000482579    127.0.0.1 → 127.0.0.1    HTTP 148 GET / HTTP/1.0 
   13 0.000489299    127.0.0.1 → 127.0.0.1    TCP 66 8000 → 37822 [ACK] Seq=1 Ack=83 Win=43776 Len=0 TSval=2705125020 TSecr=2705125020
   14 0.001506206    127.0.0.1 → 127.0.0.1    HTTP 148 GET / HTTP/1.0 
   15 0.001514510    127.0.0.1 → 127.0.0.1    TCP 66 8000 → 37824 [ACK] Seq=1 Ack=83 Win=43776 Len=0 TSval=2705125021 TSecr=2705125021
   16 0.001528418    127.0.0.1 → 127.0.0.1    HTTP 148 GET / HTTP/1.0 
   17 0.001531944    127.0.0.1 → 127.0.0.1    TCP 66 8000 → 37826 [ACK] Seq=1 Ack=83 Win=43776 Len=0 TSval=2705125021 TSecr=2705125021

Wouldn't these requests be already in accept queue when this happens? If so, first one could get moved to the worker, and the second one would be queued in accept queue. But the third one should fail, as accept queue is full by the time it is moved there?

Even if SYN queue allowed data to be received for a connection, before it's moved to accept queue, if there's slight slowness, I had 2+ second delay there and all 3 requests got handled just right.

@jamadden
Copy link
Collaborator

We're getting into details I happily admit to being fuzzy on and which I haven't tested recently.

This could be where the interaction of worker type (whether the listener socket is blocking or not) and the reuse_port setting come into play. When reuse_port is false (the default) there's essentially one queue for each open socket --- and there are two open sockets, one in the master and one in the worker. That makes for a total of three in-flight connections (two backlogged, one in progress).

When it's true and the SO_REUSEPORT socket option is enabled, then there's a single combined queue shared among all the sockets; whether that actually changes the total backlog handling or not, I don't know, but it might. Also, using gevent here could change the calculus as well, because it seems that introducing epoll changes the way connections are moved between queues.

Again, I haven't really looked closely into the details or experimented on Linux enough to claim to fully understand at this level. My deployed systems perform well with gevent+SO_REUSEPORT, and if the exact backlog isn't correct it's not a big deal for us. And hey, it's just an off-by-one error, and those have never resulted in any big problems, right? 😄

@jamadden
Copy link
Collaborator

Is there something I should do here? AFAICS gunicorn is behaving correctly, it's just that backlog is inherently incredibly system-dependent.

@benoitc
Copy link
Owner

benoitc commented Nov 21, 2019 via email

@tuukkamustonen
Copy link
Author

tuukkamustonen commented Nov 21, 2019

Sorry for delay in response!

When reuse_port is false (the default) there's essentially one queue for each open socket --- and there are two open sockets, one in the master and one in the worker. That makes for a total of three in-flight connections (two backlogged, one in progress).

Indeed I didn't even know about reuse_port option and even after reading those articles can't say I really understand how the "in-flight" connection you refer to can live here... but I believe you. No need to go deeper here, as I need to study this topic more and like you said yourself it's sometimes a bit fuzzy domain. I can see myself wondering about it for years to come. This is enough for now :)

Also, using gevent here could change the calculus as well, because it seems that introducing epoll changes the way connections are moved between queues.

I mostly ran my tests using sync workers, for simplicity, and gevent didn't seem to function much differently. But yeah, there might have been slight difference.

Again, I haven't really looked closely into the details or experimented on Linux enough to claim to fully understand at this level. My deployed systems perform well with gevent+SO_REUSEPORT, and if the exact backlog isn't correct it's not a big deal for us. And hey, it's just an off-by-one error, and those have never resulted in any big problems, right? 😄

Yeah, I get your point :D and I agree. It seems to work reasonably well in this surprising and magical world of networking. Humble thanks for the prompt responses and effort taken here.

However, maybe this should be clarified in the docs:

When it's true and the SO_REUSEPORT socket option is enabled, then there's a single combined queue shared among all the sockets; whether that actually changes the total backlog handling or not, I don't know, but it might.

I mean, if there's relation to total amount of requests that can be queued (and this regard it can be major one), maybe it should be emphasized to the reader? The docs for reuse_port is now just:

Set the SO_REUSEPORT flag on the listening socket.

For a novice reader, I believe it helps very little. It makes me wonder what the backlog option does if reuse_port is enabled? Does backlog exist (on master process) at all then? Is backlog for each worker (so total backlog is actually * then)? How are requests shared between workers with reuse_port enabled/disabled? With reuse_port enabled and 1 master + 2 workers, if master also opens accept queue, is every 1/3 request accepted by master, and re-handled to workers, and every 2/3 requests are directly accepted by workers? What is the effective difference to master taking it all in and then redistributing?

@Dobiasd
Copy link

Dobiasd commented Nov 21, 2020

Gunicorn uses the socket.listen function to open the port. This maps to the listen function on Linux.

According to the listen docs, the backlog parameter for this function is just a hint. The implementation might use different values, which is very likely what happens here. (The same is true for uWSGI.)

So, it seems, Gunicorn can not provide such fine-grained control over the request queue with the --backlog parameter. It might make sense to mention this in the docs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants