Improve incoming request handling by accepting all requests #73

vbanos · 2018-03-04T06:31:53Z

Currently, when the number of active requests is greater than
max_threads, we do time.sleep(0.05) and check again indefinitely until
we can accept the request.
Also, when this happens, the unaccepted_requests counter is
incremented.

This condition creates an issue. The number of incoming requests must be
no more that the number of warcprox threads(!) and limits the capacity of
warcprox to handle incoming requests.
If we select to use a small number of max_threads and we have a
fair number of browsers using warcprox, a lot of requests will wait.

Since the ThreadPoolExecutor we use has a queue, we can accept all
requests regardless of the number of max_threads and we'll serve them
when we can (The queue is FIFO).
The clients will have their connections accepted and maybe wait a bit for
response but they won't be blocked.

Also, VERY important is that we'll be able to keep the number of
max_threads low.
In my performance tests using a VM with 4 CPU cores, I set max_threads=20
and I saw active_requests reach values ~50 without seconds_behind increasing over 1 sec.

In addition, another improvement in this PR is that we convert
PooledMixIn.active_requests to a simple counter instead of a set().
We use this var only to add/remove requests and do len(self.active_requests)
to report their numbers.
Since we don't do anything with the set contents, there is no point to
keep it in memory.
We'll just do self.active_requests +-1 when necessary.

TODO: If this PR is OK up to this point, I also need to remove
unaccepted_requests from everywhere.

Currently, when the number of active requests is greater than `max_threads`, we do `time.sleep(0.05)` and check again indefinitely until we can accept the request. Also, when this happens, the `unaccepted_requests` counter is incremented. This condition creates an issue. The number of incoming requests must be no more that the number of warcprox threads(!) and limits the capacity of warcprox to handle incoming requests. If we select to use a small number of `max_threads` and we have a fair number of browsers using warcprox, a lot of requests will wait. Since the `ThreadPoolExecutor` we use has a queue, we can accept all requests regardless of the number of `max_threads` and we'll serve them when we can (The queue is FIFO). The clients will have their connections accepted and maybe wait a bit for response but they won't be blocked. Also, VERY important is that we'll be able to keep the number of max_threads low. In addition, another improvement in this PR is that we convert `PooledMixIn.active_requests` to a simple counter instead of a set(). We use this var only to add/remove requests and do len(self.active_requests) to report their numbers. Since we don't do anything with the set contents, there is no point to keep it in memory. We'll just do self.active_requests +-1 when necessary. TODO: If this PR is OK up to this point, I also need to remove `unaccepted_requests` from everywhere.

vbanos · 2018-03-06T06:45:36Z

Incoming queues that cannot be accepted wait in the accept queue.
If we accept incoming connections ASAP, connection timeout will not happen but maybe read timeout will happen later.

In addition, in order to avoid dropping connections in the accept queue, we tune sysctl variables net.core.somaxconn=4096 and net.ipv4.tcp_max_syn_backlog=4096

vbanos closed this Mar 6, 2018

vbanos deleted the improve-incoming-request-handling branch April 15, 2019 19:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve incoming request handling by accepting all requests #73

Improve incoming request handling by accepting all requests #73

vbanos commented Mar 4, 2018 •

edited

Loading

vbanos commented Mar 6, 2018

Improve incoming request handling by accepting all requests #73

Improve incoming request handling by accepting all requests #73

Conversation

vbanos commented Mar 4, 2018 • edited Loading

vbanos commented Mar 6, 2018

vbanos commented Mar 4, 2018 •

edited

Loading