Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve incoming request handling by accepting all requests #73

Conversation

vbanos
Copy link
Contributor

@vbanos vbanos commented Mar 4, 2018

Currently, when the number of active requests is greater than
max_threads, we do time.sleep(0.05) and check again indefinitely until
we can accept the request.
Also, when this happens, the unaccepted_requests counter is
incremented.

This condition creates an issue. The number of incoming requests must be
no more that the number of warcprox threads(!) and limits the capacity of
warcprox to handle incoming requests.
If we select to use a small number of max_threads and we have a
fair number of browsers using warcprox, a lot of requests will wait.

Since the ThreadPoolExecutor we use has a queue, we can accept all
requests regardless of the number of max_threads and we'll serve them
when we can (The queue is FIFO).
The clients will have their connections accepted and maybe wait a bit for
response but they won't be blocked.

Also, VERY important is that we'll be able to keep the number of
max_threads low.
In my performance tests using a VM with 4 CPU cores, I set max_threads=20
and I saw active_requests reach values ~50 without seconds_behind increasing over 1 sec.

In addition, another improvement in this PR is that we convert
PooledMixIn.active_requests to a simple counter instead of a set().
We use this var only to add/remove requests and do len(self.active_requests)
to report their numbers.
Since we don't do anything with the set contents, there is no point to
keep it in memory.
We'll just do self.active_requests +-1 when necessary.

TODO: If this PR is OK up to this point, I also need to remove
unaccepted_requests from everywhere.

Currently, when the number of active requests is greater than
`max_threads`, we do `time.sleep(0.05)` and check again indefinitely until
we can accept the request.
Also, when this happens, the `unaccepted_requests` counter is
incremented.

This condition creates an issue. The number of incoming requests must be
no more that the number of warcprox threads(!) and limits the capacity of
warcprox to handle incoming requests.
If we select to use a small number of `max_threads` and we have a
fair number of browsers using warcprox, a lot of requests will wait.

Since the `ThreadPoolExecutor` we use has a queue, we can accept all
requests regardless of the number of `max_threads` and we'll serve them
when we can (The queue is FIFO).
The clients will have their connections accepted and maybe wait a bit for
response but they won't be blocked.

Also, VERY important is that we'll be able to keep the number of
max_threads low.

In addition, another improvement in this PR is that we convert
`PooledMixIn.active_requests` to a simple counter instead of a set().
We use this var  only to add/remove requests and do len(self.active_requests)
to report their numbers.
Since we don't do anything with the set contents, there is no point to
keep it in memory.
We'll just do self.active_requests +-1 when necessary.

TODO: If this PR is OK up to this point, I also need to remove
`unaccepted_requests` from everywhere.
@vbanos
Copy link
Contributor Author

vbanos commented Mar 6, 2018

Incoming queues that cannot be accepted wait in the accept queue.
If we accept incoming connections ASAP, connection timeout will not happen but maybe read timeout will happen later.

In addition, in order to avoid dropping connections in the accept queue, we tune sysctl variables net.core.somaxconn=4096 and net.ipv4.tcp_max_syn_backlog=4096

@vbanos vbanos closed this Mar 6, 2018
@vbanos vbanos deleted the improve-incoming-request-handling branch April 15, 2019 19:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant