Add `max_connections` limit to `HTTP.listen()` #647

staticfloat · 2020-12-17T20:00:13Z

It is possible for the Task scheduler to become overwhelmed by too many
small tasks; it provides a better experience to simply refuse to create
more tasks and apply back-pressure through an accept() backlog.

It is possible for the Task scheduler to become overwhelmed by too many small tasks; it provides a better experience to simply refuse to create more tasks and apply back-pressure through an `accept()` backlog.

codecov-io · 2020-12-17T20:04:45Z

Codecov Report

Merging #647 (97ff836) into master (edc5e28) will decrease coverage by 0.24%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #647      +/-   ##
==========================================
- Coverage   77.63%   77.39%   -0.25%     
==========================================
  Files          36       36              
  Lines        2325     2353      +28     
==========================================
+ Hits         1805     1821      +16     
- Misses        520      532      +12

Impacted Files	Coverage Δ
src/Servers.jl	`66.87% <100.00%> (+0.64%)`	⬆️
src/RetryRequest.jl	`55.00% <0.00%> (-5.00%)`	⬇️
src/Messages.jl	`92.48% <0.00%> (-0.76%)`	⬇️
src/ConnectionRequest.jl	`51.76% <0.00%> (+3.43%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update edc5e28...97ff836. Read the comment docs.

quinnj · 2020-12-18T04:49:19Z

src/Servers.jl

@@ -229,6 +229,7 @@ function listen(f,
                tcpisvalid::Function=tcp->true,
                server::Union{Base.IOServer, Nothing}=nothing,
                reuseaddr::Bool=false,
+                max_connections::Int=nolimit,


Do you think there's a reasonable default we could use here? 10K? 100K? Enough so "benchmarkers" don't see a hit, but helps others not run into whatever you ran into by default?

I think it's very dependent on relative load; the issue is old tasks getting starved for work, so it all depends on how many requests per second your server can service. If you can serve 10k requests per second, then 10k tasks is probably fine, but if you can only serve 100 requests per second, then 100 is what you want.

I've tried creating more adaptive methods (like yield()'ing immediately to push new tasks onto the back of scheduler's work queue) but nothing has performed quite as well as just limiting the number of tasks a priori.

I have a nice little benchmarking setup for PkgServer.jl that is based off of k6; once I have it published you can take a look at it and maybe we can make some nice synthetic benchmarks for HTTP.jl that others can use to find the realistic limits for their own applications.

It should be stated that the main use for this max_connections is for graceful degredation of service. Here's the k6 output for max_connections=nolimit:

And here's max_connections=500 (an experimentally-determined "good choice" for this hardware and workload):

The first thing to notice is that the average requests per second (as eyeballed from the 2nd-to-the-left graph in the top row) are about the same, which makes sense; we're not changing the amount of throughput available. The next thing to notice is that the overall duration of the connections gets bounded at about 80ms in the second, whereas it grows quite rapidly in the first. This is shown by the lower mean http_req_duration (leftmost text box), as well as the much "tighter' spread between the min/95%/max lines in the center left graph. There are still some requests that get hung for some reasons (haven't tracked those down yet) but they're a very small percentage of requests.

The tradeoff is that errors start to happen a little earlier in the process now, as you can see in the "Errors per second" graph. That being said, the errors happen much quicker, and the rest of the clients maintain a much more responsive experience. But of course, knowing these limits will be completely application dependent.

Add max_connections limit to HTTP.listen()

97ff836

It is possible for the Task scheduler to become overwhelmed by too many small tasks; it provides a better experience to simply refuse to create more tasks and apply back-pressure through an `accept()` backlog.

quinnj reviewed Dec 18, 2020

View reviewed changes

staticfloat closed this Dec 27, 2020

staticfloat reopened this Dec 27, 2020

quinnj merged commit 31fcf05 into JuliaWeb:master Dec 30, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `max_connections` limit to `HTTP.listen()` #647

Add `max_connections` limit to `HTTP.listen()` #647

staticfloat commented Dec 17, 2020

codecov-io commented Dec 17, 2020 •

edited

Loading

quinnj Dec 18, 2020

staticfloat Dec 19, 2020 •

edited

Loading

Add max_connections limit to HTTP.listen() #647

Add max_connections limit to HTTP.listen() #647

Conversation

staticfloat commented Dec 17, 2020

codecov-io commented Dec 17, 2020 • edited Loading

Codecov Report

quinnj Dec 18, 2020

Choose a reason for hiding this comment

staticfloat Dec 19, 2020 • edited Loading

Choose a reason for hiding this comment

Add `max_connections` limit to `HTTP.listen()` #647

Add `max_connections` limit to `HTTP.listen()` #647

codecov-io commented Dec 17, 2020 •

edited

Loading

staticfloat Dec 19, 2020 •

edited

Loading