-
Notifications
You must be signed in to change notification settings - Fork 972
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use Net::HTTP#start(&block) to ensure closed TCP connections #1117
Use Net::HTTP#start(&block) to ensure closed TCP connections #1117
Conversation
Despite the API indicating otherwise, using Net::HTTP#get (or #request) will open a TCP connection and it will remain open long after the request is finished, even though calling Net::HTTP#finish subsequently raises an IOError and the #to_s shows an incorrect "open=false". Using the Net::HTTP#start block to explicitly open the connection causes Net::HTTP to fully close the TCP connection once the block complete evaluation. This was verified by doing the following: 1. Running a webserver on a *nix OS (selected default nginx on Ubuntu docker) 2. Running `watch 'netstat -tunapl'` on that OS 3. Run `curl` to GET that webserver many times. Note how there are no "TIME_WAIT" sockets 4. In IRB run `Net::HTTP.get(...)` many times. Note how there are no "TIME_WAIT" sockets 5. In IRB with `rest-client` run `RestClient.get(...)` many times. Note how there are no "TIME_WAIT" sockets 6. In IRB with `faraday` run `Faraday.get(...)` many times. Note how there are n number of "TIME_WAIT" sockets on the webserver OS. They timeout and clear after 60s: ``` Active Internet connections (servers and established) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN 3511/nginx: master tcp 0 0 172.17.0.2:80 172.17.0.1:60470 TIME_WAIT - tcp 0 0 172.17.0.2:80 172.17.0.1:60472 TIME_WAIT - tcp 0 0 172.17.0.2:80 172.17.0.1:60478 TIME_WAIT - tcp 0 0 172.17.0.2:80 172.17.0.1:60476 TIME_WAIT - tcp 0 0 172.17.0.2:80 172.17.0.1:60468 TIME_WAIT - tcp 0 0 172.17.0.2:80 172.17.0.1:60474 TIME_WAIT - ... ``` This matters because a server receiving many HTTP requests from clients using Faraday will begin to oversaturate and timeout past a particular scale. We ran into this limit. In short, #start must be used *first* before any requests are made on a Net::HTTP instance otherwise you'll get a dangling TCP connection server-side.
Build failures match my local rspec failures that occur when running suite against |
Hi @f3ndot, thanks for the contribution, it makes sense after having a quick look. |
@f3ndot for some reason the "Update branch" button is not available 🤔... |
I was mistaken and it is a different beast than raw net/http. The issue does not apply on the persistent adapter.
@iMacTia all set! I had to remove it for net-persistent-http since they have a bit of a different interface. As far as I can tell the user invokes #shutdown when they're done with a connection anyhow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, tests are green, seems to solve an important issue 😃
LGTM
Hey friends, any chance we could get a release cut on this? |
Hi @iMacTia 👋 , do you guys know when there might be a release cut for this? thanks! |
Sorry about the delay! Waiting for #1125 to get merged as well and then I guess we can get a release done 😃 |
Faraday v1.0.1 is out 🎉! |
@iMacTia thank you for doing this. I dug deeper into the "why" and figured I'd post my findings here. I was originally mistaken in describing the issue. The In HTTP/1.0 the default server behaviour was to close each and every HTTP request it received (aka behave as if HTTP header With HTTP/1.1 the spec changed the default server behaviour to keeping the connection alive (akin to The smoking gun is code in Ruby that's over 20 years old. It assumes all HTTP servers are 1.0 and explicitly adds a net = Net::HTTP.new('localhost', 8080)
50.times { net.get('/') } # is bad: connection close header appended 50.times { Net::HTTP.get(URI('http://localhost:8080/')) } # is OK: no header appended
net = Net::HTTP.new('localhost', 8080) net.start
50.times { net.get('/') } # is OK: no header appended
net.finish net = Net::HTTP.new('localhost', 8080)
50.times { net.start { net.get('/') } } # is OK: no header appended The fact the client is making assumptions about what the server wants and explicitly adding an In any event, I ended up filing a bug report to Ruby about it: https://bugs.ruby-lang.org/issues/16559 |
Thanks @f3ndot for the extensive explanation, I'm sure future readers will find this extremely helpful 😃 |
I don't necessarily agree that it does... as mentioned by @jeremyevans on the Ruby bug report filed by @f3ndot, this is just making a different trade-off; it also fundamentally changed the behaviour of the Faraday adapter w.r.t. HTTP connections, so it was far more than a patch release, really ( Furthermore, this change now explicitly prevents the use of persistent I'm fully aware of And even the
The changes in this PR explicitly make the native At the very least it should be configurable, and arguably closing the connection after each request shouldn't even be the default behaviour of the adapter, but rather something you can tell it to do; that way you keep parity with previous Faraday behaviour, and you can still achieve what this PR is doing, just not by hard coding it. I see the adapter has since moved to a separate gem so I'll cobble together a quick PR for y'all's consideration. 🙂 |
Thanks a lot @pvdb, the main reason for us moving adapters out is to make it easier for others to contribute on this kind of issues since the core team can't possible know all the inside-outs of each adapter. I'm sure a PR on the new |
Despite the API indicating otherwise, using Net::HTTP#get (or #request) will open a TCP connection and it will remain open long after the request is finished, even though calling Net::HTTP#finish subsequently raises an IOError and the #to_s shows an incorrect "open=false".
Using the Net::HTTP#start block to explicitly open the connection causes Net::HTTP to fully close the TCP connection once the block complete evaluation.
This was verified by doing the following:
watch 'netstat -tunapl'
on that OScurl
to GET that webserver many times. Note how there are no "TIME_WAIT" socketsNet::HTTP.get(...)
many times. Note how there are no "TIME_WAIT" socketsrest-client
runRestClient.get(...)
many times. Note how there are no "TIME_WAIT" socketsfaraday
runFaraday.get(...)
many times. Note how there are n number of "TIME_WAIT" sockets on the webserver OS. They timeout and clear after 60s:This matters because a server receiving many HTTP requests from clients using Faraday will begin to oversaturate and timeout past a particular scale. We ran into this limit.
In short, #start must be used first before any requests are made on a Net::HTTP instance otherwise you'll get a dangling TCP connection server-side.
Todos