Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Abort closed clients to save capacity #1227

Merged
merged 1 commit into from Feb 27, 2017
Merged

Conversation

sirupsen
Copy link
Contributor

@sirupsen sirupsen commented Feb 25, 2017

Have you ever wondered what happens when a client refreshes their browser really fast? According to Wireshark, Chrome (I haven't tested other browsers) will properly send an RST packet. At Shopify, we see this pattern a lot during flash sales where people are refreshing their browsers aggressively at the top of the hour pending a release.

This will propagate through your reverse proxies and finally update the state in the TCP backlog to a closing state. Your Ruby webserver, however, is less fortunate—it'll pick up the client, go through the Rack middleware stack, render the response, and write it to the client. Only then it'll find out that it's doing all of this on a one-directional socket and raise an error like Errno::EPIPE. Below is a picture of this scenario (it uses Unicorn as a webserver as this is where I was fixing this the first time around, replace it with Puma and it's the same):

When Req 2, which has been closed by the client, is accept(2)ed by Puma it'll go through Puma's client handling and finally run @app.call. This could mean we're easily spending 10s to 100s of milliseconds rendering a response to a client whose connection is closed. This steals capacity from actual, legitimate users.

In 2012 Tom Burns submitted a patch to Unicorn addressing this problem. All credit to Tom Burns and Eric Wong for figuring this out 5 years ago. Interest was renewed recently internally because this patch doesn't work well for clients that are not on the loopback device. The link explains this in detail, I won't duplicate this here—but feel free to ask questions! Eric and I collaborated on a new patch upstream that's pretty much identical to this one. It simply inspects the TCP state of the client socket it's accepting and aborts if it's in a closed or closing state. While we don't use Puma for Shopify Core, we happily use it in many other apps. I'd love to hear what you think about putting this patch in Puma as well. It's saved us from many DDoSes in the early days and saved us tremendous capacity during sneaker drops (and other sales). 😄

The production symptom of this for us is that Nginx will emit a lot of 499 (client aborted connections), but Rails will emit 200s for these 499s. With a connection abortion patch, we can subtract the throughput of 499s from legitimate status codes.

I tested this patch with a simple script that'll start a request and close it immediately. If you run this script against a Puma server without this patch, you'll see 10 requests going through. If you run it with this patch, no requests will go through.


def closed_socket?(socket)
tcp_info = socket.getsockopt(Socket::SOL_TCP, Socket::TCP_INFO)
state = tcp_info.unpack("C".freeze)[0]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can see the definition of the tcp info struct here.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you should assure that socket.kind_of? TCPSocket.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great point, thanks @brotbert. I'll wait with updating this PR as to not lose this comment for others who may want to review.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please pull that "C" out into a constant, as puma doesn't use the frozen string pragma.

@@ -28,6 +28,9 @@ class Server
include Puma::Const
extend Puma::Delegation

# TCP_TIME_WAIT: 6, TCP_CLOSE: 7, TCP_CLOSE_WAIT: 8, TCP_LAST_ACK: 9
IGNORED_CHECK_CLIENT_SOCKET_STATES = (6..9)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The enum for TCP states is defined here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIME_WAIT isn't really applicable to the described use-case, correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't done prolonged testing for this, however, if, say, your TCP backlog is 1000 (or whatever) it seems possible that something at the end of the backlog could go through a full closing of the socket while it's still in the backlog. To be honest I'm not entirely sure of the mechanics here in Linux' TCP/IP implementation of whether you "have" to accept(2) and close(2) it from the listening socket (kind of like wait(2) on a zombie process), or if you could ever accept a socket that's already in this state. I mostly included it for completeness of socket states we definitely don't care about. I found that we're also missing the TCP_CLOSING (server has buffered data that it hasn't sent yet but the other end has closed). I don't see how this would ever happen in Puma, but it's also not a socket state we're interested in. For the Unicorn patch we'll be running in production in the coming weeks, I may actually patch it so that we print the socket state that we rejected. It'd be interesting to see over a month if it's always CLOSE_WAIT (which it has been in my testing), or if it is ever one of the others. I haven't nerded out with tcpdump and netstat to be completely confident it can ever end up in the other states. If you think it's key, we can certainly go with only CLOSE_WAIT—since it's the only one I've observed in testing.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For TIME_WAIT to appear it means the application already called close() on the socket, which should be impossible here, and I'd also bet it's not legal to call close() on such a socket since it implies closing it twice.

@nateberkopec
Copy link
Member

Hullo Simon!

Thanks for such a detailed PR. Will take a look!

@sirupsen
Copy link
Contributor Author

@nateberkopec 👋 long time no see

@pchaganti
Copy link

👍

@SamSaffron
Copy link

this is an improvement but still not ideal, I guess in a super ideal world you would patch the c binding for postgres as well to always double check if the socket is still good prior to running queries which would allow you to abort mid request.

Also since a middleware crawl can easily be a 10ms affair does it not makes sense to do one more check in the last middleware before handing the req to rails?

I wonder if rack protocol should allow exposing this as well

@nateberkopec
Copy link
Member

I wonder if rack protocol should allow exposing this as well

What's stopping Puma from just adding it to the environment?

@sirupsen
Copy link
Contributor Author

I wonder if rack protocol should allow exposing this as well

You'll be able to assert on puma.socket which is already in the environment.

does it not makes sense to do one more check in the last middleware before handing the req to rails?

This is certainly not a perfect heuristic, but it's been good enough for our use-case in production for almost 5 years. It's never going to be perfect, there's always going to be a race. I think the webserver should do the best job it can, and once it's in Rack-land, the user shouldn't have to worry about this anymore. I don't really want to leak this into the application-level. In other words, the complexity of having a middleware as well I don't think is a bigger downside than the upside of possibly catching a couple more requests.

I guess in a super ideal world you would patch the c binding for postgres as well to always double check if the socket is still good prior to running queries which would allow you to abort mid request.

I'm not entirely sure what you mean by this. Is your point that the idea behind this patch could be extended into pretty much every library that talks over a socket to fail early?

@SamSaffron
Copy link

You'll be able to assert on puma.socket which is already in the environment.

The issue with that is that its not part of the rack spec, instead something like

socket.alive? being a proc that can be called by the app would keep the socket safe and could be available across the ecosystem

and once it's in Rack-land, the user shouldn't have to worry about this anymore

Sort of, in the natural course of events a big pile of SQL statements get executed making a monkey patch to "exec_async" which double checks if the connection to client is even there prior to running sql a pretty appealing patch. That would be the only patch I would really consider. downside is that a proc gets called a bunch of extra times.

@sirupsen
Copy link
Contributor Author

Sort of, in the natural course of events a big pile of SQL statements get executed making a monkey patch to "exec_async" which double checks if the connection to client is even there prior to running sql a pretty appealing patch. That would be the only patch I would really consider. downside is that a proc gets called a bunch of extra times.

I have a hard time parsing this, sorry. Can you elaborate further on what you mean by this? Are you saying this patch will not even be considered, or are you suggesting an alternative patch? What does SQL have to do with this?

@SamSaffron
Copy link

I have a hard time parsing this, sorry. Can you elaborate further on what you mean by this? Are you saying this patch will not even be considered, or are you suggesting an alternative patch? What does SQL have to do with this?

It's @evanphx call what to do with this patch I am not on this project.

All I am doing is suggesting that

  1. A feature request be made to rack to add socket.alive? proc which enables the application to handle this if it so desires

  2. A proposed patch that the applications could make to save up on some work with a minimal monkey patch to either rails or whatever sql adapter they are using, which would allow you to catch this condition mid-request

@sirupsen
Copy link
Contributor Author

A feature request be made to rack to add socket.alive? proc which enables the application to handle this if it so desires

Who from the Rack team would be appropriate to tag on this?

I'll wait for @evanphx to weigh in on the overall direction of this patch.

@evanphx
Copy link
Member

evanphx commented Feb 27, 2017

@sirupsen I left a comment about a frozen string usage above, otherwise this is fin.

Per @SamSaffron's discussion though, I think exploring the ability for puma to advertise that the client has disappeared is totally fine and should be done separate from this PR.

I'd prefer something more generic that checking the socket directly, like env['client_status'].call == :alive, which would be implemented by checking the socket or whatever the server things is the relevant thing (not everything will have directly connected sockets, for instance, and might use a separate signal to implement this check.

Despite all that, it's still a separate feature than this. Make the frozen string change and I'll merge this.

@sirupsen sirupsen force-pushed the abort-clients branch 3 times, most recently from 373aafb to 65261d1 Compare February 27, 2017 18:22
@sirupsen
Copy link
Contributor Author

sirupsen commented Feb 27, 2017

@evanphx updated with the frozen string change, a check for TCPSocket and to not depend on cover? as doing a straight comparison is faster than Range#cover? for some reason (and because I added TCP_CLOSING), thanks! I verified this patch again with the script linked in the description and it works perfectly still.

Edit: The CI failure is failing on other branches too, not sure what this is related to—but doesn't seem to be this PR.

@evanphx evanphx merged commit 61c9377 into puma:master Feb 27, 2017
hubot pushed a commit to defunkt/unicorn that referenced this pull request Mar 8, 2017
* Use a frozen empty array and a class variable for TCP_Info to avoid
  garbage. As far as I can tell, this shouldn't result in any garbage on
  any requests (other than on the first request).
* Pass listener socket to #read to only check the client connection on
  a TCP server.
* Short circuit CLOSE_WAIT after ESTABLISHED since in my testing it's
  the most common state after ESTABLISHED, it makes the numbers
  un-ordered, though. But comment should make it OK.
* Definition of of `check_client_connection` based on whether
  Raindrops::TCP_Info is defined, instead of the class variable
  approach.
* Changed the unit tests to pass a `nil` listener.

Tested on our staging environment, and still works like a dream.

I should note that I got the idea between this patch into Puma as well!

puma/puma#1227

[ew: squashed in temporary change for oob_gc.rb, but we'll come
 up with a different change to avoid breaking gctools
 <https://github.com/tmm1/gctools>]

Acked-by: Eric Wong <e@80x24.org>
hubot pushed a commit to defunkt/unicorn that referenced this pull request Mar 8, 2017
* Use a frozen empty array and a class variable for TCP_Info to avoid
  garbage. As far as I can tell, this shouldn't result in any garbage on
  any requests (other than on the first request).
* Pass listener socket to #read to only check the client connection on
  a TCP server.
* Short circuit CLOSE_WAIT after ESTABLISHED since in my testing it's
  the most common state after ESTABLISHED, it makes the numbers
  un-ordered, though. But comment should make it OK.
* Definition of of `check_client_connection` based on whether
  Raindrops::TCP_Info is defined, instead of the class variable
  approach.
* Changed the unit tests to pass a `nil` listener.

Tested on our staging environment, and still works like a dream.

I should note that I got the idea between this patch into Puma as well!

puma/puma#1227

[ew: squashed in temporary change for oob_gc.rb, but we'll come
 up with a different change to avoid breaking gctools
 <https://github.com/tmm1/gctools>]

Acked-by: Eric Wong <e@80x24.org>
@schneems
Copy link
Contributor

schneems commented Mar 9, 2017

Great work here. This is going to go out soon in 3.8.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants