Fix rack.after_reply exceptions breaking connections #2861

BlakeWilliams · 2022-04-16T20:06:10Z

Description

Currently, when a rack.after_reply callable raises an exception we
attempt to handle it like other client errors by responding with an HTTP
500 response. This however doesn't work because rack.after_reply
callbacks are called after the response body has already been written to
the client.

This can cause issues with re-used connections. This is because 2 HTTP
responses are being returned for a single request. If a second HTTP
request is made before the error handling logic completes the timing can
line up causing the second HTTP response to be served a 500 from the
first HTTP requests rack.after_reply callbacks raising.

That may look roughly like:

Request 1 starts, opening a reusable TCP connection
Request 1 is written to and "completed"
Request 1 rack.after_reply callables are called
Request 2 starts, reusing the same TCP connection as request 1
rack.after_reply raises, calls client_error and serves a 500
response
Request 2 receives the 500 response.

This is somewhat difficult to reproduce using HTTP clients since it's a
race condition whether or not the 500 is written at the "correct" time
or not.

To prevent this issue the rack.after_reply callables are now wrapped
in a begin/rescue/end block that rescues from StandardError and logs
instead of attempting to serve a 500 response.

Closes #2856

Your checklist for this pull request

I have reviewed the guidelines for contributing to this repository.
I have added (or updated) appropriate tests if this PR fixes a bug or adds a feature.
My pull request is 100 lines added/removed or less so that it can be easily reviewed.
If this PR doesn't need tests (docs change), I added [ci skip] to the title of the PR.
If this closes any issues, I have added "Closes #issue" to the PR description or my commit messages.
I have updated the documentation accordingly.
All new and existing tests passed, including Rubocop.

Currently, when a `rack.after_reply` callable raises an exception we attempt to handle it like other client errors by responding with an HTTP 500 response. This however doesn't work because `rack.after_reply` callbacks are called after the response body has already been written to the client. This can cause issues with re-used connections. This is because 2 HTTP responses are being returned for a single request. If a second HTTP request is made before the error handling logic completes the timing can line up causing the second HTTP response to be served a 500 from the first HTTP requests `rack.after_reply` callbacks raising. That may look roughly like: 1. Request 1 starts, opening a reusable TCP connection 2. Request 1 is written to and "completed" 3. Request 1 `rack.after_reply` callables are called 4. Request 2 starts, reusing the same TCP connection as request 1 5. `rack.after_reply` raises, calls `client_error` and serves a 500 response 6. Request 2 receives the 500 response. This is somewhat difficult to reproduce using HTTP clients since it's a race condition whether or not the 500 is written at the "correct" time or not. To prevent this issue the `rack.after_reply` callables are now wrapped in a begin/rescue/end block that rescues from `StandardError` and logs instead of attempting to serve a 500 response.

BlakeWilliams · 2022-04-16T20:08:34Z

test/test_rack_server.rb

+    # * If we would block trying to read from the socket, we can assume that
+    #   the erroneous 500 response wasn't/won't be written.
+    sleep 0.1
+    assert_raises IO::EAGAINWaitReadable do


I'm not the biggest fan of this test, but using Net::HTTP was flaky and the problem is pretty low-level so this is what I ended up with. It is technically testable using Net::HTTP but you have to instance_variable_get(:@socket) on the connection and I wasn't a huge fan of diving into stdlib internals.

Happy to get feedback/suggestions on alternatives here if anyone has any.

nateberkopec · 2022-04-16T22:13:09Z

Now this is a high-quality first time contribution to a project! 🥇

nateberkopec · 2022-04-16T22:15:43Z

Looks good to me but I'll leave it open for a bit to see if anyone else has anything to say

MSP-Greg · 2022-04-16T22:18:26Z

This passes for me locally. The one thing not sure about is whether the begin/rescue block should be inside or outside the after_reply.each loop?

BlakeWilliams · 2022-04-17T00:44:31Z

This passes for me locally. The one thing not sure about is whether the begin/rescue block should be inside or outside the after_reply.each loop?

Same, I'm wondering if it's platform differences since most of the failures are on Windows. I'll push up what I think may fix it in a bit.

I chose to put the rescue outside of the each block to keep the existing behavior where a callback raising halts the rest of the chain. Happy to change it if that's the direction y'all prefer, but it may be somewhat of a breaking change. There's a broader discussion about this behavior too in the rack spec over in the rack repo if you're interested.

BlakeWilliams · 2022-04-17T01:16:43Z

It looks like d044ed2 fixed the failures for the windows builds but all Ruby HEAD builds are still failing. I ran the suite against master in my fork too and it looks like Ruby HEAD is failing there too (my fork is 1 commit behind though, if that matters).

MSP-Greg · 2022-04-17T01:56:50Z

@BlakeWilliams

Ignore the TestIntegrationPumactl#test_prune_bundler_with_multiple_workers failures, as this PR doesn't affect those tests.

Puma may need some changes for issues related to default gems...

EDIT: Most of the std-lib has been 'gemified', and testing against Ruby master in the actual gem repos has also caused issues.

MSP-Greg · 2022-04-17T01:59:18Z

Also, OS socket errors may differ between Ubuntu, macOS, and Windows. Kind of a pita when one is wondering why test failures vary...

* Fix rack.after_reply exceptions breaking connections Currently, when a `rack.after_reply` callable raises an exception we attempt to handle it like other client errors by responding with an HTTP 500 response. This however doesn't work because `rack.after_reply` callbacks are called after the response body has already been written to the client. This can cause issues with re-used connections. This is because 2 HTTP responses are being returned for a single request. If a second HTTP request is made before the error handling logic completes the timing can line up causing the second HTTP response to be served a 500 from the first HTTP requests `rack.after_reply` callbacks raising. That may look roughly like: 1. Request 1 starts, opening a reusable TCP connection 2. Request 1 is written to and "completed" 3. Request 1 `rack.after_reply` callables are called 4. Request 2 starts, reusing the same TCP connection as request 1 5. `rack.after_reply` raises, calls `client_error` and serves a 500 response 6. Request 2 receives the 500 response. This is somewhat difficult to reproduce using HTTP clients since it's a race condition whether or not the 500 is written at the "correct" time or not. To prevent this issue the `rack.after_reply` callables are now wrapped in a begin/rescue/end block that rescues from `StandardError` and logs instead of attempting to serve a 500 response. * Assert against less specific exception

BlakeWilliams commented Apr 16, 2022

View reviewed changes

nateberkopec added the bug label Apr 16, 2022

Assert against less specific exception

d044ed2

nateberkopec merged commit b2283d8 into puma:master Apr 17, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix rack.after_reply exceptions breaking connections #2861

Fix rack.after_reply exceptions breaking connections #2861

BlakeWilliams commented Apr 16, 2022

BlakeWilliams Apr 16, 2022

nateberkopec commented Apr 16, 2022 •

edited

nateberkopec commented Apr 16, 2022

MSP-Greg commented Apr 16, 2022

BlakeWilliams commented Apr 17, 2022 •

edited

BlakeWilliams commented Apr 17, 2022

MSP-Greg commented Apr 17, 2022 •

edited

MSP-Greg commented Apr 17, 2022 •

edited

Fix rack.after_reply exceptions breaking connections #2861

Fix rack.after_reply exceptions breaking connections #2861

Conversation

BlakeWilliams commented Apr 16, 2022

Description

Your checklist for this pull request

BlakeWilliams Apr 16, 2022

Choose a reason for hiding this comment

nateberkopec commented Apr 16, 2022 • edited

nateberkopec commented Apr 16, 2022

MSP-Greg commented Apr 16, 2022

BlakeWilliams commented Apr 17, 2022 • edited

BlakeWilliams commented Apr 17, 2022

MSP-Greg commented Apr 17, 2022 • edited

MSP-Greg commented Apr 17, 2022 • edited

nateberkopec commented Apr 16, 2022 •

edited

BlakeWilliams commented Apr 17, 2022 •

edited

MSP-Greg commented Apr 17, 2022 •

edited

MSP-Greg commented Apr 17, 2022 •

edited