Puma v5.0.3+ times out slow connections after 30 seconds #2512

bmclean · 2020-12-11T23:16:49Z

Background
Our application is a Rails API for mobile clients running on Heroku. The product is agriculture based and our users often have poor cellular coverage.

This API receives a Base64 encoded file (125K max size). Even though the file is small, on a poor cellular connection it can be slow to upload. The Heroku router's 30 second request timeout window doesn't start until after the client has sent its full payload, so there's nothing that should stop the payload from being received.

With Puma versions 5.0.2 and older this is exactly how it worked. Requests with Heroku service times of even 90 seconds were completed successfully.

Problem
However, starting with Puma 5.0.3 we started seeing H18 errors from Heroku (socket closed on the server) after 30 seconds. Puma is closing the socket. If we increase the Puma config property first_data_timeout (say to 60 seconds) we reduce the number of error occurrences.

Is this expected behaviour from Puma version 5.0.3 and newer? Should we be setting a large first_data_timeout to get 5.0.2 like functionality?

To Reproduce
We can reproduce the problem with a very simple Rails controller and a script. The script posts a large string payload while running a network link conditioner on a 3G profile. If this needs to be investigated further we can certainly provide them.

The text was updated successfully, but these errors were encountered:

sbilharz · 2020-12-12T19:55:20Z

We have the same issue. Since we updated from 5.0.2 to 5.0.4, our clients would complain about 408 responses which never occurred before. We tracked them down to puma - neither our load balancer nor the app code is producing them. According to our LB logs, the 408 response can occur between 0.2 to 15 seconds after forwarding, even though we do not set the first_data_timeout setting and therefore assume it to be the default 30 seconds. I am not sure yet how this can be explained.

We will try downgrading to 5.0.2 for now. I hope we can find a way to get the behavior of that version and still receive the latest puma updates.

nateberkopec · 2020-12-15T17:26:19Z

So, like a couple of other bugs, what you're seeing here is something that was (in our view) fixed by the Reactor refactor in 5.0.3. first_data_timeout essentially did not work correctly. The 5.0.2 and less behavior is a bug.

@bmclean Yes, you should set first_data_timeout to a higher value if you need Puma to buffer clients for more than 30 seconds.

@sbilharz That would depend on how your LB is configured. If it's opening a connection to Puma before the request is fully buffered, that would be consistent with the behavior you're seeing. I suggest also increasing first_data_timeout.

I believe 30 seconds is a sensible default. Higher values open you up more to slow client attacks.

bmclean · 2020-12-15T17:56:45Z

Thanks @nateberkopec

bmclean · 2020-12-16T23:19:14Z

Hi @sbilharz. While figuring out how to get Puma 5.0.3+ to play nice with rack-timeout I was able to recreate the 408 response. The repo is here if you are curious.

schneems · 2020-12-17T03:34:51Z

I believe 30 seconds is a sensible default.

Totally agree. Does the puma timeout reset with new bytes? The Heroku router will allow an indefinite connection as long as new bytes arrive in sub 30 second intervals.

Higher values open you up more to slow client attacks.

I think the reactor inherently protects against slow client attacks even without a timeout. It's one of the main reasons I started recommending puma over unicorn https://devcenter.heroku.com/articles/deploying-rails-applications-with-the-puma-web-server#slow-clients. Though I'm not sure if there are more resource limitations by moving to NIO4r from native IO.select.

sbilharz · 2020-12-17T09:25:15Z

Hi @sbilharz. While figuring out how to get Puma 5.0.3+ to play nice with rack-timeout I was able to recreate the 408 response. The repo is here if you are curious.

Very nice, thank you! That might save me a few hours of figuring out sensible values for us.

nateberkopec closed this as completed Dec 15, 2020

bmclean mentioned this issue Mar 29, 2021

Timeout during long file upload #2574

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Puma v5.0.3+ times out slow connections after 30 seconds #2512

Puma v5.0.3+ times out slow connections after 30 seconds #2512

bmclean commented Dec 11, 2020

sbilharz commented Dec 12, 2020

nateberkopec commented Dec 15, 2020

bmclean commented Dec 15, 2020

bmclean commented Dec 16, 2020

schneems commented Dec 17, 2020

sbilharz commented Dec 17, 2020

Puma v5.0.3+ times out slow connections after 30 seconds #2512

Puma v5.0.3+ times out slow connections after 30 seconds #2512

Comments

bmclean commented Dec 11, 2020

sbilharz commented Dec 12, 2020

nateberkopec commented Dec 15, 2020

bmclean commented Dec 15, 2020

bmclean commented Dec 16, 2020

schneems commented Dec 17, 2020

sbilharz commented Dec 17, 2020