Http connections aborted after 5s / keepAliveTimeout #13391

pbininda · 2017-06-02T10:52:34Z

Version: v8.0.0
Platform: Windows 10, 64bit
Subsystem: http

Short Description

Node 8 introduced a change in the handling of http keep-alive connections. IMHO, this is (at least) a breaking change. When an http server does long-running requests (>5s), and the client requests a Connection: keep-alive connection, the http server closes the connection after 5s. This potentially causes browsers to re-send the request even if it is a POST request.

To Reproduce

clone https://github.com/pbininda/node8keepAliveTimeout and npm install. Then

    npm test

Starts a little express server (server.js) and a client.

The server is a standard express server with a long running post request (/longpost takes 10s).
The client calls the POST /longpost with a preflight OPTIONS /longpost.

The test runs through fine on node 6 and node 7:

> node test.js

got request OPTIONS /longpost
got options response 200
sending post request
got request POST /longpost
got post response 200 { status: 'OK' }

but fails on node 8 with

> node test.js

got request OPTIONS /longpost
got options response 200
sending post request
got request POST /longpost
C:\Users\pbininda\projects\ATRON\node8keepAliveTimeout\client.js:39
            throw err;
            ^

Error: socket hang up
    at createHangUpError (_http_client.js:343:15)
    at Socket.socketOnEnd (_http_client.js:435:23)
    at emitNone (events.js:110:20)
    at Socket.emit (events.js:207:7)
    at endReadableNT (_stream_readable.js:1045:12)
    at _combinedTickCallback (internal/process/next_tick.js:102:11)
    at process._tickCallback (internal/process/next_tick.js:161:9)

Browser Retries

It seems, most of the major browsers (Chrome, Firefox, Edge) implement https://www.w3.org/Protocols/rfc2616/rfc2616-sec8.html#sec8.2.4. Since the server closes the connection on which it received the POST request before sending an answer, the Browsers re-send the POST. Note that you don't see the re-send in chrome dev tools but using Wireshark shows the retransmission. To have a look at this, run

    npm start

which launches the server (server.js) and then load browsertest.html in chrome. This runs browsertest.js in the browser which does a simple $.ajax request against the server. On the server side you will see:

> node server.js

got request OPTIONS /longpost
got request POST /longpost
got request POST /longpost
format called 5003ms after previous

This shows, that the server received two POST requests the second one 5s after the first one, even though the browser client code only does one request.

Bug or Breaking Change?

I'm not sure if this is a bug or a breaking change. It probably got introduced through #2534. It only seems to happen when two connections are used (that's why the prefight OPTIONS is forced to happen in my code), so it may be that the wrong connection is being closed here.

Workaround

Setting the keepAliveTimeout (see https://nodejs.org/dist/latest-v8.x/docs/api/http.html#http_server_keepalivetimeout) of the http server to a value greater than the maximum duration of a request solves the problem. You can try this with

    npm start -- --keepAliveTimeout 20000

and then in another terminal

    node client.js

The text was updated successfully, but these errors were encountered:

ChALkeR · 2017-06-02T10:59:43Z

/cc @indutny @tshemsedinov @aqrln
Could be related to #2534.

aqrln · 2017-06-02T16:54:50Z

Thanks for the ping, I'll be able look into this in a few hours.

lvpro · 2017-06-02T18:55:20Z

Thanks for bringing this up @pbininda. We hit this issue as well when running requests longer than a few seconds. We also see multiple POSTs effecting all browsers on 8 ... reverted back to 7.10 and all is well. Seems like a pretty pernicious bug. We're using Koa 1.x middleware on Linux.

juanecabellob · 2017-06-02T22:57:16Z

Same behaviour here; request bit longer than usual and the browser kept resending the request. A general remark is that Firefox and Chrome did resend the request, but Safari didn't. Using Express server 4.15.2. Finally, we reverted to 7.10 and is working normally.

pbininda · 2017-06-03T16:28:37Z

Thanks for the note regarding Safari, I'll change the wording regarding "all major browsers" 😓

aqrln · 2017-06-05T09:34:26Z

Ugh... sorry for the delay, I was more busy that I hoped for. Let's fix it today. Thanks a lot for a detailed report and reproduction!

lvpro · 2017-06-08T12:21:34Z

Confirmed issue still present in release 8.1 as well.

aqrln · 2017-06-08T14:58:49Z

@pbininda @lvpro @juanecabellob I'm very sorry for not making it in time for the 8.1 release. I took a look at the reproduction back then, but didn't have an opportunity to debug it until now. #13549 should fix it.

lvpro · 2017-06-08T18:17:25Z

Don't be sorry @aqrln! Thank you very much for getting this resolved! :)

pbininda · 2017-06-08T20:30:13Z

@aqrln No problem, I can keep the server.keepAliveTimeout workaround in place until the fix is released. Thanks for your effort.

Fix the logic of resetting the socket timeout of keep-alive HTTP connections and add two tests: * `test-http-server-keep-alive-timeout-slow-server` is a regression test for nodejsGH-13391. It ensures that the server-side keep-alive timeout will not fire during processing of a request. * `test-http-server-keep-alive-timeout-slow-headers` ensures that the regular socket timeout is restored as soon as a client starts sending a new request, not as soon as the whole message is received, so that the keep-alive timeout will not fire while, e.g., the client is sending large cookies. Refs: nodejs#2534 Fixes: nodejs#13391

Fix the logic of resetting the socket timeout of keep-alive HTTP connections and add two tests: * `test-http-server-keep-alive-timeout-slow-server` is a regression test for GH-13391. It ensures that the server-side keep-alive timeout will not fire during processing of a request. * `test-http-server-keep-alive-timeout-slow-client-headers` ensures that the regular socket timeout is restored as soon as a client starts sending a new request, not as soon as the whole message is received, so that the keep-alive timeout will not fire while, e.g., the client is sending large cookies. Refs: #2534 Fixes: #13391 PR-URL: #13549 Reviewed-By: Refael Ackermann <refack@gmail.com> Reviewed-By: Matteo Collina <matteo.collina@gmail.com> Reviewed-By: Brian White <mscdex@mscdex.net>

awb99 · 2017-10-15T18:57:47Z

I have this issue too with webpack-dev-server. It somehow does not work.

How would I pass the paratmeter keepAlive Timeout so that the default timeout is 20 secs?

Above this is the example

npm start -- --keepAliveTimeout 20000

I tried this:

 npm run-script start -- --"keepAliveTimeout 20000"

It gets expanded to:

webpack-dev-server --inline --hot --history-api-fallback  --public --host 104.222.96.51 --port 9090 "--keepAliveTimeout 20000"

Unfortunately it does not have any effect.

apapirovski · 2017-10-23T20:15:13Z

@MylesBorins You'll need the other commit: aaf2a1c — they probably should've landed as one or I should've indicated that the 2nd one depends on the first working properly. My bad.

MylesBorins · 2017-10-23T20:17:49Z

@apapirovski done

landed in eeef06a

Documenting the best way to imitate the old behavior saves time for people migrating from older versions. (E.g. for unexpected ECONNRESET) It isn't immediately obvious if earlier nodejs versions behaved the same way as nodejs 8 does with keepAliveTimeout = 0. From 0aa7ef5, it seems like they behave the same way. Related to issues such as #13391 that show up when migrating to node 8 PR-URL: #17660 Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Rich Trott <rtrott@gmail.com> Reviewed-By: Luigi Pinca <luigipinca@gmail.com> Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: Anatoli Papirovski <apapirovski@mac.com>

details of bug are here: nodejs/node#13391 docs here: https://nodejs.org/dist/latest-v8.x/docs/api/http.html#http_server_keepalivetimeout tl;dr is node set a 5s idle timeout. this does what go and java both seem to do (have not checked python or ruby, though from what I know about python, we are not doing this either) and doesn't have an idle timeout. since in this case we do kinda trust that the client is using an idle timeout (it's fn), this seems like the right policy anyway (if fn dies is something to consider, but the least of our worries is fdk conns in that case, and we are killing fn spawned containers on startup too). it should also be noted the client (fn) is only using 1 conn per container.

tony-gutierrez · 2019-05-10T17:54:42Z

Would this have been present in node 10.6.0? Test code above seems to pass.

janswist · 2020-03-25T18:08:17Z

3 years and bug still exists? How's that possible?

jasnell · 2020-03-25T18:10:58Z

How's that possible?

As with all things in Node.js, it requires someone free to work on it. Pull requests are always welcome. One thing that may be helpful is a reproduction of the issue in the form of a known_issue test that can be used to guide someone in making a fix.

janswist · 2020-03-25T18:45:46Z

How's that possible?

As with all things in Node.js, it requires someone free to work on it. Pull requests are always welcome. One thing that may be helpful is a reproduction of the issue in the form of a known_issue test that can be used to guide someone in making a fix.

Not intended to sound rude. I'm trying to solve that riddle for hours now - maybe someone else has it as well:

trying to send in response an 8KB array of 50 items that takes 6.7 seconds to perform [ERROR]
when I slice(10) it suddenly takes 2 seconds and everything works like a charm.

My question would be: how is that possible? Seems like if send data is too big (?) it just freezes and then times out.

I'm using Node 10.19.0. Thanks for your help.

uri-chandler · 2020-11-25T22:26:50Z

In case this helps - I think there's an additional use case which isn't related to the transmission size, but rather it's a race condition between the start / end of requests on the same connection.

First request, new connection opened, keep-alive timer starts.
First request ends
Keep-alive timer kicks in, "timeout" event is about to be emitted on the socket
3.1 First bytes of second request come in on the soon-to-be-destroyed socket
"timeout" event is emitted on the socket, resulting in the destruction of the socket
4.1 Second request gets dropped
Second requests parsing (parseOnIncoming..) starts, trying to reset the timeout on the already-destroyed socket

Important:
Note that in this test (see "Reproduce" below) we're not transmitting large amounts of data on the connection - which is (I think) why f6a725e doesn't fix this issue. Put differently, I think it's a timing issue from the time one requests ends (and keepAlive timer is about to end) - and just at the (almost) same time - a new request comes in.

*I'm not 100% sure this is the correct flow - it's just my best educated-guess, based on some debugging

Reproduce:

I simply used the test code in this repo: https://github.com/yoavain/node8keepAliveTimeout
Pretty consistent result - socket gets destroyed after the first couple of requests (easy to reproduce)

Tested Versions:

v8.16.1
v12.18.0
v12.19.1
v12.20.0
v14.7.0

Naive Approach to a Fix
Assuming my debug analysis is correct, I would assume that there are 2 main ways to go about a fix for this:

either prevent the timeout event from firing on the socket if the socket is still in use
or, when ever the timeout event has fired, we then check if the socket is still in use, in which case we do nothing

Here's a naive code listing of the second approach:

// file: _http_server.js
function socketOnTimeout() {
  // "this" is the socket
  if (this.isInUse) {
    return;
  }  
  ...
}

ChALkeR added the http Issues or PRs related to the http subsystem. label Jun 2, 2017

aqrln self-assigned this Jun 2, 2017

aqrln mentioned this issue Jun 8, 2017

http: fix timeout reset after keep-alive timeout #13549

Closed

3 tasks

aqrln closed this as completed in d71718d Jun 13, 2017

unbornchikken mentioned this issue Jun 13, 2017

Socket closed by the server after upgraded to Node.js 8.1 #13655

Closed

Rayraegah mentioned this issue Jun 16, 2017

net::ERR_INCOMPLETE_CHUNKED_ENCODING in node v8.0 + chrome 58.0.3029.110 webpack-contrib/webpack-hot-middleware#210

Closed

KhaosT mentioned this issue Jul 2, 2017

Stateless Programmable Switch no longer works on node v8.x homebridge/HAP-NodeJS#464

Closed

sourcesoft mentioned this issue Jul 5, 2017

Getting net::ERR_INCOMPLETE_CHUNKED_ENCODING react-boilerplate/react-boilerplate#1840

Closed

mtraynham mentioned this issue Aug 29, 2017

http - calls from an external IP close after 5-20 seconds using response.end and Content-Length header is set (Node 8.X) #14869

Closed

dougwilson mentioned this issue Oct 18, 2017

response incomplete json data expressjs/expressjs.com#870

Closed

apapirovski mentioned this issue Oct 23, 2017

[v6.x backport] tls: fix writeQueueSize prop, long write timeouts #16420

Closed

3 tasks

dbroadhurst mentioned this issue Nov 3, 2017

Unexpected EOF while installing/downloading large packages verdaccio/verdaccio#301

Closed

TysonAndre mentioned this issue Dec 13, 2017

doc: value choice for imitating the old behavior of http.Server.keepAliveTimeout #17660

Closed

2 tasks

ribizli referenced this issue in apollographql/apollo-server Jan 10, 2018

Send content length header.

671d196

samcov mentioned this issue Feb 13, 2018

Is there a fix for Http connections aborted after 5s / keepAliveTimeout nodejs/help#1103

Closed

MBDeveloper mentioned this issue Nov 20, 2018

HTTP Error 500 with substatus 1011/1013 and win32 error 109 (ERROR_BROKEN_PIPE) Azure/iisnode#57

Open

dosentmatter mentioned this issue Mar 8, 2019

http: only destroy socket once error is sent #26356

Closed

4 tasks

rdallman mentioned this issue Apr 16, 2019

fix idle timeout issue fnproject/fdk-node#26

Merged

yoavain mentioned this issue May 23, 2019

Regression issue with keep alive connections #27363

Closed

abondoa mentioned this issue Oct 21, 2019

IISNode responds with HTTP500.1013 after subsequent requests with long processing time (around > 5 seconds) tjanczuk/iisnode#610

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Http connections aborted after 5s / keepAliveTimeout #13391

Http connections aborted after 5s / keepAliveTimeout #13391

pbininda commented Jun 2, 2017 •

edited

Loading

ChALkeR commented Jun 2, 2017

aqrln commented Jun 2, 2017

lvpro commented Jun 2, 2017 •

edited

Loading

juanecabellob commented Jun 2, 2017 •

edited

Loading

pbininda commented Jun 3, 2017

aqrln commented Jun 5, 2017

lvpro commented Jun 8, 2017

aqrln commented Jun 8, 2017

lvpro commented Jun 8, 2017

pbininda commented Jun 8, 2017

awb99 commented Oct 15, 2017 •

edited

Loading

apapirovski commented Oct 23, 2017

MylesBorins commented Oct 23, 2017

tony-gutierrez commented May 10, 2019 •

edited

Loading

janswist commented Mar 25, 2020

jasnell commented Mar 25, 2020

janswist commented Mar 25, 2020 •

edited

Loading

uri-chandler commented Nov 25, 2020 •

edited

Loading

Http connections aborted after 5s / keepAliveTimeout #13391

Http connections aborted after 5s / keepAliveTimeout #13391

Comments

pbininda commented Jun 2, 2017 • edited Loading

Short Description

To Reproduce

Browser Retries

Bug or Breaking Change?

Workaround

ChALkeR commented Jun 2, 2017

aqrln commented Jun 2, 2017

lvpro commented Jun 2, 2017 • edited Loading

juanecabellob commented Jun 2, 2017 • edited Loading

pbininda commented Jun 3, 2017

aqrln commented Jun 5, 2017

lvpro commented Jun 8, 2017

aqrln commented Jun 8, 2017

lvpro commented Jun 8, 2017

pbininda commented Jun 8, 2017

awb99 commented Oct 15, 2017 • edited Loading

apapirovski commented Oct 23, 2017

MylesBorins commented Oct 23, 2017

tony-gutierrez commented May 10, 2019 • edited Loading

janswist commented Mar 25, 2020

jasnell commented Mar 25, 2020

janswist commented Mar 25, 2020 • edited Loading

uri-chandler commented Nov 25, 2020 • edited Loading

pbininda commented Jun 2, 2017 •

edited

Loading

lvpro commented Jun 2, 2017 •

edited

Loading

juanecabellob commented Jun 2, 2017 •

edited

Loading

awb99 commented Oct 15, 2017 •

edited

Loading

tony-gutierrez commented May 10, 2019 •

edited

Loading

janswist commented Mar 25, 2020 •

edited

Loading

uri-chandler commented Nov 25, 2020 •

edited

Loading