Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AssertionError crash at ServerResponse.resOnFinish #2639

Closed
ghost opened this issue Sep 2, 2015 · 83 comments
Closed

AssertionError crash at ServerResponse.resOnFinish #2639

ghost opened this issue Sep 2, 2015 · 83 comments
Labels
http Issues or PRs related to the http subsystem.

Comments

@ghost
Copy link

ghost commented Sep 2, 2015

Running io.js 3.2 installed from source on a centOS 7 machine, here are the logs:
http://pastebin.com/ix0Ke5zS

@Fishrock123 Fishrock123 added the http Issues or PRs related to the http subsystem. label Sep 2, 2015
@ghost
Copy link
Author

ghost commented Sep 2, 2015

I was fiddling around with my server on digital ocean and got this screen when checking out their embedded terminal.
http://i.imgur.com/d0jc9zG.png
Could it be insufficient RAM that caused the crash? I am running a mongodb instance on the server and using libraries that are a little bulky on RAM.
I checked the RAM usage once and I had around 70 or 50 MB idle. I wasn`t looking how much RAM I had free at the time of the crashes.

Edit: total RAM of the server is 512.

@mscdex
Copy link
Contributor

mscdex commented Sep 2, 2015

Did you try with other versions, like io.js 2.5.0?

@ghost
Copy link
Author

ghost commented Sep 2, 2015

No, this just happened on a live instance. I have been running 3.2 on a different instance until yesterday, but the other machine had 1gb RAM. I will try upgrading the server and see if it happens again.

@ghost
Copy link
Author

ghost commented Sep 2, 2015

Yeah, people have been posting and so far, so good. "top" indicates about 400MB free RAM, so I really guess the issue was with RAM causing trouble down the road.
I don't know if its possible for io.js to handle that and give a more intuitive error. If it is, it could be useful in a scenario like this.

@ghost ghost closed this as completed Sep 2, 2015
@ghost
Copy link
Author

ghost commented Sep 2, 2015

Actually, it still crashes, just isn't so often so the application will stop.
Edit: it just caused it to stop again.

@ghost ghost reopened this Sep 2, 2015
@ghost
Copy link
Author

ghost commented Sep 2, 2015

Crashed with 3.3.0, trying with 2.5.0.

Edit: so far, it seems 2.5.0 is going through the same scenarios that caused 3.x to crash and there was absolutely zero failures so far.

@ghost
Copy link
Author

ghost commented Sep 4, 2015

Update: someone pointed digital ocean doesn`t give you swap memory. So it might be possible the crashes were caused by a RAM peak?

@arthurschreiber
Copy link

We see the same error with Node 4.0 on our RHEL 6 machines. We haven't seen this issue with io.js 2.5.0.

@arthurschreiber
Copy link

This seems to be related to #1411. @indutny Any idea what's happening here?

Stacktrace:

AssertionError: false == true
    at ServerResponse.detachSocket (_http_server.js:144:3)
    at ServerResponse.resOnFinish (_http_server.js:483:11)
    at emitNone (events.js:67:13)
    at ServerResponse.emit (events.js:166:7)
    at finish (_http_outgoing.js:526:10)
    at doNTCallback0 (node.js:407:9)
    at process._tickDomainCallback (node.js:377:13)

@fasiha
Copy link

fasiha commented Sep 20, 2015

I'm running into the same issue with node 4.1.0 using just Express. Looks like socket.io folks are seeing this too: socketio/socket.io#2236 😭 Definitely not RAM: 16 GB here.

@fasiha
Copy link

fasiha commented Sep 20, 2015

I may have fixed it with an npm update to pull in updated dependencies (instead of just using whatever version was saved when I did npm install X --save who-knows-when).

@simplier
Copy link

I'm running into the same issue with iojs 3.0.0 :( Any update on this?

@fasiha how are you going with npm update?

@fasiha
Copy link

fasiha commented Sep 21, 2015

@simplier Node 4.1.0 produces the assertion failure less often after npm update but it still does it every few minutes. I just restart nodemon and usually it resumes working (i.e., properly serving requests without crashing) right away. 😖

@tpfeifle
Copy link

I'm running into the same issue everytime I send a server-request with node 4.1.0 and also after running npm update.
Did anyone already get behind this issue?

@arthurschreiber
Copy link

I tried to debug this a bit, and I can see that when ServerResponse#detachSocket is called, socket._httpMessage is already set to null, and that's the reason this assertion fails.

@simplier
Copy link

no luck with iojs 3.x, come back to iojs 2.5 and works fine

@arthurschreiber
Copy link

Did anyone who faces the issue try to create a minimal test case? I tried to build one, but I can't get this error reproduced, as I don't yet understand what is actually causing this.

@ghost
Copy link
Author

ghost commented Sep 22, 2015

@arthurschreiber No idea, I just figured the error happens with a more significant load.

@fasiha
Copy link

fasiha commented Sep 22, 2015

@mrseth no, it sometimes happens on the very first request I make to express, immediately after starting node 😩

@tpfeifle
Copy link

@fasiha in my case it always happens on the first request ... had to downgrade to keep developing

@fasiha
Copy link

fasiha commented Sep 23, 2015

Here's dependencies in my package.json (all currently set to "latest" versions):

  • babel-core
  • bluebird
  • body-parser
  • child-process-promise
  • compression
  • consolidate
  • cookie-parser
  • copy-paste
  • debug
  • ejs
  • emailjs
  • express
  • express-session
  • lodash
  • morgan
  • mustache
  • passwordless
  • passwordless-memorystore
  • passwordless-rethinkdbstore
  • request
  • rethinkdb
  • sendgrid
  • serve-favicon
  • session-rethinkdb

Perhaps if we get enough of these dependency lists, taking the intersection of them all will identify the problem? I'm on Mac.

@arthurschreiber
Copy link

@fasiha Are you running a http or https server? Can you try disabling different modules and see what combination still triggers that error on your side? In our application, this error only happens during high load and so far I've not found a way to reproduce. 😞

@vandernorth
Copy link

Using node 4.1.1 with express and socket.io.

I get the following stack trace

assert.js:89
  throw new assert.AssertionError({
  ^
AssertionError: false == true
    at ServerResponse.resOnFinish (_http_server.js:474:7)
    at emitNone (events.js:67:13)
    at ServerResponse.emit (events.js:166:7)
    at finish (_http_outgoing.js:529:10)
    at doNTCallback0 (node.js:407:9)
    at process._tickCallback (node.js:336:13)

Its hard to reproduce but with some debugging I found the following:
screen shot 09-23-15 at 12 40 pm

In _http_server.js the incoming array's length = 2. And the current request is at index 1 instead of index 0. That's why the assertion fails.

The first request is for the favicon and the second is for socket.io in this case, not sure if that's relevant.

I'm not sure where to go from here, any idea's?

@arthurschreiber
Copy link

Trying to debug this further. I added an express route that basically patches the detachSocket like this:

  var detachSocket = res.detachSocket;
  res.detachSocket = function(socket) {
    if (socket._httpMessage !== this) {
      req.logger.error("Wrong httpMessage set on socket");
      console.log(this, socket._httpMessage);
    }

    // Revert
    res.detachSocket = detachSocket;
    return res.detachSocket.apply(this, arguments);
  };

The output is quite interesting:

  • socket._httpMessage is set to null. That's why the assertion in the original detachSocket function fails.
  • this._last is true.
  • this.socket and this.connection are null.
  • socket is a TLSSocket instance:
  • socket.destroyed is true.

I don't really understand how this can happen. It looks like the socket never got attached, or that it tries to detach the socket twice.

@ghost
Copy link
Author

ghost commented Sep 23, 2015

@arthurschreiber maybe a stack corruption on C++ code?

intermittent
happens more frequently on higher usage scenarios

@arthurschreiber
Copy link

@indutny @bnoordhuis Any idea what could be wrong? Any idea how we could better debug this?

@indutny
Copy link
Member

indutny commented Sep 25, 2015

Relevant PR: #3059

@indutny
Copy link
Member

indutny commented Sep 25, 2015

Just in case, it will be merged as soon as two things will happen:

  1. It will be confirmed to fix the problem by one (several) of the reporters of this issue
  2. It will be reviewed by team member.

So if you'll get a chance - please test this patch and let us know if it fixes the bug for you!

@mareksrom
Copy link

now it does not crash, but it does not load at all... It was another type of behaviour - if node didn't crash some requests took exactly 30 seconds (on ios network monitor), on server few ms... Now after the patch it keeps loading for few minutes, I can watch network monitor which request, but it is not quite ok...

@indutny
Copy link
Member

indutny commented Sep 25, 2015

@mareksrom does it hang if you'll replace finish with prefinish in _http_server.js?

@mareksrom
Copy link

before this patch it didn't... I added some logs, will try and let you know...

@indutny
Copy link
Member

indutny commented Sep 25, 2015

@mareksrom argh... stupid me. Try this one:

diff --git a/lib/_http_outgoing.js b/lib/_http_outgoing.js
index 0522ecb..da54247 100644
--- a/lib/_http_outgoing.js
+++ b/lib/_http_outgoing.js
@@ -71,6 +71,7 @@ function OutgoingMessage() {
   this._header = null;
   this._headers = null;
   this._headerNames = {};
+  this._prefinish = false;
 }
 util.inherits(OutgoingMessage, Stream);

@@ -526,7 +527,14 @@ OutgoingMessage.prototype.end = function(data, encoding, callback) {

   var self = this;
   function finish() {
-    self.emit('finish');
+    // NOTE: It is very important to wait for `prefinish` event.
+    // It is emitted only when the response becomes active in the pipeline.
+    // The ordering of requests/responses in `_http_server.js` is dependent
+    // on this, doing it in different order will result in crash.
+    if (self._prefinish)
+      self.emit('finish');
+    else
+      self.on('prefinish', finish);
   }

   if (typeof callback === 'function')
@@ -586,6 +594,7 @@ OutgoingMessage.prototype.end = function(data, encoding, callback) {

 OutgoingMessage.prototype._finish = function() {
   assert(this.connection);
+  this._prefinish = true;
   this.emit('prefinish');
 };

@mareksrom
Copy link

I put away second patch and here what's going on:

incoming push 0 /
finish 1 /
 - I(f) 0 /
incoming push 0 /myWAC/System/css.asp?module=styles.css&v=34
finish 1 /myWAC/System/css.asp?module=styles.css&v=34
 - I(f) 0 /myWAC/System/css.asp?module=styles.css&v=34
incoming push 0 /myWAC/System/css.asp?module=reset.css&v=1
incoming push 1 /myWAC/System/jscript.asp?module=websocket.js&v=227
 - I(p) 0 /myWAC/System/css.asp?module=reset.css&v=1
incoming push 2 /myWAC/System/jscript.asp?module=topwindow.js&v=2689
 - I(p) 0 /myWAC/System/css.asp?module=reset.css&v=1
 - I(p) 1 /myWAC/System/jscript.asp?module=websocket.js&v=227
incoming push 0 /myWAC/System/css.asp?module=global.css&v=238
incoming push 1 /myWAC/System/jscript.asp?module=selection.js&v=1856
 - I(p) 0 /myWAC/System/css.asp?module=global.css&v=238
incoming push 2 /myWAC/System/jscript.asp?module=resources.js&v=55
 - I(p) 0 /myWAC/System/css.asp?module=global.css&v=238
 - I(p) 1 /myWAC/System/jscript.asp?module=selection.js&v=1856
incoming push 3 /myWAC/System/jscript.asp?module=utils.js&v=2851
 - I(p) 0 /myWAC/System/css.asp?module=global.css&v=238
 - I(p) 1 /myWAC/System/jscript.asp?module=selection.js&v=1856
 - I(p) 2 /myWAC/System/jscript.asp?module=resources.js&v=55
incoming push 3 /myWAC/System/jscript.asp?module=clipboard.js&v=1487
 - I(p) 0 /myWAC/System/css.asp?module=reset.css&v=1
 - I(p) 1 /myWAC/System/jscript.asp?module=websocket.js&v=227
 - I(p) 2 /myWAC/System/jscript.asp?module=topwindow.js&v=2689
incoming push 0 /myWAC/System/jscript.asp?module=system.js&v=1074
incoming push 1 /myWAC/System/jscript.asp?module=controls.js&v=15053
 - I(p) 0 /myWAC/System/jscript.asp?module=system.js&v=1074
incoming push 2 /myWAC/System/jscript.asp?module=messages.js&v=413
 - I(p) 0 /myWAC/System/jscript.asp?module=system.js&v=1074
 - I(p) 1 /myWAC/System/jscript.asp?module=controls.js&v=15053
incoming push 3 /myWAC/System/jscript.asp?module=mouse.js&v=1740
 - I(p) 0 /myWAC/System/jscript.asp?module=system.js&v=1074
 - I(p) 1 /myWAC/System/jscript.asp?module=controls.js&v=15053
 - I(p) 2 /myWAC/System/jscript.asp?module=messages.js&v=413
incoming push 0 /myWAC/System/jscript.asp?module=interface_chrome.js&v=173
incoming push 1 /myWAC/System/jscript.asp?module=notification.js&v=509
 - I(p) 0 /myWAC/System/jscript.asp?module=interface_chrome.js&v=173
incoming push 2 /myWAC/System/jscript.asp?module=widgets.js&v=2280
 - I(p) 0 /myWAC/System/jscript.asp?module=interface_chrome.js&v=173
 - I(p) 1 /myWAC/System/jscript.asp?module=notification.js&v=509
incoming push 3 /myWAC/System/jscript.asp?module=scroller.js&v=168
 - I(p) 0 /myWAC/System/jscript.asp?module=interface_chrome.js&v=173
 - I(p) 1 /myWAC/System/jscript.asp?module=notification.js&v=509
 - I(p) 2 /myWAC/System/jscript.asp?module=widgets.js&v=2280
incoming push 0 /myWAC/System/jscript.asp?module=domain.js&v=22
incoming push 1 /myWAC/System/jscript.asp?module=performance.js&v=47
 - I(p) 0 /myWAC/System/jscript.asp?module=domain.js&v=22
incoming push 2 /myWAC/System/jscript.asp?module=components.js&v=49
 - I(p) 0 /myWAC/System/jscript.asp?module=domain.js&v=22
 - I(p) 1 /myWAC/System/jscript.asp?module=performance.js&v=47
incoming push 3 /myWAC/System/jscript.asp?module=security.js&v=19
 - I(p) 0 /myWAC/System/jscript.asp?module=domain.js&v=22
 - I(p) 1 /myWAC/System/jscript.asp?module=performance.js&v=47
 - I(p) 2 /myWAC/System/jscript.asp?module=components.js&v=49
finish 4 /myWAC/System/css.asp?module=reset.css&v=1
 - I(f) 0 /myWAC/System/css.asp?module=reset.css&v=1
 - I(f) 1 /myWAC/System/jscript.asp?module=websocket.js&v=227
 - I(f) 2 /myWAC/System/jscript.asp?module=topwindow.js&v=2689
 - I(f) 3 /myWAC/System/jscript.asp?module=clipboard.js&v=1487
finish 4 /myWAC/System/jscript.asp?module=resources.js&v=55
 - I(f) 0 /myWAC/System/css.asp?module=global.css&v=238
 - I(f) 1 /myWAC/System/jscript.asp?module=selection.js&v=1856
 - I(f) 2 /myWAC/System/jscript.asp?module=resources.js&v=55
 - I(f) 3 /myWAC/System/jscript.asp?module=utils.js&v=2851

log is:

  • in parserOnIncoming (before push), follwed by I(p) dump of incoming
  • in resOnFinish (before push), follwed by I(f) dump of incoming (before assert)

it is of course followed by crash like in previous tries....

I'm pretty new for this (I see node's http server code for the first time), but why we do expect that response finishes in the same order like the request was put into the incoming queue? If it is necessary than I think we need to put resOnFinish into some queue and call it after all previous requests in incoming queue passed resOnFinish... @indutny is this idea right? I can try to prepare some patch, but I'm not sure if the order is really necessary. If it is, than according to me is really really badly written and change of prefinish vs. finish is only some side effect... If not than assert is with bad idea...

@arthurschreiber
Copy link

@indutny Can this be related to #3047 ? In the sense that because events don't fire or fire in the wrong order, memory from TLSWrap does not get freed correctly?

@arthurschreiber
Copy link

@indutny I just tried the following test case, based on the test in your PR:

'use strict';

const net = require('net');
const http = require("http");

let count = 0;

const server = http.createServer(function(req, res) {
  const id = count++;
  setTimeout(function() {
    res.end('hello');
  }, (2 - id) * 10);

  if (id === 1)
    server.close();
});

server.listen(8080, function() {
  const host = server.address().address;
  const port = server.address().port;


  console.log('Example app (' + process.pid + ') listening at http://%s:%s', host, port);

  const s = net.connect(port);

  const req = 'GET / HTTP/1.1\r\nConnection: keep-alive\r\n\r\n';
  const big = req + req;
  s.end(big);
  s.resume();
});

I see the following stack trace:

$ node test.js
Example app (6792) listening at http://:::8080

assert.js:89
  throw new assert.AssertionError({
  ^
AssertionError: false == true
    at ServerResponse.detachSocket (_http_server.js:144:3)
    at ServerResponse.resOnFinish (_http_server.js:483:11)
    at emitNone (events.js:67:13)
    at ServerResponse.emit (events.js:166:7)
    at finish (_http_outgoing.js:526:10)
    at doNTCallback0 (node.js:407:9)
    at process._tickCallback (node.js:336:13)

Which is different than what others are reporting, but the same as we see on our test environments.

@mareksrom
Copy link

My last comment was not wise, of course it has to be in order:) I think I know where is problem... just trying...

@ghost
Copy link
Author

ghost commented Sep 25, 2015

I can't replicate it on my local machine. It just fails in QA server which is exposed.
And it always fails when receiving SYN_RECV requests only.

@mareksrom
Copy link

I already addressed the problem and I am working on patch, it already works, but some requests still keep loading, so some testing need to be done, but I am almost finished, expect to publish it today evening or tommorow. In some cases finish event is beeing fired even before socket was assigned...

@mareksrom
Copy link

@indutny I proposed _http_outgoing.js change with the bug fix (this is first time here so I don't know if I should do it this way:). Problem was that when response was ended and came to hack _send('',...) before the connection was assigned to it, callback to finish was fired immediately before anything was send, but it should wait for activating this response and flusing it... Hope it is correct my test case it solved and after spending few hours today in http code I hope that I understand it quite well. Code of ending response I think is worth refactoring... I left there one assert that sould not happen - in _writeraw - if data would be empty, connection assigned and something was in output buffer, than behaviour would not be correct, I don't know if this can happen...

@indutny
Copy link
Member

indutny commented Sep 25, 2015

@mareksrom thanks for the insight! I don't exactly agree with the fix, though, and I think we can do it better. I'll comment on that PR.

@mareksrom
Copy link

yeah of course, we can, as I wrote this is only fix, I didn't want to make bigger changes that I think should be done here... I don't know anything about your guidelines and so on, I was just unhappy of node falling down every few minutes and it was a good reason to look a bit under the cover...:)

@fasiha
Copy link

fasiha commented Sep 29, 2015

@indutny @mareksrom sorry I'm a little unclear on the status of your patches—should we be testing them, or are they both in flux? Apologies for being dense…

I ask because @arthurschreiber seems to have experienced assertion failure with @indutny's patch.

@rvagg
Copy link
Member

rvagg commented Sep 29, 2015

I believe @mareksrom's patch is in the closest state to being merged; either way we'll have a release out this week with a fix in it as it seems like the cause has been fully diagnosed now.

@fasiha
Copy link

fasiha commented Sep 29, 2015

🎉 so happy! Thanks everyone 🍻

@rvagg
Copy link
Member

rvagg commented Oct 5, 2015

Please upgrade to v4.1.2 and confirm that this problem has been resolved.

@ghost
Copy link
Author

ghost commented Oct 5, 2015

I had it in a very intermittent fashion, so you will need someone that could reproduce it at will to confirm. I had already upgraded my server, but it doesn't get enough traffic.

@ahwayakchih
Copy link

Hi, i started noticing problem, searched for info, found this issue, updated to 4.1.2 and now the problem is gone. Thanks!

@gesposito
Copy link

It used to crash often while developing on Windows (http-server with a lot of static requests), looks stable now, thanks.

@ghost
Copy link
Author

ghost commented Oct 9, 2015

looks stable!!!! 👍

@micky2be
Copy link

Thanks

rvagg pushed a commit to rvagg/io.js that referenced this issue Oct 21, 2015
Changes to `stream_base.cc` are required to support empty writes.
Fixes CVE-2015-7384, nodejs#3138

Fix: nodejs#2639
PR-URL: nodejs#3128
@hi54yt
Copy link

hi54yt commented Feb 3, 2016

thx!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
http Issues or PRs related to the http subsystem.
Projects
None yet
Development

Successfully merging a pull request may close this issue.