Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

http/net: fix catching immediate connect errors #1823

Closed

Conversation

m0ppers
Copy link

@m0ppers m0ppers commented May 28, 2015

When creating a standard http client request the agent is calling
net.createConnection. The agent will then defer installing error
handling to nextTick(). If there is any immediate error in
createConnection iojs will bail with an uncaught error. There is already
one error situation which is handled in a special way (i.e. throwing
error is deferred so the standard error handler may catch it). However
there are more situations where this may happen: Namely in the connect()
function.

Due to a locally changed ip address I was getting the following error and the iojs bailed out (which should not happen as error management was in place):

events.js:141
      throw er; // Unhandled 'error' event
            ^
Error: connect EHOSTDOWN 192.168.1.11:80 - Local (192.168.1.10:54125)
    at Object.exports._errnoException (util.js:846:11)
    at exports._exceptionWithHostPort (util.js:869:20)
    at connect (net.js:840:14)
    at lookupAndConnect (net.js:933:5)
    at Socket.connect (net.js:902:5)
    at Agent.exports.connect.exports.createConnection (net.js:61:35)
    at Agent.createSocket (_http_agent.js:177:16)
    at Agent.addRequest (_http_agent.js:147:23)
    at new ClientRequest (_http_client.js:132:16)
    at Object.exports.request (http.js:30:10)

I am still unsure why I am the only one facing this problem and maybe the pull request is entirely bogus. The error situation seems pretty common to me (and that's why I am surprised): My dhcp assigned my dev machine a new address and connecting to the old, hardcoded ip address is now failing with an ETIMEOUT (as expected). However another, slightly (few seconds) deferred request (while the first one is still working) will trigger an immediate EHOSTDOWN/EHOSTUNREACH error for me which cannot be caught using request.on("error", fn).

I can 100% reproduce this with iojs 2.1 and master using the following gist:

https://gist.github.com/m0ppers/3709f7d87b73be9aca7b

I am on osx yosemite.

The wrapped nextTick, nextTick thing is of course totally evil. However I cannot seem to fix it otherwise. The root cause IMHO is the deferred error management in the http client. I don't know the reason for it and left it untouched but IMHO this should be the real fix.

It has been introduced with this commit (which makes me even more unsure as this is years ago):

nodejs/node-v0.x-archive@7c87e09

Can somebody reproduce the problem? Am missing something? Does somebody have a better fix (nextTick(nextTick()))? :S

When creating a standard http client request the agent is calling
net.createConnection. The agent will then defer installing error
handling to nextTick(). If there is any immediate error in
createConnection iojs will bail with an uncaught error. There is already
one error situation which is handled in a special way (i.e. throwing
error is deferred so the standard error handler may catch it). However
there are more situations where this may happen: Namely in the connect()
function.
@Fishrock123 Fishrock123 added http Issues or PRs related to the http subsystem. net Issues and PRs related to the net subsystem. labels May 28, 2015
@Fishrock123
Copy link
Member

@m0ppers does this happen on node 0.12 too? (I'm assuming it does given the commit in question)

@m0ppers
Copy link
Author

m0ppers commented May 28, 2015

Interesting :O works as expected with 0.12:

hans-guenther:io.js mop$ node -v; node broken.js 
v0.12.4
Server running at http://<ip-address>:8001/
GOT REQUEST
GOT REQUEST
ERR { [Error: connect EHOSTUNREACH]
  code: 'EHOSTUNREACH',
  errno: 'EHOSTUNREACH',
  syscall: 'connect' }

@m0ppers
Copy link
Author

m0ppers commented May 28, 2015

Just reverified with io.js 2.1.0:

hans-guenther:io.js mop$ node -v; node broken.js 
v2.1.0
Server running at http://<ip-address>:8001/
GOT REQUEST
GOT REQUEST
events.js:141
      throw er; // Unhandled 'error' event
            ^
Error: connect EHOSTDOWN 192.168.1.11:80 - Local (192.168.1.10:55714)
    at Object.exports._errnoException (util.js:846:11)
    at exports._exceptionWithHostPort (util.js:869:20)
    at connect (net.js:840:14)
    at lookupAndConnect (net.js:933:5)
    at Socket.connect (net.js:902:5)
    at Agent.exports.connect.exports.createConnection (net.js:61:35)
    at Agent.createSocket (_http_agent.js:177:16)
    at Agent.addRequest (_http_agent.js:147:23)
    at new ClientRequest (_http_client.js:132:16)
    at Object.exports.request (http.js:30:10)

So definately some regression. Interesting :O Didn't expect that. However that explains why so far nobody got that error

@evanlucas
Copy link
Contributor

I'm thinking this may a regression introduced in #1505. If the host is an IP address, it skips calling the lookup function which could be causing the error to be on the same tick. Looking into it.

Ref: https://github.com/nodejs/io.js/blob/master/lib/net.js#L927

@evanlucas
Copy link
Contributor

Can you try to reproduce after apply this patch?

commit f31295c0925464621073d2d635d9060284d4f62d
Author: Evan Lucas <evanlucas@me.com>
Date:   Thu May 28 18:08:06 2015 -0500

    defer

diff --git a/lib/net.js b/lib/net.js
index 5309521..d74f001 100644
--- a/lib/net.js
+++ b/lib/net.js
@@ -924,7 +924,9 @@ function lookupAndConnect(self, options) {
   // TODO(evanlucas) should we hot path this for localhost?
   var addressType = exports.isIP(host);
   if (addressType) {
-    connect(self, host, port, addressType, localAddress, localPort);
+    process.nextTick(function() {
+      connect(self, host, port, addressType, localAddress, localPort);
+    })
     return;
   }

@m0ppers
Copy link
Author

m0ppers commented May 29, 2015

Doesn't help :S it DOES however help when wrapped in another (evil) nextTick(). Would you agree that the rootcause for all the nextTick hacks in connect() is the http client stuff? Maybe it is time to fix the rootcause? The sole reason the error management has to be wrapped into nextTick(s) is that the http client stuff installs error management only on nextTick() isn't it?

@evanlucas
Copy link
Contributor

I am not able to reproduce with the above patch. I am thinking that removing the hot path of skipping lookup should be added back though. Mind trying this patch?

commit 9daac3964dd0db81564759375b0f18a201d88306
Author: Evan Lucas <evanlucas@me.com>
Date:   Tue Jun 2 14:55:39 2015 -0500

    Don't skip lookup

    If lookup is skipped, an error can be emitted on the current tick
    (before an error listener has been added)

diff --git a/lib/net.js b/lib/net.js
index 5309521..06f38fa 100644
--- a/lib/net.js
+++ b/lib/net.js
@@ -920,14 +920,6 @@ function lookupAndConnect(self, options) {
   }
   port |= 0;

-  // If host is an IP, skip performing a lookup
-  // TODO(evanlucas) should we hot path this for localhost?
-  var addressType = exports.isIP(host);
-  if (addressType) {
-    connect(self, host, port, addressType, localAddress, localPort);
-    return;
-  }
-
   if (options.lookup && typeof options.lookup !== 'function')
     throw new TypeError('options.lookup should be a function.');

@m0ppers
Copy link
Author

m0ppers commented Jun 8, 2015

that works...

@m0ppers
Copy link
Author

m0ppers commented Jun 8, 2015

i just retried the nextTick around connect patch and it now (?) works for me too. Maybe i made a mistake while applying the patch. You can reproduce my original problem, can't you? If so i would assume that i made an error testing that patch.

@m0ppers
Copy link
Author

m0ppers commented Jun 24, 2015

any update?

evanlucas added a commit to evanlucas/node that referenced this pull request Jul 1, 2015
Fixes an edge case regression introduced in
1bef717.

With the lookup being skipped, an error could be emitted before an
error listener has been added.

An example of this was presented by changing the server’s IP address
and then immediately making a request to the old address.

Related: nodejs#1823
PR-URL: nodejs#2054
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-by: Trevor Norris <trev.norris@gmail.com>
@evanlucas
Copy link
Contributor

Fixed in af249fa.

@evanlucas evanlucas closed this Jul 1, 2015
mscdex pushed a commit to mscdex/io.js that referenced this pull request Jul 9, 2015
Fixes an edge case regression introduced in
1bef717.

With the lookup being skipped, an error could be emitted before an
error listener has been added.

An example of this was presented by changing the server’s IP address
and then immediately making a request to the old address.

Related: nodejs#1823
PR-URL: nodejs#2054
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-by: Trevor Norris <trev.norris@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
http Issues or PRs related to the http subsystem. net Issues and PRs related to the net subsystem.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants