Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NPM hangs on system, under FreeBSD 11.0 and OSX 10.11 with Node 6.5+ #8635

Closed
dmilith opened this issue Sep 17, 2016 · 30 comments
Closed

NPM hangs on system, under FreeBSD 11.0 and OSX 10.11 with Node 6.5+ #8635

dmilith opened this issue Sep 17, 2016 · 30 comments
Labels
npm Issues and PRs related to the npm client dependency or the npm registry.

Comments

@dmilith
Copy link

dmilith commented Sep 17, 2016

I did build Nodejs 6.4.0+ under OSX 10.11, Darwin 15.6.0 - x86_64 (all 6.x versions does it under OSX and HardenedBSD/FreeBSD 11.0 as well)
Build went fine, but:

  1. I did complete wipeout of anything Node related so I did rm -rf ~/.npm ~/.node* and stuff before builds.
  2. When did --without-npm to configure, and trying std method of installing NPM there:
    http://ół.pl/d968120bfbc83a579218a8619d3784b6.png
    (lasts forever)
  3. When built npm bundled with 6.6.0:
    http://ół.pl/47adb612b09232caa49d0c5906d19337.png
    (also lasts forever, yet with nice progress.. without any progress)
  4. I confirmed that after enabling/disabling ipv6 problem remains unsolved.
@targos targos added the npm Issues and PRs related to the npm client dependency or the npm registry. label Sep 17, 2016
@dmilith dmilith changed the title NPM hangs on OSX 10.11 NPM hangs on OSX 10.11 with Node 6.x Sep 17, 2016
@dmilith
Copy link
Author

dmilith commented Sep 17, 2016

Some more details from --verbose (case when used npm bundled with Node 6.6.0):

Installing: coffee-script
npm info it worked if it ends with ok
npm verb cli [ '/Software/Node/bin/node',
npm verb cli   '/Software/Node/bin/npm',
npm verb cli   'install',
npm verb cli   'coffee-script',
npm verb cli   '--global',
npm verb cli   '--verbose' ]
npm info using npm@3.10.3
npm info using node@v6.6.0
npm verb request uri https://registry.npmjs.org/coffee-script
npm verb request no auth needed
npm info attempt registry request try #1 at 7:34:11 PM
npm verb request id 59279808efacf700
npm http request GET https://registry.npmjs.org/coffee-script
npm info retry will retry, error on last attempt: TypeError: Invalid argument: family must be 4 or 6
npm info attempt registry request try #2 at 7:34:21 PM
npm http request GET https://registry.npmjs.org/coffee-script

.. so it looks like ipv6 issue again.. but I tried:

networksetup -setv6off Ethernet

but result looks the same -

Installing: coffee-script
npm info it worked if it ends with ok
npm verb cli [ '/Software/Node/bin/node',
npm verb cli   '/Software/Node/bin/npm',
npm verb cli   'install',
npm verb cli   'coffee-script',
npm verb cli   '--global',
npm verb cli   '--verbose' ]
npm info using npm@3.10.3
npm info using node@v6.6.0
npm verb request uri https://registry.npmjs.org/coffee-script
npm verb request no auth needed
npm info attempt registry request try #1 at 7:37:45 PM
npm verb request id 643056bc3479eba2
npm http request GET https://registry.npmjs.org/coffee-script
npm info retry will retry, error on last attempt: TypeError: Invalid argument: family must be 4 or 6
npm info attempt registry request try #2 at 7:37:55 PM
npm http request GET https://registry.npmjs.org/coffee-script

This also confirms that exactly same issue had place under FreeBSD / HardenedBSD x86_64 with whole line of Node 6.4+ (at least)

@dmilith dmilith changed the title NPM hangs on OSX 10.11 with Node 6.x NPM hangs on system with both IPv4 and IPv6 enabled, under OSX 10.11 with Node 6.5+ Sep 17, 2016
@addaleax
Copy link
Member

dmilith changed the title from NPM hangs on OSX 10.11 with Node 6.x to NPM hangs on system with both IPv4 and IPv6 enabled, under OSX 10.11 with Node 6.5+

Does that mean that this only occurs for you with Node v6.5+? Does (sudo) npm install -g npm@next help?

@addaleax
Copy link
Member

Also, what does “hanging” mean? 100 % CPU? Does it abort properly when using Ctrl+C?

@dmilith
Copy link
Author

dmilith commented Sep 17, 2016

Does (sudo) npm install -g npm@next help?

Updating: npm
npm info it worked if it ends with ok
npm verb cli [ '/Software/Node/bin/node',
npm verb cli   '/Software/Node/bin/npm',
npm verb cli   'install',
npm verb cli   '--global',
npm verb cli   '--verbose',
npm verb cli   'npm@next' ]
npm info using npm@3.10.3
npm info using node@v6.6.0
npm verb request uri https://registry.npmjs.org/npm
npm verb request no auth needed
npm info attempt registry request try #1 at 8:19:18 PM
npm verb request id 43f35875a9351c45
npm http request GET https://registry.npmjs.org/npm
npm info retry will retry, error on last attempt: TypeError: Invalid argument: family must be 4 or 6
npm info attempt registry request try #2 at 8:19:28 PM
npm http request GET https://registry.npmjs.org/npm

hang means.. it never ends. .. and tried with update of several versions of npm. Same issue

Also, what does “hanging” mean? 100 % CPU? Does it abort properly when using Ctrl+C?

No, hang - deadlock. No CPU usage. Haven't dtrace it yet. Yes it just interrupts properly on both SIGINT/TERM

@dmilith
Copy link
Author

dmilith commented Sep 17, 2016

Does that mean that this only occurs for you with Node v6.5+?

It means - It affected previous 6.x versions I tried. On both systems where I enabled ipv6 via broker.

@dmilith
Copy link
Author

dmilith commented Sep 17, 2016

@addaleax How to set family to 4 via ~/.npmrc?

family = 4 

changes nothing, documentation contains no information.

@addaleax
Copy link
Member

How to set family to 4 in npmrc?

I don’t know, but from the looks of it, you should probably report this over at https://github.com/npm/npm anyway. The people who take looks on that issue tracker might have more information about that kind of question, too.

@dmilith
Copy link
Author

dmilith commented Sep 17, 2016

Even more rainbow here - http://ół.pl/79c215ccf3543e92855ac8a90a7bc691.png

@dmilith
Copy link
Author

dmilith commented Sep 17, 2016

@addaleax Forget about npm.. it's Node causing problem not npm.. Look on issues on other node projects. f.e. node_redis. It will fail on any DNS via ipv6 - or whatever it does there.

TypeError: Invalid argument: family must be 4 or 6

This is the cause. A buggy Node API IMHO. Untested ipv4 + ipv6 environments. Fix it on Nodejs side and it will fix several issues reported separately by lots of people (google: "TypeError: Invalid argument: family must be 4 or 6")

@addaleax
Copy link
Member

A buggy Node API IMHO.

A reason for that opinion might be helpful, other than “this occurs in multiple projects which use Node.js”.

@dmilith
Copy link
Author

dmilith commented Sep 17, 2016

It's not about opinion but understanding how stuff works. I'm not here to troll with you :)...
Yet to prove you that:

Try building Node 6.6.0 from source with --without-npm - and try to install it (npm) in same configuration (both ipv4 and ipv6) - version You pick is irrelevant...

So issue is only invoked by npm, but caused by Node
have a nice day..

@bnoordhuis
Copy link
Member

'family must be 4 or 6' means something is passing an invalid argument to dns.lookup(), presumably npm. So far I see no reason to believe this is a node.js issue.

@dmilith
Copy link
Author

dmilith commented Sep 17, 2016

Then I'd suggest checking - not believing :)

@dmilith dmilith changed the title NPM hangs on system with both IPv4 and IPv6 enabled, under OSX 10.11 with Node 6.5+ NPM hangs on system, under FreeBSD 11.0 and OSX 10.11 with Node 6.5+ Sep 17, 2016
@dmilith
Copy link
Author

dmilith commented Sep 17, 2016

Also It turned out the issue is affecting workstation with no ipv6, (ipv4 only) - used exactly same build of Node 6.6.0

@bnoordhuis
Copy link
Member

I don't have a problem, it works for me, and your passive aggressive tone doesn't make me inclined to help you out.

@imyller
Copy link
Member

imyller commented Sep 17, 2016

Can't reproduce. Did my best.

I suspect @dmilith has either severe connectivity issues to registry.npmjs.org or some persistent environment setting affecting npm.

15.6.0 Darwin Kernel Version 15.6.0: Mon Aug 29 20:21:34 PDT 2016; 
root:xnu-3248.60.11~1/RELEASE_X86_64 x86_64
Apple LLVM version 8.0.0 (clang-800.0.38)
Target: x86_64-apple-darwin15.6.0

@imyller
Copy link
Member

imyller commented Sep 17, 2016

Additionally:

@dmilith I noticed at your screenshot that you're getting EHOSTDOWN status for connections to registry.npmjs.org CDN IP address.

I mangled my network to blackhole selectively all packets to registry.npmjs.org to test this and Node.js/npm/Darwin prefers to return ENETUNREACH or ETIMEDOUT when single host does not respond but otherwise non-local network is reachable.

EHOSTDOWN is likely coming from your local gateway or ISP network infrastructure (less likely). EHOSTDOWN is almost always natively sent by TCP stack when it considers IP to be local subnet and enough (5) ARP retries are done for the IP. As I highly suspect that Fastly's CDN is within your local subnet, the culprit might be your routing firewall actively intervening in routed connections with EHOSTDOWN.

Check your connection rate limits. npm tends to open quite a few concurrent connections and you just might have tripped to your own local network security.

@dmilith
Copy link
Author

dmilith commented Sep 18, 2016

Interesting info. Will take a closer look later today

@dmilith
Copy link
Author

dmilith commented Sep 18, 2016

@imyller The thing is:

⇢ ping registry.npmjs.org
PING prod.a.sni.global.fastlylb.net (151.101.60.162): 56 data bytes
64 bytes from 151.101.60.162: icmp_seq=0 ttl=51 time=39.821 ms
64 bytes from 151.101.60.162: icmp_seq=1 ttl=51 time=40.496 ms
64 bytes from 151.101.60.162: icmp_seq=2 ttl=52 time=43.979 ms
^C

Nothing special in my network configuration - except router side ipv6 management from my broker. Standard stuff.

And here's definition of Node with all specific arguments and stuff I pass to build process:
https://github.com/VerKnowSys/sofin-definitions/blob/stable/definitions/node.def (it's sh script)

Anyway... I thought this place was meant for problem solve, not my "aggressive state" LOL.

@dmilith
Copy link
Author

dmilith commented Sep 18, 2016

It's just unhandled exception from Node function - one function crashes, whole JS virtual machine is useless and just does nothing. Whole deal. Nothing special. It looks like a hang but seems to be just dead V8. I'm not specifying any arguments to inner DNS functions. Using plain version bundled with 6.6. My environment is Vanilla. I assumed it's a KNOWN environment for you.

have a nice day

@imyller
Copy link
Member

imyller commented Sep 18, 2016

The thing is:

The thing is that ICMP pings are usually not subject to traffic management and can be unreliable or misleading when analysing actual TCP connectivity issues.

But, I think you've made up your mind about the cause of the issue.

I personally can't reproduce the problem you are seeing even in vanilla OS X 10.11.6 virtual machine + fresh clone of Node.js source. I'm also quite confident that I'm not the only one.

And oh, have a nice day.

@bnoordhuis
Copy link
Member

But, I think you've made up your mind about the cause of the issue.

I think that's on the mark. Let's close.

@dmilith
Copy link
Author

dmilith commented Sep 18, 2016

WOW. No wonder people are getting as far from using this project as possible.. :) Thank you for this precious lesson ;)

@imyller
Copy link
Member

imyller commented Sep 18, 2016

You're welcome.

@jbergstroem
Copy link
Member

Just tried reproducing on my FreeBSD environment (as well as one of our test runners). Sorry, but I can't reproduce this. Perhaps I can help you dig into your network setup and identify possible issues?

$ uname -a
FreeBSD $foo 10.3-RELEASE FreeBSD 10.3-RELEASE #0 r297264: Fri Mar 25 02:10:02 UTC 2016     root@releng1.nyi.freebsd.org:/usr/obj/usr/src/sys/GENERIC  amd64
$ cd node-v6.6.0
$ CC=clang CXX=clang++ ./configure
# <snip>
$ gmake -j3
# <snip>
$ export PATH=$(pwd):$PATH
$ ./node deps/npm/bin/npm-cli.js install coffee-script
/usr/home/jbergstroem/node-v6.6.0
└── coffee-script@1.10.0 

npm WARN enoent ENOENT: no such file or directory, open '/usr/home/jbergstroem/node-v6.6.0/package.json'
npm WARN node-v6.6.0 No description
npm WARN node-v6.6.0 No repository field.
npm WARN node-v6.6.0 No README data
npm WARN node-v6.6.0 No license field.

@dmilith
Copy link
Author

dmilith commented Sep 18, 2016

@jbergstroem yea, exactly. 4.5 tested under FreeBSD worked correctly. Issue started with 6.x line.

It also depends on dependencies you installed to build your node. In base - there's ilbc++ in FreeBSD 10+ - which I also used since Node 0.8? No issues until 6.x and I'm 100% sure I used libc++ only.
If you used port version... It does a mess with installing whole (unnecessary) GNU world, hence after:

ldd `which node`

You should see "libstdc++ to specific location somewhere under /usr/local/lib which makes it "version broken after moving across systems". My build has all Node dependencies included, hence it's relocatable. That's why I use /Software/Node prefix. Example prebuilt (affected) version: http://software.verknowsys.com/binary/Darwin-10.11-x86_64/Node-6.6.0-Darwin-10.11-x86_64.txz

I also tried to dig through, and applied two patches.. but these didn't help much. It looks like DNS resolving is broken because of libc++ used.. and that causes DNS issue in Node.

@dmilith
Copy link
Author

dmilith commented Sep 18, 2016

--- lib/dns.js.orig 2016-09-18 20:43:45.000000000 +0200
+++ lib/dns.js  2016-09-18 20:44:56.000000000 +0200
@@ -130,9 +130,9 @@
   } else {
     family = options >>> 0;
   }
-
+  
   if (family !== 0 && family !== 4 && family !== 6)
-    throw new TypeError('Invalid argument: family must be 4 or 6');
+      family = 4;

   callback = makeAsync(callback);

@dmilith
Copy link
Author

dmilith commented Sep 18, 2016

Tried this fellow to force 4 family, it silenced TypeError but this one has completely no sense at all:

Installing: less
npm info it worked if it ends with ok
npm verb cli [ '/Software/Node/bin/node',
npm verb cli   '/Software/Node/bin/npm',
npm verb cli   'install',
npm verb cli   'less',
npm verb cli   '--global',
npm verb cli   '--verbose' ]
npm info using npm@3.10.3
npm info using node@v6.6.0
npm verb request uri https://registry.npmjs.org/less
npm verb request no auth needed
npm info attempt registry request try #1 at 9:16:52 PM
npm verb request id bb89f02718784c0b
npm http request GET https://registry.npmjs.org/less
npm info retry will retry, error on last attempt: Error: connect EHOSTDOWN 151.101.36.162:443 - Local (0.0.0.0:0)
npm info attempt registry request try #2 at 9:17:02 PM
npm http request GET https://registry.npmjs.org/less

I assure you I checked by request after request made by npm and all these sites and resources locations are working fine.

I have no idea what address that is - but it doesn't work as it says, but it looks like npm has only a single try and single repo... and at least not hangs.. but crashes after a timeout like here:

Installing: coffee-script
npm info it worked if it ends with ok
npm verb cli [ '/Software/Node/bin/node',
npm verb cli   '/Software/Node/bin/npm',
npm verb cli   'install',
npm verb cli   'coffee-script',
npm verb cli   '--global',
npm verb cli   '--verbose' ]
npm info using npm@3.10.3
npm info using node@v6.6.0
npm verb request uri https://registry.npmjs.org/coffee-script
npm verb request no auth needed
npm info attempt registry request try #1 at 9:15:40 PM
npm verb request id 288d080784b76333
npm http request GET https://registry.npmjs.org/coffee-script
npm info retry will retry, error on last attempt: Error: connect EHOSTDOWN 151.101.60.162:443 - Local (0.0.0.0:0)
npm info attempt registry request try #2 at 9:15:50 PM
npm http request GET https://registry.npmjs.org/coffee-script
npm info retry will retry, error on last attempt: Error: connect EHOSTDOWN 151.101.60.162:443 - Local (0.0.0.0:0)
npm info attempt registry request try #3 at 9:16:51 PM
npm http request GET https://registry.npmjs.org/coffee-script
npm verb stack Error: connect EHOSTDOWN 151.101.36.162:443 - Local (0.0.0.0:0)
npm verb stack     at Object.exports._errnoException (util.js:1036:11)
npm verb stack     at exports._exceptionWithHostPort (util.js:1059:20)
npm verb stack     at connect (net.js:874:16)
npm verb stack     at net.js:1003:7
npm verb stack     at GetAddrInfoReqWrap.asyncCallback [as callback] (dns.js:62:16)
npm verb stack     at GetAddrInfoReqWrap.onlookup [as oncomplete] (dns.js:81:10)
npm verb cwd /Software/Node/.src_413ce5db3432c3f8aca163f2fc8a3d421c5155c6/node-v6.6.0
npm ERR! Darwin 15.6.0
npm ERR! argv "/Software/Node/bin/node" "/Software/Node/bin/npm" "install" "coffee-script" "--global" "--verbose"
npm ERR! node v6.6.0
npm ERR! npm  v3.10.3
npm ERR! code EHOSTDOWN
npm ERR! errno EHOSTDOWN
npm ERR! syscall connect

npm ERR! connect EHOSTDOWN 151.101.36.162:443 - Local (0.0.0.0:0)
npm ERR!
npm ERR! If you need help, you may report this error at:
npm ERR!     <https://github.com/npm/npm/issues>
npm verb exit [ 1, true ]

Also checked hosts or similar locations that might cause address to be set somewhere.. but no such thing in my whole system.

@imyller
Copy link
Member

imyller commented Sep 18, 2016

By looking at your log we can determine that:

  1. DNS resolution succeeds by net.lookupAndConnect()
  2. net.connect() is called to resolved IP (located in Fastly CDN IPv4 subnet 151.101.0.0/16)
  3. uv_tcp_connect() gets called (libuv)
  4. which eventually calls POSIX connect provided by your platform library

What happens after is that your TCP/IP stack returns EHOSTDOWN status (-64 as numeric errno value) from POSIX connect call, which suggest:

a) Your local host determines 151.101.0.0/16 subnet to be local subnet and fails at ARP resolution (highly unlikely scenario unless you've got serious netmask misconfiguration)

or

b) A router somewhere on the route between your local host and remote IP 151.101.60.162 responds with ICMP (Type 3) packet which causes TCP stack to return EHOSTDOWN. Reason might be routing issue, active application firewalling/connection limiting or fragmentation issues in case with functional ICMP ping connectivity (usually small 64-byte unfragmented packets) but unreliable TCP connectivity.

Node.js runtime respects status codes returned by underlying OS and has very little recovery options in situation where OS returns very specific network connectivity related error code.

Unless you provide more detailed logs or additional information these are the only pointers towards issue resolution I'm able to provide you.

@lynx-r
Copy link

lynx-r commented Aug 29, 2017

Hi! I had the same issue and I solved it with deleting the record repository=http://registry.npmjs.org/npm in ~/.npmrc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
npm Issues and PRs related to the npm client dependency or the npm registry.
Projects
None yet
Development

No branches or pull requests

7 participants