Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

Cluster workers are confused? #4091

Closed
hemanth opened this Issue · 17 comments

6 participants

@hemanth

Did some code with cluster, that looks like this on a quad core system only 2 workers are responding for each request from the browser and for cli curl calls only one of them is responding...Am I missing something? :dizzy:

@bnoordhuis

Unless you can type really fast, curl is not a great benchmarking tool. What do the numbers look like when you use ab or siege?

@hemanth

Well, tried ab -n 1000 -c 5 http://127.0.0.1:8000/ it servered 1002 requests. It should have been 1004 is in't?

@langpavel

@hemanth Look at gist https://gist.github.com/3850507, run ab (ApacheBench)
Oh, sorry, you already do it, but try to increase concurrency

@hemanth

@langpavel The question is on a quad core machine, when 4 workers are running only 2 are severing the requests, why so?

@langpavel

@hemanth Ok, I modified code you referenced at issue report to spawn 4 workers on my 2 core CPU and I see every worker dispatch something. https://gist.github.com/3850827
both versions works for me, from master and v0.8.11

What is your HW, bare metal or virtual, check process affinity, I really don't know....

@bnoordhuis

on a quad core machine, when 4 workers are running only 2 are severing the requests

Could be one of several reasons:

  • other processes (like ab) are tying up the other two cores
  • ab is not stressing the server enough, try ab -c 200 -n 50000
  • the accept() back-off mechanism in v0.8 is not very effective, you may get more balanced numbers with master

If you're running your benchmarks on OS X, try a real operating system.

@hemanth

@bnoordhuis heh heh liked your last point!

  • With curl/browser it's still the same
  • ab is not eating much of the cores.
  • Hmm what better than accept() ?

@langpavel ^ :smile:

@bnoordhuis

@hemanth Try the master branch, it uses a revised algorithm. Run it with export UV_TCP_SINGLE_ACCEPT=1 and export UV_TCP_SINGLE_ACCEPT=0.

@hemanth

ok :+1:

@christophsturm

@bnoordhuis:

"the accept() back-off mechanism in v0.8 is not very effective, you may get more balanced numbers with master"

is there a way to get this into 0.8?

@bnoordhuis

@christophsturm In an official v0.8 release? No, because it changes the C/C++ ABI. You're free to back-port it though, it's pretty straightforward. I heard your company employs one or two people who know node's internals somewhat, maybe they can help you. ;-)

@christophsturm

lol, what company are you talking about? I always thought I'm the resident node guru here :)

@bnoordhuis

Ah, sorry - I thought you were our (that is, Cloud9's) Christoph. :-)

@brettkiefer

@christophsturm Did that do the trick for you? We seem to be seeing really serious imbalances with Cluster and node 0.9.3 with the 3.2.0 linux kernel (and not with the 2.6.32 kernel), but our experiment may be flawed (#3241 (comment)).

@brettkiefer

We tested the environment variables here and a build from master, just to make sure we weren't missing anything, and with the Linux kernel 3.2.0 we're still seeing one process hogging most of the connections. #3241 (comment)

@brettkiefer

@christophsturm If it happens that you got this fixed, and you happen to be feeling charitable, could you run the test I posted here with your setup and tell me if it balances any better? #3241 (comment)

@jasnell
Owner

@joyent/node-coreteam ... if this is still an issue, please reopen with additional information.

@jasnell jasnell closed this
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.