Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workers Keep Crashing Randomly #46

Open
patrickml opened this issue Apr 16, 2015 · 47 comments
Open

Workers Keep Crashing Randomly #46

patrickml opened this issue Apr 16, 2015 · 47 comments

Comments

@patrickml
Copy link

@patrickml patrickml commented Apr 16, 2015

Randomly all of the workers will kill them selves with an error similar to these:

[104.236.27.101] Error: write after end
    at writeAfterEnd (_stream_writable.js:133:12)[104.236.27.101] 
    at Socket.Writable.write (_stream_writable.js:181:5)
    at Socket.write (net.js:616:40)
    at Socket.Writable.end (_stream_writable.js:341:10)
    at Socket.end (net.js:397:31)
    at App.exports.GenericApp.GenericApp.handle_error (/opt/meteor/app/programs/server/npm/ddp/node_modules/sockjs/lib/webjs.js:148:13)
    at execute_request (/opt/meteor/app/programs/server/npm/ddp/node_modules/sockjs/lib/webjs.js:33:30)
    at Object.req.next_filter (/opt/meteor/app/programs/server/npm/ddp/node_modules/sockjs/lib/webjs.js:105:18)
    at Listener.webjs_handler (/opt/meteor/app/programs/server/npm/ddp/node_modules/sockjs/lib/webjs.js:107:13)
    at Listener.handler (/opt/meteor/app/programs/server/npm/ddp/node_modules/sockjs/lib/sockjs.js:147:12)
Cluster: Exiting worker 7 with exitCode=7 signalCode=null
[104.236.27.101] TypeError: Cannot read property 'writeHead' of undefined
    at Listener.webjs_handler (/opt/meteor/app/programs/server/npm/ddp/node_modules/sockjs/lib/webjs.js:78:21)
    at Listener.handler (/opt/meteor/app/programs/server/npm/ddp/node_modules/sockjs/lib/sockjs.js:147:12)
    at Listener.handler (/opt/meteor/app/programs/server/npm/ddp/node_modules/sockjs/lib/sockjs.js:6:61)
    at Server.<anonymous> (/opt/meteor/app/programs/server/npm/ddp/node_modules/sockjs/lib/sockjs.js:154:24)
    at Server.new_handler (/opt/meteor/app/programs/server/npm/ddp/node_modules/sockjs/lib/utils.js:86:19)
    at packages/ddp/stream_server.js:133:1
    at Array.forEach (native)
    at Function._.each._.forEach (packages/underscore/underscore.js:105:1)
    at Server.newListener (packages/ddp/stream_server.js:132:1)
    at packages/meteorhacks:cluster/lib/server/utils.js:11:1

This error may be unrelated

[104.236.27.101]     at Object.Meteor._nodeCodeMustBeInFiber (packages/meteor/dynamics_nodejs.js:9:1)
    at [object Object]._.extend.get (packages/meteor/dynamics_nodejs.js:21:1)
    at [object Object].RouteController.lookupOption (packages/iron:router/lib/route_controller.js:66:1)
    at new Controller.extend.constructor (packages/iron:router/lib/route_controller.js:26:1)
    at [object Object].ctor (packages/iron:core/lib/iron_core.js:88:1)
    at Function.Router.createController (packages/iron:router/lib/router.js:201:1)
    at Function.Router.dispatch (packages/iron:router/lib/router_server.js:39:1)
    at Object.router (packages/iron:router/lib/router.js:15:1)
    at next (/opt/meteor/app/programs/server/npm/webapp/node_modules/connect/lib/proto.js:190:15)
    at Object.Package [as handle] (packages/cfs:http-methods/http.methods.server.api.js:420:1)

It is causing instability on our website is there a known fix for this?

@arunoda
Copy link
Member

@arunoda arunoda commented Apr 17, 2015

I haven't seen this error before.
Is there anyway, I can reproduce this locally?

Loading

@patrickml
Copy link
Author

@patrickml patrickml commented Apr 17, 2015

If I knew what was causing the issue id be able to tell you what to do. This error seems to happen once our client get around 500+ concurrent connections on our site. but it doesnt always happen. https://www.dropbox.com/s/xg8d355qzecwlp0/Screenshot%202015-04-17%2010.58.08.png?dl=0

In order to stop the endless loop of errors I have to restart our entire server. Mup Restart doesnt stop the issue.

Loading

@Latnok
Copy link

@Latnok Latnok commented May 28, 2015

I have some errors

Exiting worker ## with exitCode=7 signalCode=null
TypeError: Cannot read property 'writeHead' of undefined
at Listener.webjs_handler (/opt/meteor/app/programs/server/npm/ddp/node_modules/sockjs/lib/webjs.js:78:21)
at Listener.handler (/opt/meteor/app/programs/server/npm/ddp/node_modules/sockjs/lib/sockjs.js:147:12)
at Listener.handler (/opt/meteor/app/programs/server/npm/ddp/node_modules/sockjs/lib/sockjs.js:6:61)
at Server. (/opt/meteor/app/programs/server/npm/ddp/node_modules/sockjs/lib/sockjs.js:154:24)
at Server.new_handler (/opt/meteor/app/programs/server/npm/ddp/node_modules/sockjs/lib/utils.js:86:19)
at packages/ddp/stream_server.js:133:1
at Array.forEach (native)
at Function..each..forEach (packages/underscore/underscore.js:105:1)
at Server.newListener (packages/ddp/stream_server.js:132:1)
at packages/meteorhacks:cluster/lib/server/utils.js:11:1

6 node processes each 100% cpu loading

Loading

@jfols
Copy link

@jfols jfols commented Jun 30, 2015

Also seeing Error: write after end on very high utilization.

Loading

@jfols
Copy link

@jfols jfols commented Jun 30, 2015

Perhaps this will help:

[mostexclusivewebsite.com]     at Server.new_handler (/opt/mostexclusivewebsite/app/programs/server/npm/ddp/node_modules/sockjs/lib/utils.js:86:19)
    at packages/ddp/stream_server.js:133:1
    at Array.forEach (native)
    at Function._.each._.forEach (packages/underscore/underscore.js:105:1)
    at Server.newListener (packages/ddp/stream_server.js:132:1)
    at packages/meteorhacks:cluster/lib/server/utils.js:11:1
Cluster: Exiting worker 20 with exitCode=7 signalCode=null
Cluster: Initializing worker 24 on port 15374
Kadira: successfully authenticated
Kadira: completed instrumenting the app
[mostexclusivewebsite.com] Error: write after end
    at writeAfterEnd (_stream_writable.js:133:12)
    at Socket.Writable.write (_stream_writable.js:181:5)
    at Socket.write (net.js:616:40)
    at Socket.Writable.end (_stream_writable.js:341:10)
    at Socket.end (net.js:397:31)
    at App.exports.GenericApp.GenericApp.handle_error (/opt/mostexclusivewebsite/app/programs/server/npm/ddp/node_modules/sockjs/lib/webjs.js:148:13)
    at execute_request (/opt/mostexclusivewebsite/app/programs/server/npm/ddp/node_modules/sockjs/lib/webjs.js:33:30)
    at Object.req.next_filter (/opt/mostexclusivewebsite/app/programs/server/npm/ddp/node_modules/sockjs/lib/webjs.js:105:18)
    at Listener.webjs_handler (/opt/mostexclusivewebsite/app/programs/server/npm/ddp/node_modules/sockjs/lib/webjs.js:107:13)
    at Listener.handler (/opt/mostexclusivewebsite/app/programs/server/npm/ddp/node_modules/sockjs/lib/sockjs.js:147:12)
...

Loading

@arunoda
Copy link
Member

@arunoda arunoda commented Jun 30, 2015

I did a possible fix to this lately. Are you using the latest version?

On Tue, Jun 30, 2015 at 9:33 AM Justin Foley notifications@github.com
wrote:

Perhaps this will help:

[mostexclusivewebsite.com] at Server.new_handler (/opt/mostexclusivewebsite/app/programs/server/npm/ddp/node_modules/sockjs/lib/utils.js:86:19)
at packages/ddp/stream_server.js:133:1
at Array.forEach (native)
at Function..each..forEach (packages/underscore/underscore.js:105:1)
at Server.newListener (packages/ddp/stream_server.js:132:1)
at packages/meteorhacks:cluster/lib/server/utils.js:11:1
Cluster: Exiting worker 20 with exitCode=7 signalCode=null
Cluster: Initializing worker 24 on port 15374
...


Reply to this email directly or view it on GitHub
#46 (comment).

Loading

@jfols
Copy link

@jfols jfols commented Jun 30, 2015

I think so, 0.11.0? Site is currently on the front page of reddit and got an email from Digital Ocean about a DDOS attach on the server. The server was running hard for several hours with lots of write after end errors.

Loading

@arunoda
Copy link
Member

@arunoda arunoda commented Jun 30, 2015

oh!

On Tue, Jun 30, 2015 at 10:03 AM Justin Foley notifications@github.com
wrote:

I think so, 0.11.0? Site is currently on the front page of reddit and got
an email from Digital Ocean about a DDOS attach on the server. The server
was running hard for several hours with lots of write after end errors.


Reply to this email directly or view it on GitHub
#46 (comment).

Loading

@arunoda
Copy link
Member

@arunoda arunoda commented Jun 30, 2015

Currently it's 1.6.8. - This may be a very old version.
But I'll debug more.

Loading

@jfols
Copy link

@jfols jfols commented Jun 30, 2015

Just ran npm update -g, trying again :)

npm view mup version
0.11.0
npm -v mup
2.12.0

Loading

@arunoda
Copy link
Member

@arunoda arunoda commented Jun 30, 2015

Nope. I'm not saying mup.
But meteorhacks:cluster

Go to your project and do: meteor update meteorhacks:cluster

Loading

@jfols
Copy link

@jfols jfols commented Jun 30, 2015

Oh duh...yea the package is up to date, in versions file meteorhacks:cluster@1.6.8

Loading

@arunoda
Copy link
Member

@arunoda arunoda commented Jun 30, 2015

Okay. Let me see debug a bit more.

Loading

@jfols
Copy link

@jfols jfols commented Jun 30, 2015

I did find this http://stackoverflow.com/questions/27769842/write-after-end-error-in-node-js-webserver that seems to be what's going on. I think the question is: is it in cluster or sockjs?

Loading

@arunoda
Copy link
Member

@arunoda arunoda commented Jun 30, 2015

Do you have this message on the logs? "Cluster: web proxy error:"

Loading

@jfols
Copy link

@jfols jfols commented Jun 30, 2015

Recent log snapshot:

Cluster: web proxy error:  Connection droped
Cluster: web proxy error:  Connection droped
Cluster: web proxy error:  Connection droped
Cluster: web proxy error:  Connection droped
WS proxying failed! to:  http://mostexclusivewebsite.com:80 err: getaddrinfo ENOTFOUND
Cluster: web proxy error:  Connection droped
WS proxying failed! to:  http://mostexclusivewebsite.com:80 err: getaddrinfo ENOTFOUND
Cluster: web proxy error:  Connection droped
WS proxying failed! to:  http://mostexclusivewebsite.com:80 err: getaddrinfo ENOTFOUND
[balance5.mostexclusivewebsite.com] Cluster: web proxy error:  Connection droped
[mostexclusivewebsite.com] Cluster: web proxy error:  Connection droped
[mostexclusivewebsite.com] Cluster: web proxy error:  Connection droped
[mostexclusivewebsite.com] Cluster: web proxy error:  Connection droped
[mostexclusivewebsite.com] Cluster: web proxy error:  Connection droped
[mostexclusivewebsite.com] Cluster: web proxy error:  Connection droped

Loading

@arunoda
Copy link
Member

@arunoda arunoda commented Jun 30, 2015

Okay. Found something noicy.
Let's try to do a fix.

Loading

@arunoda
Copy link
Member

@arunoda arunoda commented Jun 30, 2015

Published a new version. Check that.

Loading

@jfols
Copy link

@jfols jfols commented Jun 30, 2015

So fyi I was seeing the error while maxing out a single server running cluster in multi-core mode. Now I've got a real cluster up and things are running smoothly. It may just be something to do with CPU limits and IO in node.js?

Redeploying now with the new version, we'll see what happens 😄

Loading

@jfols
Copy link

@jfols jfols commented Jun 30, 2015

Seeing this on all servers in cluster

[mostexclusivewebsite.com] Cluster: web proxy error:  Connection droped
[mostexclusivewebsite.com] Cluster: web proxy error:  Connection droped

I think this is normal if a user drops the connection?

Loading

@jfols
Copy link

@jfols jfols commented Jun 30, 2015

Hmmm...

[balance7.mostexclusivewebsite.com] Error: write after end[balance7.mostexclusivewebsite.com]
[balance7.mostexclusivewebsite.com]     at writeAfterEnd (_stream_writable.js:133:12)[balance7.mostexclusivewebsite.com]
[balance7.mostexclusivewebsite.com]     at Socket.Writable.write (_stream_writable.js:181:5)[balance7.mostexclusivewebsite.com]
[balance7.mostexclusivewebsite.com]     at Socket.write (net.js:616:40)[balance7.mostexclusivewebsite.com]
[balance7.mostexclusivewebsite.com]     at Socket.Writable.end (_stream_writable.js:341:10)[balance7.mostexclusivewebsite.com]
[balance7.mostexclusivewebsite.com]     at Socket.end (net.js:397:31)[balance7.mostexclusivewebsite.com]
[balance7.mostexclusivewebsite.com]     at App.exports.GenericApp.GenericApp.handle_error (/opt/mostexclusivewebsite/app/programs/server/npm/ddp/node_modules/sockjs/lib/webjs.js:148:13)[balance7.mostexclusivewebsite.com]
[balance7.mostexclusivewebsite.com]     at execute_request (/opt/mostexclusivewebsite/app/programs/server/npm/ddp/node_modules/sockjs/lib/webjs.js:33:30)[balance7.mostexclusivewebsite.com]
[balance7.mostexclusivewebsite.com]     at Object.req.next_filter (/opt/mostexclusivewebsite/app/programs/server/npm/ddp/node_modules/sockjs/lib/webjs.js:105:18)[balance7.mostexclusivewebsite.com]
[balance7.mostexclusivewebsite.com]     at Listener.webjs_handler (/opt/mostexclusivewebsite/app/programs/server/npm/ddp/node_modules/sockjs/lib/webjs.js:107:13)[balance7.mostexclusivewebsite.com]
[balance7.mostexclusivewebsite.com]     at Listener.handler (/opt/mostexclusivewebsite/app/programs/server/npm/ddp/node_modules/sockjs/lib/sockjs.js:147:12)[balance7.mostexclusivewebsite.com]
[balance7.mostexclusivewebsite.com] Cluster: Exiting worker 4 with exitCode=7 signalCode=null[balance7.mostexclusivewebsite.com]
[balance7.mostexclusivewebsite.com] Cluster: Initializing worker 6 on port 9181[balance7.mostexclusivewebsite.com]

Loading

@arunoda
Copy link
Member

@arunoda arunoda commented Jun 30, 2015

Hm..
I am pretty sure I need to fix this.
But I couldn't reproduce this locally.

Can anyone of you tell me how to reproduce this.

  • What's the cluster configurations?
  • How much of load you are getting?

Loading

@patrickml
Copy link
Author

@patrickml patrickml commented Jun 30, 2015

For me personally I had the cluster configured to run on a single server
that has 15 gigs of RAM 8 Cores if I set the cluster count to auto I was
able to run almost 100 clusters so I limited it down to 40 in order to keep
resources available. On average there's at least 500 concurrent connections
which leads to about 150 to 200 page views per minute.
On Tue, Jun 30, 2015 at 6:16 PM Arunoda Susiripala notifications@github.com
wrote:

Hm..
I am pretty sure I need to fix this.
But I couldn't reproduce this locally.

Can anyone of you tell me how to reproduce this.

  • What's the cluster configurations?
  • How much of load you are getting?


Reply to this email directly or view it on GitHub
#46 (comment).

Loading

@arunoda
Copy link
Member

@arunoda arunoda commented Jun 30, 2015

@patrickml Oh! Does auto select 100 workers for your 8 core cluster box. That's weird. Theoretically it should be 8 workers.

Loading

@patrickml
Copy link
Author

@patrickml patrickml commented Jul 1, 2015

@arunoda here is a pic of 54 workers running on the server when the worker count is set to 40

https://www.dropbox.com/s/qgfjgtjuqx7o8sh/Screenshot%202015-07-01%2011.14.18.png?dl=0

...
"env": {
...
    "CLUSTER_WORKERS_COUNT" : "40"
...
  },
...

Loading

@yonilerner
Copy link

@yonilerner yonilerner commented Jul 27, 2015

I am also having this issue:

2015-07-27T21:53:52.161730988Z TypeError: Cannot read property 'writeHead' of undefined
2015-07-27T21:53:52.163626778Z     at Listener.webjs_handler (/built_app/programs/server/npm/ddp/node_modules/sockjs/lib/webjs.js:78:21)
2015-07-27T21:53:52.163715834Z     at Listener.handler (/built_app/programs/server/npm/ddp/node_modules/sockjs/lib/sockjs.js:147:12)
2015-07-27T21:53:52.163715834Z     at Listener.handler (/built_app/programs/server/npm/ddp/node_modules/sockjs/lib/sockjs.js:6:61)
2015-07-27T21:53:52.164623455Z     at Server.<anonymous> (/built_app/programs/server/npm/ddp/node_modules/sockjs/lib/sockjs.js:154:24)
2015-07-27T21:53:52.164704768Z     at Server.new_handler (/built_app/programs/server/npm/ddp/node_modules/sockjs/lib/utils.js:86:19)
2015-07-27T21:53:52.164704768Z     at packages/ddp/stream_server.js:133:1
2015-07-27T21:53:52.164704768Z     at Array.forEach (native)
2015-07-27T21:53:52.164704768Z     at Function._.each._.forEach (packages/underscore/underscore.js:105:1)
2015-07-27T21:53:52.164704768Z     at Server.newListener (packages/ddp/stream_server.js:132:1)
2015-07-27T21:53:52.164704768Z     at packages/meteorhacks:cluster/lib/server/utils.js:11:1

Loading

@elie222
Copy link

@elie222 elie222 commented Jul 29, 2015

Exactly the same error here. Production site just went down for 30 minutes...

TypeError: Cannot read property 'writeHead' of undefined
    at Listener.webjs_handler (/opt/draftapp/app/programs/server/npm/ddp/node_modules/sockjs/lib/webjs.js:78:21)
    at Listener.handler (/opt/draftapp/app/programs/server/npm/ddp/node_modules/sockjs/lib/sockjs.js:147:12)
    at Listener.handler (/opt/draftapp/app/programs/server/npm/ddp/node_modules/sockjs/lib/sockjs.js:6:61)
    at Server.<anonymous> (/opt/draftapp/app/programs/server/npm/ddp/node_modules/sockjs/lib/sockjs.js:154:24)
    at Server.new_handler (/opt/draftapp/app/programs/server/npm/ddp/node_modules/sockjs/lib/utils.js:86:19)
    at packages/ddp/stream_server.js:133:1
    at Array.forEach (native)
    at Function._.each._.forEach (packages/underscore/underscore.js:105:1)
    at Server.newListener (packages/ddp/stream_server.js:132:1)
    at packages/meteorhacks:cluster/lib/server/utils.js:11:1
TypeError: Cannot read property 'writeHead' of undefined
    at Listener.webjs_handler (/opt/draftapp/app/programs/server/npm/ddp/node_modules/sockjs/lib/webjs.js:78:21)
    at Listener.handler (/opt/draftapp/app/programs/server/npm/ddp/node_modules/sockjs/lib/sockjs.js:147:12)
    at Listener.handler (/opt/draftapp/app/programs/server/npm/ddp/node_modules/sockjs/lib/sockjs.js:6:61)
    at Server.<anonymous> (/opt/draftapp/app/programs/server/npm/ddp/node_modules/sockjs/lib/sockjs.js:154:24)
    at Server.new_handler (/opt/draftapp/app/programs/server/npm/ddp/node_modules/sockjs/lib/utils.js:86:19)
    at packages/ddp/stream_server.js:133:1
    at Array.forEach (native)
    at Function._.each._.forEach (packages/underscore/underscore.js:105:1)
    at Server.newListener (packages/ddp/stream_server.js:132:1)
    at packages/meteorhacks:cluster/lib/server/utils.js:11:1
spiderable: phantomjs failed: Error: Command failed:
    at ChildProcess.exithandler (child_process.js:658:15)
    at ChildProcess.emit (events.js:98:17)
    at maybeClose (child_process.js:766:16)
    at Process.ChildProcess._handle.onexit (child_process.js:833:5)

Loading

@elie222
Copy link

@elie222 elie222 commented Jul 29, 2015

Did anyone here find a fix? Or make any progress with this issue?
@patrickml @arunoda @Latnok @jfols

Loading

@jfols
Copy link

@jfols jfols commented Jul 29, 2015

@elie222 not a fix, but I simply stopped using workers, deployed cluster on several small instances instead.

Loading

@elie222
Copy link

@elie222 elie222 commented Jul 29, 2015

@jfols and you stopped seeing this specific error completely after that?

Loading

@elie222
Copy link

@elie222 elie222 commented Jul 29, 2015

also, @arunoda why didn't forever restart the app automatically after the crash?

Loading

@jfols
Copy link

@jfols jfols commented Jul 29, 2015

@elie222 that's correct, the writeHead error hasn't occurred with workers disabled.

Loading

@maxpain
Copy link

@maxpain maxpain commented Sep 17, 2015

+1

Loading

@XAOPT
Copy link

@XAOPT XAOPT commented Oct 6, 2015

same error (version 1.6.9.)

Loading

@rkstar
Copy link

@rkstar rkstar commented Oct 20, 2015

i'm having this problem also! i'm only running 2 servers in my cluster. it worked fine on HTTP, but now with HTTPS it is throwing connection dropped errors. *i have a wildcard ssl cert working on both servers.

browser errors:

WebSocket connection to 'wss://production-1.getcrate.co/cluster-ddp/3c4452cd058d5e4b2a9e5e49cfcb6fc7b2fe0738/web/036/wpkh0t_b/websocket' failed: WebSocket opening handshake was canceled
15c205c902cb3143090c5445e94086181e618ae7.js?meteor_js_resource=true:39 POST https://production-1.getcrate.co/cluster-ddp/3c4452cd058d5e4b2a9e5e49cfcb6fc7b2fe0738/web/036/8st9w5e0/xhr net::ERR_INSECURE_RESPONSEy._start @ 15c205c902cb3143090c5445e94086181e618ae7.js?meteor_js_resource=true:39(anonymous function) @ 15c205c902cb3143090c5445e94086181e618ae7.js?meteor_js_resource=true:39
4

and server logs:

 => Starting meteor app on port:80
Cluster: connecting to 'mongodb' discovery backend
Cluster: with options:  {}
Kadira: successfully authenticated
Cluster: registering this node as service 'web'
Cluster:    endpoint url = http://<my-ip>:80
Cluster:    balancer url = https://<my-domain>
Kadira: completed instrumenting the app
Cluster: web proxy error:  Connection droped
Cluster: web proxy error:  Connection droped
Cluster: web proxy error:  Connection droped
Cluster: web proxy error:  Connection droped
Cluster: web proxy error:  Connection droped

i've updated mupx and meteorhacks:cluster but no luck here.

Loading

@dnish
Copy link

@dnish dnish commented Oct 24, 2015

@arunoda Getting the same in my production environment, 3 clusters configured:

Cluster: web proxy error:  Connection droped
Exception in setInterval callback: Error: connection closed
    at Object.Future.wait (/opt/twyce/app/programs/server/node_modules/fibers/fu                                                           ture.js:398:15)
    at Collection.update (packages/meteor/helpers.js:119:1)
    at Object._ping (packages/meteorhacks_cluster/packages/meteorhacks_cluster.j                                                           s:535:1)
    at [object Object]._.extend.withValue (packages/meteor/dynamics_nodejs.js:56                                                           :1)
    at packages/meteor/timers.js:6:1
    at runWithEnvironment (packages/meteor/dynamics_nodejs.js:108:1)
    - - - - -
    at [object Object].<anonymous> (/opt/twyce/app/programs/server/npm/meteorhac                                                           ks_cluster/node_modules/mongodb/lib/mongodb/connection/server.js:609:98)
    at [object Object].emit (events.js:92:17)
    at [object Object].<anonymous> (/opt/twyce/app/programs/server/npm/meteorhac                                                           ks_cluster/node_modules/mongodb/lib/mongodb/connection/connection_pool.js:171:15                                                           )
    at [object Object].emit (events.js:98:17)
    at Socket.<anonymous> (/opt/twyce/app/programs/server/npm/meteorhacks_cluste                                                           r/node_modules/mongodb/lib/mongodb/connection/connection.js:550:12)
    at Socket.emit (events.js:95:17)
    at TCP.close (net.js:466:12)
Exception in setInterval callback: Error: connection closed
    at Object.Future.wait (/opt/twyce/app/programs/server/node_modules/fibers/fu                                                           ture.js:398:15)
    at Collection.update (packages/meteor/helpers.js:119:1)
    at Object._ping (packages/meteorhacks_cluster/packages/meteorhacks_cluster.j                                                           s:535:1)
    at [object Object]._.extend.withValue (packages/meteor/dynamics_nodejs.js:56                                                           :1)
    at packages/meteor/timers.js:6:1
    at runWithEnvironment (packages/meteor/dynamics_nodejs.js:108:1)
    - - - - -
    at [object Object].<anonymous> (/opt/twyce/app/programs/server/npm/meteorhac                                                           ks_cluster/node_modules/mongodb/lib/mongodb/connection/server.js:609:98)
    at [object Object].emit (events.js:92:17)
    at [object Object].<anonymous> (/opt/twyce/app/programs/server/npm/meteorhac                                                           ks_cluster/node_modules/mongodb/lib/mongodb/connection/connection_pool.js:171:15                                                           )
    at [object Object].emit (events.js:98:17)
    at Socket.<anonymous> (/opt/twyce/app/programs/server/npm/meteorhacks_cluste                                                           r/node_modules/mongodb/lib/mongodb/connection/connection.js:550:12)
    at Socket.emit (events.js:95:17)
    at TCP.close (net.js:466:12)
Exception in setInterval callback: Error: connection closed
    at Object.Future.wait (/opt/twyce/app/programs/server/node_modules/fibers/fu                                                           ture.js:398:15)
    at Collection.update (packages/meteor/helpers.js:119:1)
    at Object._ping (packages/meteorhacks_cluster/packages/meteorhacks_cluster.j                                                           s:535:1)
    at [object Object]._.extend.withValue (packages/meteor/dynamics_nodejs.js:56                                                           :1)
    at packages/meteor/timers.js:6:1
    at runWithEnvironment (packages/meteor/dynamics_nodejs.js:108:1)
    - - - - -
    at [object Object].<anonymous> (/opt/twyce/app/programs/server/npm/meteorhac                                                           ks_cluster/node_modules/mongodb/lib/mongodb/connection/server.js:609:98)
    at [object Object].emit (events.js:92:17)
    at [object Object].<anonymous> (/opt/twyce/app/programs/server/npm/meteorhac                                                           ks_cluster/node_modules/mongodb/lib/mongodb/connection/connection_pool.js:171:15                                                           )
    at [object Object].emit (events.js:98:17)
    at Socket.<anonymous> (/opt/twyce/app/programs/server/npm/meteorhacks_cluste                                                           r/node_modules/mongodb/lib/mongodb/connection/connection.js:550:12)
    at Socket.emit (events.js:95:17)
    at TCP.close (net.js:466:12)
Cluster: web proxy error:  Connection droped
Cluster: web proxy error:  Connection droped
Cluster: web proxy error:  Connection droped
Cluster: web proxy error:  Connection droped
Cluster: web proxy error:  Connection droped
Cluster: web proxy error:  Connection droped
Cluster: web proxy error:  Connection droped
Cluster: web proxy error:  Connection droped
Cluster: web proxy error:  Connection droped
Cluster: web proxy error:  Connection droped
Cluster: web proxy error:  Connection droped

Whole app crashed :(. We are also using SSL.

mup restart makes the app work again, but I still get some error messages:

Cluster: connecting to 'mongodb' discovery backend
Cluster: with options:  {}
Cluster: registering this node as service 'web'
Cluster:    endpoint url = http://data3-rs0.myapp.io:80
Cluster:    balancer url = https://data3-rs0.myapp.io
 >> stepping down to gid: meteoruser
 >> stepping down to uid: meteoruser
Cluster: web proxy error:  Connection droped
Cluster: web proxy error:  Connection droped
Cluster: web proxy error:  Connection droped

Loading

@patrickml
Copy link
Author

@patrickml patrickml commented Oct 30, 2015

@elie222 i reduced the number of clusters auto made so many the server crashed so I limited mine to 40 on a 16gig server

Loading

@bitomule
Copy link

@bitomule bitomule commented Nov 24, 2015

I also have this issue. It taked down the whole app :( Any idea?

Loading

@btoueg
Copy link

@btoueg btoueg commented Dec 3, 2015

Same here, SSL on and Cluster set to auto

Loading

@patrickml
Copy link
Author

@patrickml patrickml commented Dec 3, 2015

@btoueg reduce the number of clusters from auto to something you think the server can handle. I counted the number running at crash time and cut it in half

Loading

@sahanDissanayake
Copy link

@sahanDissanayake sahanDissanayake commented Feb 2, 2016

is this Fixed guys ? just learning my way through using using meteor clusters

Loading

@evolross
Copy link

@evolross evolross commented Jul 28, 2016

@btoueg reduce the number of clusters from auto to something you think the server can handle. I counted the number running at crash time and cut it in half

Remind us why you would want to set the workers to a value greater than the amount of cores?

Loading

@eportico
Copy link

@eportico eportico commented Mar 1, 2017

Any progress in this topic ? I have the same problem in production with 2 workers.
I have 600 clients working against my server and I need scale the server with a cluster.
Thanks,

Loading

@evolross
Copy link

@evolross evolross commented Mar 1, 2017

Well the guy who wrote cluster has left the Meteor community and there's articles now on how cluster is actually a bad solution for scaling because it's handling scaling at the application level and not above it (can't find the link right now). It's also not supported on Galaxy. So I'd say this issue probably won't be fixed ever.

Loading

@dnish
Copy link

@dnish dnish commented Mar 1, 2017

I would recommend you PM2+NGINX. You need to start every instance in fork mode (not PM2 cluster because it doesn't support sticky sessions) and then do the load balancing via NGINX upstream.

Loading

@lmachens
Copy link

@lmachens lmachens commented May 7, 2017

I have the same issue. Anyone has an idea?

Loading

@kakadais
Copy link

@kakadais kakadais commented Apr 20, 2021

I've come here too late- I agree this would be better to be handled by Webserver lever but it still could be so useful and convenient way.

So sad there's no other approach and people who drives this ;(

Loading

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet