Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workers Keep Crashing Randomly #46

Open
patrickml opened this issue Apr 16, 2015 · 46 comments

Comments

@patrickml
Copy link

commented Apr 16, 2015

Randomly all of the workers will kill them selves with an error similar to these:

[104.236.27.101] Error: write after end
    at writeAfterEnd (_stream_writable.js:133:12)[104.236.27.101] 
    at Socket.Writable.write (_stream_writable.js:181:5)
    at Socket.write (net.js:616:40)
    at Socket.Writable.end (_stream_writable.js:341:10)
    at Socket.end (net.js:397:31)
    at App.exports.GenericApp.GenericApp.handle_error (/opt/meteor/app/programs/server/npm/ddp/node_modules/sockjs/lib/webjs.js:148:13)
    at execute_request (/opt/meteor/app/programs/server/npm/ddp/node_modules/sockjs/lib/webjs.js:33:30)
    at Object.req.next_filter (/opt/meteor/app/programs/server/npm/ddp/node_modules/sockjs/lib/webjs.js:105:18)
    at Listener.webjs_handler (/opt/meteor/app/programs/server/npm/ddp/node_modules/sockjs/lib/webjs.js:107:13)
    at Listener.handler (/opt/meteor/app/programs/server/npm/ddp/node_modules/sockjs/lib/sockjs.js:147:12)
Cluster: Exiting worker 7 with exitCode=7 signalCode=null
[104.236.27.101] TypeError: Cannot read property 'writeHead' of undefined
    at Listener.webjs_handler (/opt/meteor/app/programs/server/npm/ddp/node_modules/sockjs/lib/webjs.js:78:21)
    at Listener.handler (/opt/meteor/app/programs/server/npm/ddp/node_modules/sockjs/lib/sockjs.js:147:12)
    at Listener.handler (/opt/meteor/app/programs/server/npm/ddp/node_modules/sockjs/lib/sockjs.js:6:61)
    at Server.<anonymous> (/opt/meteor/app/programs/server/npm/ddp/node_modules/sockjs/lib/sockjs.js:154:24)
    at Server.new_handler (/opt/meteor/app/programs/server/npm/ddp/node_modules/sockjs/lib/utils.js:86:19)
    at packages/ddp/stream_server.js:133:1
    at Array.forEach (native)
    at Function._.each._.forEach (packages/underscore/underscore.js:105:1)
    at Server.newListener (packages/ddp/stream_server.js:132:1)
    at packages/meteorhacks:cluster/lib/server/utils.js:11:1

This error may be unrelated

[104.236.27.101]     at Object.Meteor._nodeCodeMustBeInFiber (packages/meteor/dynamics_nodejs.js:9:1)
    at [object Object]._.extend.get (packages/meteor/dynamics_nodejs.js:21:1)
    at [object Object].RouteController.lookupOption (packages/iron:router/lib/route_controller.js:66:1)
    at new Controller.extend.constructor (packages/iron:router/lib/route_controller.js:26:1)
    at [object Object].ctor (packages/iron:core/lib/iron_core.js:88:1)
    at Function.Router.createController (packages/iron:router/lib/router.js:201:1)
    at Function.Router.dispatch (packages/iron:router/lib/router_server.js:39:1)
    at Object.router (packages/iron:router/lib/router.js:15:1)
    at next (/opt/meteor/app/programs/server/npm/webapp/node_modules/connect/lib/proto.js:190:15)
    at Object.Package [as handle] (packages/cfs:http-methods/http.methods.server.api.js:420:1)

It is causing instability on our website is there a known fix for this?

@arunoda

This comment has been minimized.

Copy link
Member

commented Apr 17, 2015

I haven't seen this error before.
Is there anyway, I can reproduce this locally?

@patrickml

This comment has been minimized.

Copy link
Author

commented Apr 17, 2015

If I knew what was causing the issue id be able to tell you what to do. This error seems to happen once our client get around 500+ concurrent connections on our site. but it doesnt always happen. https://www.dropbox.com/s/xg8d355qzecwlp0/Screenshot%202015-04-17%2010.58.08.png?dl=0

In order to stop the endless loop of errors I have to restart our entire server. Mup Restart doesnt stop the issue.

@Latnok

This comment has been minimized.

Copy link

commented May 28, 2015

I have some errors

Exiting worker ## with exitCode=7 signalCode=null
TypeError: Cannot read property 'writeHead' of undefined
at Listener.webjs_handler (/opt/meteor/app/programs/server/npm/ddp/node_modules/sockjs/lib/webjs.js:78:21)
at Listener.handler (/opt/meteor/app/programs/server/npm/ddp/node_modules/sockjs/lib/sockjs.js:147:12)
at Listener.handler (/opt/meteor/app/programs/server/npm/ddp/node_modules/sockjs/lib/sockjs.js:6:61)
at Server. (/opt/meteor/app/programs/server/npm/ddp/node_modules/sockjs/lib/sockjs.js:154:24)
at Server.new_handler (/opt/meteor/app/programs/server/npm/ddp/node_modules/sockjs/lib/utils.js:86:19)
at packages/ddp/stream_server.js:133:1
at Array.forEach (native)
at Function..each..forEach (packages/underscore/underscore.js:105:1)
at Server.newListener (packages/ddp/stream_server.js:132:1)
at packages/meteorhacks:cluster/lib/server/utils.js:11:1

6 node processes each 100% cpu loading

@jfols

This comment has been minimized.

Copy link

commented Jun 30, 2015

Also seeing Error: write after end on very high utilization.

@jfols

This comment has been minimized.

Copy link

commented Jun 30, 2015

Perhaps this will help:

[mostexclusivewebsite.com]     at Server.new_handler (/opt/mostexclusivewebsite/app/programs/server/npm/ddp/node_modules/sockjs/lib/utils.js:86:19)
    at packages/ddp/stream_server.js:133:1
    at Array.forEach (native)
    at Function._.each._.forEach (packages/underscore/underscore.js:105:1)
    at Server.newListener (packages/ddp/stream_server.js:132:1)
    at packages/meteorhacks:cluster/lib/server/utils.js:11:1
Cluster: Exiting worker 20 with exitCode=7 signalCode=null
Cluster: Initializing worker 24 on port 15374
Kadira: successfully authenticated
Kadira: completed instrumenting the app
[mostexclusivewebsite.com] Error: write after end
    at writeAfterEnd (_stream_writable.js:133:12)
    at Socket.Writable.write (_stream_writable.js:181:5)
    at Socket.write (net.js:616:40)
    at Socket.Writable.end (_stream_writable.js:341:10)
    at Socket.end (net.js:397:31)
    at App.exports.GenericApp.GenericApp.handle_error (/opt/mostexclusivewebsite/app/programs/server/npm/ddp/node_modules/sockjs/lib/webjs.js:148:13)
    at execute_request (/opt/mostexclusivewebsite/app/programs/server/npm/ddp/node_modules/sockjs/lib/webjs.js:33:30)
    at Object.req.next_filter (/opt/mostexclusivewebsite/app/programs/server/npm/ddp/node_modules/sockjs/lib/webjs.js:105:18)
    at Listener.webjs_handler (/opt/mostexclusivewebsite/app/programs/server/npm/ddp/node_modules/sockjs/lib/webjs.js:107:13)
    at Listener.handler (/opt/mostexclusivewebsite/app/programs/server/npm/ddp/node_modules/sockjs/lib/sockjs.js:147:12)
...
@arunoda

This comment has been minimized.

Copy link
Member

commented Jun 30, 2015

I did a possible fix to this lately. Are you using the latest version?

On Tue, Jun 30, 2015 at 9:33 AM Justin Foley notifications@github.com
wrote:

Perhaps this will help:

[mostexclusivewebsite.com] at Server.new_handler (/opt/mostexclusivewebsite/app/programs/server/npm/ddp/node_modules/sockjs/lib/utils.js:86:19)
at packages/ddp/stream_server.js:133:1
at Array.forEach (native)
at Function..each..forEach (packages/underscore/underscore.js:105:1)
at Server.newListener (packages/ddp/stream_server.js:132:1)
at packages/meteorhacks:cluster/lib/server/utils.js:11:1
Cluster: Exiting worker 20 with exitCode=7 signalCode=null
Cluster: Initializing worker 24 on port 15374
...


Reply to this email directly or view it on GitHub
#46 (comment).

@jfols

This comment has been minimized.

Copy link

commented Jun 30, 2015

I think so, 0.11.0? Site is currently on the front page of reddit and got an email from Digital Ocean about a DDOS attach on the server. The server was running hard for several hours with lots of write after end errors.

@arunoda

This comment has been minimized.

Copy link
Member

commented Jun 30, 2015

oh!

On Tue, Jun 30, 2015 at 10:03 AM Justin Foley notifications@github.com
wrote:

I think so, 0.11.0? Site is currently on the front page of reddit and got
an email from Digital Ocean about a DDOS attach on the server. The server
was running hard for several hours with lots of write after end errors.


Reply to this email directly or view it on GitHub
#46 (comment).

@arunoda

This comment has been minimized.

Copy link
Member

commented Jun 30, 2015

Currently it's 1.6.8. - This may be a very old version.
But I'll debug more.

@jfols

This comment has been minimized.

Copy link

commented Jun 30, 2015

Just ran npm update -g, trying again :)

npm view mup version
0.11.0
npm -v mup
2.12.0
@arunoda

This comment has been minimized.

Copy link
Member

commented Jun 30, 2015

Nope. I'm not saying mup.
But meteorhacks:cluster

Go to your project and do: meteor update meteorhacks:cluster

@jfols

This comment has been minimized.

Copy link

commented Jun 30, 2015

Oh duh...yea the package is up to date, in versions file meteorhacks:cluster@1.6.8

@arunoda

This comment has been minimized.

Copy link
Member

commented Jun 30, 2015

Okay. Let me see debug a bit more.

@jfols

This comment has been minimized.

Copy link

commented Jun 30, 2015

I did find this http://stackoverflow.com/questions/27769842/write-after-end-error-in-node-js-webserver that seems to be what's going on. I think the question is: is it in cluster or sockjs?

@arunoda

This comment has been minimized.

Copy link
Member

commented Jun 30, 2015

Do you have this message on the logs? "Cluster: web proxy error:"

@jfols

This comment has been minimized.

Copy link

commented Jun 30, 2015

Recent log snapshot:

Cluster: web proxy error:  Connection droped
Cluster: web proxy error:  Connection droped
Cluster: web proxy error:  Connection droped
Cluster: web proxy error:  Connection droped
WS proxying failed! to:  http://mostexclusivewebsite.com:80 err: getaddrinfo ENOTFOUND
Cluster: web proxy error:  Connection droped
WS proxying failed! to:  http://mostexclusivewebsite.com:80 err: getaddrinfo ENOTFOUND
Cluster: web proxy error:  Connection droped
WS proxying failed! to:  http://mostexclusivewebsite.com:80 err: getaddrinfo ENOTFOUND
[balance5.mostexclusivewebsite.com] Cluster: web proxy error:  Connection droped
[mostexclusivewebsite.com] Cluster: web proxy error:  Connection droped
[mostexclusivewebsite.com] Cluster: web proxy error:  Connection droped
[mostexclusivewebsite.com] Cluster: web proxy error:  Connection droped
[mostexclusivewebsite.com] Cluster: web proxy error:  Connection droped
[mostexclusivewebsite.com] Cluster: web proxy error:  Connection droped
@arunoda

This comment has been minimized.

Copy link
Member

commented Jun 30, 2015

Okay. Found something noicy.
Let's try to do a fix.

@arunoda

This comment has been minimized.

Copy link
Member

commented Jun 30, 2015

Published a new version. Check that.

@jfols

This comment has been minimized.

Copy link

commented Jun 30, 2015

So fyi I was seeing the error while maxing out a single server running cluster in multi-core mode. Now I've got a real cluster up and things are running smoothly. It may just be something to do with CPU limits and IO in node.js?

Redeploying now with the new version, we'll see what happens 😄

@jfols

This comment has been minimized.

Copy link

commented Jun 30, 2015

Seeing this on all servers in cluster

[mostexclusivewebsite.com] Cluster: web proxy error:  Connection droped
[mostexclusivewebsite.com] Cluster: web proxy error:  Connection droped

I think this is normal if a user drops the connection?

@jfols

This comment has been minimized.

Copy link

commented Jun 30, 2015

Hmmm...

[balance7.mostexclusivewebsite.com] Error: write after end[balance7.mostexclusivewebsite.com]
[balance7.mostexclusivewebsite.com]     at writeAfterEnd (_stream_writable.js:133:12)[balance7.mostexclusivewebsite.com]
[balance7.mostexclusivewebsite.com]     at Socket.Writable.write (_stream_writable.js:181:5)[balance7.mostexclusivewebsite.com]
[balance7.mostexclusivewebsite.com]     at Socket.write (net.js:616:40)[balance7.mostexclusivewebsite.com]
[balance7.mostexclusivewebsite.com]     at Socket.Writable.end (_stream_writable.js:341:10)[balance7.mostexclusivewebsite.com]
[balance7.mostexclusivewebsite.com]     at Socket.end (net.js:397:31)[balance7.mostexclusivewebsite.com]
[balance7.mostexclusivewebsite.com]     at App.exports.GenericApp.GenericApp.handle_error (/opt/mostexclusivewebsite/app/programs/server/npm/ddp/node_modules/sockjs/lib/webjs.js:148:13)[balance7.mostexclusivewebsite.com]
[balance7.mostexclusivewebsite.com]     at execute_request (/opt/mostexclusivewebsite/app/programs/server/npm/ddp/node_modules/sockjs/lib/webjs.js:33:30)[balance7.mostexclusivewebsite.com]
[balance7.mostexclusivewebsite.com]     at Object.req.next_filter (/opt/mostexclusivewebsite/app/programs/server/npm/ddp/node_modules/sockjs/lib/webjs.js:105:18)[balance7.mostexclusivewebsite.com]
[balance7.mostexclusivewebsite.com]     at Listener.webjs_handler (/opt/mostexclusivewebsite/app/programs/server/npm/ddp/node_modules/sockjs/lib/webjs.js:107:13)[balance7.mostexclusivewebsite.com]
[balance7.mostexclusivewebsite.com]     at Listener.handler (/opt/mostexclusivewebsite/app/programs/server/npm/ddp/node_modules/sockjs/lib/sockjs.js:147:12)[balance7.mostexclusivewebsite.com]
[balance7.mostexclusivewebsite.com] Cluster: Exiting worker 4 with exitCode=7 signalCode=null[balance7.mostexclusivewebsite.com]
[balance7.mostexclusivewebsite.com] Cluster: Initializing worker 6 on port 9181[balance7.mostexclusivewebsite.com]
@arunoda

This comment has been minimized.

Copy link
Member

commented Jun 30, 2015

Hm..
I am pretty sure I need to fix this.
But I couldn't reproduce this locally.

Can anyone of you tell me how to reproduce this.

  • What's the cluster configurations?
  • How much of load you are getting?
@patrickml

This comment has been minimized.

Copy link
Author

commented Jun 30, 2015

For me personally I had the cluster configured to run on a single server
that has 15 gigs of RAM 8 Cores if I set the cluster count to auto I was
able to run almost 100 clusters so I limited it down to 40 in order to keep
resources available. On average there's at least 500 concurrent connections
which leads to about 150 to 200 page views per minute.
On Tue, Jun 30, 2015 at 6:16 PM Arunoda Susiripala notifications@github.com
wrote:

Hm..
I am pretty sure I need to fix this.
But I couldn't reproduce this locally.

Can anyone of you tell me how to reproduce this.

  • What's the cluster configurations?
  • How much of load you are getting?


Reply to this email directly or view it on GitHub
#46 (comment).

@arunoda

This comment has been minimized.

Copy link
Member

commented Jun 30, 2015

@patrickml Oh! Does auto select 100 workers for your 8 core cluster box. That's weird. Theoretically it should be 8 workers.

@patrickml

This comment has been minimized.

Copy link
Author

commented Jul 1, 2015

@arunoda here is a pic of 54 workers running on the server when the worker count is set to 40

https://www.dropbox.com/s/qgfjgtjuqx7o8sh/Screenshot%202015-07-01%2011.14.18.png?dl=0

...
"env": {
...
    "CLUSTER_WORKERS_COUNT" : "40"
...
  },
...
@yonilerner

This comment has been minimized.

Copy link

commented Jul 27, 2015

I am also having this issue:

2015-07-27T21:53:52.161730988Z TypeError: Cannot read property 'writeHead' of undefined
2015-07-27T21:53:52.163626778Z     at Listener.webjs_handler (/built_app/programs/server/npm/ddp/node_modules/sockjs/lib/webjs.js:78:21)
2015-07-27T21:53:52.163715834Z     at Listener.handler (/built_app/programs/server/npm/ddp/node_modules/sockjs/lib/sockjs.js:147:12)
2015-07-27T21:53:52.163715834Z     at Listener.handler (/built_app/programs/server/npm/ddp/node_modules/sockjs/lib/sockjs.js:6:61)
2015-07-27T21:53:52.164623455Z     at Server.<anonymous> (/built_app/programs/server/npm/ddp/node_modules/sockjs/lib/sockjs.js:154:24)
2015-07-27T21:53:52.164704768Z     at Server.new_handler (/built_app/programs/server/npm/ddp/node_modules/sockjs/lib/utils.js:86:19)
2015-07-27T21:53:52.164704768Z     at packages/ddp/stream_server.js:133:1
2015-07-27T21:53:52.164704768Z     at Array.forEach (native)
2015-07-27T21:53:52.164704768Z     at Function._.each._.forEach (packages/underscore/underscore.js:105:1)
2015-07-27T21:53:52.164704768Z     at Server.newListener (packages/ddp/stream_server.js:132:1)
2015-07-27T21:53:52.164704768Z     at packages/meteorhacks:cluster/lib/server/utils.js:11:1
@elie222

This comment has been minimized.

Copy link

commented Jul 29, 2015

Exactly the same error here. Production site just went down for 30 minutes...

TypeError: Cannot read property 'writeHead' of undefined
    at Listener.webjs_handler (/opt/draftapp/app/programs/server/npm/ddp/node_modules/sockjs/lib/webjs.js:78:21)
    at Listener.handler (/opt/draftapp/app/programs/server/npm/ddp/node_modules/sockjs/lib/sockjs.js:147:12)
    at Listener.handler (/opt/draftapp/app/programs/server/npm/ddp/node_modules/sockjs/lib/sockjs.js:6:61)
    at Server.<anonymous> (/opt/draftapp/app/programs/server/npm/ddp/node_modules/sockjs/lib/sockjs.js:154:24)
    at Server.new_handler (/opt/draftapp/app/programs/server/npm/ddp/node_modules/sockjs/lib/utils.js:86:19)
    at packages/ddp/stream_server.js:133:1
    at Array.forEach (native)
    at Function._.each._.forEach (packages/underscore/underscore.js:105:1)
    at Server.newListener (packages/ddp/stream_server.js:132:1)
    at packages/meteorhacks:cluster/lib/server/utils.js:11:1
TypeError: Cannot read property 'writeHead' of undefined
    at Listener.webjs_handler (/opt/draftapp/app/programs/server/npm/ddp/node_modules/sockjs/lib/webjs.js:78:21)
    at Listener.handler (/opt/draftapp/app/programs/server/npm/ddp/node_modules/sockjs/lib/sockjs.js:147:12)
    at Listener.handler (/opt/draftapp/app/programs/server/npm/ddp/node_modules/sockjs/lib/sockjs.js:6:61)
    at Server.<anonymous> (/opt/draftapp/app/programs/server/npm/ddp/node_modules/sockjs/lib/sockjs.js:154:24)
    at Server.new_handler (/opt/draftapp/app/programs/server/npm/ddp/node_modules/sockjs/lib/utils.js:86:19)
    at packages/ddp/stream_server.js:133:1
    at Array.forEach (native)
    at Function._.each._.forEach (packages/underscore/underscore.js:105:1)
    at Server.newListener (packages/ddp/stream_server.js:132:1)
    at packages/meteorhacks:cluster/lib/server/utils.js:11:1
spiderable: phantomjs failed: Error: Command failed:
    at ChildProcess.exithandler (child_process.js:658:15)
    at ChildProcess.emit (events.js:98:17)
    at maybeClose (child_process.js:766:16)
    at Process.ChildProcess._handle.onexit (child_process.js:833:5)
@elie222

This comment has been minimized.

Copy link

commented Jul 29, 2015

Did anyone here find a fix? Or make any progress with this issue?
@patrickml @arunoda @Latnok @jfols

@jfols

This comment has been minimized.

Copy link

commented Jul 29, 2015

@elie222 not a fix, but I simply stopped using workers, deployed cluster on several small instances instead.

@elie222

This comment has been minimized.

Copy link

commented Jul 29, 2015

@jfols and you stopped seeing this specific error completely after that?

@elie222

This comment has been minimized.

Copy link

commented Jul 29, 2015

also, @arunoda why didn't forever restart the app automatically after the crash?

@jfols

This comment has been minimized.

Copy link

commented Jul 29, 2015

@elie222 that's correct, the writeHead error hasn't occurred with workers disabled.

@Maxpain177

This comment has been minimized.

Copy link

commented Sep 17, 2015

+1

@XAOPT

This comment has been minimized.

Copy link

commented Oct 6, 2015

same error (version 1.6.9.)

@rkstar

This comment has been minimized.

Copy link

commented Oct 20, 2015

i'm having this problem also! i'm only running 2 servers in my cluster. it worked fine on HTTP, but now with HTTPS it is throwing connection dropped errors. *i have a wildcard ssl cert working on both servers.

browser errors:

WebSocket connection to 'wss://production-1.getcrate.co/cluster-ddp/3c4452cd058d5e4b2a9e5e49cfcb6fc7b2fe0738/web/036/wpkh0t_b/websocket' failed: WebSocket opening handshake was canceled
15c205c902cb3143090c5445e94086181e618ae7.js?meteor_js_resource=true:39 POST https://production-1.getcrate.co/cluster-ddp/3c4452cd058d5e4b2a9e5e49cfcb6fc7b2fe0738/web/036/8st9w5e0/xhr net::ERR_INSECURE_RESPONSEy._start @ 15c205c902cb3143090c5445e94086181e618ae7.js?meteor_js_resource=true:39(anonymous function) @ 15c205c902cb3143090c5445e94086181e618ae7.js?meteor_js_resource=true:39
4

and server logs:

 => Starting meteor app on port:80
Cluster: connecting to 'mongodb' discovery backend
Cluster: with options:  {}
Kadira: successfully authenticated
Cluster: registering this node as service 'web'
Cluster:    endpoint url = http://<my-ip>:80
Cluster:    balancer url = https://<my-domain>
Kadira: completed instrumenting the app
Cluster: web proxy error:  Connection droped
Cluster: web proxy error:  Connection droped
Cluster: web proxy error:  Connection droped
Cluster: web proxy error:  Connection droped
Cluster: web proxy error:  Connection droped

i've updated mupx and meteorhacks:cluster but no luck here.

@dnish

This comment has been minimized.

Copy link

commented Oct 24, 2015

@arunoda Getting the same in my production environment, 3 clusters configured:

Cluster: web proxy error:  Connection droped
Exception in setInterval callback: Error: connection closed
    at Object.Future.wait (/opt/twyce/app/programs/server/node_modules/fibers/fu                                                           ture.js:398:15)
    at Collection.update (packages/meteor/helpers.js:119:1)
    at Object._ping (packages/meteorhacks_cluster/packages/meteorhacks_cluster.j                                                           s:535:1)
    at [object Object]._.extend.withValue (packages/meteor/dynamics_nodejs.js:56                                                           :1)
    at packages/meteor/timers.js:6:1
    at runWithEnvironment (packages/meteor/dynamics_nodejs.js:108:1)
    - - - - -
    at [object Object].<anonymous> (/opt/twyce/app/programs/server/npm/meteorhac                                                           ks_cluster/node_modules/mongodb/lib/mongodb/connection/server.js:609:98)
    at [object Object].emit (events.js:92:17)
    at [object Object].<anonymous> (/opt/twyce/app/programs/server/npm/meteorhac                                                           ks_cluster/node_modules/mongodb/lib/mongodb/connection/connection_pool.js:171:15                                                           )
    at [object Object].emit (events.js:98:17)
    at Socket.<anonymous> (/opt/twyce/app/programs/server/npm/meteorhacks_cluste                                                           r/node_modules/mongodb/lib/mongodb/connection/connection.js:550:12)
    at Socket.emit (events.js:95:17)
    at TCP.close (net.js:466:12)
Exception in setInterval callback: Error: connection closed
    at Object.Future.wait (/opt/twyce/app/programs/server/node_modules/fibers/fu                                                           ture.js:398:15)
    at Collection.update (packages/meteor/helpers.js:119:1)
    at Object._ping (packages/meteorhacks_cluster/packages/meteorhacks_cluster.j                                                           s:535:1)
    at [object Object]._.extend.withValue (packages/meteor/dynamics_nodejs.js:56                                                           :1)
    at packages/meteor/timers.js:6:1
    at runWithEnvironment (packages/meteor/dynamics_nodejs.js:108:1)
    - - - - -
    at [object Object].<anonymous> (/opt/twyce/app/programs/server/npm/meteorhac                                                           ks_cluster/node_modules/mongodb/lib/mongodb/connection/server.js:609:98)
    at [object Object].emit (events.js:92:17)
    at [object Object].<anonymous> (/opt/twyce/app/programs/server/npm/meteorhac                                                           ks_cluster/node_modules/mongodb/lib/mongodb/connection/connection_pool.js:171:15                                                           )
    at [object Object].emit (events.js:98:17)
    at Socket.<anonymous> (/opt/twyce/app/programs/server/npm/meteorhacks_cluste                                                           r/node_modules/mongodb/lib/mongodb/connection/connection.js:550:12)
    at Socket.emit (events.js:95:17)
    at TCP.close (net.js:466:12)
Exception in setInterval callback: Error: connection closed
    at Object.Future.wait (/opt/twyce/app/programs/server/node_modules/fibers/fu                                                           ture.js:398:15)
    at Collection.update (packages/meteor/helpers.js:119:1)
    at Object._ping (packages/meteorhacks_cluster/packages/meteorhacks_cluster.j                                                           s:535:1)
    at [object Object]._.extend.withValue (packages/meteor/dynamics_nodejs.js:56                                                           :1)
    at packages/meteor/timers.js:6:1
    at runWithEnvironment (packages/meteor/dynamics_nodejs.js:108:1)
    - - - - -
    at [object Object].<anonymous> (/opt/twyce/app/programs/server/npm/meteorhac                                                           ks_cluster/node_modules/mongodb/lib/mongodb/connection/server.js:609:98)
    at [object Object].emit (events.js:92:17)
    at [object Object].<anonymous> (/opt/twyce/app/programs/server/npm/meteorhac                                                           ks_cluster/node_modules/mongodb/lib/mongodb/connection/connection_pool.js:171:15                                                           )
    at [object Object].emit (events.js:98:17)
    at Socket.<anonymous> (/opt/twyce/app/programs/server/npm/meteorhacks_cluste                                                           r/node_modules/mongodb/lib/mongodb/connection/connection.js:550:12)
    at Socket.emit (events.js:95:17)
    at TCP.close (net.js:466:12)
Cluster: web proxy error:  Connection droped
Cluster: web proxy error:  Connection droped
Cluster: web proxy error:  Connection droped
Cluster: web proxy error:  Connection droped
Cluster: web proxy error:  Connection droped
Cluster: web proxy error:  Connection droped
Cluster: web proxy error:  Connection droped
Cluster: web proxy error:  Connection droped
Cluster: web proxy error:  Connection droped
Cluster: web proxy error:  Connection droped
Cluster: web proxy error:  Connection droped

Whole app crashed :(. We are also using SSL.

mup restart makes the app work again, but I still get some error messages:

Cluster: connecting to 'mongodb' discovery backend
Cluster: with options:  {}
Cluster: registering this node as service 'web'
Cluster:    endpoint url = http://data3-rs0.myapp.io:80
Cluster:    balancer url = https://data3-rs0.myapp.io
 >> stepping down to gid: meteoruser
 >> stepping down to uid: meteoruser
Cluster: web proxy error:  Connection droped
Cluster: web proxy error:  Connection droped
Cluster: web proxy error:  Connection droped
@patrickml

This comment has been minimized.

Copy link
Author

commented Oct 30, 2015

@elie222 i reduced the number of clusters auto made so many the server crashed so I limited mine to 40 on a 16gig server

@bitomule

This comment has been minimized.

Copy link

commented Nov 24, 2015

I also have this issue. It taked down the whole app :( Any idea?

@btoueg

This comment has been minimized.

Copy link

commented Dec 3, 2015

Same here, SSL on and Cluster set to auto

@patrickml

This comment has been minimized.

Copy link
Author

commented Dec 3, 2015

@btoueg reduce the number of clusters from auto to something you think the server can handle. I counted the number running at crash time and cut it in half

@sahanDissanayake

This comment has been minimized.

Copy link

commented Feb 2, 2016

is this Fixed guys ? just learning my way through using using meteor clusters

@evolross

This comment has been minimized.

Copy link

commented Jul 28, 2016

@btoueg reduce the number of clusters from auto to something you think the server can handle. I counted the number running at crash time and cut it in half

Remind us why you would want to set the workers to a value greater than the amount of cores?

@eportico

This comment has been minimized.

Copy link

commented Mar 1, 2017

Any progress in this topic ? I have the same problem in production with 2 workers.
I have 600 clients working against my server and I need scale the server with a cluster.
Thanks,

@evolross

This comment has been minimized.

Copy link

commented Mar 1, 2017

Well the guy who wrote cluster has left the Meteor community and there's articles now on how cluster is actually a bad solution for scaling because it's handling scaling at the application level and not above it (can't find the link right now). It's also not supported on Galaxy. So I'd say this issue probably won't be fixed ever.

@dnish

This comment has been minimized.

Copy link

commented Mar 1, 2017

I would recommend you PM2+NGINX. You need to start every instance in fork mode (not PM2 cluster because it doesn't support sticky sessions) and then do the load balancing via NGINX upstream.

@lmachens

This comment has been minimized.

Copy link

commented May 7, 2017

I have the same issue. Anyone has an idea?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
You can’t perform that action at this time.