Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Meteor will Self-Terminate Under High CPU Use and/or Heavy Method Call Stress #1302

Closed
meawoppl opened this issue Aug 13, 2013 · 4 comments
Closed

Comments

@meawoppl
Copy link
Contributor

TL;DR The current proxy and sub-process management can kill otherwise well-behaved meteor projects when high CPU loads are encountered. The characteristic behavior is seen as:

Failed to receive keepalive! Exiting.
=> Exited with code: 1

events.js:71
...
less useful stack barf

The outer proxy that Meteor sets up pings the inner running process every two seconds with the letter "k" to signify that it should not commit ritual seppuku to please its master:

// Keepalive so server can detect when we die
  var timer = setInterval(function () {
    try {
      if (proc && proc.pid && proc.stdin && proc.stdin.write)
        proc.stdin.write('k');
    } catch (e) {
      // do nothing. this fails when the process dies.
    }
  }, 2000);

On the inner server side this is met with the following function,

var init_keepalive = function () {
  var keepalive_count = 0;

  process.stdin.on('data', function (data) {
    keepalive_count = 0;
  });

  process.stdin.resume();

  setInterval(function () {
    keepalive_count ++;
    if (keepalive_count >= 3) {
      console.log("Failed to receive keepalive! Exiting.");
      process.exit(1);
    }
  }, 3000);
};

which is run contingent on a hard-wired sub-process environment argument, found on this line

This wiring introduces three dependencies:
-the outer shell forwarding stuff promptly
-the correct performance of the outer proxy
-additionally a number of these callbacks can stack up and be executed roughly at the same time (libev manual for the interested), but the high level Node.js disclaimer that, "Node.js makes no guarantees about the exact timing of when the callback will fire, nor of the ordering things will fire in. The callback will be called as close as possible to the time specified.", pretty much wraps that up.

Anyway, my suggestion, (will write pull request if desired) would be to re-write the inner server function with a setTimeout based keepalive poll, which re-registers itself when run.

The fact this is passed as an environment variable suggests that this is an optional behavior and might not be used in the bundled version of apps?

@andreioprisan
Copy link

I ran into this as well and would love to see it done. If you write it at least I'll use it :)

@meawoppl
Copy link
Contributor Author

I am not 100% sure that it will fix everything w/o introducing some other oddity. In the setTimeout version there is a possibility that the outer server takes a rather long time to terminate the inner given a problem + high CPU. I have been told that the bundled version of an app does not have this wrapping layer, but I have not dug into the machinery enough to be certain this fixes my problems.

@meawoppl
Copy link
Contributor Author

NB: this can also happen when you accidentally subscribe to too many records as in this SO question.

At the very least, the error message should be a ton more informative given it would likely be given in a dev environment. Writing a pull request.

EDIT: Will post pull after github finishes getting DDoS'ed. . .

@glasser
Copy link
Contributor

glasser commented Aug 19, 2013

Note that this is just a behavior of the develop-mode meteor run (and any hosting environment that chooses to turn on the keepalive option, which probably isn't most of them), not a production issue. And in any case, if your Node process is churning CPU for seconds, it's not going to be able to respond to any network traffic.

@glasser glasser closed this as completed Apr 22, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants