New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parent process killed when only child process should be terminated #7

Open
lgreenspan opened this Issue Jun 3, 2010 · 6 comments

Comments

Projects
None yet
3 participants
@lgreenspan

lgreenspan commented Jun 3, 2010

when removing the line "process.exit(0);" from examples/while_true.js and run the script, we get:

this will wait up to 5s before killing the script ...
DEBUG: request queue: 1
DEBUG: spawned pid 6511
DEBUG: --> pid 6511: {"cmd":"compile","scriptName":"while forever loop should not run forever","script":"while (true) {}"}
[JEFE] WARNING! memory monitoring problem on darwin: Error: unsupported
DEBUG: --> pid 6511: {"cmd":"run","scriptName":"while forever loop should not run forever","sandbox":{}}
DEBUG: killing 6511; handled 1 requests.
error = killed: too much time taken
sys.p will be removed in future versions of Node. Use sys.puts(sys.inspect()) instead.

{ runs: 0, totalRunTime: 0, kills: { time: 1 } }
/Users/lgreenspan/workspaces/container/lib/jefe/lib/jefe.js:205
      throw new Error("Unexpected Error: " + response.reason);
            ^
Error: Unexpected Error: child died while running script
    at ChildHandler.callback (/Users/lgreenspan/workspaces/container/lib/jefe/lib/jefe.js:205:13)
    at ChildProcess.<anonymous> (/Users/lgreenspan/workspaces/container/lib/jefe/lib/handler.js:50:12)
    at ChildProcess.emit (events:42:20)
    at Stream.<anonymous> (child_process:108:12)
    at Stream.emit (events:25:26)
    at net:1001:12
    at EventEmitter._tickCallback (node.js:48:25)
    at node.js:204:9
@lgreenspan

This comment has been minimized.

Show comment
Hide comment
@lgreenspan

lgreenspan Jun 3, 2010

Using jefe (ef26fd8) and node.js (55d73521891c5dbc9e7bd697b386e00620eac1df)

lgreenspan commented Jun 3, 2010

Using jefe (ef26fd8) and node.js (55d73521891c5dbc9e7bd697b386e00620eac1df)

@lgreenspan

This comment has been minimized.

Show comment
Hide comment
@lgreenspan

lgreenspan Jun 8, 2010

Any news on this one?

lgreenspan commented Jun 8, 2010

Any news on this one?

@fictorial

This comment has been minimized.

Show comment
Hide comment
@fictorial

fictorial Jun 8, 2010

Owner

Sorry, I'm swamped at this time. Can you dive in?

Owner

fictorial commented Jun 8, 2010

Sorry, I'm swamped at this time. Can you dive in?

@lgreenspan

This comment has been minimized.

Show comment
Hide comment
@lgreenspan

lgreenspan Jun 8, 2010

OK. Here are my findings:

Jefe sets a timeout in where it kills the child process. Once this happens the exit callback in process.js is invoked. However the this reference in there is not going to ChildHandler, but to node.js's ChildProcess. This makes it impossible in exit handler to differentiate between a SIGTERM which came from ChildHandler.kill vs. an external SIGTERM. (NB: this.available will also always be false as this property doesn't exist in ChildProcess). To disambiguate we may set a property in ChildProcess when ChildHandler.kill is invoked. The we know in the exit handler if the exit is caused by us or not:

diff --git a/lib/handler.js b/lib/handler.js
index 7529410..fd141b0 100644
--- a/lib/handler.js
+++ b/lib/handler.js
@@ -46,6 +46,11 @@ function ChildHandler(proc) {
   // the operation.

   proc.addListener("exit", function (code) {
+    if (this.suicide) {
+      if (common.debugMode)
+        sys.debug("committed suicide.");
+      return;
+    }
     if (!this.available && self.callback) 
       self.callback({ ok: false
                     , reason: common.ERR_UNEXPECTED_DEATH 
@@ -210,6 +215,7 @@ ChildHandler.prototype.kill = function () {
   this.available = false;
   this.cancelKillSwitches();

+  this.process.suicide = true;
   this.process.stdin.end();
   this.process.kill();
 };

lgreenspan commented Jun 8, 2010

OK. Here are my findings:

Jefe sets a timeout in where it kills the child process. Once this happens the exit callback in process.js is invoked. However the this reference in there is not going to ChildHandler, but to node.js's ChildProcess. This makes it impossible in exit handler to differentiate between a SIGTERM which came from ChildHandler.kill vs. an external SIGTERM. (NB: this.available will also always be false as this property doesn't exist in ChildProcess). To disambiguate we may set a property in ChildProcess when ChildHandler.kill is invoked. The we know in the exit handler if the exit is caused by us or not:

diff --git a/lib/handler.js b/lib/handler.js
index 7529410..fd141b0 100644
--- a/lib/handler.js
+++ b/lib/handler.js
@@ -46,6 +46,11 @@ function ChildHandler(proc) {
   // the operation.

   proc.addListener("exit", function (code) {
+    if (this.suicide) {
+      if (common.debugMode)
+        sys.debug("committed suicide.");
+      return;
+    }
     if (!this.available && self.callback) 
       self.callback({ ok: false
                     , reason: common.ERR_UNEXPECTED_DEATH 
@@ -210,6 +215,7 @@ ChildHandler.prototype.kill = function () {
   this.available = false;
   this.cancelKillSwitches();

+  this.process.suicide = true;
   this.process.stdin.end();
   this.process.kill();
 };
@jamesmacaulay

This comment has been minimized.

Show comment
Hide comment
@jamesmacaulay

jamesmacaulay Aug 27, 2010

Just chiming in to say that I ran into the same problem, tried this patch out, and it seems to work fine.

jamesmacaulay commented Aug 27, 2010

Just chiming in to say that I ran into the same problem, tried this patch out, and it seems to work fine.

@fictorial

This comment has been minimized.

Show comment
Hide comment
@fictorial

fictorial Aug 31, 2010

Owner

Ok good to know. I have been on daddy duty plus trying to launch a startup for a few months. My apologies for not integrating the fix yet.

Owner

fictorial commented Aug 31, 2010

Ok good to know. I have been on daddy duty plus trying to launch a startup for a few months. My apologies for not integrating the fix yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment