This repository has been archived by the owner. It is now read-only.

passing unix signals to child rather than dying? #4603

Closed
evantahler opened this Issue Feb 4, 2014 · 36 comments

Comments

Projects
None yet
@evantahler

evantahler commented Feb 4, 2014

Here's some interesting behavior I've seen when running an application which does some slower shutdown operations when receiving a termination signal.

When running npm start in a *nix terminal (tested on ubuntu and OSX), the SIGINT (ctrl+c) command is passed from the parent npm process to the child (my running application). npm instantly exits, and then my application handles the passed-down signal and eventually exits. However, because the stdout/err of the session was managed by npm, strange things can happen if my application still writes (or console.outs).

At the very least, the terminal which started the application will interpret the npm process' death as a return and draw the next line. However, the child's stdout is still bound to that session. This means you can start another application in that same terminal and still get the first app's messages intermingled. Or, more strangely, if you then close that terminal session, the underlying app which is still running and possibly trying to write to stdout is likely to crash on the next console.log.

You can emulate this pretty simply:

package.json

{
  "name": "test",
  "scripts": {
    "start": "node ./app.js"
  }
}

app.js

process.on('SIGINT',  function(){ console.log("SIGINT");  shutDown() });
process.on('SIGTERM', function(){ console.log("SIGTERM"); shutDown() });

var string = ".";

var shutDown = function(){
  console.log("off-ing...");
  string = "x";

  setTimeout(function(){
    console.log("bye!");
    process.exit();
  }, 1000 * 60);
}

setInterval(function(){
  console.log(string);
}, 500)
  1. start app.js with npm start
  2. ctrl+c in the window you started npm with. You will see npm exit, but still receive messages from app.js.
  3. have fun opening vi or some other application in the same window.

I think a possible solution to this would to have npm catch (process.on('x')) and not actually exit until the child process exits and pass those commands along to the child... but I'm sure that will cause all sorts of other problems.

Thoughts?

@markhuge

This comment has been minimized.

markhuge commented Nov 7, 2014

+1

This becomes an issue when you use npm start in your system init scripts.

It's pretty common to handle logging with npm start >> /var/log/myapp.log 2>&1. This breaks on SIGINT with npm, because the stdout was attached to npm and not the node process.

This also breaks things like upstart, which watch the PID of the commands they invoke. It'll see the process as down, when really it's just npm that's exited.

@othiym23 othiym23 removed the bug label Nov 7, 2014

@othiym23

This comment has been minimized.

Contributor

othiym23 commented Nov 10, 2014

It's arguable that this isn't something that npm itself should try to handle – or at least I'm not sure that I, as an ops engineer, would want npm trying to act as a proxy for OS signals to the child application. Upstart and systemd can be configured to watch not just npm but its child processes, although that isn't the way either behaves by default. In general, I'd want the process monitoring system running the node services directly (to minimize overhead, if nothing else).

I don't think the intent with npm start was that it would keep the process running as a child process of npm. Instead, it would boot the process, which would daemonize itself and then run in the background, returning control to the shell. I'm not saying that the current situation is ideal, just that the assumptions behind the implementation of the feature may be out of whack with user expectations. The question then is how to reconcile this conflict, and my inclination is to fix it with documentation tweaks: I don't think it makes sense to make npm a fully-fledged process monitor when things like pm2 and forever exist. Is there a way to get what you want without increasing npm's scope?

@evantahler

This comment has been minimized.

evantahler commented Nov 10, 2014

Yeah... I agree with @othiym23's thoughts for production apps.

This is more of a development issue for me with actionhero. It's super weird to signal your server to shutdown, get the new-line, and then a few seconds later, see more output.

screen_shot_2014-11-10_at_11_49_46_am

I use monit to run my node apps in production, and never use NPM to start them. I always call the executables directly, ie cd /www/app && ./node_modules/.bin/actionhero startCluster --daemon --workers=4. This way, I can have fine-grain control over the pidfile, etc to avoid problems like @markhuge mentions. I actually don't see this as a failure of NPM. It just seems redundant to call NPM if the application can boot itself.

How common is the NODE_ENV environment variable? Common enough for NPM to change behavior based on it?

@markhuge

This comment has been minimized.

markhuge commented Nov 10, 2014

@othiym23 That makes sense and I agree with you 100%. This is a best practices/docs issue. 👍 For upstart in particular it makes sense to just expect fork and not make this npm's concern.

In practice, as long as the thing that catches your signal eventually exits, everything still works as expected with npm start (which is probably why I never noticed it before).

I think the disconnect is in what the intended use case of npm start is in the docs. Right now the docs on npm start are:

This runs a package's "start" script, if one was provided.

Is there an appropriate place to submit a PR to update the docs? I saw npm/docs.npmjs.com but that seems to be something else.

@othiym23

This comment has been minimized.

Contributor

othiym23 commented Nov 14, 2014

Is there an appropriate place to submit a PR to update the docs? I saw npm/docs.npmjs.com but that seems to be something else.

The docs are up in the air as we try to sort out what goes where. For now, the best place to land doc PRs for stuff that's currently documented in npm's command-line help, manpages, or the webpages on npmjs.org/doc/ is a PR on this very repo. Some of the more concept-orient docs will probably migrate to docs.npmjs.com over time. In general, don't let analysis paralysis get you down – submit patches, and we as a team can figure out where they go. ;)

@vvo

This comment has been minimized.

vvo commented Nov 25, 2014

I also find it disturbing that sending SIGTERM to an npm start(ed) program does not kill the program as it would if it was started with node index.js.

It makes using things like watch_program jsfiles -- npm start impossible. You have to watch_program jsfiles -- node startscript.js.

In this situation, you better not use npm start at all because you will have to maintain two start script commands.

@stuartpb

This comment has been minimized.

Contributor

stuartpb commented Apr 21, 2015

Okay, this is a fairly significant problem for me, as it means that any apps I'm developing to use npm start don't get stopped when they're supposed to (in other words, every deploy I do keeps me waiting for 10 more seconds than it should as it waits for the app to respond to a SIGTERM it never gets, then force-kills the entire process group).

Does Node not provide any mechanism for accessing the process-replacing exec() call used by bash -c? (In other words, https://github.com/jprichardson/node-kexec)

@kimmobrunfeldt

This comment has been minimized.

kimmobrunfeldt commented May 25, 2015

Please fix this issue. Otherwise npm scripts is a good way to organize tasks, but this issue makes the usage a bit unexpected.

Concurrently would need this NPM feature. Explained more here: https://github.com/kimmobrunfeldt/concurrently#npm-issue It's still on-going issue.

@stuartpb

This comment has been minimized.

Contributor

stuartpb commented May 27, 2015

I'd appreciate if everybody affected by this issue on Heroku (or similar) would back my pull request to fix it by copying run from package.json instead of using npm start: heroku/heroku-buildpack-nodejs#217

@ihsw

This comment has been minimized.

ihsw commented Sep 16, 2015

npm start is pretty much worthless at this point, particularly when it comes to using it with Docker as the CMD instruction.

Took me a while to track this down as npm start doesn't propagate (at the very least) SIGTERM and my containers always exited with 137 and I had no idea why.

Welp, looks like it's time to advocate against using npm start for anything.

@JustinBeckwith

This comment has been minimized.

JustinBeckwith commented Nov 27, 2015

+1 to this being a problem. Google App Engine defaults to using npm start to running node processes via docker, as it's been a pretty convenient standard (instead of a procfile):
https://github.com/GoogleCloudPlatform/nodejs-docker/blob/master/base/Dockerfile

We'd like to use SIGTERM to propagate a signal down to the node processes running in the container - right now, I can't get it to work and this seems like the culprit.

@justjake

This comment has been minimized.

justjake commented Dec 10, 2015

I was just bit by this -- I use some npm run scripts to start some mock services as part of my integration tests, but in my CI system, killing the npm processes does not terminate the mock services, so they keep running after the end of my tests! Very frustrating.

@dtryon

This comment has been minimized.

dtryon commented Dec 10, 2015

+1 using node docker images with npm start would be nice.

@mattfysh

This comment has been minimized.

mattfysh commented Dec 23, 2015

+1 - this took a whole day to track down hehe.
Goodbye Tuesday December 22nd 2015, I guess I'll never get those hours back again ;)

@justjake

This comment has been minimized.

justjake commented Dec 23, 2015

An easy work-around is to always spawn npm run as a process group leader. Then when you want to kill the npm run task, you can kill the PGID instead to send your signal to all the processes in the group.

This is Ruby from memory:

pid = Process.spawn('npm run build', :pgroup => true)
# pgid is usually the same as pid but better safe than sorry 
pgid = Process.getpgid(pid)

# kill with negative value to send signal to group
Process.kill(-1 * pgid)
@andrenarchy

This comment has been minimized.

andrenarchy commented Jan 9, 2016

@othiym23 would it be feasible to introduce an option for passing (or not passing) signals to child processes?

Oh, and I just burned 3 hours while dockerizing a node app before finding this.

@othiym23

This comment has been minimized.

Contributor

othiym23 commented Jan 9, 2016

@andrenarchy Why are you using npm as a task manager within Docker? That's an extra node process per container doing almost nothing. I'm sorry about the lost 3 hours – that would frustrate me as well.

would it be feasible to introduce an option for passing (or not passing) signals to child processes?

This isn't behavior that it makes sense to make configurable – if it's going to do this, it should do it all the time. However, there are a couple other considerations I didn't include above, of which the most significant is

An easy work-around is to always spawn npm run as a process group leader.

This is an easy workaround on operating systems that have a concept of a process group leader. Notably, Windows doesn't. It's not a requirement that npm have total parity across the OSes it supports, but it does point to the fact that this isn't as simple to do as it might seem at first.

That said, and taking into account everybody's feedback, I'm going to close this feature request now, because this isn't something that the team is going to have time to get to in the next 6-12 months. npm start / npm stop / npm restart are handy and convenient, but they're not substitutes for a dedicated process manager. If somebody wants to submit a patch that handles this behavior for both UNIXy systems and Windows, we'll land it, but the team isn't going to do it themselves. Thanks to everybody for their feedback!

@aoberoi

This comment has been minimized.

aoberoi commented Nov 22, 2016

i'm in the same boat regarding running my app in docker and wishing the SIGTERM would be passed down. i really would like my package.json to be the source of truth about how to start my app, and I don't want to maintain any changes I make in both that file and my Dockerfile.

@twang2218

This comment has been minimized.

twang2218 commented Dec 12, 2016

I think this issue should be reopened, as it's not working in docker now.

@scriptype

This comment has been minimized.

scriptype commented Feb 16, 2017

Not working in Cygwin either. This causes child processes to stay open after the main process killed by Ctrl-C.

Running

  • a script directly with node script.js is able catch SIGINT in every case (iTerm2, cmd.exe, Cygwin).
  • a script through npm scripts like npm run my-script is same, except when run in Cygwin.

I have this issue in my project: scriptype/salinger-basic-boilerplate#5 (turned into a monologue). Quoting from there:

Cygwin seems to have no problem propagating SIGINT to the process. It's npm scripts that doesn't pass through the SIGINT emitted by Cygwin. Running the same script directly like: node my-script in Cygwin is running normal. Running it through npm scripts like: npm run my-script (after adding my-script key to scripts of package.json) couldn't capture the SIGINT, only in Cygwin.

Versions:

uname -a # Cygwin version
MINGW64_NT-6.1 ITDEV03 2.6.1(0.306/5/3) 2017-01-20 15:23 x86_64 Msys

npm -v
4.1.2

node -v
v7.5.0

My test script:

process.stdin.resume()

process.on('SIGINT', _ => {
  console.log('will be closed in 5s')
  setTimeout(_ => {
    process.exit()
  }, 5000)
})
@sandys

This comment has been minimized.

sandys commented Feb 18, 2017

@twang2218 are you still having this issue ? im currently seeing this in docker myself. ubuntu 16.04 vm.
node v7.5.0
npm v4.1.2

guys is there any workaround for this ?

P.S. dumb-init works though.

becdot added a commit to becdot/simple_docker_app that referenced this issue May 26, 2017

Modify start script to call `node` directly instead of `npm start`
Because `npm` [does not pass signals to its children](npm/npm#4603), the node process that it spawns never receives SIGINT/HUP/TERM signals, despite the additional signal handling that's been added.
@pfdgithub

This comment has been minimized.

pfdgithub commented Aug 28, 2017

i have a similar problem

windows 7 x64  
node v6.11.2  
npm 3.10.10  

when I use npm run xxx, ctrl+c can't kill the child process.

@slavik57

This comment has been minimized.

slavik57 commented Aug 31, 2017

Same here on:
windows 10 x64
node v8.4.0
npm v5.3.0

@Jokero

This comment has been minimized.

Contributor

Jokero commented Jan 24, 2018

It's a bug 😄

@OmgImAlexis

This comment has been minimized.

OmgImAlexis commented Jan 24, 2018

@othiym23 can this be reopened?

@cjoulain

This comment has been minimized.

cjoulain commented Jan 24, 2018

The CLI team encourages you to visit #npm via https://package.community for further discussion.

@willxy

This comment has been minimized.

willxy commented Feb 11, 2018

I'm experiencing the same problem here. I'm needing my npm script to receive the INT, not be caught by npm.

@i0natan

This comment has been minimized.

i0natan commented Feb 12, 2018

For most of Node.JS developer, running "npm start" is an hassle-free way to initiate Node.JS apps. In the era of Docker & K8S, it doesn't make sense that a container/pod won't receive signals and can act before closing. Please fix this

orangejulius added a commit to pelias/pip-service that referenced this issue Mar 12, 2018

Add script to start service
This script uses `exec` to ensure the PID of the script is the PID of
the PIP service later.

The Dockerfile is updated to use the script, so that containers can be
shut down immediately by avoiding any intermediate processes that won't
forward signals.

See npm/npm#4603
Fixes #61
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.