Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding optional switch to enable graceful shutdown on Windows #4474

Merged
merged 3 commits into from
Oct 30, 2019
Merged

Adding optional switch to enable graceful shutdown on Windows #4474

merged 3 commits into from
Oct 30, 2019

Conversation

aleksk
Copy link
Contributor

@aleksk aleksk commented Oct 24, 2019

Q A
Bug fix? yes
New feature? yes
BC breaks? no
Deprecations? no
Tests pass? yes
Fixed tickets #3651, #3691, #3555
License MIT

Problem:
There is no official and easy way to implement the graceful shutdown on Windows.

For graceful shutdown PM2 pretty much relies on the process.kill() and applications being able to ignore SIGINT to delay shutdown, but it doesn't work on Windows.
https://pm2.keymetrics.io/docs/usage/signals-clean-restart/

There are variables you can set to alter the PM2 kill behavior: PM2_KILL_SIGNAL (SIGINT by default), PM2_KILL_TIMEOUT (1600 by default), the --treekill parameter (true by default), and kill retry time.


There are a couple of issues with PM2's graceful shutdown on Windows:

  • The shutdown message is only sent if the softReload method is used:
    old_worker.send && old_worker.send('shutdown');
    I.e. it works only if you call pm2 startOrGracefulReload <json>, and if the application is automatically restarted by the PM2. But it does not work for kill/stop/restart/reload and for scale down operations.
  • People are often confused about process.kill() behavior on Windows due to unclear documentation (https://nodejs.org/api/process.html#process_process_kill_pid_signal) - Windows doesn't support signals; instead, nodejs emulates process.kill() behavior for 3 signals and ignores everything else: SIGINT/SIGTERM - terminates app; 0 - checks whether the process exists. What is often confusing is that if you run nodejs app in cmd and use Ctrl-C, it does receive SIGINT, - but it's emulated.

Attempted solutions:

  1. Existing bugreports. There a bunch of posts on stackoverflow/github about emulating SIGINT/SIGBREAK by handling STDIN closure: https://stackoverflow.com/questions/10021373/what-is-the-windows-equivalent-of-process-onsigint-in-node-js
    Doesn't work, and it I'm not sure how well that idea is going to work with different CLI modes PM2 can launch the app.
  2. https://www.npmjs.com/package/windows-kill - native nodejs plugin (up to API 67?).
    Can actually invoke the signal handler in the nodejs, but it does so for all child processes too, which makes it unusable in Cluster mode - signal will be received by all instances.
    But that module replaces global process.kill, so I thought about doing it manually.
    NOTE; If you play with the process.kill(), you have to disable treekill, because it uses Windows utility by default instead.
  3. process.kill() substitution. You can wrap PM2 startup in another script and replace process.kill there. This will still allow running PM2 as a service, and not only in the standalone mode.
    The idea there is to handle the SIGINT signal to notify an app that is about to be closed, and rely on the PM2 kill timeout to send the SIGTERM and explicitly close the app.
    It's actually a 'reasonable' workaround, but you can't use any messaging methods (like sendDataToProcessId or trigger custom action), because the process is marked as stopping internally and all communication to such processes is blocked:
    if (proc.pm2_env.status != cst.ONLINE_STATUS && proc.pm2_env.status != cst.LAUNCHING_STATUS)

    So you eventually will have to get access to the God.clusters_db[] object to be able to use process.send - this is not 'trivial'.
  4. Since the process.kill() replacement is quirky and is tightly coupled with PM2 version (relies on veriable names and etc.), you can notify the specific app instance with the sendDataToProcessId or another way, if you can get the instance that is going to be shutdown.
    It's easy to guess, however, also PM2 version-specific, but somewhat less than previous item.
    Let's say you want to stop/scale down an app. You need to call the describe or list first to get ID of the instances and their order. PM2 internally represents this list in the same order it kills apps - it uses filtered arrays and a simple for loop, so it's easy to guess and pre-notify apps. And you can rely on the 'shutdown' message for the auto-restarts.

I thought about fixing the higher-level abstractions, but it feels like that the change in that business logic is too risky and won't be accepted (considering the previous PRs I've seen):
God.killProcess is called only from the God.stopProcessId, so it could be refactored to accept switch to soft/hard stop the app. But if you do that, there is no need in the softReload function, and entire startOrGracefulReload feature relies on it.
And it's hard to say what kind of regressions could be introduced by affecting the timing of the shutdown message.


Proposed solution:

I think the easiest way to provide a Windows-specific workaround is to introduce a new global switch (PM2_KILL_USE_SEND environment variable, name TBD) to change how God.killProcess works - it's intent actually not to kill the process, but to notify it, and then the God.processIsDead method explicitly kills the process on timeout, if it's not dead.
So, by allowing God.killProcess to use process.send() instead of process.kill(), apps can be notified and then killed by timeout, if needed.
It's also possible to change the shutdown message with the existing PM2_KILL_SIGNAL variable.

Sample app:

setInterval(() => {}, 1000);

process.on('message', msg => {
  if (msg === 'SIGINT') {
    // Perform cleanup, app will be automatically terminated on kill timeout.
  }
});

@CLAassistant
Copy link

CLAassistant commented Oct 24, 2019

CLA assistant check
All committers have signed the CLA.

@aleksk aleksk marked this pull request as ready for review October 24, 2019 23:01
…able 'PM2_KILL_USE_SEND' (false by default) to allow PM2 to use process.send() instead of the process.kill() from the God.killProcess() method.

This allows applications not to depend on the platform-specific signal behavior and eventually have a graceful shutdown capability on Windows.
@Unitech Unitech merged commit f4833a2 into Unitech:development Oct 30, 2019
@Unitech
Copy link
Owner

Unitech commented Oct 30, 2019

thanks for this PR

inerc pushed a commit to inerc/pm2 that referenced this pull request Feb 11, 2020
Adding optional switch to enable graceful shutdown on Windows
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants