Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nodemon doesn't wait for SIGUSR2 graceful shutdown? #1889

Closed
rocketeerbkw opened this issue Jul 8, 2021 · 34 comments
Closed

Nodemon doesn't wait for SIGUSR2 graceful shutdown? #1889

rocketeerbkw opened this issue Jul 8, 2021 · 34 comments
Labels

Comments

@rocketeerbkw
Copy link

  • nodemon -v: 2.0.9
  • node -v: v16.1.0
  • Operating system/terminal environment: macOS Catalina, iTerm 2
  • Using Docker? What image: Yes, but I have a non-docker breaking example
  • Command you ran: yarn run nodemon watch.js

I'm trying to write a shutdown handler to gracefully close my mariadb sql client pool, but it never works because the script is killed too soon. I copied the example from the README and changed the signal to SIGTERM but it never lets the shutdown function finish.

If I manually send a kill -s SIGUSR2 <pid> then it will wait for the shutdown script, but nodemon doesn't wait.

My shutdown function uses async/await, but I was able to reproduce with out it. All the below info applies to async/await too (meaning it works with manual kill but not nodemon).

Expected behaviour

The script doesn't get killed by nodemon before it can gracefully shut itself down

Actual behaviour

Nodemon kills the script before my SIGUSR2 script can finish

Steps to reproduce

create a file watch.js:

process.stdin.read();

process.once('SIGUSR2', (signal) => {
  for (let i = 1; i <= 10000; i++) {
    console.log(i);
  }

  process.kill(process.pid, signal);
});

Save watch.js to trigger a restart and notice that the logs never reach 10,000 (for me it never goes past 4k). Send a manual kill -s SIGUSR2 <pid> and notice that the logs go to 10k, the script ends and then nodemon restarts it.


If applicable, please append the --dump flag on your command and include the output here ensuring to remove any sensitive/personal details or tokens.

@rocketeerbkw
Copy link
Author

I can reproduce on a linux system as well

  • nodemon -v: 2.0.10
  • node -v: v12.21.0
  • Operating system/terminal environment: Ubuntu 21.04 on a raspberry pi 4, bash
  • Using Docker? What image: no
  • Command you ran: ./node_modules/.bin/nodemon watch.js

@remy
Copy link
Owner

remy commented Jul 8, 2021

That's correct. nodemon sends the SIGUSR2 signal to the subprocess and if it doesn't exit within a predefined time frame, it'll force quit the sub process. Otherwise your sub process could be stuck hanging and nodemon would never be able to restart it.

@remy remy closed this as completed Jul 8, 2021
@rocketeerbkw
Copy link
Author

Can the "predefined time frame" be changed? I can't even close a database pool to a local db before nodemon kills it and crashes the database. This is a "guess I can't use nodemon" issue for me.

@remy
Copy link
Owner

remy commented Jul 8, 2021

🤔 thinking back, I originally wrote it so that if you trapped the signal from nodemon then your script would be responsible for stopping.

Logic has changed over the 11 years I've been developing nodemon so maybe it's been lost along the way. Though quickly looking this evening, it looks as I'd expect:

	  subprocess.kill(signal);

      pids.sort().forEach(pid => exec(`kill -${sig} ${pid}`, noop));

      waitForSubProcesses(child.pid, () => {
        // finally kill the main user process
        exec(`kill -${sig} ${subprocess.pid}`, callback);
      });

@rocketeerbkw
Copy link
Author

if you trapped the signal from nodemon then your script would be responsible for stopping

That's what I assumed would happen based on the README:

If you need to clean up on shutdown inside your script you can capture the kill signal and handle it yourself.

Note that the process.kill is only called once your shutdown jobs are complete.

I did some timings locally and nodemon kills it after ~50ms for me. That seems really short.

process.stdin.read();

process.once('SIGUSR2', async(signal) => {
  while(true) {
    console.log(Date.now());
  }
});

@rocketeerbkw
Copy link
Author

I'm looking into this more, I think it's a bug. The code for waitForSubProcesses has a debug message "still waiting for x sub-process to finish..." and combined with the README comments I think that nodemon should wait for the process to exit itself.

I added some debug statements and it looks like the psTree call in waitForSubprocesses returns an empty array for pids which means the function immediately calls the callback which kills the process.

rs
  nodemon bus emit: restart +3s
  nodemon:run psTree of 53461: [] +3s
  nodemon:run sending kill signal to [] +0ms
  nodemon:run checking ps tree for pids of 53461 +0ms
watch.js: SIGUSR2 start
  nodemon:run psTree of 53461: [] +41ms
  nodemon:run bus.emit(exit) via SIGUSR2 +8ms
  nodemon bus emit: exit +92ms
[nodemon] starting `node watch.js`
  nodemon:run fork sh -c node watch.js +3ms
  nodemon bus new listener: exit (1) +3ms
  nodemon:run start watch on: [ '*.*', re: /.*\..*/ ] +0ms
  nodemon:watch early exit on watch, still watching (1) +4s

I'm assuming that the "subprocess" here means the script that nodemon is running.

@rocketeerbkw
Copy link
Author

I used git bisect and determined that this is the breaking commit: 47dfb8b.

The biggest change I see in that commit is the addition of child.kill(signal); before it calls waitForSubProcesses. If I commend out that line on master then nodemon waits for my SIGUSR2 function to complete as I expect.

@remy
Copy link
Owner

remy commented Jul 11, 2021

And with it comments out and without the sub process handling the exit itself does the spawned sub process still restart?

In theory, the kill signal is still SIGUSR2 which doesn't force a shutdown if it's being handled by the sub process...

@rocketeerbkw
Copy link
Author

rocketeerbkw commented Jul 11, 2021

without the sub process handling the exit itself

meaning don't call process.exit? When using this example (and commenting out child.kill) the loop completes and nodemon restarts:

process.once('SIGUSR2', (signal) => {
  for (let i = 1; i <= 10000; i++) {
    console.log(i);
  }
});
  nodemon watch is complete +131ms
rs
  nodemon bus emit: restart +2s
1
2
3
...
9998
9999
10000
  nodemon bus.emit(exit) via SIGUSR2 +3s
  nodemon bus emit: exit +292ms
[nodemon] starting `node watch.js`
  nodemon fork sh -c node watch.js +3ms
  nodemon bus new listener: exit (1) +4ms
  nodemon start watch on: [ '*.*', re: /.*\..*/ ] +1ms
  nodemon:watch early exit on watch, still watching (1) +2s

I also tried with a async/await based delay wait 1 second and it also works without child.kill:

const delay = ms => new Promise(resolve => setTimeout(resolve, ms))

process.once('SIGUSR2', async (signal) => {
  console.log(`watch.js: ${signal} start`);
  await delay(1000);
  console.log(`watch.js: ${signal} stop`);
  process.kill(process.pid, signal);
});
  nodemon watch is complete +125ms
rs
  nodemon bus emit: restart +1s
watch.js: SIGUSR2 start
watch.js: SIGUSR2 stop
  nodemon bus.emit(exit) via SIGTERM +3s
  nodemon bus emit: exit +1s
[nodemon] starting `node watch.js`
  nodemon fork sh -c node watch.js +4ms
  nodemon bus new listener: exit (1) +4ms
  nodemon start watch on: [ '*.*', re: /.*\..*/ ] +0ms
  nodemon:watch early exit on watch, still watching (1) +2s

If I use an infinite loop, it looks like nodemon does not kill it, it just waits forever:

process.once('SIGUSR2', (signal) => {
  while(true) {
    console.log(Date.now());
  }
});

@remy
Copy link
Owner

remy commented Jul 11, 2021

Sorry, I wasn't clear at all (!). If you remove the hook in your subprocess that listens for SIGUSR2 does the process still restart?

@rocketeerbkw
Copy link
Author

Without a SIGUSR2 listener and without child.kill it does get killed:

console.log('watch.js: process started');
process.stdin.read();
  nodemon watch is complete +39ms
watch.js: process started
rs
  nodemon bus emit: restart +7s
  nodemon:run sending kill signal to  +7s
  nodemon:run checking ps tree for pids of 12767 +1ms
  nodemon:run bus.emit(exit) via SIGUSR2 +3ms
  nodemon bus emit: exit +47ms
[nodemon] starting `node watch.js`
  nodemon:run fork sh -c node watch.js +3ms
  nodemon bus new listener: exit (1) +4ms
  nodemon:run start watch on: [ '*.*', re: /.*\..*/ ] +1ms
  nodemon:watch early exit on watch, still watching (1) +7s
watch.js: process started

@remy
Copy link
Owner

remy commented Jul 11, 2021

Right, well first of all: excellent job in following up and digging.

That's definitely a regression and I'll see if I can find time this week to sort it (or happy to take a pr).

@remy remy reopened this Jul 11, 2021
@remy remy added bug confirmed bug not-stale Tell stalebot to ignore this issue labels Jul 11, 2021
@rocketeerbkw
Copy link
Author

This is my understanding so far, but I don't know enough to fix it and be confident that it doesn't break something else.
run_js_—_nodemon

levirs565 added a commit to levirs565/quizx that referenced this issue Jul 20, 2021
Remove underscore and express validator

Use nodemon version 2.0.7

Latest version of nodemon have issue in windows (remy/nodemon#1889).
@lukepolo
Copy link

lukepolo commented Feb 2, 2022

Any workarounds for the time being? It's not super critical but would be nice!

@rocketeerbkw
Copy link
Author

I switched to tsc-watch.

If you're fine running a patched version, you can make the change in my previous comment. There's a chance you'll get stuck in an infinite loop but if you're function is simple enough maybe that's not a big enough deterrent?

mikob added a commit to mikob/nodemon that referenced this issue Mar 8, 2022
workaround for remy#1889
@leppaott
Copy link

leppaott commented Jun 18, 2022

process.once('SIGUSR2',  () => {
      while (true) {
        console.log(Date.now());
      }
    });

This does indeed log for a long time.

Then asynchronously waiting for cleanup to be finished:

process.once('SIGUSR2', async () => {
     console.log('start');
     await asyncCloseCallback();
     console.log('end');
   });

This indeed logs only the "start" and restarts in middle of cleanup.

@leppaott
Copy link

leppaott commented Jun 18, 2022

@lukepolo downgrade nodemon to 2.0.1 same for nodemon-webpack-plugin:

    "nodemon": "2.0.1",
    "nodemon-webpack-plugin": "4.2.1",

After this my above example correctly waits for the cleanup:
Change process.once into process.on

start
[nodemon] still waiting for 1 subprocess(es) to finish...
[nodemon] still waiting for 1 subprocess(es) to finish...
end

@remy do you have time for this?

@ulion
Copy link

ulion commented May 31, 2023

This is definitely what I encountered, the SIGUSR2 was sent twice in a short time, so no wait cleanup at all.
The FAQ/Readme still teach user to capture SIGUSR2 once, but it does not work according to this. you have to capture it twice for the no-subprocess case. it's a terrible bug.

@ulion
Copy link

ulion commented May 31, 2023

turns out, maybe it's not only caused by nodemon.
even after I changed nodemon logic to only send the signal once for solo child process, it still does not work well with the process.once signal trick.

until this trick:

function onNodemonAskedRestart(signal) {
  gracefulExit(signal, () => {
    console.log('restart by SIGUSR2')
    process.kill(process.pid, 'SIGUSR2')
  })

  setTimeout(() => { // this timeout seems be critical, else node will exit the process without running any pending scheduled tasks.
    process.off('SIGUSR2', onNodemonAskedRestart)
  })
}
process.on('SIGUSR2', onNodemonAskedRestart)

@basicdays
Copy link

Running into this as well. With the setTimeout workaround that @ulion used, it seems to almost work. Except nodemon seems to just start the next process before the previous process has finished exiting. I tested by just adding/removing a JS comment twice to cause the error.

Here's the code and output:

function handleNodemonRestart() {
	console.log("Nodemon requested cleanup");
	cleanupServer()
		.then(() => {
			console.info("Finished nodemon cleanup");
			process.kill(process.pid, "SIGUSR2");
		})
		.catch((err) => {
			console.error("PANIC: failed to cleanup");
			console.error(err);
			process.exit(10);
		});
	setTimeout(() => {
		process.off("SIGUSR2", handleNodemonRestart);
	});
}
process.on("SIGUSR2", handleNodemonRestart);
[nodemon] 2.0.22
[nodemon] to restart at any time, enter `rs`
[nodemon] watching path(s): *.*
[nodemon] watching extensions: js,mjs,json
[nodemon] starting `node ./lib/bin/appjs`
2023-06-22T20:32:12.508Z [prisma] info: Starting a mysql pool with 17 connections.
2023-06-22T20:32:12.616Z [app] info: 🚀 Server ready at http://localhost:8080/graphql
[nodemon] restarting due to changes...
Nodemon requested cleanup
[nodemon] starting `node ./lib/bin/appjs`
Finished nodemon cleanup
2023-06-22T20:32:27.307Z [prisma] info: Starting a mysql pool with 17 connections.
2023-06-22T20:32:27.413Z [app] info: 🚀 Server ready at http://localhost:8080/graphql
[nodemon] restarting due to changes...
[nodemon] starting `dotenv -e ../../.env -c "${APP_ENV-shared}" -- node ./lib/bin/reverb.js`
2023-06-22T20:32:34.834Z [prisma] info: Starting a mysql pool with 17 connections.
2023-06-22T20:32:34.950Z [app] error: PANIC: failed to start the server
2023-06-22T20:32:34.952Z [app] error: listen EADDRINUSE: address already in use :::8080
[nodemon] app crashed - waiting for file changes before starting...

@ulion
Copy link

ulion commented Jun 23, 2023

@basicdays indeed, I suppose it should only off the handler when you are ready to exit:

function onNodemonAskedRestart(signal) {
  gracefulExit(signal, () => {
    process.off('SIGUSR2', onNodemonAskedRestart)
    console.log('restart by SIGUSR2')
    process.kill(process.pid, 'SIGUSR2')
  })
}
process.on('SIGUSR2', onNodemonAskedRestart)

@basicdays
Copy link

basicdays commented Oct 5, 2023

Fwiw I do actually have the following code. Unfortunately it still causes the same issues on the latest version (3.0.1). Still stuck on version 2.0.1:

// set nodemon restart handles
function handleNodemonRestart() {
	logger.info("Nodemon requested cleanup");
	stopServer()
		.then(() => {
			logger.info("Finished nodemon cleanup");
			process.off("SIGUSR2", handleNodemonRestart);
			process.kill(process.pid, "SIGUSR2");
		})
		.catch((err) => {
			logger.error("PANIC: failed to cleanup");
			logger.error(err);
			process.exit(10);
		});
}
process.on("SIGUSR2", handleNodemonRestart);

@basicdays
Copy link

@remy Looking through some of the other issues here, I saw your comment here.

I just tested directly called nodemon instead of going through npm run or npx. It seems it's properly waiting for the graceful exit to finish before restarting. What's wild is that for version 2.0.1, it properly works through npm run or npx. Testing version 2.0.2 causes the bug to pop up. My guess is there must have been some code that was removed in 2.0.2 to make sure the process properly exited before starting the new process.

@LiJell
Copy link

LiJell commented Nov 17, 2023

I had a same problem, so I have been stick on version 2.0.1 too

@remy
Copy link
Owner

remy commented Feb 22, 2024

For those of you sticking on v2.0.1 to avoid the issue, what version of node are you using?

@davemooreuws
Copy link

davemooreuws commented Feb 22, 2024

For those of you sticking on v2.0.1 to avoid the issue, what version of node are you using?

For me, currently on v20.8.0. Windows 11

@remy
Copy link
Owner

remy commented Feb 23, 2024

@davemooreuws and 3.x behaves with the same bug as originally described?

@davemooreuws
Copy link

Yes, the node processes don't all get killed in version 3. At least on windows, not sure about other platforms.

@Awoogamuffin
Copy link

I'm writing to confirm that on version 3.1.0 this remains an issue. A hacky workaround is to call process.once('SIGUSR2') a second time, within the first handler, which feels very hacky.

process.once('SIGUSR2', async () => {

    // we expect nodemon to send another signal very soon after the first, so we listen for it again.
    // I've set this timeout to repeat the signal after 500 ms in case nodemon doesn't
    const secondSignalTimeout = setTimeout(() => {
        console.warn('second signal not received. exiting anyway');
        process.kill(process.pid, 'SIGUSR2')
    }, 500);

    process.once('SIGUSR2', async () => {
        clearTimeout(secondSignalTimeout);
        try {
            await doTheStuffINeedToDoBeforeShutdown();
        } finally {
            process.kill(process.pid, 'SIGUSR2')
        }
    });
});

@LiJell
Copy link

LiJell commented Apr 2, 2024

For those of you sticking on v2.0.1 to avoid the issue, what version of node are you using?

currently using node 18 version

@Otard95
Copy link

Otard95 commented Apr 24, 2024

For those of you sticking on v2.0.1 to avoid the issue, what version of node are you using?

Using v20.10.0 here. Just found this issue and downgraded from 3.1.0 to 2.0.1, that fixed my issues.

@donato
Copy link

donato commented Apr 27, 2024

I'm experiencing the same thing using latest node and nodemon. I also tried downgrading versions but was unable to get the workarounds working in my docker environment.

# Dockerfile
FROM node:20.10.0
RUN npm install nodemon@2.0.1 -g

COPY "start.sh" /app
ENTRYPOINT ["/usr/bin/env"]
CMD ["bash", "/app/start.sh"]
# start.sh
npm install
nodemon -L index.ts
// index.ts - shutdown code
process.on('SIGUSR2', async () => {
  console.log('SIGUSR2 signal received');
  await shutdown();
  process.kill(process.pid, 'SIGUSR2');
});

@remy
Copy link
Owner

remy commented Jun 19, 2024

I've finally dug into this and found that this line is the issue (introduced a while back) - but someone else recently opened an issue which was (effectively) the same issue (but I didn't put them together until today): 47dfb8b

I the theoretical fix is to remove that single line, because a few lines later, the process sends an exec to send the same kill signal. (the problem is the impact on why it was there in the first place, I'm fairly should it was in there before it starts cleaning the subprocesses because it has to happen first… but I'm not 100% sure).

The reason why nodemon appears not to wait, is because the subprocess is recieving two SIGUSR2 signals - the first is the linked code, the second is after a very short, but long enough, delay in the exec. So the subprocess bails on the 2nd call (because samples/tests used process.once('SIGUSR2'….

It could be mitigated by using process.on('SIGUSR2' and when the process is done tidying up, it sends itself process.kill(process.pid, 'SIGTERM') which wouldn't be traped, ie. based on the sample from @donato:

process.on('SIGUSR2', async () => {
  console.log('SIGUSR2 signal received');
  await shutdown();
  process.kill(process.pid, 'SIGTERM');
});

This would trap the double SIGNUSR2, likely not execute twice (because it's busy running shutdown) then when the subprocess is done, it terminates the process itself cleanly.

So I've got two options:

  1. Remove the early child.kill - the risk being a knock on effect leaving zombie processes running and causing some other whack-a-mole issue.
  2. Document the trapping process properly, so that you're guided to use .on and not .once and once the trap is complete, to send a different kill signal (preferably SIGTERM).

My gut is to be wary of core code changes, and use documentation. But I'm happy to hear input.

Copy link

🎉 This issue has been resolved in version 3.1.4 🎉

The release is available on:

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests