Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Process groups #82

Merged
merged 1 commit into from
Jun 9, 2023
Merged

Process groups #82

merged 1 commit into from
Jun 9, 2023

Conversation

airhorns
Copy link
Contributor

@airhorns airhorns commented Jun 8, 2023

We noticed an issue where wds can leave around zombie processes when reloading if the process running under wds creates its own subprocesses. This happens in gadget when we use wds to run the api process, which then runs its own sandbox subprocesses. When wds SIGTERMs or SIGKILLs the api process, the signals don't get sent down to the child process. I am not exactly sure why this is in POSIX land, but the internet has some good evidence this is the case.

When wds wants to kill the orchestrated process, we can ensure child processes are terminated too by killing the whole process group. See https://azimi.me/2014/12/31/kill-child_process-node-js.html for details

Copy link
Contributor

@jasong689 jasong689 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense to me. I can run it a bit in Gadget to see if it fixes my issue

src/Supervisor.ts Outdated Show resolved Hide resolved
…one child process

This fixes a bug where when wds is orchestrating a process that starts
its own child processes, those child processes can be come zombies if we
just SIGTERM/SIGKILL the main orchestrated process. What we need to do
is kill the whole process group, which you can do by sending the
negative pid of the parent to the kernel.

See https://azimi.me/2014/12/31/kill-child_process-node-js.html
@jasong689
Copy link
Contributor

I did unfortunately run into the EADDRINUSE error again, but it was in a temporal worker this time and not the api

@airhorns airhorns merged commit d9833a5 into main Jun 9, 2023
2 checks passed
@airhorns airhorns deleted the process-groups branch June 9, 2023 16:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants