Fix process group termination: always send SIGKILL after SIGTERM#151
Conversation
| os.killpg(os.getpgid(self.process.pid), signal.SIGKILL) | ||
|
|
||
| with contextlib.suppress(OSError, ProcessLookupError): | ||
| os.killpg(pgid, signal.SIGKILL) |
There was a problem hiding this comment.
What's the rationale for this change? I don't think it's good style to kill a process that we know has terminated. Is poll not reliable? If the process has terminated, it will be in zombie state here. Better to have it finish normally than kill it, I would say, if it has terminated in the one-second grace period.
There was a problem hiding this comment.
The problem is that we are currently using poll() on the leader process as a proxy for whether the whole process group has terminated.
Before, if the leader spawned child processes, it could happen that the leader exited after SIGTERM while some children were still running. In that case, poll() would return the leader’s exit code rather than None.
|
Thanks for looking into this! Claude and I added some code for avoiding killing processes that are already dead as suggested by Malte. |
Fixes an issue in Call._terminate_process_group where SIGKILL was only sent if the parent process was still alive.