Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fix child processes not being reaped with PID 1
Starting with Puma v6.4.1, we observed that killed Puma cluster workers were never being restarted when the parent was run as PID 1. For example, I issued a `kill 44` and PID 44 remained in the `defunct` state: ``` git@gitlab-webservice-default-78664bb757-2nxvh:/var/log/gitlab$ ps -ef UID PID PPID C STIME TTY TIME CMD git 1 0 0 Jan09 ? 00:01:39 puma 6.4.1 (tcp://0.0.0.0:8080) [gitlab-puma-worker] git 23 1 0 Jan09 ? 00:05:46 /usr/local/bin/gitlab-logger /var/log/gitlab git 41 1 0 Jan09 ? 00:01:55 ruby /srv/gitlab/bin/metrics-server git 44 1 0 Jan09 ? 00:02:41 [ruby] <defunct> git 46 1 0 Jan09 ? 00:02:38 puma: cluster worker 1: 1 [gitlab-puma-worker] git 48 1 0 Jan09 ? 00:02:42 puma: cluster worker 2: 1 [gitlab-puma-worker] git 49 1 0 Jan09 ? 00:02:41 puma: cluster worker 3: 1 [gitlab-puma-worker] git 5205 0 0 21:57 pts/0 00:00:00 bash git 5331 5205 0 22:00 pts/0 00:00:00 ps -ef ``` Further investigation showed that the introduction of `Process.wait2(-1, Process::WNOHANG)` in puma#3255 never appears to return anything when: 1. The parent PID is 1. 2. `Process.detach(some PID != 1)` is run after a `Process.spawn`. This bug appears to be present in Ruby 3.1 and 3.2, but it seems to have been fixed in Ruby 3.3. Previously `Process.wait(w.pid, Process::WNOHANG)` was called on each known worker PID. puma#3255 changed this behavior to do this only if the `fork_worker` config parameter were enabled, but it seems that we should always do this. Closes puma#3313
- Loading branch information