Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unicorn.service: Can't open PID file unicorn.pid (yet?) after reload: Operation not permitted #16636

Closed
mnin opened this issue Jul 31, 2020 · 11 comments

Comments

@mnin
Copy link

mnin commented Jul 31, 2020

systemd version the issue has been seen with

systemd 245 (245.4-4ubuntu3.1)

Used distribution

Distributor ID: Ubuntu
Description: Ubuntu 20.04 LTS
Release: 20.04
Codename: focal

Linux hostname 5.4.0-1015-aws #15-Ubuntu SMP Thu Jun 4 22:47:00 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

Expected behaviour you didn't see

I expect that the Unicorn service gets the signal USR2 sent to the PID of the PID file.

Unexpected behaviour you saw

I see confusing error messages:

unicorn.service file

[Unit]
Description=Unicorn service
After=network.target

[Service]
User=ubuntu
Group=www-data
Type=forking
GuessMainPID=no
Environment=RAILS_ENV=production
WorkingDirectory=/srv/project/current
PIDFile=/srv/project/shared/system/pids/unicorn.pid
ExecStart=/home/ubuntu/.rbenv/shims/bundle exec unicorn -c config/unicorn.rb -D
ExecReload=/bin/kill -USR2 $MAINPID
KillMode=process
KillSignal=SIGQUIT
Restart=on-failure
SyslogIdentifier=hopla-unicorn

[Install]
WantedBy=default.target

logs

Jul 16 00:10:32 hostname systemd[1]: Reloading Unicorn service.
Jul 16 00:10:32 hostname systemd[1]: unicorn.service: Can't open PID file /srv/project/shared/system/pids/unicorn.pid (yet?) after reload: Operation not permitted
Jul 16 00:10:32 hostname systemd[1]: Reloaded Unicorn service.

unicorn service status

● unicorn.service - Unicorn service
Loaded: loaded (/etc/systemd/system/unicorn.service; enabled; vendor preset: enabled)
Active: active (running) since Thu 2020-07-16 18:12:40 CEST; 1 weeks 6 days ago
Process: 786416 ExecReload=/bin/kill -USR2 $MAINPID (code=exited, status=0/SUCCESS)
Main PID: 775820 (ruby)
Tasks: 55 (limit: 9430)
Memory: 2.9G

What we see here is: 775820 is the PID from the new master process from the unicorn service.

But 786416 is almost died and doesn't exists anymore. Because it was the old unicorn master process which is replaced by the new one.

strace

...
waitid(P_PID, 786416, {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=786416, si_uid=1000, si_status=0, si_utime=0, si_stime=0}, WEXITED, NULL) = 0
...
openat(AT_FDCWD, "/proc/786416/comm", O_RDONLY|O_CLOEXEC) = 36
fstat(36, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
read(36, "kill\n", 1024)                = 5
ioctl(36, TCGETS, 0x7fffadebeef0)       = -1 ENOTTY (Inappropriate ioctl for device)
read(36, "", 1024)                      = 0
close(36)                               = 0
openat(AT_FDCWD, "/proc/786416/cgroup", O_RDONLY|O_CLOEXEC) = 36
fstat(36, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
read(36, "12:cpuset:/\n11:blkio:/\n10:pids:/"..., 1024) = 191
...
newfstatat(AT_FDCWD, "/sys/fs/cgroup/systemd", {st_mode=S_IFDIR|0555, st_size=0, ...}, AT_SYMLINK_NOFOLLOW) = 0
openat(AT_FDCWD, "/sys/fs/cgroup/systemd/system.slice/unicorn.service/cgroup.procs", O_WRONLY|O_NOCTTY|O_CLOEXEC) = 34
fcntl(34, F_GETFL)                      = 0x8001 (flags O_WRONLY|O_LARGEFILE)
write(34, "786416\n", 7)                = 7
...
openat(AT_FDCWD, "/", O_RDONLY|O_CLOEXEC|O_PATH|O_DIRECTORY) = 36
fstat(36, {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
openat(36, "srv", O_RDONLY|O_NOFOLLOW|O_CLOEXEC|O_PATH) = 37
fstat(37, {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
close(36)                               = 0
openat(37, "project", O_RDONLY|O_NOFOLLOW|O_CLOEXEC|O_PATH) = 36
fstat(36, {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
close(37)                               = 0
openat(36, "shared", O_RDONLY|O_NOFOLLOW|O_CLOEXEC|O_PATH) = 37
fstat(37, {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
close(36)                               = 0
openat(37, "system", O_RDONLY|O_NOFOLLOW|O_CLOEXEC|O_PATH) = 36
fstat(36, {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
close(37)                               = 0
openat(36, "pids", O_RDONLY|O_NOFOLLOW|O_CLOEXEC|O_PATH) = 37
fstat(37, {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
close(36)                               = 0
openat(37, "unicorn.pid", O_RDONLY|O_NOFOLLOW|O_CLOEXEC|O_PATH) = -1 ENOENT (No such file or directory)
close(37)                               = 0
sendmsg(47, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="PRIORITY=6\nSYSLOG_FACILITY=3\nCOD"..., iov_len=198}, {iov_base="MESSAGE=", iov_len=8}, {iov_base="unicorn.service: Can't open PID "..., iov_len=123}, {iov_base="\n", iov_len=1}], msg_iovlen=4, msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 330
sendmsg(47, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="PRIORITY=6\nSYSLOG_FACILITY=3\nCOD"..., iov_len=132}, {iov_base="MESSAGE=Reloaded Unicorn service"..., iov_len=33}, {iov_base="\n", iov_len=1}, {iov_base="JOB_ID=38921", iov_len=12}, {iov_base="\n", iov_len=1}, {iov_base="JOB_TYPE=reload", iov_len=15}, {iov_base="\n", iov_len=1}, {iov_base="JOB_RESULT=done", iov_len=15}, {iov_base="\n", iov_len=1}, {iov_base="UNIT=unicorn.service", iov_len=20}, {iov_base="\n", iov_len=1}, {iov_base="INVOCATION_ID=4a356cb5cfc24d15b9"..., iov_len=46}, {iov_base="\n", iov_len=1}, {iov_base="MESSAGE_ID=7b05ebc668384222baa88"..., iov_len=43}, {iov_base="\n", iov_len=1}], msg_iovlen=15, msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 323
...
  • Why is systemd just trying to do something with the old PID?
  • The file is there and can be read by everyone (other+r for files and other+rx for directories).

Steps to reproduce the problem

systemctl restart unicorn
systemctl reload unicorn

@poettering
Copy link
Member

You configured a PIDFile= which is good, but the PID file doesn't actually exist, i.e. we get ENOENT opening it. Fix the PIDFile= path. Or maybe your app is broken and doesn't write the PID file before returning in the process we invoke?

Either way, this is not a systemd issue, closing hence.

@mnin
Copy link
Author

mnin commented Jul 31, 2020

@poettering the file is there and contains the current PID.

As I have already written above.

This looks like a systemd issue here.

@mnin
Copy link
Author

mnin commented Jul 31, 2020

ubuntu@hostname:~$ cat /srv/project/shared/system/pids/unicorn.pid
824613
ubuntu@hostname:~$ ps aux | grep 824613
ubuntu    824613  0.0  1.8 336044 147020 ?       Sl   17:21   0:04 unicorn master -c config/unicorn.rb -D

@webdev
Copy link

webdev commented Aug 27, 2021

@mnin Have you solved this problem? I am having the same issue

@mnin
Copy link
Author

mnin commented Aug 29, 2021

@webdev thanks for asking, I've been debugging here for a while. It looked like the pid file still contained the old PID, I used strace here.

In the end I switched to Puma (instead of Unicorn), here there were no such problems anymore.

@Gooberpatrol66
Copy link

Or maybe your app is broken and doesn't write the PID file before returning in the process we invoke?

What does this sentence mean? It reads like broken english.

@vcaputo
Copy link
Member

vcaputo commented Oct 15, 2021

Or maybe your app is broken and doesn't write the PID file before returning in the process we invoke?

What does this sentence mean? It reads like broken english.

Which part don't you comprehend? It reads perfectly fine to me...

@Gooberpatrol66
Copy link

Gooberpatrol66 commented Oct 15, 2021

Did he mean to say "before turning in the process"? or "before returning into the process"? Or "returning (a noun is usually required here) inside the process"?

@vcaputo
Copy link
Member

vcaputo commented Oct 15, 2021

Did he mean to say "before turning in the process"? or "before returning into the process"? Or "returning (a noun is usually required here) inside the process"?

The context here is a self-daemonizing process, which implies a double fork is happening somewhere outside of systemd's control. For the pidfile to be usable the piece doing the double fork must write out the pid to the pidfile before it returns/exits - at which point systemd will reap the child it had created and attempt to make use of the pidfile.

Lennart's statement makes sense in the context of what's being discussed here. I think there must just be some confusion surrounding how pidfiles and daemonizing processes work, which is something Lennart's statement assumes is understood. It doesn't read as broken english.

@Gooberpatrol66
Copy link

Gooberpatrol66 commented Oct 15, 2021

So "returning" as in "terminating execution". I'm not sure what terminating "in the process" means though.

@mbiebl
Copy link
Contributor

mbiebl commented Oct 15, 2021

See https://www.freedesktop.org/software/systemd/man/daemon.html#SysV%20Daemons if you want to understand it better

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

6 participants