Systems tested
- Fedora Kinoite, systemd 258.3-2.fc43, bindfs 1.18.3
- Debian 13.2, systemd 257.9-1~deb13u1, bindfs 1.14.7
Issue description
With a bindfs mount that was mounted via fstab, if the mountpoint has yet to be visited, an interaction between systemd and bindfs results in temporary system unresponsiveness (90 seconds) when running systemctl daemon-reload.
Prerequisites
- /etc/fstab entry for the mount
- Filesystem must be mounted already
- The user cache within bindfs must not yet be populated (i.e. the mount hasn't been visited since boot)
Steps to reproduce (as root)
mkdir /root/src
mkdir /root/dst1
echo '/root/src /root/dst1 fuse.bindfs mirror=@root 0 0' | tee -a /etc/fstab
# Optionally, run an early `systemctl daemon-reload` here.
# It will be quick, unlike the next one after we've mounted this below.
mount /root/dst1
time systemctl daemon-reload # hangs for 90 seconds
# Further checks for demonstration purposes
time systemctl daemon-reload # ok
pkill -USR1 bindfs
time systemctl daemon-reload # hangs for 90 seconds
umount /root/dst1
mount /root/dst1
time systemctl daemon-reload # hangs for 90 seconds
pkill -USR1 bindfs
stat /root/dst1
time systemctl daemon-reload # ok
Or you can reboot after you add the fstab line, and then run time systemctl daemon-reload after reboot, and there will be a hang. (time is only used to show how long it takes; its presence is not involved in the hang.)
Technical details
When systemctl daemon-reload is called, the fstab generator hangs and eventually times out when it stats an affected bindfs mountpoint. This seems to be because bindfs tries to obtain group information, but systemd apparently cannot provide this information when it's in the process of reloading its daemon.
It looks like the relevant call chain in bindfs.c is the getattr callbacks -> getattr_common -> is_mirrored_user -> user_belongs_to_group.
Possible solutions
- Since user_belongs_to_group appears to use a cache, maybe bindfs could populate that cache at process start, and perhaps also (somehow) ensure a cache refresh is not attempted when systemd is being reloaded. As far as this particular issue is concerned, it might be safer for the signal handler to actually refresh the cache rather than just invalidate it.
- See if this can be addressed in systemd somehow instead, rather than here in bindfs.
Workarounds
- Use a systemd mount unit file for the affected bindfs mounts instead of fstab, since it is systemd-fstab-generator that attempts to stat the mountpoint.
- Review the mountpoint with
ls or stat before running systemctl daemon-reload so that its attributes are cached and there doesn't need to be another call for group information when systemd is reloading.
Troubleshooting tips
systemctl log-level debug shows extra info in the log, including when running systemctl daemon-reload. On examination, you can see in the journal that systemd-fstab-generator is one of the processes spawned during this time, along with its PID. Unlike other generators, there isn't a quick completion for the PID launched for that one. Moving /usr/lib/systemd/system-generators/systemd-fstab-generator out of the way and replacing with a symlink to /bin/true confirmed that it was implicated.
strace -T -t was also helpful, including having a wrapper script for systemd-fstab-generator that calls it with strace -t -T and saves a log; this made it clear that the process hangs on the fstat step for the mountpoint.
Using bindfs -d, also with strace -T -t was helpful. Without strace, it wouldn't be as clear that there was a group query happening around the time of the hang.
Systems tested
Issue description
With a bindfs mount that was mounted via fstab, if the mountpoint has yet to be visited, an interaction between systemd and bindfs results in temporary system unresponsiveness (90 seconds) when running
systemctl daemon-reload.Prerequisites
Steps to reproduce (as root)
Or you can reboot after you add the fstab line, and then run
time systemctl daemon-reloadafter reboot, and there will be a hang. (timeis only used to show how long it takes; its presence is not involved in the hang.)Technical details
When
systemctl daemon-reloadis called, the fstab generator hangs and eventually times out when itstats an affected bindfs mountpoint. This seems to be because bindfs tries to obtain group information, but systemd apparently cannot provide this information when it's in the process of reloading its daemon.It looks like the relevant call chain in bindfs.c is the getattr callbacks -> getattr_common -> is_mirrored_user -> user_belongs_to_group.
Possible solutions
Workarounds
lsorstatbefore runningsystemctl daemon-reloadso that its attributes are cached and there doesn't need to be another call for group information when systemd is reloading.Troubleshooting tips
systemctl log-level debugshows extra info in the log, including when runningsystemctl daemon-reload. On examination, you can see in the journal thatsystemd-fstab-generatoris one of the processes spawned during this time, along with its PID. Unlike other generators, there isn't a quick completion for the PID launched for that one. Moving/usr/lib/systemd/system-generators/systemd-fstab-generatorout of the way and replacing with a symlink to/bin/trueconfirmed that it was implicated.strace -T -twas also helpful, including having a wrapper script forsystemd-fstab-generatorthat calls it withstrace -t -Tand saves a log; this made it clear that the process hangs on thefstatstep for the mountpoint.Using
bindfs -d, also withstrace -T -twas helpful. Withoutstrace, it wouldn't be as clear that there was a group query happening around the time of the hang.