Fix race between explicit mount and handling automount request from kernel #5916

Merged
merged 1 commit into from May 10, 2017

Conversation

Projects
None yet
2 participants
Contributor

AnchorCat commented May 9, 2017

If a process accesses an autofs filesystem while systemd is in the middle of starting the mount unit on top of it, it is possible for the autofs_ptype_missing_direct request from the kernel to be received after
the mount unit has been fully started:

          P1                              P2
systemd forks and execs mount             ...
          ...                     access autofs, blocks
mount exits                               ...
systemd receives SIGCHLD                  ...
          ...                     kernel sends request
systemd receives request                  ...

systemd needs to respond to this request, otherwise the kernel will continue to block access to the mount point.


While I have only encountered this bug during boot on systems with automounted network filesystems, it can be triggered reasonably reliable with an artificial test case as follows:

  1. Prepare an automount and mount:
# mkdir /tmp/source /tmp/target
# echo contents >/tmp/source/file
# echo '/tmp/source /tmp/target none bind,noauto,x-systemd.automount 0 0' >>/etc/fstab
# systemctl daemon-reload
# systemctl start tmp-target.automount
  1. Raise systemd's rate limits:
# echo 'DefaultStartLimitBurst=10000' >>/etc/systemd/system.conf
# systemctl daemon-reexec 
  1. Start a loop that repeatedly races systemd mounting the unit against the automount request from the kernel:
# while sleep 0.2; do systemctl stop tmp-target.mount; systemctl start --no-block tmp-target.mount & sleep 0.006; cat /tmp/target/file; wait; done

The delay in the sleep 0.006 command needs to be tweaked so that the automount request is sometimes, but not always, the trigger for systemd to mount the unit, and such that "Automount point already active?" only occasionally appears in the journal. Eventually the "Got automount request" message will appear after "Mounted /tmp/target", showing that the autofs request was received too late:

May 09 11:38:28 hostname systemd[1]: Unmounting /tmp/target...
May 09 11:38:28 hostname systemd[1]: Unmounted /tmp/target.
May 09 11:38:28 hostname systemd[1]: Mounting /tmp/target...
May 09 11:38:28 hostname systemd[1]: tmp-target.automount: Got automount request for /tmp/target, triggered by 22427 (cat)
May 09 11:38:28 hostname systemd[1]: Mounted /tmp/target.
May 09 11:38:29 hostname systemd[1]: Unmounting /tmp/target...
May 09 11:38:29 hostname systemd[1]: Unmounted /tmp/target.
May 09 11:38:29 hostname systemd[1]: Mounting /tmp/target...
May 09 11:38:29 hostname systemd[1]: Mounted /tmp/target.
May 09 11:38:29 hostname systemd[1]: tmp-target.automount: Got automount request for /tmp/target, triggered by 22438 (cat)
May 09 11:38:29 hostname systemd[1]: tmp-target.automount: Automount point already active?

When this happens the cat command will hang, even though the /tmp/target filesystem was successfully mounted. Anything else that tries to access /tmp/target will also hang until it is unmounted (via umount or systemctl stop).

Owner

poettering commented May 9, 2017

Patch looks good, but could you add a short comment about this issue to the if block you are changing? Otherwise the next one looking at these sources or changing them might break this again. Something brief suffices, possibly containing a link to this issue containing the longer explanation...

Otherwirse looks excellent! thanks for spending the time to track this down, much appreciated.

automount: ack automount requests even when already mounted
If a process accesses an autofs filesystem while systemd is in the
middle of starting the mount unit on top of it, it is possible for the
autofs_ptype_missing_direct request from the kernel to be received after
the mount unit has been fully started:

  systemd forks and execs mount             ...
            ...                     access autofs, blocks
  mount exits                               ...
  systemd receives SIGCHLD                  ...
            ...                     kernel sends request
  systemd receives request                  ...

systemd needs to respond to this request, otherwise the kernel will
continue to block access to the mount point.
Contributor

AnchorCat commented May 10, 2017

No problem. I've added a brief comment.

@poettering poettering merged commit e7d54bf into systemd:master May 10, 2017

5 checks passed

default Build finished.
Details
semaphoreci The build passed on Semaphore.
Details
xenial-amd64 autopkgtest finished (success)
Details
xenial-i386 autopkgtest finished (success)
Details
xenial-s390x autopkgtest finished (success)
Details
Owner

poettering commented May 10, 2017

thanks!

@euank euank referenced this pull request in coreos/bugs Jun 26, 2017

Closed

Hang with df -h #1630

euank added a commit to euank/systemd that referenced this pull request Jun 29, 2017

automount: ack automount requests even when already mounted (#5916)
If a process accesses an autofs filesystem while systemd is in the
middle of starting the mount unit on top of it, it is possible for the
autofs_ptype_missing_direct request from the kernel to be received after
the mount unit has been fully started:

  systemd forks and execs mount             ...
            ...                     access autofs, blocks
  mount exits                               ...
  systemd receives SIGCHLD                  ...
            ...                     kernel sends request
  systemd receives request                  ...

systemd needs to respond to this request, otherwise the kernel will
continue to block access to the mount point.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment