-
-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
systemd: mount units fail with "Mount process finished, but there is no mount." #10872
Comments
If you replace the mount binary with a shell script that invokes the original mount binary but after it checks /proc/self/mountinfo to see if the mount is actually established, what do you see? |
Sorry late feedback. This is what I get:
For the record, this is the wrapper I used:
|
I am pretty sure #10980 will fix this one too. Any chance you can give it a whirl? |
Sure, I'll try with the PR and report back. |
Running with revision ca7d7db from #10980:
FWIW I noticed that the script was hitting another sanity check, which is run in the same process, after
The log says:
|
This works around the systemd bug systemd/systemd#10872 by ensuring that there is only a single operation that manipulates mount units at the same time.
We were checking some potential workarounds in snapd. We've simplified the reproducer script to do just one mount, but many reloads in parallel and we were able to reproduce the problem, though it took much longer. Now that #10980 is merged, would you like me to try with the latest master? |
This is an RFC PR to see if the "mount protocol error" reported in systemd/systemd#10872 can be worked around by serializing the mount unit adding/removal. Proposing to get full spread runs. This is similar to snapcore#6243 but it goes further by ensuring a single daemon reload on the systemd go package level. Note that there is still a chance that the protocol error happens if something else (like dpkg or the user) runs "systemd daemon-reload" while we write a mount unit. But the risk should be hughely smaller.
This is an RFC PR to see if the "mount protocol error" reported in systemd/systemd#10872 can be worked around by serializing the mount unit adding/removal. Proposing to get full spread runs. This is similar to snapcore#6243 but it goes further by ensuring a single daemon reload on the systemd go package level. Note that there is still a chance that the protocol error happens if something else (like dpkg or the user) runs "systemd daemon-reload" while we write a mount unit. But the risk should be hughely smaller.
I still have seen this problem of parallel execution of
in version v241. We work around this problem by trusting the successful execution of the mount command:
Unfortunately, I have not yet been able to isolate a reproducer for this problem, it just fails from time to time when creating new systemd units (and reloading) and performing mounts as dependency for starting other service units. |
I do see this issue with nfs mounts on openSUSE Tumbleweed, that carries 241.
|
Forgot to mention, the mounts are working fine, nevertheless. |
Can confirm this same issue on recent openSUSE Tumbleweed releases with version: In this case it is an encrypted XFS partition. Mount fails on boot, yet is fully usable, exactly like @frispete reports. Any updates or possibly any testcases to run perhaps? |
If we get a SIGCHLD we enable and eventually dispatch sigchld_event_source where we actually reap the process. We received SIGCHLD for the specific PID so wait for that process first. Motivation to do this is to prevent problem due to our state machine for mount units relying on the fact that we always dispatch mountinfo notifications before dispatching sigchld handler for the mount. Previously, this was racy because we might have called manager_dispatch_sigchld() for completely unrelated process but we would actually reap the mount process which completed in the meantime. sigchld handler for the mount unit would then fail the mount unit because we haven't dispatched mountinfo notification yet. event| mount kernel PID 1 ------------------------------------------------------------------------ 1 | forks off mount as PID x ------------------------------------------------------------------------ 2 | receives SIGCHLD for PID y ------------------------------------------------------------------------ 3 | enables sigchld_event_source ------------------------------------------------------------------------ 4 | dispatches sigchld_event_source ------------------------------------------------------------------------ 5 | mount() mountinfo_notif ------------------------------------------------------------------------ 6 | exit() ------------------------------------------------------------------------ 7 | calls waitid() with P_ALL ------------------------------------------------------------------------ 8 | calls sigchld_handler for mount ------------------------------------------------------------------------ 9 | fails the mount unit since | mountinfo_notif wasn't | processed yet ------------------------------------------------------------------------ Fixes systemd#10872
If we get a SIGCHLD we enable and eventually dispatch sigchld_event_source where we actually reap the process. We received SIGCHLD for the specific PID so wait for that process first. Motivation to do this is to prevent problem due to our state machine for mount units relying on the fact that we always dispatch mountinfo notifications before dispatching sigchld handler for the mount. Previously, this was racy because we might have called manager_dispatch_sigchld() for completely unrelated process but we would actually reap the mount process which completed in the meantime. sigchld handler for the mount unit would then fail the mount unit because we haven't dispatched mountinfo notification yet. event| mount kernel PID 1 ------------------------------------------------------------------------ 1 | forks off mount as PID x ------------------------------------------------------------------------ 2 | receives SIGCHLD for PID y ------------------------------------------------------------------------ 3 | enables sigchld_event_source ------------------------------------------------------------------------ 4 | dispatches sigchld_event_source ------------------------------------------------------------------------ 5 | mount() mountinfo_notif ------------------------------------------------------------------------ 6 | exit() ------------------------------------------------------------------------ 7 | calls waitid() with P_ALL ------------------------------------------------------------------------ 8 | calls sigchld_handler for mount ------------------------------------------------------------------------ 9 | fails the mount unit since | mountinfo_notif wasn't | processed yet ------------------------------------------------------------------------ Fixes systemd#10872
(The interesting bits about the what and why are in a comment in the patch, please have a look there instead of looking here in the commit msg). Fixes: systemd#10872
I prepped a proposal to fix this in #13097, ptal! |
4.19. |
So maybe the "disk-nn missing" phenomenon is caused by the reproducer rather than a problem? |
It's my understanding this is caused by a locking problem in the kernel when reading the proc mounts table. The kernel fix went into v5.7 or v5.8, I'm not sure what version of util-linux has the workaround. So if your libmount doesn't have the workaround or your distribution kernel (or the kernel you are building) doesn't have the fix you will see the problem. |
util-linux v2.35 and v2.36 (the current upstream (not released yet) is without the workaround) |
I try the workaround patch in util-linux of util-linux/util-linux@e4925f5. |
@syyhao1994 there are more patches related to this topic. You need also ee551c909f95437fd9fcd162f398c069d0ce9720. |
You mean the problem of "disk-xx missing"? |
@syyhao1994 Have you tried variant 5 ( |
@karelzak As nomuranec said, the problem is still exist with variant 0 |
I tried many times, the problem is still exist with variant 5, but it was very hard to reproduce, i have tried maybe thousands of time. |
Because variant 0 has high frequency lodev mapping change in the mix, it could trigger other problem than /proc/self/mountinfo race. If you are chasing mount-related problem, variant 5 has tighter focus on that. If variant 0 matches your use case, IMHO forking a new issue with specific systemd/kernel/util-linux version avoids confusion... |
Ok, it may be couldn't solve in systemd. Thank you a lot! |
Frankly, I do not see reason why use loopdev to test mountinfo issues. It only inceases complexity and kernel loopdev driver is pretty problematic when used in paralell. It seesm better to minimize complexity and use for example "tmpfs" to test mountinfo. See for example my version of the script which I have used to implement the workaround: http://people.redhat.com/kzak/rep-tmpfs.sh. Anyway, the mountinfo read() issue should be fixed by kernel (since 5.8). |
…"just_mounted" When starting a mount unit, systemd invokes mount command and moves the unit's internal state to "mounting". Then it watches for updates of /proc/self/mountinfo. When the expected mount entry newly appears in mountinfo, the unit internal state is changed to "mounting-done". Finally, when systemd finds the mount command has finished, it checks whether the unit internal state is "mounting-done" and changes the state to "mounted". If the state was not "mounting-done" in the last step though mount command was successfully finished, the unit is marked as "failed" with following log messages: Mount process finished, but there is no mount. Failed with result 'protocol'. If daemon-reload is done in parallel with starting mount unit, it is possible that things happen in following order and result in above failure. 1. the mount unit state changes to "mounting" 2. daemon-reload saves the unit state 3. kernel completes the mount and /proc/self/mountinfo is updated 4. daemon-reload restores the saved unit state, that is "mounting" 5. systemd notices the mount command has finished but the unit state is still "mounting" though it should be "mounting-done" mount_setup_existing_unit() should take into account that MOUNT_MOUNTING is transitional state and set MOUNT_PROC_JUST_MOUNTED flag if the unit comes from /proc/self/mountinfo so that mount_process_proc_self_mountinfo() later can make state transition from "mounting" to "mounting-done". Fixes: systemd#10872 (cherry picked from commit 1d086a6)
Hopefully fixed by #23893. |
systemd version the issue has been seen with
Used distribution
Expected behaviour you didn't see
Unexpected behaviour you saw
Steps to reproduce the problem
Grab the reproducer script: https://gist.github.com/bboozzoo/d4b142229b1915ef7cc0cf8593599ad9/828d716e484a39da11987b2dc38da86434d1f89f
We have been tracking the problem in snapd for a while in https://forum.snapcraft.io/t/unexplained-mount-failure-protocol-error-what-we-know-so-far/5682 and it started appearing around April/May 2018. It randomly reproduces in the CI runs on distros with recent(-ish) systemd while installing snaps. What
snapd
does is in short: generate a mount unit for the snap, drop it under/etc/systemd/system
, callsystemctl daemon-reload
, and latersystemctl start <...>.mount
. The last step randomly fails. The journal message is what is provided above.The reproducer script was used to explore some possible ideas on how it fails. So far, only the variant when the thing happens is daemon-reload interleaved with start/stop of mount units. On my machine it fails reliably in 1-2 loop iterations. Other variants that were explored and failed to reproduce: loading mount units before and doing start/stop, using mount directly, calling systemd-mount (this one failed in a precular, but unrelated way).
I also tried Fedora 28/29 cloud images and Ubuntu 18.10/18.04 cloud image with similar results.
Edit: added Ubuntu 18.04
The text was updated successfully, but these errors were encountered: