Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFE: umount: recognizable exit status if passed path is not a mount point #2049

Open
poettering opened this issue Jan 31, 2023 · 6 comments
Open

Comments

@poettering
Copy link
Contributor

poettering commented Jan 31, 2023

In systemd, during system shutdown we often end up unmounting some mounts we see that other subsystems consider themselves responsible for and end up unmounting at the same time.

This sucks, and result in log spew (because one side will succeed and the other fail), and we should address this via some explicit deps that ensure that either the subsystem or pid 1 ends up destroying the mount first, and they aren't suprised by that.

But still, I think it would be tremendously useful if systemd could recognize from the exit status of /bin/umount if the unmounting failed due to the fact the specified path wasn't actually a mount point. Right now though we just get a generic exit code of 32, which we cannot make much sense of.

Hence my feature request: return a documented, recognizable exit code in this case. So that systemd could just handle this case gracefully and proceed, knowing that once the command exited like this there definitely is no mount point anymore.

I am aware that the underlying syscall only returns EINVAL in that case, so it's not just a matter of propagating the kernel error code in some way. A relatively simple mechanism might be to just issue statx() with AT_NO_AUTOMOUNT|AT_STATX_DONT_SYNC and then look for the STATX_ATTR_MOUNT_ROOT flag. If it is not set, then return a new, recognizable error code, otherwise just return EINVAL as always.

(And I am fine if this only works on current kernels, I guess it would be fine to return 32 as before on those kernels as an error indicating "can't make out the precise cause")

I thought about just adding this same logic to PID 1 but I think doing this inside umount would help everywhere (also it's kinda nice if we can do fds accesses like that to arbitrary dirs in out-of-process helpers).

What do you think?

@adilger
Copy link

adilger commented Apr 13, 2023

This seems pretty attractive for another reason. Ever since umount started calling stat(2) on all mountpoints before unmount (via mnt_stat_mountpoint() today) this can lead to a hang for network filesystems like NFS and others. In almost all cases, the caller just checks if (not) S_ISDIR(st_mode), or just that the pathname exists. This would benefit from using statx(STATX_TYPE) and avoid revalidating the inode size or calling into the filesystem at all.

@karelzak
Copy link
Collaborator

@adilger, if you see a way how to use statx() to avoid calling into the filesystem, then I want a patch, ASAP :-) For example, we can improve mnt_stat_mountpoint() or introduce mnt_isdir_mountpoint() or so.

These improvements are really wanted, classic stat() is evil.

@brauner
Copy link
Contributor

brauner commented Apr 14, 2023

Yeah, we do have AT_STATX_DONT_SYNC. This will call into the filesystem but instruct the filesystem to not sync with the server.

@adilger
Copy link

adilger commented Apr 14, 2023

@brauner definitely DONT_SYNC will help. Since inode is pinned by mount it will always be in cache, and the st_ino, st_mode and st_dev are in struct vfs_inode so in theory this could be filled by the VFS with no call into the filesystem.

Alternately, doing getdents() on the parent dir would also return the d_type of the mountpoint, but not sure if that is enough.

karelzak added a commit that referenced this issue Apr 17, 2023
* prefer statx() with AT_STATX_DONT_SYNC if available
* keep fstatat() and stat() as fallback
* add test to mnt_stat_mountpoint()

The goal is to minimize situations when we need classic stat() because
it triggers automount, and stat() syscall may hang on unreachable
network filesystems. The automount issue was resolved years ago by
AT_NO_AUTOMOUNT; now we can use statx() to fix also hangs on NFS.

Addresses: #2049
Signed-off-by: Karel Zak <kzak@redhat.com>
@karelzak
Copy link
Collaborator

OK, I have tied to use statx() if possible, and it seems usable:

# strace -e statx /umount /mnt/test
statx(-1, "/mnt/test", AT_STATX_DONT_SYNC|AT_NO_AUTOMOUNT, STATX_TYPE|STATX_MODE|STATX_INO, {stx_mask=STATX_BASIC_STATS|STATX_MNT_ID, stx_attributes=STATX_ATTR_MOUNT_ROOT, stx_mode=S_IFDIR|0755, stx_size=1024, ...}) = 0

Committed.

@karelzak
Copy link
Collaborator

@adilger, thanks for your review!

I have pushed new changes related to the stat(), and now it should be almost stat()-free and without STATX_MODE. Please, review: f2663ba

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants
@adilger @karelzak @poettering @brauner and others