Skip to content

Commit

Permalink
ovl: port to new mount api
Browse files Browse the repository at this point in the history
We recently ported util-linux to the new mount api. Now the mount(8)
tool will by default use the new mount api. While trying hard to fall
back to the old mount api gracefully there are still cases where we run
into issues that are difficult to handle nicely.

Now with mount(8) and libmount supporting the new mount api I expect an
increase in the number of bug reports and issues we're going to see with
filesystems that don't yet support the new mount api. So it's time we
rectify this.

For overlayfs specifically we ran into issues where mount(8) passed
multiple lower layers as one big string through fsconfig(). But the
fsconfig() FSCONFIG_SET_STRING option is limited to 256 bytes in
strndup_user(). While this would be fixable by extending the fsconfig()
buffer I'd rather encourage users to append layers via multiple
fsconfig() calls as the interface allows nicely for this. This has also
been requested as a feature before.

With this port to the new mount api the following will be possible:

        fsconfig(fs_fd, FSCONFIG_SET_STRING, "lowerdir", "/lower1", 0);

	/* set upper layer */
	fsconfig(fs_fd, FSCONFIG_SET_STRING, "upperdir", "/upper", 0);

	/* append "/lower2", "/lower3", and "/lower4" */
        fsconfig(fs_fd, FSCONFIG_SET_STRING, "lowerdir", ":/lower2:/lower3:/lower4", 0);

	/* turn index feature on */
	fsconfig(fs_fd, FSCONFIG_SET_STRING, "index", "on", 0);

	/* append "/lower5" */
        fsconfig(fs_fd, FSCONFIG_SET_STRING, "lowerdir", ":/lower5", 0);

Specifying ':' would have been rejected so this isn't a regression. And
we can't simply use "lowerdir=/lower" to append on top of existing
layers as "lowerdir=/lower,lowerdir=/other-lower" would make
"/other-lower" the only lower layer so we'd break uapi if we changed
this. So the ':' prefix seems a good compromise.

Users can choose to specify multiple layers at once or individual
layers. A layer is appended if it starts with ":". This requires that
the user has already added at least one layer before. If lowerdir is
specified again without a leading ":" then all previous layers are
dropped and replaced with the new layers. If lowerdir is specified and
empty than all layers are simply dropped.

An additional change is that overlayfs will now parse and resolve layers
right when they are specified in fsconfig() instead of deferring until
super block creation. This allows users to receive early errors.

It also allows users to actually use up to 500 layers something which
was theoretically possible but ended up not working due to the mount
option string passed via mount(2) being too large.

This also allows a more privileged process to set config options for a
lesser privileged process as the creds for fsconfig() and the creds for
fsopen() can differ. We could restrict that they match by enforcing that
the creds of fsopen() and fsconfig() match but I don't see why that
needs to be the case and allows for a good delegation mechanism.

Plus, in the future it means we're able to extend overlayfs mount
options and allow users to specify layers via file descriptors instead
of paths:

	fsconfig(FSCONFIG_SET_PATH{_EMPTY}, "lowerdir", "lower1", dirfd);

	/* append */
	fsconfig(FSCONFIG_SET_PATH{_EMPTY}, "lowerdir", "lower2", dirfd);

	/* append */
	fsconfig(FSCONFIG_SET_PATH{_EMPTY}, "lowerdir", "lower3", dirfd);

	/* clear all layers specified until now */
	fsconfig(FSCONFIG_SET_STRING, "lowerdir", NULL, 0);

This would be especially nice if users create an overlayfs mount on top
of idmapped layers or just in general private mounts created via
open_tree(OPEN_TREE_CLONE). Those mounts would then never have to appear
anywhere in the filesystem. But for now just do the minimal thing.

We should probably aim to move more validation into ovl_fs_parse_param()
so users get errors before fsconfig(FSCONFIG_CMD_CREATE). But that can
be done in additional patches later.

Link: util-linux/util-linux#2287 [1]
Link: util-linux/util-linux#1992 [2]
Link: https://bugs.archlinux.org/task/78702 [3]
Link: https://lore.kernel.org/linux-unionfs/20230530-klagen-zudem-32c0908c2108@brauner [4]
Signed-off-by: Christian Brauner <brauner@kernel.org>
---

I'm starting to get the feeling that I stared enough at this and I would
need a fresh set of eyes to review it for any bugs. Plus, Amir seems to
have conflicting series and I would have to rebase anyway so no point in
delaying this any further.
  • Loading branch information
brauner committed Jun 8, 2023
1 parent ac70cec commit 213f92f
Showing 1 changed file with 568 additions and 328 deletions.

0 comments on commit 213f92f

Please sign in to comment.