New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
layers: xattr handling is not well-specified #503
Comments
Oh, in addition, we need to specify whether or not all xattrs should be included in a layer changeset. IMO we shouldn't be including |
On Fri, Dec 16, 2016 at 10:52:04PM -0800, Aleksa Sarai wrote:
1. Should we always ensure that the xattrs of a file match _exactly_
the same xattrs of the layer (meaning that `user.test` will be
removed).
This is what the current spec calls for [1]. Here's GNU tar doing
just that:
$ tar --version
tar (GNU tar) 1.28
…
$ mkdir a
$ echo foo >a/test
$ setfattr -n user.foo -v foo a/test
$ mkdir b
$ echo bar >b/test
$ setfattr -n user.bar -v bar b/test
$ tar -cf test.tar -C b --xattrs .
$ tar -tf test.tar --xattrs -vv
drwxr-xr-x wking/wking 0 2016-12-17 07:25 ./
-rw-r--r--* wking/wking 4 2016-12-17 07:25 ./test
x: 3 user.bar
$ strace -o /tmp/trace tar -xf test.tar -C a --xattrs
$ grep ./test /tmp/trace
mknodat(4, "./test", 0644) = -1 EEXIST (File exists)
unlinkat(4, "./test", 0) = 0
mknodat(4, "./test", 0644) = 0
setxattr("/proc/self/fd/4/./test", "user.bar", "bar", 3, 0) = 0
openat(4, "./test", O_WRONLY|O_CREAT|O_NOCTTY|O_NONBLOCK|O_CLOEXEC, 0644) = 5
$ getfattr -d a/test
# file: a/test
user.bar="bar"
So the layer author has no way to pass through pre-existing xattrs,
but if you know what xattrs you want, you can always add a trailing
layer to restore anything that you don't want clobbered.
1. This will cause issues with stuff like `security.selinux` which
is automatically set on all created files based on your current
context, and you cannot remove them.
For what it's worth (probably not much with such a fuzzy connection to
our spec), GNU tar distinguishes between selinux, ACLs, and other
xattrs [2]. And there seem to be plans to eventually split out
security.capability as well [3].
[1]: https://github.com/opencontainers/image-spec/blame/v1.0.0-rc3/layer.md#L222
[2]: https://www.gnu.org/software/tar/manual/html_node/Extended-File-Attributes.html
[3]: http://git.savannah.gnu.org/cgit/tar.git/tree/src/xattrs.c?h=release_1_29#n665
|
No it isn't. Because doing an And it still doesn't help with layer generation -- how do we know if we should include certain xattrs.
You're right, it doesn't help to point to the implementation of GNU tar in order to answer a simple question of "what should we do with xattrs". |
On Sat, Dec 17, 2016 at 07:52:48AM -0800, Aleksa Sarai wrote:
> > 1. Should we always ensure that the xattrs of a file match _exactly_
> > the same xattrs of the layer (meaning that `user.test` will be
> > removed).
> This is what the current spec calls for [1]. Here's GNU tar doing
> just that:
No it isn't. Because doing an `unlink` and then adding xattrs will
still mean you have a `security.selinux` xattr set on the file --
that's how SELinux works. But "match exactly" would require
**removing** `security.selinux` if it's not set in the `tar
archive`.
Fair. The current spec calls for unlinking any pre-existing file,
that's what GNU tar does, and that's what we should do. If you get
SELinux settings by doing that, you don't have to clear them. I don't
think we need additional wording to cover the xattrs implications of
the already-specified unlink-and-replace wording.
And it still doesn't help with layer generation -- how do we know if
we should include certain xattrs.
Add a flag to your layer generator (like GNU tar does with --xattrs
and friends)? I don't think there's a good way to make this call in
general.
> For what it's worth (probably not much with such a fuzzy
> connection to our spec), GNU tar distinguishes between selinux,
> ACLs, and other xattrs [2]. And there seem to be plans to
> eventually split out security.capability as well [3].
You're right, it doesn't help to point to the implementation of GNU
tar in order to answer a simple question of "what should we do with
xattrs".
It doesn't help clarify the current spec's stance, but it can inform
future spec changes. They presumably had reasons for breaking down
xattrs into subcategories, and maybe those reasons apply to us too.
|
The spec should at least provide some guidance. For example:
Those are the three that come to mind immediately. 2 is because otherwise |
agreed
I don't understand why they need to track this so specifically - can't they just naively apply the xattrs at every layer? (e.g using the remove/recreate method). Tying into the next point - are there "implicit" sets that would happen on some layers but not on others?
Who else set it? Isn't this a file tree that we (the implementer) are rendering from scratch? Do you have any examples of cases other than the selinux one? |
So do the removal and re-creation for every file in an image, even if the content hasn't changed? I mean ... that could work fine (bit inefficient, but what can you do).0 The problem is not extraction but with generation -- how do I know which xattrs should I include in the next layer? Which ones did users set and which ones were implicitly set by the system? The way
Some of the |
well, isn't the only way to represent an xattr change between layers by having a new copy of the file in the upper layer anyway?
Ah, now I get you. Yeah, I don't have a great answer (or at least one better than umoci's). |
Sure, but implementations could optimise disk access by storing the checksums and then computing the checksum of |
omg... I think by having this discussion we're already wasting more than this optimisation would ever save :-) |
@jonboulle Micro optimisations are where it's at. 😸 In any case, ultimately I think this should up to the runtime since the unpacking document already has the |
So, while implementing
xattr
support inumoci
(cyphar/umoci#54) I discovered that it's not actually clear how to handle unpacking of attributes. In particular, it's not very clear what we should be doing when an extended attribute is defined for a file that has had its metadata modified.Imagine you have a file
foo
which had theuser.test
xattr set. In the next layer, this xattr is missing but there is some other xattrs set. What should an extractor do:Should we always ensure that the xattrs of a file match exactly the same xattrs of the layer (meaning that
user.test
will be removed).Should we leave existing xattrs alone, so
user.test
will not be removed?Should we only remove xattrs that were added explicitly by the extractor from a previous layer (so
user.test
will be removed but something likesecurity.selinux
that is not included in layers will not)?Each of these solutions has a problem that we need to address:
This will cause issues with stuff like
security.selinux
which is automatically set on all created files based on your current context, and you cannot remove them.This will mean that you cannot effectively delete an xattr, because there's no way of specifying an xattr whiteout.
This sounds like the most sane method. The problem is that it's a bit of a burden on implementors to have to keep track of what xattrs they've set on every file in a filesystem. In addition, how do you handle the case where a file already had some xattr set (which you didn't set) and then you're told to modify it. Should you remove it if it disappears in a future layer or not?
/cc @vbatts
The text was updated successfully, but these errors were encountered: