Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

layers: xattr handling is not well-specified #503

Closed
cyphar opened this issue Dec 17, 2016 · 11 comments
Closed

layers: xattr handling is not well-specified #503

cyphar opened this issue Dec 17, 2016 · 11 comments

Comments

@cyphar
Copy link
Member

cyphar commented Dec 17, 2016

So, while implementing xattr support in umoci (cyphar/umoci#54) I discovered that it's not actually clear how to handle unpacking of attributes. In particular, it's not very clear what we should be doing when an extended attribute is defined for a file that has had its metadata modified.

Imagine you have a file foo which had the user.test xattr set. In the next layer, this xattr is missing but there is some other xattrs set. What should an extractor do:

  1. Should we always ensure that the xattrs of a file match exactly the same xattrs of the layer (meaning that user.test will be removed).

  2. Should we leave existing xattrs alone, so user.test will not be removed?

  3. Should we only remove xattrs that were added explicitly by the extractor from a previous layer (so user.test will be removed but something like security.selinux that is not included in layers will not)?

Each of these solutions has a problem that we need to address:

  1. This will cause issues with stuff like security.selinux which is automatically set on all created files based on your current context, and you cannot remove them.

  2. This will mean that you cannot effectively delete an xattr, because there's no way of specifying an xattr whiteout.

  3. This sounds like the most sane method. The problem is that it's a bit of a burden on implementors to have to keep track of what xattrs they've set on every file in a filesystem. In addition, how do you handle the case where a file already had some xattr set (which you didn't set) and then you're told to modify it. Should you remove it if it disappears in a future layer or not?

/cc @vbatts

@cyphar
Copy link
Member Author

cyphar commented Dec 17, 2016

Oh, in addition, we need to specify whether or not all xattrs should be included in a layer changeset. IMO we shouldn't be including security.selinux -- but some other xattrs might exist that cause us even more issues. My current solution (cyphar/umoci#54) is to just blacklist security.selinux pending further issues.

@wking
Copy link
Contributor

wking commented Dec 17, 2016 via email

@cyphar
Copy link
Member Author

cyphar commented Dec 17, 2016

@wking

  1. Should we always ensure that the xattrs of a file match exactly
    the same xattrs of the layer (meaning that user.test will be
    removed).
    This is what the current spec calls for [1]. Here's GNU tar doing
    just that:

No it isn't. Because doing an unlink and then adding xattrs will still mean you have a security.selinux xattr set on the file -- that's how SELinux works. But "match exactly" would require removing security.selinux if it's not set in the tar archive.

And it still doesn't help with layer generation -- how do we know if we should include certain xattrs.

For what it's worth (probably not much with such a fuzzy connection to
our spec), GNU tar distinguishes between selinux, ACLs, and other
xattrs [2]. And there seem to be plans to eventually split out
security.capability as well [3].

You're right, it doesn't help to point to the implementation of GNU tar in order to answer a simple question of "what should we do with xattrs".

@wking
Copy link
Contributor

wking commented Dec 17, 2016 via email

@cyphar
Copy link
Member Author

cyphar commented Dec 17, 2016

Add a flag to your layer generator (like GNU tar does with --xattrs
and friends)? I don't think there's a good way to make this call in
general.

The spec should at least provide some guidance. For example:

  1. user.* MUST be included.
  2. security.capability SHOULD be included.
  3. security.selinux SHOULD NOT be included.

Those are the three that come to mind immediately. 2 is because otherwise ping will be broken everywhere except in Ubuntu (where ping is setuid).

@jonboulle
Copy link
Contributor

  1. This sounds like the most sane method.

agreed

The problem is that it's a bit of a burden on implementors to have to keep track of what xattrs they've set on every file in a filesystem.

I don't understand why they need to track this so specifically - can't they just naively apply the xattrs at every layer? (e.g using the remove/recreate method). Tying into the next point - are there "implicit" sets that would happen on some layers but not on others?

In addition, how do you handle the case where a file already had some xattr set (which you didn't set)

Who else set it? Isn't this a file tree that we (the implementer) are rendering from scratch? Do you have any examples of cases other than the selinux one?

@cyphar
Copy link
Member Author

cyphar commented Dec 21, 2016

The problem is that it's a bit of a burden on implementors to have to keep track of what xattrs they've set on every file in a filesystem.

I don't understand why they need to track this so specifically - can't they just naively apply the xattrs at every layer? (e.g using the remove/recreate method). Tying into the next point - are there "implicit" sets that would happen on some layers but not on others?

So do the removal and re-creation for every file in an image, even if the content hasn't changed? I mean ... that could work fine (bit inefficient, but what can you do).0 The problem is not extraction but with generation -- how do I know which xattrs should I include in the next layer? Which ones did users set and which ones were implicitly set by the system? The way umoci solves this problem is by generating a manifest of the rootfs after extraction -- and then you can be sure that any changes were made by the user.

Who else set it? Isn't this a file tree that we (the implementer) are rendering from scratch? Do you have any examples of cases other than the selinux one?

Some of the trusted.* extended attributes in principle could be set by some super-root user, and if you run the tooling as root then you might see xattrs that were not part of the image. I'm not really sure.

@jonboulle
Copy link
Contributor

So do the removal and re-creation for every file in an image, even if the content hasn't changed?

well, isn't the only way to represent an xattr change between layers by having a new copy of the file in the upper layer anyway?

The problem is not extraction but with generation -- how do I know which xattrs should I include in the next layer? Which ones did users set and which ones were implicitly set by the system? The way umoci solves this problem is by generating a manifest of the rootfs after extraction -- and then you can be sure that any changes were made by the user.

Ah, now I get you. Yeah, I don't have a great answer (or at least one better than umoci's).

@cyphar
Copy link
Member Author

cyphar commented Dec 21, 2016

well, isn't the only way to represent an xattr change between layers by having a new copy of the file in the upper layer anyway?

Sure, but implementations could optimise disk access by storing the checksums and then computing the checksum of tar as they read the data. If the checksum is the same, there's no need to replace the file. It's a small optimisation, but I'm just wondering if there's a way to define changes without cutting out the possibility for such optimisations.

@jonboulle
Copy link
Contributor

omg... I think by having this discussion we're already wasting more than this optimisation would ever save :-)

@cyphar
Copy link
Member Author

cyphar commented Dec 21, 2016

@jonboulle Micro optimisations are where it's at. 😸 In any case, ultimately I think this should up to the runtime since the unpacking document already has the unlink+create wording. So I'll close this.

@cyphar cyphar closed this as completed Dec 21, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants