New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Any chance of changing the whiteout file approach? #24

Open
timthelion opened this Issue Apr 14, 2016 · 46 comments

Comments

Projects
None yet
9 participants
@timthelion
Copy link
Contributor

timthelion commented Apr 14, 2016

It seems unfortunate that a new standard should use a method which unecesarilly limits the standard. With the .wh file approach, base images can no longer contain arbitrary data. This means, for example, that you cannot have an image with example OCI image-spec data stored in it. Is there any possibility of changing this to use, for example, a white out list instead. So that the files that made a layer would be:

VERSION
layer.tar
whiteouts
json

That would mean that truely arbitrary data could be stored in the images, which would be really nice :)

@timthelion timthelion changed the title Any chance of changing the whiteout file approach Any chance of changing the whiteout file approach? Apr 14, 2016

@vbatts

This comment has been minimized.

Copy link
Member

vbatts commented Apr 14, 2016

To be clear, you are saying that the .wh approach is limited? And the
whiteout file is preferred?

If so, I agree.

On Thu, Apr 14, 2016, 17:16 Timothy Hobbs notifications@github.com wrote:

It seems unfortunate that a new standard should use a method which
unecesarilly limits the standard. With the .wh file approach, base images
can no longer contain arbitrary data. This means, for example, that you
cannot have an image with example OCI image-spec data stored in it. Is
there any possibility of changing this to use, for example, a white out
list instead. So that the files that made a layer would be:

VERSION
layer.tar
whiteouts
json

That would mean that truely arbitrary data could be stored in the images,
which would be really nice :)


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
#24

@timthelion

This comment has been minimized.

Copy link
Contributor Author

timthelion commented Apr 14, 2016

Creating a list of files to be whited out, and keeping that outside of layer.tar is better than putting .wh files in layer.tar. We wouldn't want a situation like this: http://git-annex.branchable.com/forum/Storing_git_repos_in_git-annex/

@vbatts

This comment has been minimized.

Copy link
Member

vbatts commented Apr 14, 2016

I wholly agree and intend to see it done as a whiteout file list.

On Thu, Apr 14, 2016, 17:26 Timothy Hobbs notifications@github.com wrote:

Creating a list of files to be whited out, and keeping that outside of
layer.tar is better than putting .wh files in layer.tar. We wouldn't want
a situation like this:
http://git-annex.branchable.com/forum/Storing_git_repos_in_git-annex/


You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#24 (comment)

@philips

This comment has been minimized.

Copy link
Contributor

philips commented Apr 14, 2016

I think this is something we should think about for post v1.0.0. I whole heartedly agree but in the ideal case the initial v1 serialization spec is binary compatible with the existing docker serialization to save the sanity of all of the existing registry folks like acr, gcr.io, Quay, Hub, etc.

@timthelion

This comment has been minimized.

Copy link
Contributor Author

timthelion commented Apr 14, 2016

Would it be possible to use the whiteout list file IFF a whiteout list file exists, and otherwise use the .wh approach?

@philips

This comment has been minimized.

Copy link
Contributor

philips commented Apr 15, 2016

@timthelion Yes, that would be the right way of approaching it. It would be a schema bump which would be a version break. Happy to consider adding this feature to fix the issue but I do want to hold off until we get post v1.0.0 (in a couple of months).

@timthelion

This comment has been minimized.

Copy link
Contributor Author

timthelion commented Apr 15, 2016

@philips so basically, you want to have the 1.0 release be supported by Docker/CoreOS/whatever, without actually haven't to change Docker/CoreOS/whatever's code? Basically, there will be no actual technical changes to the spec before 1.0?

@cyphar

This comment has been minimized.

Copy link
Member

cyphar commented Jul 20, 2016

@timthelion No breaking technical changes.

@wking

This comment has been minimized.

Copy link
Contributor

wking commented Sep 29, 2016

On Thu, Apr 14, 2016 at 02:26:16PM -0700, Timothy Hobbs wrote:

Creating a list of files to be whited out, and keeping that outside
of layer.tar is better than putting .wh files in layer.tar.

If we don't mind picking up a non-standard tar entry, star
(Schilling's tar) uses pax extension headers and defines
SCHILY.filetype with (among other things) a "whiteout" value
representing a BSD whiteout directory entry 1. And as far as I can
tell, that's the same sort of whiteout we're interested in. So if we
want a way to represent whiteouts without leaving tar or restricting
the legal filename space, that's probably a good choice.

@aecolley

This comment has been minimized.

Copy link

aecolley commented Oct 16, 2016

Now that v1.0.0-rc1 is out, it's the last chance to consider this before v1.0.0 is final. After that, this will be such an incompatible change that it will have to wait for v2.0.0 (assuming SemVer).

@wking

This comment has been minimized.

Copy link
Contributor

wking commented Oct 16, 2016

You could add a new whiteout approach in 1.1. You'd only need to go to 2.0 if you remove or make backward-incompatible changes to an existing approach.

@aecolley

This comment has been minimized.

Copy link

aecolley commented Oct 17, 2016

A backwards-compatible change means either that a 1.1 image must be processed correctly by a 1.0 extractor, or that a 1.0 image must be processed correctly by a 1.1 extractor; depending on your point of view. In the case of 1.1-image-on-1.0-extractor, the extractor will not know about any way to produce a file named .wh.foo, regardless of how 1.1 decides to represent it. In the case of 1.0-image-on-1.1-extractor, the image cannot contain a layer with any .wh. file, because the 1.0 spec states unambiguously that there is no representation which can produce such a file.

Either way, it seems to me that it's impossible to construct an image which extracts a .wh.foo file into the unpacked bundle, unless either (a) both image and extractor are version 1.1, which is not backwards-compatible by definition; or (b) the image produces different bundle contents in 1.0 extractors and 1.1 extractors; which is an incompatibility all on its own.

Perhaps there's something I'm missing. Perhaps these limitations are acceptable to the project members. Otherwise, it's something that should be addressed before v1.0.0, IMHO.

@vbatts

This comment has been minimized.

Copy link
Member

vbatts commented Oct 17, 2016

Interesting. There is an option of approaching whiteouts like overlayfs
does, by setting device to 0 on non-directories and and xattr for directory.

On Mon, Oct 17, 2016, 18:37 Adrian Colley notifications@github.com wrote:

A backwards-compatible change means either that a 1.1 image must be
processed correctly by a 1.0 extractor, or that a 1.0 image must be
processed correctly by a 1.1 extractor; depending on your point of view. In
the case of 1.1-image-on-1.0-extractor, the extractor will not know about
any way to produce a file named .wh.foo, regardless of how 1.1 decides to
represent it. In the case of 1.0-image-on-1.1-extractor, the image cannot
contain a layer with any .wh. file, because the 1.0 spec states
unambiguously that there is no representation which can produce such a file.

Either way, it seems to me that it's impossible to construct an image
which extracts a .wh.foo file into the unpacked bundle, unless either (a)
both image and extractor are version 1.1, which is not backwards-compatible
by definition; or (b) the image produces different bundle contents in 1.0
extractors and 1.1 extractors; which is an incompatibility all on its own.

Perhaps there's something I'm missing. Perhaps these limitations are
acceptable to the project members. Otherwise, it's something that should be
addressed before v1.0.0, IMHO.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#24 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAEF6fCsS60XgtwhoZjJ5FVy8Go98VvXks5q0_cagaJpZM4IHyeg
.

@aecolley

This comment has been minimized.

Copy link

aecolley commented Oct 17, 2016

Unfortunately, support for extended attributes varies widely and incompatibly among tar implementations. And that gets us into the tarpit of the #342 discussion (pun accidental, I swear). But it works if the chosen tar format supports it.

@wking

This comment has been minimized.

Copy link
Contributor

wking commented Oct 17, 2016

On Mon, Oct 17, 2016 at 03:30:51PM -0700, Adrian Colley wrote:

A backwards-compatible change means either that a 1.1 image must be
processed correctly by a 1.0 extractor, or that a 1.0 image must be
processed correctly by a 1.1 extractor; depending on your point of
view.

This should be made very clear in the spec, leaving it open to
interpretation is going to make make SemVer largely useless. In
runtime-spec, opencontainers/runtime-spec#523 made it clear that you
may need a new runtime after bumping your config's minor version.

In the case of 1.1-image-on-1.0-extractor, the extractor will not
know about any way to produce a file named .wh.foo, regardless of
how 1.1 decides to represent it.

Agreed, which is why I want a 1.0 extractor to error out if you give
it a 1.1 image.

In the case of 1.0-image-on-1.1-extractor, the image cannot contain
a layer with any .wh. file, because the 1.0 spec states
unambiguously that there is no representation which can produce such
a file.

Well, you can have ‘.wh.foo’ entries. They're just interpreted as
“remove foo” and not “please create a file (or directory, or …) at
.wh.foo”. If 1.1 declares a new media type
(e.g. application/vnd.oci.image.layer.v1.1.tar) supporting
SCHILY.filetype whiteout 1 or device 0 2 or some other way to
avoid the current path overloading, then 1.0 extractors will correctly
die (with “I've never heard of
application/vnd.oci.image.layer.v1.1.tar”) and 1.1 and later 1.x
extractors will correctly unpack the layer (including any .wh.foo
files it contains).

Either way, it seems to me that it's impossible to construct an
image which extracts a .wh.foo file into the unpacked bundle,
unless either (a) both image and extractor are version 1.1, which is
not backwards-compatible by definition; or (b) the image produces
different bundle contents in 1.0 extractors and 1.1 extractors;
which is an incompatibility all on its own.

Backwards-compat means “the old stuff still works with the new tools”.
So 1.0 images would still work with 1.1 tools. But if .wh.foo files
are impossible in 1.0 (which seems like the current path), then yeah,
no 1.0 images are going to have them. But if you want a .wh.foo file,
and are willing to create a 1.1 image and use a 1.1+ extractor, that
will work. You don't need to bump to 2.0 (and throw out all of your
other 1.0 images).

@aecolley

This comment has been minimized.

Copy link

aecolley commented Oct 17, 2016

@wking OK, I see your point. 1.0 images can work on 1.x extractors so long as they don't attempt to create non-whiteout .wh. files, which is fine. I withdraw my position.

@stevvooe

This comment has been minimized.

Copy link
Contributor

stevvooe commented Oct 19, 2016

After spending some more intimate time with whiteout files (.wh.), the approach currently employed in this specification is really as good as any.

What good does changing this provide? I don't see how this allows layers to have "arbitrary data".

@wking

This comment has been minimized.

Copy link
Contributor

wking commented Oct 19, 2016

On Wed, Oct 19, 2016 at 04:47:27PM -0700, Stephen Day wrote:

After spending some more intimate time with whiteout files (.wh.),
the approach currently employed in this specification is really as
good as any.

What good does changing this provide? I don't see how this allows
layers to have "arbitrary data".

As @timthelion describes in the topic post 1, the current approach
makes it impossible to distribute .wh.* files because the entry path
is overloading “unpack me to here” and “delete the stuff there”. The
alternatives discussed here:

a. An external ‘whiteouts’ 1.
b. SCHILY.filetype set to whiteout 2.
c. Device set to 0 3.

all either give us an non-overloaded place to store the “delete the
stuff there” bit (a and b) or pick a location where the overloading is
less restrictive (c).

@stevvooe

This comment has been minimized.

Copy link
Contributor

stevvooe commented Oct 20, 2016

@wking Is there a realistic use case for distributing .wh. files, other than packing up a container runtime into an image?

Please, for the love of god, stop doing this.

@wking

This comment has been minimized.

Copy link
Contributor

wking commented Oct 20, 2016

On Wed, Oct 19, 2016 at 05:53:25PM -0700, Stephen Day wrote:

@wking Is there a realistic use case for distributing .wh. files,
other than packing up a container runtime into an image?

I don't have a personal use case for it, but I would like to avoid
complication when explaining what folks can put into layers. And from
an implementation perspective both SCHILY.filetype set to whiteout and
device set to 0 would be very easy to implement in code that already
uses the .wh.* approach.

@stevvooe

This comment has been minimized.

Copy link
Contributor

stevvooe commented Oct 21, 2016

@wking I don't really get this: the proposed layout is not how images are laid out. Such a provision requires either having in-band whiteout or a container format, which complicates unpacking. A tar of a tar, while we do it in image layout, should be avoided.

If we use overlay style device, now do you implement devices with windows on NTFS? Sure, you can put them in the tar file, but what happens when they are unpacked? How do you encode these in other archive formats that don't have device support, like zip?

@wking

This comment has been minimized.

Copy link
Contributor

wking commented Oct 21, 2016

On Fri, Oct 21, 2016 at 11:40:17AM -0700, Stephen Day wrote:

A tar of a tar, while we do it in image layout, should be avoided.

I agree, which is why I prefer the SCHILY.filetype whiteout approach or the device 0 approach. Both of those are in-band (like the current .wh.* approach), but SCHILY.filetype is not overloaded at all and device 0 is a less-restrictive overload.

If we use overlay style device…

I think you're talking about @VBatt's device 0 approach here.

… now do you implement devices with windows on NTFS? Sure, you can put them in the tar file, but what happens when they are unpacked?

If we land “device set to zero means whiteout” docs (for application/vnd.oci.image.layer.v1.tar or application/vnd.oci.image.layer.v1.1.tar), then an unpacker handling such a tarball will invoke the whiteout operation whenever it hits a device 0 entry. I don't see how the OS comes into it, since whiteouts are a cross-platform idea. Windows unpackers can still fail if they encounter a device 1 entry, etc.

The SCHILY.filetype-set-to-whiteout approach would also be cross-platform.

How do you encode these in other archive formats that don't have device support, like zip?

You don't, but that's not a big deal. We don't have an application/vnd.oci.image.layer.v1.zip format now, and I don't hear anyone calling for one. If there is a future need for zip-based layers, the authors of the zip-layer spec will need to figure out a scheme for marking whiteouts. Maybe they'll use .wh.*, and maybe they'll use something else, but I don't think the potential presence of a future zip-based format is a good reason to overload the path as a whiteout marker in tar.

@jonboulle

This comment has been minimized.

Copy link
Contributor

jonboulle commented Oct 24, 2016

I think we should definitely pursue this, but as previously discussed it's more likely for a 1.1+ horizon.

@timthelion

This comment has been minimized.

Copy link
Contributor Author

timthelion commented Oct 25, 2016

I don't know what the best way to change the spec is, but I personally
think that it would include adding a directory to the tarbal with a
whiteout list and any other information that we might want to add in the
future, so as to make the spec extendable.

That is that the / directory of the layer should include a
/opencontainer-data directory.

I have several use-cases in mind.

The first use-case is that of profesor Dr. Janette Lang in the year
2033. Dr. Lang is one of the smartest ecologists in the world, and is
currently working on a simulation of wetland health to be run on the
worlds fastest supercomputer. Dr. Lang is not even aware that the
simulation will be deployed using Docker, because dev-ops does the
deployment and dev-ops doesn't think anything of the idea, becasue
Docker is completely transparent for Dr. Lang's usecase, not needing
devices or even network access.

As the simulation is extremely large and runs for a long time, the
simulation is occassionally serialized to files with names that look like:

.wh.<wetland-name>.yyyy.MM.dd.hh.mm.ss

These serializations are associated with user editable meta-data files
named <wetland-name>.yml>.

It takes several weeks for the simlulation to complete. Sometime during
this time, dev-ops is running an upgrade. They have to migrate the
containers, and so they shut everything down, run a docker commit,
and then launch everything agian. No problem, Dr. Lang's team included
restarting the simulation right in their automated test suit. But this
time it doesn't work. The entire simulation starts over again at 0%. All
of a sudden, you have a night filled with long distance phone calls and
dev-ops seating their beards out trying to figure out why the hell
things aren't working. No one has a clue what is going on or why
everything didn't go as planned, and everyone is blaming everyone else,
no one even thinks to blame Docker, just as they don't think to blame
BTRFs or the kernal, or any of the other fundamental system utilities
that they are using.

Their final solution is to let the simulation run its full course all
over again, with stern order's from Dr. Lang to the dev-ops team that
they are "Not to fuck with anything." And a tired and anxoious dev-ops
team cursing over coffee about how these damned scientists can't learn
to program properly and how they assume that the serialization file was
never saved in the first place because it's "Just not there".


The second use-case is that I want to build an image within an image.


The third use-case is that of Enrico Xavier who is a spanish student of
computer science and a great fan of Esparanto. The language which was
specifically designed to not have any exceptions with regard to verb
conjugation. He hears his lecterur say that you can put any file into an
image except one which starts with .wh.. And Enrico Xavier correctly
assumes that it's stupid to waste your time trying to memorize random
exceptions and therefor decides never to use containers or images.

On 10/21/2016 08:40 PM, Stephen Day wrote:

@wking https://github.com/wking I don't really get this: the
proposed layout is not how images are laid out. Such a provision
requires either having in-band whiteout or a container format, which
complicates unpacking. A tar of a tar, while we do it in image layout,
should be avoided.

If we use overlay style device, now do you implement devices with
windows on NTFS? Sure, you can put them in the tar file, but what
happens when they are unpacked? How do you encode these in other
archive formats that don't have device support, like zip?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#24 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABU7-JWRbm15jVCDWDh7awRc2sNj76nqks5q2QcTgaJpZM4IHyeg.

@wking

This comment has been minimized.

Copy link
Contributor

wking commented Oct 25, 2016

On Tue, Oct 25, 2016 at 11:12:15AM -0700, Timothy Hobbs wrote:

I don't know what the best way to change the spec is, but I personally think that it would include adding a directory to the tarbal with a whiteout list and any other information that we might want to add in the future, so as to make the spec extendable.

You can extend at any time by minting new media types. You don't have to add support for something like this now on the off-chance that we'll use it later.

That is that the / directory of the layer should include a /opencontainer-data directory.

This has the same path-restriction as the current .wh.* approach, although it limits the restriction to a single path. POSIX pax extended headers provide the same functionality without overloading the entry path. And SCHILY.filetype is one example of what you can do with those extended headers.

@timthelion

This comment has been minimized.

Copy link
Contributor Author

timthelion commented Oct 25, 2016

Excelent. I of course don't want to restrict the possible paths at all,
and did not know that a better method was possible. Somehow, I did not
understand from your discussion that this SCHILY thing is a method of
hiding files inside the TAR that remain outside of the file tree.

On 10/25/2016 08:20 PM, W. Trevor King wrote:

On Tue, Oct 25, 2016 at 11:12:15AM -0700, Timothy Hobbs wrote:

I don't know what the best way to change the spec is, but I
personally think that it would include adding a directory to the
tarbal with a whiteout list and any other information that we
might want to add in the future, so as to make the spec extendable.

You can extend at any time by minting new media types. You don't have
to add support for something like this now on the off-chance that
we'll use it later.

That is that the |/| directory of the layer should include a
|/opencontainer-data| directory.

This has the same path-restriction as the current |.wh.*| approach,
although it limits the restriction to a single path. POSIX pax
extended
http://pubs.opengroup.org/onlinepubs/9699919799/utilities/pax.html#tag_20_92_13_02
headers
http://pubs.opengroup.org/onlinepubs/9699919799/utilities/pax.html#tag_20_92_13_03
provide the same functionality without overloading the entry path. And
|SCHILY.filetype|
http://cdrtools.sourceforge.net/private/man/star/star.4.html is one
example of what you can do with those extended headers.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#24 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABU7-MQwXRKptr18oOlfACQltXiCI2vrks5q3khzgaJpZM4IHyeg.

@stevvooe

This comment has been minimized.

Copy link
Contributor

stevvooe commented Oct 25, 2016

@timthelion Besides the contrived examples, do you have an extant proof that someone has actually run into a naming collision with whiteout files? I think there are problems in nested container scenarios but that can be solved with filesystem passthrough.

Look, I am not saying that AUFS whiteouts are ideal, but I thought the goal here was to define a container standard based on working systems. Especially, one that people actually use.

If we always focus on the limitations, we'll never realize the benefits.

@aecolley

This comment has been minimized.

Copy link

aecolley commented Oct 26, 2016

"It's already implemented" is literally the only good thing to say about the .wh. scheme, but it's still a powerful argument. Bumping the imageLayoutVersion can be used to signal a better-thought out scheme in the future. Anyone who really needs .wh. files in a 1.0.0 image can rename them into place as the first action in the container's runtime 😧.

I'm really only an outsider throwing peanuts from the gallery, but if I were dictator, I'd make the .wh. prefix overrideable by a new field in the image config, because that's a quick-to-implement change. But being stuck with .wh. for 1.0 is OK as long as we're not going to be stuck with it for the long term.

@wking

This comment has been minimized.

Copy link
Contributor

wking commented Oct 26, 2016

On Tue, Oct 25, 2016 at 05:37:37PM -0700, Adrian Colley wrote:

Bumping the imageLayoutVersion can be used to signal a
better-thought out scheme in the future…

imageLayoutVersion has nothing to do with it. The thing we'd be
bumping is the layer media type (e.g. by minting a new
application/vnd.oci.image.layer.v1.1.tar 1). But your main point
(that we aren't stuck with this long term unless we plan to support
the v1.0 media types forever) stands.

@vbatts

This comment has been minimized.

Copy link
Member

vbatts commented Oct 28, 2016

a couple points:

  1. I agree that pinning this standard to the rejected AUFS' approach is silly. The only redeeming thing is that "that's the way it's done in docker". The .wh. is a silly approach. The lack of ordering still bugs me.
  2. Having a whiteout list provided would be a decent alternative. Best to have it in the ordered list of "layers" or "references" to be applied. Because if it were just a file at the / root inside the tarball, than it would best be either the first entry or directly after an entry for /.
  3. There is an approach taken by overlayfs, which is upstreamed. Detailed https://www.kernel.org/doc/Documentation/filesystems/overlayfs.txt. For directories the set an xattr, of trusted.overlay.opaque=y which is somewhat tailored. But for all other file types they use a trick of setting the entry as a character device to 0/0 major/minor device, which hides it. Somewhat clever. Similar enough to the AUFS whiteout approach, with none of the silly .wh. name prefix-ing.
  4. While this talk of a SCHILY. approach is possible does not mean that it should be pursued. I personally am against the SCHILY. approach. It is a PAX header, but would require changes to virtually all tar implementations. Event the golang library would require some re-work to allow for setting arbitrary pax headers like a new SCHILY. type. Which would likely mean carrying/maintaining a fork of upstream stdlib. This is a none-starter.
@wking

This comment has been minimized.

Copy link
Contributor

wking commented Oct 28, 2016

On Fri, Oct 28, 2016 at 06:05:57AM -0700, Vincent Batts wrote:

  1. There is an approach taken by overlayfs, which is
    upstreamed. Detailed
    https://www.kernel.org/doc/Documentation/filesystems/overlayfs.txt. For
    directories the set an xattr, of trusted.overlay.opaque=y which
    is somewhat tailored. But for all other file types they use a
    trick of setting the entry as a character device to 0/0
    major/minor device, which hides it. Somewhat clever. Similar
    enough to the AUFS whiteout approach, with none of the silly
    .wh. name prefix-ing.
  2. While this talk of a SCHILY. approach is possible does not mean
    that it should be pursued. I personally am against the
    SCHILY. approach. It is a PAX header, but would require changes
    to virtually all tar implementations. Event the golang library
    would require some re-work to allow for setting arbitrary pax
    headers like a new SCHILY. type. Which would likely mean
    carrying/maintaining a fork of upstream stdlib. This is a
    none-starter.

I don't think we'd need to fork archive/tar to support SCHILY.filetype
whiteout. GNU's tar is already using pax Extended Headers for xattr
support:

$ zgrep TMPFS_XATTR /proc/config.gz
CONFIG_TMPFS_XATTR=y
$ mkdir a
$ sudo mount -t tmpfs none a
$ touch a/b
$ md5sum a/b
d41d8cd98f00b204e9800998ecf8427e a/b
$ sudo setfattr -n trusted.md5sum -v d41d8cd98f00b204e9800998ecf8427e a/b
$ tar --version
tar (GNU tar) 1.28

$ sudo tar -cf a.tar --xattrs a
$ strings a.tar | grep SCHILY
64 SCHILY.xattr.trusted.md5sum=d41d8cd98f00b204e9800998ecf8427e

With SCHILY.filetype we'd just be putting a different payload in that
typeflag x entry. I can mock up an example if that would help
demonstrate the point (or turn up any implementation issues I'm
missing ;).

@wking

This comment has been minimized.

Copy link
Contributor

wking commented Oct 28, 2016

On Fri, Oct 28, 2016 at 08:20:06AM -0700, W. Trevor King wrote:

$ strings a.tar | grep SCHILY
64 SCHILY.xattr.trusted.md5sum=d41d8cd98f00b204e9800998ecf8427e

With SCHILY.filetype we'd just be putting a different payload in that
typeflag x entry. I can mock up an example if that would help
demonstrate the point (or turn up any implementation issues I'm
missing ;).

Looking into this more today, it turns out that there is an issue with
Go's stock archive/tar: Go currently consumes x typeflag entries
internally and throws out any extended headers it doesn't recognize
(golang/go#14472). I've suggested Go provide a way around that
limitation, but until that happens Go will ignore any extended headers
besides SCHILY.xattr.*. Overlayfs's trusted.overlay.opaque (which in
the pax extended header would be SCHILY.xattr.trusted.overlay.opaque)
avoids the current Go limitation.

I don't understand why overlayfs uses one approach (c0/0) for whiteout
files and another (trusted.overlay.opaque) for opaque directories. It
seems like they'd use trusted.overlay.whiteout or some such to follow
the xattr pattern. Maybe there is a performance benefit to avoiding
an xattr lookup for whiteouts, but opaque directories need to be
directories so they can hold upper-level content, and you can't have a
c0/0 directory ;). Unpacking a layer tarball is not something that
happens as frequently as overlayfs file access, so if the reason for
using c0/0 in overlayfs is performance, that reasoning may not apply
here. On the other hand, c0/0 works for overlayfs, and major 0 is
reserved (at least on Linux 1). POSIX doesn't have much to say
about major/minor 2, with the most detail coming from the ustar
spec. From ustar's typeflag docs 3:

3,4
Represent character special files and block special files
respectively. In this case the devmajor and devminor fields shall
contain information defining the device, the format of which is
unspecified by this volume of POSIX.1-2008. Implementations may
map the device specifications to their own local specification or
may ignore the entry.

So I'm ok with the following ways for specifying “this is a whiteout”:

a. Follow overlayfs into overloading c0/0, which likely works (and
matches an existing implementation) but has unclear namespacing and
portability.

b. Follow overlayfs into coining a new SCHILY.xattr.trusted
(SCHILY.xattr.trusted.oci.whiteout?). This has better namespacing
than (a) but means we're coining a new pax extended header string
(even though we're following an established “use a pax extended
header” approach).

c. Wait until Go fixes their pax extended header processing and use
the existing SCHILY.filetype whiteout.

@cyphar

This comment has been minimized.

Copy link
Member

cyphar commented Nov 5, 2016

I was reading about the Tar archive format the other day, and I noticed that we could just use Typeflag for this. Just specify that 'W' indicates a whiteout path (A-Z is defined as fair-game for tar archives from what I understand). This also helps with ensuring that we don't have to make it clear that you have to add a directory as a "file" when whiting it out.

One thing I ran into with umoci is that I'm not entirely sure what I'm meant to do with parent directories and whiteouts. Is it required that I include all of the parent components of a path in a diff layer if I'm whiting out a file? Or can I also just include the entries which are specifically paths that have been removed / explicitly changed?

@wking

This comment has been minimized.

Copy link
Contributor

wking commented Nov 5, 2016

On Sat, Nov 05, 2016 at 04:42:46AM -0700, Aleksa Sarai wrote:

I was reading about the Tar archive format the other day, and I
noticed that we could just use Typeflag for this. Just specify
that 'W' indicates a whiteout path…

This is the same approach as SCHILY.filetype whiteout, but without
using a pax extended header. I'd rather wait for golang/go#14472 and
use the pax extended header, to avoid accidentally misinterpreting
some other tar extension that has decided to use a ‘W’ type for
something else. Especially with #342 still in flight, leaving which
tar unpackers MUST recognize very unclear.

One thing I ran into with umoci is that I'm not entirely sure what
I'm meant to do with parent directories and whiteouts. Is it
required that I include all of the parent components of a path in a
diff layer if I'm whiting out a file? Or can I also just include the
entries which are specifically paths that have been removed /
explicitly changed?

Layers only need to contain the paths they're touching. If you want
to whiteout ./a/b/c but don't need to touch ./a or ./a/b, your tar can
have a single entry touching:

./a/b/c

This is covered by the “extracted like a regular tar archive” wording
1 and underlined (for the ./ case) in the text that landed via #408.

@cyphar

This comment has been minimized.

Copy link
Member

cyphar commented Nov 11, 2016

I just ran into this issue with this commit: cyphar/umoci@2a42128. Basically if you call lstat on AUFS with a whiteout path you get an EPERM. Which means that minor ordering issues like that commit fixes become permission issues and bugs.

cyphar added a commit to openSUSE/umoci that referenced this issue Nov 11, 2016

layer unpack: fix whiteout errors on AUFS
lstat(2) returns EPERM if you try to get information about a whiteout
path. This is one of the reasons I think that matching the AUFS
semantics is insane.

Ref: opencontainers/image-spec#24
Signed-off-by: Aleksa Sarai <asarai@suse.com>

cyphar added a commit to openSUSE/umoci that referenced this issue Nov 11, 2016

layer unpack: fix whiteout errors on AUFS
lstat(2) returns EPERM if you try to get information about a whiteout
path. This is one of the reasons I think that matching the AUFS
semantics is insane.

Ref: opencontainers/image-spec#24
Signed-off-by: Aleksa Sarai <asarai@suse.com>
@wking

This comment has been minimized.

Copy link
Contributor

wking commented Aug 25, 2017

@cyphar

This comment has been minimized.

Copy link
Member

cyphar commented Aug 26, 2017

@wking I think that (b) or (e) are the two best options. (e) is the best from a "let's create a standard that you have to explicitly implement" perspective, but (b) is the best from a "let's do what other people do already" (I consider the BSDs to be a better source of inspiration in this area than Linux filesystem internals). As for (c) and (d), they feel like they'd either be inconsistent with overlayfs anyway (since overlayfs does both) and would likely be inconsistent with how a normal user of a tar parsing library would act. (a) feels clunky, and (f) feels like a misuse of Typeflag which is a very small namespace.

@wking

This comment has been minimized.

Copy link
Contributor

wking commented Aug 27, 2017

@stevvooe

This comment has been minimized.

Copy link
Contributor

stevvooe commented Aug 31, 2017

What problem are we trying to solve here? There is already millions of container images using this approach. Tweaking this won't actually make the format better; it just means that complaint implementations now need to handle two variations, instead of one.

If we are going to make an incompatible format, it would be better to move away from a layered diff approach to something tree oriented (like https://github.com/containerd/continuity).

@timthelion

This comment has been minimized.

Copy link
Contributor Author

timthelion commented Aug 31, 2017

@wking

This comment has been minimized.

Copy link
Contributor

wking commented Aug 31, 2017

@caniszczyk

This comment has been minimized.

Copy link
Contributor

caniszczyk commented Aug 31, 2017

@timthelion please keep things civil, we have a code of conduct that we expect our community to abide by: https://www.opencontainers.org/about/code-of-conduct

@vbatts

This comment has been minimized.

Copy link
Member

vbatts commented Oct 11, 2018

This came up again recently at Container Camp UK in London. @cyphar and I were discussing what other options would be a migration path from the AUFS inherited approach. And as we also want to enable ourselves for a non-TAR future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment