Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Image import #315

Merged
merged 13 commits into from
Apr 1, 2020
Merged

Image import #315

merged 13 commits into from
Apr 1, 2020

Conversation

squaremo
Copy link
Member

See also #301

@squaremo
Copy link
Member Author

squaremo commented Feb 14, 2020

TODO (integration tests for the following):

  • Deal with symlinks correctly when downloading and in VFS (can use jk run --lib alpine:3.9 ... to check, since it symlinks everything in /bin/ to /bin/busybox)
  • Deal with whiteout files: both individual masks and sibling masks (will it need to check each directory as it descends?)

@squaremo squaremo force-pushed the image-import branch 5 times, most recently from cf102be to 3cf55ca Compare February 17, 2020 15:45
@squaremo squaremo marked this pull request as ready for review February 28, 2020 16:08
@squaremo squaremo added this to the 0.4.0 milestone Mar 10, 2020
@squaremo squaremo requested a review from dlespiau March 12, 2020 08:34
@squaremo
Copy link
Member Author

@dlespiau This is a wodge of code, much of it tests, and not much of it changes to existing code. But a significant change, nonetheless.

@squaremo
Copy link
Member Author

squaremo commented Mar 12, 2020

Some more user-interface-oriented TODOs:

  • I've assumed that module code is in the root directory of an image. In general, images will be constructed for purpose, so that is easy to arrange. But maybe it's more friendly to expect people to put things in say, /jk/? (In the event, I made it /jk/modules, to make some room for other things e.g., the binary; I need to fix up the image build for jkcfg/kubernetes)

  • At present this will happily download :latest if you don't put a tag at the end of an image name. This is pretty convenient, but doesn't help with repeatability. Maybe tags should be required to discourage this (though you can always have foo:latest, and you can mutate tags anyway, so ..).

EDIT: turn these into things to fix

@squaremo squaremo force-pushed the image-import branch 3 times, most recently from b07e0cf to 5f998d3 Compare March 27, 2020 17:44
This implements one half of an image cache -- creating a filesystem
from the cache, given an image name. (The other half is to download
and put the files into the cache in the first place.)

In the process, I moved the overlay filesystem to its own package.
Rough-and-ready code for downloading images into the cache. Manifests
referred to by digest get written out; otherwise, they get tagged.
 - factor out the symlink of tag manifests
 - factor out writing a layer
 - write each layer to a tmp dir, then move it, so failures don't
   leave partial layers
 - test that the cached image can be accessed using its digest,
   without using the image registry
This gives the vm (and therefore any command that runs the VM) a flag
`--lib`, which will cache and use the mentioned image in the module
search path.

At present the module search path only uses a FileResolver; that is,
it expects modules in a flat structure.

Ths involves a bit of extra test scaffolding, so that invocations of
`jk` when testing can use `--lib` (and `--cache` to set the cache to a
fresh, temporary directory). In particular, an image registry is
started, and images from tarballs uploaded to it. This is described in
`tests/testfiles/README-images.md`.
In the OCI image format, the layer filesystems can contain "whiteout"
files (think liquid paper or tipp-ex). This suppress any like-named
file in the layers below.

    https://github.com/opencontainers/image-spec/blob/master/layer.md#whiteouts

This commit accounts for whiteouts in the overlay VFS when opening a
file or enumerating files in a directory.

It's a fairly naive implementation, which checks each possible
whiteout file, when traversing layers (since whiteouts hide things in
the _next_ layer).
This is important because images may contain symlinks to save
repeating files. A notable example is alpine, in which everything in
/bin/ is a symlink to the busybox binary.
This adds tests for present and missing files in a library image that
contains whiteout files.

The test surfaced a mistake: the opaque whiteout file name is
`.wh..wh..opq`, not `.wh..wh.opq`. Owps.
Instead of expecting modules to be dumped in the root directory, it's
friendlier to give jk a "namespace" by putting things in a
subdirectory. For modules, I have chosen `/jk/modules`.

But to make sure we look for files in the right location (and not
outside it!), we have to do the same thing as we do for modules in the
"real" filesystem, and treat the chosen directory as the root of the
image filesystem. This requires an extra layer, to "chroot" in the
image filesystem.
It's much easier to debug if you get the full path (including the
image name) for an import candidate, when an import fails.
Starting with foolib, which is the easy one.
Phew! This took some figuring out. Details in README-images.md.
This needs a "plugin" to pflag, which I've put in `pkg/cli`.
@squaremo squaremo merged commit 7fd4fd7 into master Apr 1, 2020
@squaremo squaremo deleted the image-import branch April 1, 2020 08:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant