image: Refactor to use cas/ref engines instead of walkers #5

wking · 2016-09-16T08:53:40Z

Abstract out the details of the image-layout format, since the validators and unpackers shouldn't care about that sort of backend stuff. This should make it easy to build backends based on zip files, HTTP, the Docker registry format, etc.

Carried from opencontainers/image-spec#159. The outstanding issue there was what to do about tar handling. I'm fine with this PR as it stands, but I've floated a few alternatives in case the maintainers want to pick an alternative path.

runcom · 2016-09-16T09:14:12Z

/cc @s-urbaniak

wking · 2016-09-17T06:15:34Z

I've pushed 9687484 → 90c4fea with:

An ‘=’ → ‘:=’ fix in cmd/oci-refs/list.go's printName.
O_RDWR → O_WRONLY in image/layout/tar.go's CreateTarFile.
A new commit adding directory-backed engines (to compliment the
previous tar-backed engines). The directory-backed engines are much
nicer ;).

On Fri, Sep 16, 2016 at 01:53:40AM -0700, W. Trevor King wrote:

Abstract out the details of the image-layout format, since the validators and unpackers shouldn't care about that sort of backend stuff. This should make it easy to build backends based on zip files, HTTP, the Docker registry format, etc.

Carried from opencontainers/image-spec#159. The outstanding issue there was what to do about tar handling. I'm fine with this PR as it stands, but I've floated a few alternatives in case the maintainers want to pick an alternative path.

You can view, comment on, or merge this pull request online at:

#5

-- Commit Summary --

Makefile: Add a pattern rule for oci-* commands

image/cas: Add a generic CAS interface

image/refs: Add a generic name-based reference interface

specs-go: Add ImageLayoutVersion and check oci-layout in tar engines

image/layout/tar.go: Add TarEntryByName

image/cas/put: Add a PutJSON helper

image/cas: Implement Engine.Put

vendor: Bundle golang.org/x/net/context

image/*/interface: Add unstable warnings to Engines

image/layout/tar: Add a CreateTarFile helper

image/refs: Implement Engine.Put

image: Refactor to use cas/ref engines instead of walkers

cmd: Document the cas, refs, and init commands

-- File Changes --
M .gitignore (5)
M Makefile (20)
A cmd/oci-cas/get.go (93)
A cmd/oci-cas/main.go (38)
A cmd/oci-cas/oci-cas-get.1.md (27)
A cmd/oci-cas/oci-cas-put.1.md (27)
A cmd/oci-cas/oci-cas.1.md (47)
A cmd/oci-cas/put.go (83)
M cmd/oci-create-runtime-bundle/main.go (9)
M cmd/oci-create-runtime-bundle/oci-create-runtime-bundle.1.md (4)
A cmd/oci-image-init/image_layout.go (57)
A cmd/oci-image-init/main.go (37)
A cmd/oci-image-init/oci-image-init-image-layout.1.md (27)
A cmd/oci-image-init/oci-image-init.1.md (33)
A cmd/oci-image-tools.7.md (40)
M cmd/oci-image-validate/main.go (14)
M cmd/oci-image-validate/oci-image-validate.1.md (8)
A cmd/oci-refs/get.go (78)
A cmd/oci-refs/list.go (75)
A cmd/oci-refs/main.go (39)
A cmd/oci-refs/oci-refs-get.1.md (27)
A cmd/oci-refs/oci-refs-list.1.md (27)
A cmd/oci-refs/oci-refs-put.1.md (27)
A cmd/oci-refs/oci-refs.1.md (55)
A cmd/oci-refs/put.go (81)
M cmd/oci-unpack/main.go (13)
M cmd/oci-unpack/oci-unpack.1.md (4)
M glide.lock (10)
M image/autodetect.go (3)
A image/cas/interface.go (47)
A image/cas/layout/interface.go (25)
A image/cas/layout/main.go (37)
A image/cas/layout/tar.go (108)
A image/cas/put.go (47)
M image/config.go (69)
M image/descriptor.go (115)
M image/image.go (202)
A image/layout/layout.go (22)
A image/layout/tar.go (267)
M image/manifest.go (119)
M image/manifest_test.go (45)
A image/refs/interface.go (82)
A image/refs/layout/main.go (37)
A image/refs/layout/tar.go (135)
D image/walker.go (118)
A vendor/golang.org/x/net/LICENSE (27)
A vendor/golang.org/x/net/PATENTS (22)
A vendor/golang.org/x/net/context/context.go (156)
A vendor/golang.org/x/net/context/go17.go (72)
A vendor/golang.org/x/net/context/pre_go17.go (300)
-- Patch Links --

https://github.com/opencontainers/image-tools/pull/5.patch
https://github.com/opencontainers/image-tools/pull/5.diff

You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
#5

This email may be signed or encrypted with GnuPG (http://www.gnupg.org).
For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy

wking · 2016-09-17T06:52:26Z

Pushed 90c4fea → 7c9efa7 adding:

oci-cas delete …
oci-refs delete …
DirDelete to avoid 1:

$ make lint
checking lint
image/cas/layout/dir.go:108::warning: duplicate of image/refs/layout/dir.go:156-167 (dupl)
image/refs/layout/dir.go:156::warning: duplicate of image/cas/layout/dir.go:108-119 (dupl)

xiekeyang · 2016-09-22T00:27:46Z

@wking

This patch seems to be too great to be merged sooner. Could you please split it and submit one by one commit? Now we need the method instead of walking, and go forward to some other functionality based on it. I'm working on my local side https://github.com/xiekeyang/image-tools/commit/ff0401330fdca591f67b82924c74fc84ce9d052f referring to your implementation, which is temporary solution and should wait for your patch merged. So I hope your commit can be merged. Thanks so much.

wking · 2016-09-22T05:54:19Z

On Wed, Sep 21, 2016 at 05:27:46PM -0700, xiekeyang wrote:

Could you please split it and submit one by one commit?

First commit submitted separately as #28. If/when that lands, I'll
file a PR for the next commit, etc. (unless this PR has already landed
in its entirety ;).

wking · 2016-10-11T19:45:42Z

On Wed, Sep 21, 2016 at 10:54:19PM -0700, W. Trevor King wrote:

On Wed, Sep 21, 2016 at 05:27:46PM -0700, xiekeyang wrote:

Could you please split it and submit one by one commit?

First commit submitted separately as #28. If/when that lands, I'll
file a PR for the next commit, etc. (unless this PR has already
landed in its entirety ;).

Rebased onto master with 7c9efa7 → 1926150 now that #28 has landed.
Bottom of that stack submitted separately as #40 in case that
continues to help review.

wking · 2016-10-15T05:39:44Z

I've pushed 1926150 → ee44a09 which:

Checks the result of the DirEngine.Put cleanup Remove, and returns a wrapped error if both the Put and the cleanup Remove fail.
Adds a comment describing the public DirDelete function.

wking · 2016-10-15T06:01:25Z

I've pushed ee44a09 → 4dfe06f fixing a few more lint issues with the
tip commit:

Cleanup Remove error wrapping in the cas DirEngine.Put (previously
I'd just fixed the refs DirEngine.Put).
Check the io.Copy error status in cas DirEngine.Put.
Don't mask err in cas TarEngine.Put now that it's using blobPath.

Making the CAS/refs Get implementations more DRY. Signed-off-by: W. Trevor King <wking@tremily.us>

A fair amount of image setup involves pushing JSON objects to CAS, so provide a convenient wrapper around that. This implementation could be improved, with at least: * Consistent key sorts, etc. to increase the chances of matching an existing CAS blob. * Streaming the marshaled JSON into the engine to avoid serializing it in memory before passing it into Engine.Put. But the API is fine, and we can improve the implementation as we go. Signed-off-by: W. Trevor King <wking@tremily.us>

Generated with: $ make install.tools $ make update-deps $ git checkout HEAD -- vendor/github.com/spf13/pflag $ git checkout HEAD -- vendor/github.com/runtime-spec I rolled back the other changes because I haven't checked for compatibility issues due to the upgrades. I also rolled back the hash bumps for those packages in glide.lock. Signed-off-by: W. Trevor King <wking@tremily.us>

As called for during today's meeting [1]. The wording is from Brandon [2]. [1]: http://ircbot.wl.linuxfoundation.org/meetings/opencontainers/2016/opencontainers.2016-07-27-21.00.log.html#l-54 [2]: https://github.com/opencontainers/image-spec/pull/159/files/9f8a18eb1cb2a1cbccd6fe137d743dccc2142901#r72525379 Signed-off-by: W. Trevor King <wking@tremily.us>

This is a bit awkward. For writing a tar entry, we need to know both the name and size of the file ahead of time. The implementation in this commit accomplishes that by reading the Put content into a buffer, hashing and sizing the buffer, and then calling WriteTarEntryByName to create the entry. With a filesystem-backed CAS engine, we could avoid the buffer by writing the file to a temporary location with rolling hash and size tracking and then renaming the temporary file to the appropriate path. WriteTarEntryByName itself has awkward buffering to avoid dropping anything onto disk. It reads through its current file and writes the new tar into a buffer, and then writes that buffer back back over its current file. There are a few issues with this: * It's a lot more work than you need if you're just appending a new entry to the end of the tarball. But writing the whole file into a buffer means we don't have to worry about the trailing blocks that mark the end of the tarball; that's all handled transparently for us by the Go implementation. And this implementation doesn't have to be performant (folks should not be using tarballs to back write-heavy engines). * It could leave you with a corrupted tarball if the caller dies mid-overwrite. Again, I expect folks will only ever write to a tarball when building a tarball for publishing. If the caller dies, you can just start over. Folks looking for a more reliable implementation should use a filesystem-backed engine. * It could leave you with dangling bytes at the end of the tarball. I couldn't find a Go invocation to truncate the file. Go does have an ftruncate(2) wrapper [1], but it doesn't seem to be exposed at the io.Reader/io.Writer/... level. So if you write a shorter file with the same name as the original, you may end up with some dangling bytes. cas.Engine.Put protects against excessive writes with a Get guard; after hashing the new data, Put trys to Get it from the tarball and only writes a new entry if it can't find an existing entry. This also protects the CAS engine from the dangling-bytes issue. The 0666 file modes and 0777 directory modes rely on the caller's umask to appropriately limit user/group/other permissions for the tarball itself and any content extracted to the filesystem from the tarball. The trailing slash manipulation (stripping before comparison and injecting before creation) is based on part of libarchive's description of old-style archives [2]: name Pathname, stored as a null-terminated string. Early tar implementations only stored regular files (including hardlinks to those files). One common early convention used a trailing "/" character to indicate a directory name, allowing directory permissions and owner information to be archived and restored. and POSIX ustar archives [3]: name, prefix ... The standard does not require a trailing / character on directory names, though most implementations still include this for compatibility reasons. [1]: https://golang.org/pkg/syscall/#Ftruncate [2]: https://github.com/libarchive/libarchive/wiki/ManPageTar5#old-style-archive-format [3]: https://github.com/libarchive/libarchive/wiki/ManPageTar5#posix-ustar-archives Signed-off-by: W. Trevor King <wking@tremily.us>

This is pretty straightforward with the new WriteTarEntryByName helper. I considered pulling the ref name -> path conversion (%s -> ./refs/%s) out into a helper function to stay DRY, but the logic is simple enough that it seemed more intuitive to leave it inline. Signed-off-by: W. Trevor King <wking@tremily.us>

The NewEngine commands for the tar-backed image-layout engines (both the CAS and refs engines) open files O_RDWR and expect image-layout compatible content in the tarball. That makes sense, but for folks who *don't* have such a tarball, a helper like CreateTarFile makes it easy to explicitly create an empty one. The 0666 file modes and 0777 directory modes rely on the caller's umask to appropriately limit user/group/other permissions for the tarball itself and any content extracted to the filesystem from the tarball. The trailing slashes are based on part of libarchive's description of old-style archives [1]: name Pathname, stored as a null-terminated string. Early tar implementations only stored regular files (including hardlinks to those files). One common early convention used a trailing "/" character to indicate a directory name, allowing directory permissions and owner information to be archived and restored. and POSIX ustar archives [2]: name, prefix ... The standard does not require a trailing / character on directory names, though most implementations still include this for compatibility reasons. Expose this new functionality on the command line as: $ oci-image-init image-layout PATH where 'image-layout' is a separate level in case we support initializing additional types of repositories in the future. [1]: https://github.com/libarchive/libarchive/wiki/ManPageTar5#old-style-archive-format [2]: https://github.com/libarchive/libarchive/wiki/ManPageTar5#posix-ustar-archives Signed-off-by: W. Trevor King <wking@tremily.us>

The validation/unpacking code doesn't really care what the reference and CAS implemenations are. And the new generic interfaces in image/refs and image/cas will scale better as we add new backends than the walker interface. This replaces the simpler interface from image/reader.go with something more robust. The old tar/directory distinction between image and imageLayout is gone. The new CAS/refs engines don't support directory backends yet (I plan on adding them once the engine framework lands), but the new framework will handle tar/directory/... detection inside layout.NewEngine (and possibly inside a new (cas|refs).NewEngine when we grow engine types that aren't based on image-layout). Also replace the old methods like: func (d *descriptor) validateContent(r io.Reader) error with functions like: validateContent(ctx context.Context, descriptor *specs.Descriptor, r io.Reader) error to avoid local types that duplicate the image-spec types. This saves an extra instantiation for folks who want to validate (or whatever) a specs.Descriptor they have obtained elsewhere. I'd prefer casLayout and refsLayout for the imported packages, but Stephen doesn't want camelCase for package names [1]. [1]: opencontainers/image-spec#159 (comment) Signed-off-by: W. Trevor King <wking@tremily.us>

Most of this is new boilerplate, but oci-image-tools.7.md is based on the old oci-image-tool.1.md removed by fe363aa (*: move to opencontainers/image-tools, 2016-09-15). There's a lot going on in this repo, and it's nice to have a page that outlines everything provided by the project even though we're no longer providing a single command. Signed-off-by: W. Trevor King <wking@tremily.us>

Don't worry about: $ make lint checking lint image/cas/layout/dir.go:37::warning: duplicate of image/refs/layout/dir.go:37-54 (dupl) image/refs/layout/dir.go:37::warning: duplicate of image/cas/layout/dir.go:37-54 (dupl) cmd/oci-cas/delete.go:41::warning: duplicate of cmd/oci-cas/get.go:43-62 (dupl) cmd/oci-refs/delete.go:41::warning: duplicate of cmd/oci-refs/get.go:42-61 (dupl) cmd/oci-cas/delete.go:15::warning: duplicate of cmd/oci-refs/delete.go:15-72 (dupl) cmd/oci-refs/delete.go:15::warning: duplicate of cmd/oci-cas/delete.go:15-72 (dupl) cmd/oci-cas/get.go:43::warning: duplicate of cmd/oci-refs/get.go:42-61 (dupl) cmd/oci-refs/get.go:42::warning: duplicate of cmd/oci-cas/get.go:43-62 (dupl) make: *** [lint] Error 1 The commands are all similar (open an engine, perform some method, print the result), but are short enough that extracting out helpers would be more trouble and indirection than it's worth. Oddly, dupl seems happy to print: "duplicate of oci-cas/get.go:..." and "duplicate of get.go:..." if I exclude: "duplicate of cmd/oci-cas/get.go:..." or "duplicate of .*oci-cas/get.go:..." I want to get "oci-cas" and "oci-refs" in the exclusion regular expression somewhere to avoid accidentally skipping dupl checks for other get.go and similar if they show up somewhere else in the repository, so I'm matching on the initial filename. Signed-off-by: W. Trevor King <wking@tremily.us>

These are much nicer than the tar engines (hooray atomic renames :), so switch the manifest tests tests back to using the directory-backed engines. I also switched the man-page examples over to directory-backed layouts, now that they're what oci-image-init generates by default. And I added command-line wrappers for the delete methods, now that we have a backend that implements it. I do with there was a paginated, callback-based directory lister we could use instead of ioutils.ReadDir. On the other hand, by the time directories get big enough for that to matter we may be sharding them anyway. Signed-off-by: W. Trevor King <wking@tremily.us>

stevvooe · 2016-11-18T20:11:24Z

@cyphar Sorry for the slow cycle, but it looks like that link is broken.

Shall we take this design into an issue and close this PR?

wking · 2016-11-18T21:49:20Z

On Fri, Nov 18, 2016 at 12:11:25PM -0800, Stephen Day wrote:

@cyphar Sorry for the slow cycle, but it looks like that link is broken.

He removed it in cyphar/umoci@8d19f36. The last version before that
removal is in 1.

Shall we take this design into an issue and close this PR?

I'm happy to discuss any design differences between what I have here
and the choices @cyphar has made in umoci. I don't think ditching all
of this is the best way to have that discussion ;). Last night saw
some progress on #40, maybe you want to chime in there on any issues
you see with the CAS engine interface or the tar-get implementation?

cyphar · 2016-11-19T02:43:44Z

@stevvooe I've removed the file because all of the stuff in the design document has since been implemented in umoci and the design document no longer accurately described how everything is implemented.

zhouhao3 · 2017-07-27T06:44:41Z

SGTM, What is the current state of this patch?

wking · 2017-07-27T14:33:12Z

SGTM, What is the current state of this patch?

Wait, is there really renewed interest in this over a year since I filed it as opencontainers/image-spec#159? If so, you should probably start with #40, which has just the CAS part of this PR. And getting a green light or concrete improvement suggestions from @stevvooe would be nice, since he's been pushing back against my proposed CAS interface, most recently here.

stevvooe · 2017-07-28T00:15:34Z

@q384566678 We worked hard on defining a useful content api in https://godoc.org/github.com/containerd/containerd/content. It handles all the cases of resumption, keeps memory usage low and handles pre and post known digest entries, as well as concurrency issues. We've also been able to use context to wrap it in transaction managers.

There may be a few tweaks we still need to make, but it is the culmination of work around the old docker/distribution, docker and containerd, as well as some research on distributed content addressable storage systems.

I'd advocate for closing this PR.

Let me know if you have questions.

wking · 2017-07-28T05:39:00Z

We worked hard on defining a useful content api in https://godoc.org/github.com/containerd/containerd/content.

Comparing that API's Provider.Reader:

type Provider interface {
  Reader(ctx context.Context, dgst digest.Digest) (io.ReadCloser, error)
  ReaderAt(ctx context.Context, dgst digest.Digest) (io.ReaderAt, error)
}

with my Engine.Get:

type Engine interface {
  Get(ctx context.Context, digest digest.Digest) (reader io.ReadCloser, err error)
}

You can see tht we have exactly the same API. And it is also shared by umoci, where it is GetBlob. So I don't see any problem with the part of this PR's API that I've covered in #40.

zhouhao3 · 2017-10-25T06:50:21Z

@wking @stevvooe @xiekeyang @cyphar
Sorry I forgot to reply. So the main question right now is the API question?
I've seen similar functionality in umoci. And these features are important for image-tools propulsion, and I wonder if we can discuss and solve this?

wking · 2017-10-25T15:07:58Z

In case folks are interested, I have more recent work on this front in the stand-alone casengine repo, and there's a good ref-engine reader in oci-discovery (although there is currently no ref-engine writer there). @xiekeyang and I would be happy to get feedback on those and, eventually if feedback is positive, transfer those to the OCI.

wking force-pushed the cas-api branch from d6f9842 to 2bea980 Compare September 16, 2016 09:00

wking mentioned this pull request Sep 16, 2016

add unit test for ValidateLayout function #4

Merged

wking mentioned this pull request Sep 16, 2016

image: Refactor to use cas/ref engines instead of walkers opencontainers/image-spec#159

Closed

wking force-pushed the cas-api branch 5 times, most recently from 9687484 to 90c4fea Compare September 17, 2016 06:09

wking force-pushed the cas-api branch from 90c4fea to 7c9efa7 Compare September 17, 2016 06:47

wking mentioned this pull request Sep 22, 2016

Makefile: Add a pattern rule for oci-* commands #28

Merged

xiekeyang mentioned this pull request Sep 22, 2016

descriptor validation #29

Merged

wking mentioned this pull request Sep 30, 2016

validate media type of manifest descendants opencontainers/image-spec#341

Merged

wking force-pushed the cas-api branch from 7c9efa7 to 1dbc99c Compare October 11, 2016 19:23

wking mentioned this pull request Oct 11, 2016

image/cas: Add a generic CAS interface #40

Open

wking force-pushed the cas-api branch from 1dbc99c to 1926150 Compare October 11, 2016 19:43

wking force-pushed the cas-api branch from 1926150 to 180dda0 Compare October 11, 2016 20:01

wking mentioned this pull request Oct 13, 2016

Add create-layer sub command to create filesystem changeset #8

Open

wking force-pushed the cas-api branch from 180dda0 to ee44a09 Compare October 15, 2016 05:36

wking force-pushed the cas-api branch 2 times, most recently from cb0b850 to 4dfe06f Compare October 15, 2016 05:51

This was referenced Oct 19, 2016

Add config validation #39

Closed

Build story: producing an OCI image opencontainers/image-spec#396

Closed

wking added 11 commits November 18, 2016 07:43

image/layout/tar.go: Add TarEntryByName

9263d6e

Making the CAS/refs Get implementations more DRY. Signed-off-by: W. Trevor King <wking@tremily.us>

wking force-pushed the cas-api branch from 512176d to ca90284 Compare November 18, 2016 16:02

wking mentioned this pull request Dec 1, 2016

ReadMe should include description of how to install #91

Closed

This was referenced Dec 22, 2016

Increase the generate function #101

Open

combinational image opencontainers/image-spec#508

Closed

wking mentioned this pull request Jan 20, 2017

image: Port to go-digest #104

Merged

This was referenced Feb 16, 2017

Bump version for v0.1.0 #116

Merged

independently unpacking layer functions #11

Closed

wking mentioned this pull request Feb 16, 2017

image: Consistent reader usage #117

Closed

wking mentioned this pull request Aug 30, 2017

doc.go: Perfect description #177

Merged

wking mentioned this pull request Nov 3, 2017

oci/cas/dir: Load blob URI from oci-layout opencontainers/umoci#214

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

image: Refactor to use cas/ref engines instead of walkers #5

image: Refactor to use cas/ref engines instead of walkers #5

wking commented Sep 16, 2016

runcom commented Sep 16, 2016

wking commented Sep 17, 2016

wking commented Sep 17, 2016

xiekeyang commented Sep 22, 2016 •

edited

Loading

wking commented Sep 22, 2016

wking commented Oct 11, 2016

wking commented Oct 15, 2016

wking commented Oct 15, 2016

stevvooe commented Nov 18, 2016

wking commented Nov 18, 2016

cyphar commented Nov 19, 2016

zhouhao3 commented Jul 27, 2017

wking commented Jul 27, 2017

stevvooe commented Jul 28, 2017

wking commented Jul 28, 2017

zhouhao3 commented Oct 25, 2017

wking commented Oct 25, 2017

image: Refactor to use cas/ref engines instead of walkers #5

Are you sure you want to change the base?

image: Refactor to use cas/ref engines instead of walkers #5

Conversation

wking commented Sep 16, 2016

runcom commented Sep 16, 2016

wking commented Sep 17, 2016

wking commented Sep 17, 2016

xiekeyang commented Sep 22, 2016 • edited Loading

wking commented Sep 22, 2016

wking commented Oct 11, 2016

wking commented Oct 15, 2016

wking commented Oct 15, 2016

stevvooe commented Nov 18, 2016

wking commented Nov 18, 2016

cyphar commented Nov 19, 2016

zhouhao3 commented Jul 27, 2017

wking commented Jul 27, 2017

stevvooe commented Jul 28, 2017

wking commented Jul 28, 2017

zhouhao3 commented Oct 25, 2017

wking commented Oct 25, 2017

xiekeyang commented Sep 22, 2016 •

edited

Loading