-
Notifications
You must be signed in to change notification settings - Fork 367
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docker: add docker-archive transport #148
docker: add docker-archive transport #148
Conversation
Note: Currently
To be honest, I'm not really sure where This does appear to work with
|
Scratch that, it's because |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
docker/archive/archive_dest.go
Outdated
// Close removes resources associated with an initialized ImageDestination, if any. | ||
func (d *archiveImageDestination) Close() { | ||
d.writer.Close() | ||
// FIXME: We don't clean up here if we errored out... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
? It should be the responsibility of the ImageDestination
user to always call Close
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking something about cleaning up temporary files we created (we create the tar file even if we fail to actually make a valid image). So if we return an error at some point we have a half-built archive.
docker/archive/archive_transport.go
Outdated
} | ||
|
||
// archiveReference is an ImageReference for Docker images stored on the local filesystem. | ||
type archiveReference string // FIXME FIXME |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, the docker-daemon:
ImageReference
implementation is unfinished, and not a good model ATM. We will be fixing that soonish ( #143 ).
If the semantics of the docker-archive:
references is that they are simply paths to tarballs, probably see what the dir:
transport does for ValidatePolicyConfigurationScope
, StringWithinTransport
, PolicyConfiguration{Identity,Namespaces
} etc.
Note that because these things interact with signature validation, the _transport.go
files really should have tests covering every single line if possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add some tests once I've ported docker-daemon
over to this refactored interface.
docker/archive/archive_transport.go
Outdated
// NewImage returns a types.Image for this reference. | ||
// The caller must call .Close() on the returned Image. | ||
func (ref archiveReference) NewImage(ctx *types.SystemContext) (types.Image, error) { | ||
panic("FIXME FIXME") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably
src := newImageSource(ctx, ref)
return image.FromSource(src)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe. I just copied this from docker-daemon
.
@mtrmac I've done a lot of cleanups and implemented quite a few things. Here's the list, maybe you can take another look (specifically to tell me what I need to implement for
PTAL. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
I’m afraid I have only made one pass through the code, so this is really not comprehensive. The way this PR mixes moving code around and various renames and refactorings makes it very non-obvious to determine how the original and the changed code differs, and after the ~2 hours I have spent on this today I need to stop. Do you by any chance have a working branch where the original effort was done in smaller commits?
I guess the two big items I have noticed are
-
The seek+jump code is IMHO too fragile to be worth it (@runcom ?)
-
AFAIK the silently symlink support is necessary
docker/archive/archive_dest.go
Outdated
func (d *archiveImageDestination) SupportedManifestMIMETypes() []string { | ||
// XXX: Should this be exposed in docker/archive? | ||
return []string{ | ||
manifest.DockerV2Schema2MediaType, // FIXME: Handle others. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These comments are obsolete now, see https://github.com/mtrmac/image/tree/docker-daemon .
But overall it might be cleanest to keep a single implementation of these methods for both docker-archive:
and docker-daemon:
, perhaps by embedding the shared tar handler as an anonymous struct member.
docker/archive/archive_src.go
Outdated
return nil, "", err | ||
} | ||
|
||
// FIXME: Do we want to suppport manifest lists? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How? AFAICS the tar format can not do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe it does support manifest lists. By construction the format is a manifest list, but it only contains one entry (there's some support in the daemon IIRC because people have created multi-arch images).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See the other comment on manifest lists; a set of manifests != application/vnd.docker.distribution.manifest.list.v2+json
.
docker/archive/archive_src.go
Outdated
// XXX: We aren't caching this because GetManifest() specifically states | ||
// that it can use a slow service. Caching makes this uglier (and also | ||
// makes implementing read-write images harder). | ||
m, err := s.img.GenerateManifest() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caching is useful not just for its performance; it by construction guarantees that the manifest won’t change just because some variable somewhere was mistakenly updated, or because serializing JSON may use a random order depending on heap state, or for any other reason.
(Read-write images? The linear tar format seems especially unsuitable for such functionality.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, I'll cache the output of this.
docker/archive/archive_transport.go
Outdated
if cleaned != scope { | ||
return fmt.Errorf("Invalid scope %s: Uses non-canonical format, perhaps try %s", scope, cleaned) | ||
} | ||
// XXX: Is this enough? Should we also make sure scope isn't a directory? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A scope which is a directory is very much intended: policy can be either defined for an individual image path (/home/mitr/my-precious.tar
), or for a whole directory subtree (/mnt/nfs/our-company/trusted-images
).
docker/archive/archive_transport.go
Outdated
// WARNING: Do not use the return value in the UI to describe an image, it does not contain the Transport().Name() prefix; | ||
// instead, see transports.ImageName(). | ||
func (ref archiveReference) StringWithinTransport() string { | ||
// FIXME: Maybe we should be using explicitfilepath.ResolvePathToFullyExplicit? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That transformation should really happen already in NewReference
; types.ImageReference
are supposed to be immutable/have immutable semantics, and ResolvePathToFullyExplicit
depends on filesystem contents.
docker/archive/image.go
Outdated
// FIXME: This caching is ugly. The only reason we have it here is because | ||
// the Docker image format isn't a CAS (you need to store a lookup | ||
// table for diffID -> layerPath). | ||
cache *tarCache |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
… and then tarCache
would not be a separate object but just some members of the TarSource
.
docker/archive/tar.go
Outdated
|
||
// Reset everything. | ||
tc.offsets = map[string]int64{} | ||
if _, err := tc.file.Seek(0, os.SEEK_SET); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Considering that prime()
is called exactly once, immediately after creating the object, it would be simpler to call prime()
first, and only then return a new object with the results of prime()
; then the questions of “resetting” and “prior state” disappear.
docker/archive/tar.go
Outdated
} | ||
|
||
// prime regenerates all of the offset information in the tarCache. | ||
func (tc *tarCache) prime() error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, we can afford the electrons for descriptive names. collectFileOffsets
, createFileOffsetMap
or anything else would work just as well and be more explicit at the call site.
docker/archive/tar.go
Outdated
if (cur+hdr.Size)%512 > 0 { | ||
lastEntry += tarBlockSize - (lastEntry % tarBlockSize) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, no.
Overall, it is a nice idea to do one scan and then only seek, and of course it is asymptotically correct and maximally efficient, but when the archive/tar.Reader
does not provide a suitable interface, reaching behind its back to see the offset of the underlying file and then adjusting it due to a knowledge of its internal implementation is way too fragile.
Meanwhile, the OS is perfectly capable of keeping a few kilobytes of data in memory, so the real cost of rescanning is primarily the syscall overhead, and we expect, well, half a dozen files for manifests or so, and one file per layer, about a dozen in total.
Without a stable maintainable interface to rely on, and with the risk of having to tinker with this every time archive/tar
changes something about its buffering, this is just not worth the headache.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's disappointing. I'll switch it instead to doing a seek to the start then. 😢
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could actually implement this without "reaching behind tar.Reader
's back". Rather than re-parsing the metadata you can just store the offset when the data portion starts (which is cur
in the code you've quoted) and then also store the header information in the map (so you can just return the header and tar.Reader
won't be reparsing it).
I mean, you'd have to implement tar.Reader
's ReadCloser
functionality but it shouldn't be that bad... I'll float this in a separate patch on top of your PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That’s still relying on undocumented implementation properties: There is no API promise that after tar.Reader.Next()
returns, the underlying stream is positioned at the start of the content of the file; e.g. nothing prohibits tar.Reader
from buffering input (e.g. reading it one st_blksize
at a time instead of one tar block at a time).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sigh, yeah you're right. Okay, I'll just stick with using Seek(0, SEEK_SET)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(It does seem to me that there could be other users who would benefit from archive/tar.Reader
adding support for this, either an explicit one like SaveStreamPosition/SeekToSavedPosition
or a documented API promise that the underlying file position matches at specific points in time. But that would need to happen in golang upstream.)
docker/archive/tar.go
Outdated
} | ||
return nil, nil, err | ||
} | ||
return hdr, tr, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(At least validate that hdr.Name
matches path
? Anyway, moot.)
@cyphar How about https://github.com/mtrmac/image/tree/split-tarfile for a starting point? Yes, it is also a big diff, but it obviously makes only trivial changes to the
are also very readable to show that this is only moving code around without semantic changes. Any refactorings and improvements can then be added as separate commits, with confidence. And hooking up the (Disclaimer: confidence or not, completely untested and for all I know may be completely broken. Only to show a possible path forward.) |
Yeah, that could work. The only issue with that patchset is that it mixes some policy decisions that are specific to I can also switch this back to being a purely "implementation" patchset and then we can do all of my fixing / improvements later. |
I’m sorry, I haven’t noticed that
It hadn’t existed; I created that as a proof-of-concept after writing the review, “only to show a possible path forward”.
Both a single PR and separate implementation/improvements PRs are perfectly fine with me; but, please, as much as possible, on idea per commit. A 10-commit PR is perfectly fine, one-liner commits are perfectly fine, a sequence of 3 commits doing 3 improvements on the same 10 lines is also fine. f4529ce , as a single idea, also works (though separating the interface changes and the implementation improvement would be even better). But comparing the removed code in 9404cfa with the fairly different implementation in `docker/archive/tar.go is difficult, and as you see I have missed some of the intentional changes you have made in that. |
@mtrmac Sorry for not working on this recently (lots of fun exams). But I'm going to work on this next week (and over the weekend), basing it off your |
FWIW I have updated https://github.com/mtrmac/image/tree/split-tarfile to apply on top of #165 ; it is still completely untested. |
@mtrmac I've rebased on top of your changes. It does work but there's probably more stuff you want me to implement (especially with regards to the scoping code that I'm not sure I understand the interface for). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is pretty close.
The policy scoping based on otherwise completely ignored Docker reference, and making that mandatory, is probably the biggest issue. One possible way to solve this would be to make the reference optional: Then, tarfileImageSource
could just be used by specifying a path; and tarfileImageDestination
would either fail when a Docker reference is missing, or it could possibly create a tarfile which specifies no RepoTags
. (This would be a bit difficult to use with Docker, but it would be perfectly fine as an input to docker-archive:
again.)
WRT policy scoping, I’d suggest basing it on paths and subdirectories, exactly the way dirTransport
does it (in ValidatePCS
, accept absolute no-symlink paths; in NewReference
, resolve to an absolute no-symlink path, etc.)
Or, if you really don’t want to care, I it would be OK to punt and do what daemonTransport
does: the identity is ""
, no other scopes are permitted. This would still make it implement a more expressive scope design in the future, without limiting it in any way, so I guess there’s little point in being a pain about it.
Ultimately, due to the impact on signatures, the new archive+transport.go
should have tests with as close to 100% coverage as possible; that can of course wait after the design has been fleshed out, and can probably be cobbled together by copy&pasting tests from the most similar existing transport.
docker/tarfile/archive_transport.go
Outdated
"github.com/containers/image/types" | ||
) | ||
|
||
// Transport is an ImageTransport for Docker registry-hosted images. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is no longer precise,
docker/tarfile/archive_transport.go
Outdated
type tarfileReference struct { | ||
ref reference.NamedTagged | ||
path string | ||
tag string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If tag
is always .ref.String()
, it would probably be simpler to drop the duplicate tag
member.
docker/tarfile/archive_transport.go
Outdated
return nil, err | ||
} | ||
ref = reference.WithDefaultTag(ref) | ||
refTagged := ref.(reference.NamedTagged) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ref can also be reference.Canonical
, in which case this panics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't .WithDefaultTag
force the type to be .NamedTagged
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at https://github.com/containers/image/blob/master/docker/reference/reference.go#L141 , no: a reference.Canonical
is IsNameOnly
and is returned unchanged.
docker/daemon/daemon_dest.go
Outdated
// a hostname-qualified reference. | ||
// See https://github.com/containers/image/issues/72 for a more detailed | ||
// analysis and explanation. | ||
refString := fmt.Sprintf("%s:%s", namedTaggedRef.FullName(), namedTaggedRef.Tag()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this moved into the docker-daemon:
-specific part? AFAICT exactly the same compatibility concerns apply to docker-archive:
as well.
The commit message says “make tagging work properly”, so presumably the original is broken in some way, but I am not sure how.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code is wrong for docker-archive:
transports, because the filename of a .tar
archive is not related to what RepoTags
you're going to export. There might be a way around it, but for now this code doesn't really belong in tarfile.Transport
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, but refString
is not the file name. That is set by giving a tarfile.Destination
an io.Writer
, in our case from fh, err := os.OpenFile(ref.path, …
.
tarfile.Destination.repoTag
is only used to set the RepoTags
field inside the manifest in the exported tar file (which Docker uses to decide what tags, if any, to use for the docker load
-ed tarball). This has, indeed, no relationship with the archive filename, so the filename:
_referenceformat makes sense for specifying
RepoTag; but, _independently of that_, the formatting using
FullNameis useful to make the projectatomic/docker (i.e. Fedora, CentOS, RHEL) versions update tag consistently with
docker pull`.
Now, in the case of docker-archive:
it is not as obvious that the tag behavior should be consistent with docker pull
, the way it is fairly clear for docker-daemon:
, but, on balance, I still think it might be more useful to create tarballs which use hostname-qualified names (using the projectatomic/docker terminology). OTOH there is also a fairly strong argument that when the qualified/unqualified tags have different semantics, docker-archive:
should just let the user use either of them without interfering.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is now in docker/tarfile
.
docker/tarfile/archive_transport.go
Outdated
return tarfileReference{ | ||
ref: refTagged, | ||
path: path, | ||
tag: fmt.Sprintf("%s:%s", refTagged.Name(), refTagged.Tag()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to require a tag/Docker reference even when specifying a source, which completely ignores it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
… and even more, does it make sense to base a PolicyConfigurationIdentity
on this ignored value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently we only support a single image inside a docker-archive
, but there's no reason this can't be expanded upon in later versions. The tag
references what image from that .tar
you want to stream from.
If you still think it doesn't make sense we can drop it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case, would it make sense to enforce that the tag must match? (I realize that it would be an extra code which is not adding new functionality right now, but it would also make sure that nobody writes incorrect shell scripts which would later be broken by adding enforcement.)
[Returning to this too late, I apologize if this no longer makes sense given other context.]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've made it so that only destinations have tags. We can expand this later.
docker/tarfile/archive_transport.go
Outdated
} | ||
|
||
// tagOrDigest returns a tag or digest from the reference. | ||
func (ref tarfileReference) tagOrDigest() (string, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is unused.
@@ -0,0 +1,3 @@ | |||
// Package tarfile is an internal implementation detail of some transports. | |||
// Do not use outside of the github.com/containers/image repo! | |||
package tarfile |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
… so, I'd suggest moving the docker-archive:
transport into a separate containers/image/docker/archive
package.
(docker:
, i.e. docker/docker_image_*
, should also have been a separate subpackage, and eventually we’ll fix that).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough.
transports/transports.go
Outdated
@@ -22,6 +23,7 @@ func init() { | |||
for _, t := range []types.ImageTransport{ | |||
directory.Transport, | |||
docker.Transport, | |||
tarfile.Transport, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding a smoke test for the naming to transports_test.go
. Not strictly necessary.
@cyphar how is this working with Docker archives that contains multiple images? I can't find anything, could you point me to that? Cause Docker archives can contain multiple images specified in the needs a rebase though |
It has inherited https://github.com/containers/image/pull/148/files#diff-e5badaf0bc3c9d10e866102bfd533457L236 / https://github.com/containers/image/pull/148/files#diff-7af67cd96906421aeee11932f166d1f2R180 from the (Because the reference format supports a ref:tag for destinations, sources could use this to choose a single image from a multi-image tarball. But that can definitely be a separate PR, if anyone needs that functionality at all; just insisting on a single-image tarball works perfectly fine.) |
fair enoiugh, worth creating an issue for multi images tarballs though as I think people will come to us asking for that |
/me is rebasing. Lots of changes to |
Alright. Tests aren't failing anymore so I can now go about implementing @mtrmac's requests. |
@mtrmac I've covered all of your comments. The tagging has been fixed, so now PTAL. I will squash all of the "squashme" commits as soon as you're happy with the state. |
Issues with OCI -> |
Verify blobs
This is currently blocked on #193. |
Actually this isn't necessarily blocked on #193. This can be merged now and then the #193 fixes can be applied later. @mtrmac Can we please merge this. I'm getting very tired of having to rebase this on top of minor changes in |
ACK, #193 can be done separately. And yeah, I’ve been maintaining the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, one more rebase, please, due to #221 . You might be able to use the rebased https://github.com/mtrmac/image/tree/split-tarfile .
I will write the necessary _transport.go
tests now…
docker/archive/transport.go
Outdated
|
||
// A :tag was specified, which is only necessary for destinations. | ||
if len(parts) == 2 { | ||
ref, err := reference.ParseNamed(parts[1]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will as of #221 need to be reference.ParseNormalizedNamed
.
docker/archive/transport.go
Outdated
if !isTagged { | ||
// Really shouldn't be hit... | ||
return nil, errors.Errorf("docker-archive reference %s has ':' but reference isn't tagged", refString) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this really shouldn’t be hit, because of the WithDefaultTag
and because we now know !isDigest
. The :
in refString
does not matter because we were parsing only parts[1]
.
It’s fine to say internal error: reference is not tagged even after reference.WithDefaultTag
or something like that.
docker/archive/transport.go
Outdated
// It is STRONGLY recommended for the first element, if any, to be a prefix of PolicyConfigurationIdentity(), | ||
// and each following element to be a prefix of the element preceding it. | ||
func (ref archiveReference) PolicyConfigurationNamespaces() []string { | ||
return policyconfiguration.DockerReferenceNamespaces(ref.destinationRef) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also needs a check for ref.destinationRef == nil
.
(Alternatively, perhaps ignore destinationRef
in here as well, and return ""
in …Identity
and []string{}
in …Namespaces
, at least for now; the policy is, at least currently, only used on sources, not destinations.)
docker/archive/transport.go
Outdated
func (ref archiveReference) NewImageDestination(ctx *types.SystemContext) (types.ImageDestination, error) { | ||
if ref.destinationRef == nil { | ||
return nil, errors.Errorf("docker-archive: destination reference not supplied (must be of form <path>:<reference:tag>)") | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check could be moved into newImageDestination
, so that the check and the code which relies on that check is in the same file.
transports/transports_test.go
Outdated
@@ -34,6 +34,8 @@ func TestImageNameHandling(t *testing.T) { | |||
{"docker", "//busybox:notlatest", "//busybox:notlatest"}, // This also tests handling of multiple ":" characters | |||
{"docker-daemon", "sha256:0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef", "sha256:0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef"}, | |||
{"docker-daemon", "busybox:latest", "busybox:latest"}, | |||
{"docker-archive", "/var/lib/oci/busybox.tar:busybox:latest", "/var/lib/oci/busybox.tar:busybox:latest"}, | |||
{"docker-archive", "busybox.tar:busybox:latest", "busybox.tar:busybox:latest"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I expect these test cases to fail now; either update them, or perhaps use FamiliarString
in the StringWithinTransport
method.)
docker/archive/dest.go
Outdated
if err := d.writer.Close(); err != nil { | ||
return err | ||
} | ||
return nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(The above 4 lines can be simplified to return d.writer.Close()
but this is perfectly fine as well.)
docker/archive/dest.go
Outdated
func newImageDestination(ctx *types.SystemContext, ref archiveReference) (types.ImageDestination, error) { | ||
fh, err := os.OpenFile(ref.path, os.O_WRONLY|os.O_EXCL|os.O_CREATE, 0644) | ||
if err != nil { | ||
// FIXME: It should be possible to modify archiges, but the only really |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/archiges/archives/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As promised, tests for transport.go
are available as (the last commit in) https://github.com/mtrmac/image/tree/docker-archive-tests . Do note the FIXMEs.
Below pointing out a few things the tests have uncovered.
docker/archive/transport.go
Outdated
src, err := newImageSource(ctx, ref) | ||
if err != nil { | ||
return nil, err | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error
value returned by newImageSource
is always nil
; please drop that return value, and then we can drop these three untestable lines. (We can always re-add them if it became necessary in the future.)
docker/archive/transport.go
Outdated
|
||
// DeleteImage deletes the named image from the registry, if supported. | ||
func (ref archiveReference) DeleteImage(ctx *types.SystemContext) error { | ||
return os.Remove(ref.path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is this supposed to interact with the destinationRef
? It seems that it would make sense to remove only the image indicated by destinationRef
, not the whole multi-image tarball. I suppose we can’t easily do that now, which is fine—but then it might be better to just return an error in here instead of doing something which will become dangerous in the future.
docker/archive/transport.go
Outdated
_, isTagged := ref.(reference.NamedTagged) | ||
_, isDigest := ref.(reference.Canonical) | ||
|
||
if isDigest { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(These tests can be simplified to if _, ok := ref.(reference.Canonical); ok {
… and the like, eliminating the two variables. But this works as well.)
docker/archive/transport.go
Outdated
// ParseReference converts a string, which should not start with the ImageTransport.Name prefix, into an Docker ImageReference. | ||
func ParseReference(refString string) (types.ImageReference, error) { | ||
parts := strings.SplitN(refString, ":", 2) | ||
if len(parts) == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is never true: parts
will always contain at least an empty string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only minor comments now (most importantly it should be buildable :) )
Updated https://github.com/mtrmac/image/tree/docker-archive-tests , please add the last commit to this PR.
docker/tarfile/dest.go
Outdated
// SupportsSignatures returns an error (to be displayed to the user) if the destination certainly can't store signatures. | ||
// Note: It is still possible for PutSignatures to fail if SupportsSignatures returns nil. | ||
func (d *Destination) SupportsSignatures() error { | ||
return errors.Errorf("Storing signatures for docker tarfile: destinations is not supported") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The transport:
form in error messages usually refers to the transports.ParseImageName
syntax. Please s/tarfile: destinations/tar files/
, and similarly update the other tarfile:
error m essages.
docker/archive/transport.go
Outdated
// DeleteImage deletes the named image from the registry, if supported. | ||
func (ref archiveReference) DeleteImage(ctx *types.SystemContext) error { | ||
// Not really supported, for safety reasons. | ||
return nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then this should fail instead of pretending to succeed. Something like return errors.New("Deleting images not implemented for docker-archive: images")
docker/archive/dest.go
Outdated
if err := d.Destination.Commit(); err != nil { | ||
return err | ||
} | ||
return d.writer.Close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Calling io.Closer.Close()
twice (once in Commit
and once in Close
) is undefined in general, and for an os.File
it seems the second call would reliably fail. That’s fine for now, but if archiveImageDestination
starts returning an error
, this will need more logic to avoid a double close (~ the daemon_dest.go
committed
flag).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you want, I can remove .Close
from .Commit
. IMO it's a bit silly for us to protect against a user of this library from doing both .Close
and .Commit
-- if they aren't meant to be mutually exclusive operations then they shouldn't both try to do the same thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(The .Commit()
and .Close()
aren’t meant to be mutually exclusive. Close()
destroys temporary state like file handles and temporary files; Commit()
tells the destination “now you have everything, you can mark the image as finished / start transmitting the result as a single unit”.
See how daemonImageDestination
terminates its helper goroutine in Close()
(to ensure that it never lingers around), but correctly closes the generated tar file (making it acceptable to dockerd
so that the input isn’t all thrown away) in Commit()
.
The expected pattern for a writer is to do
dest, … := someRef.NewImageDestination()
defer dest.Close() // Or perhaps a more thorough variant with error propagation
// steps which may fail, e.g. reading input
… dest.PutBlob(…)
// other steps which may fail, including other writes to dest
// we get here only if everything has succeeded so far
err = dest.Commit()
i.e. the caller can do a blind defer dest.Close()
. If the two were expected to be exclusive, the calller would have to do something like
succeeded = false
dest, … : someRef.NewImageDestination()
defer func() {
if succeeded { dest.Commit() } else { dest.Close() }
}
// steps which may fail, e.g. reading input
… dest.PutBlob(…)
// other steps which may fail, including other writes to dest
// we get here only if everything has succeeded so far
succeeded = true
which could make the ImageDestination
implementations simpler, but it would be notably more cumbersome for the caller. Especially because so many ImageDestination
have no state to speak of and their .Close()
is empty, it seems overall simpler to force the ImageDestination
implementations—only those who care—to track success/failure, like daemonImageDestination.committed
does, than to force all writers to track this.
Perhaps I should make the comments in types.go
documenting ImageDestination
and ImageDestination.Close
more explicit about this?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've removed the d.writer.Close
from Commit
in docker/archive/dest
because d.Destination.Commit
will "save" the contents of the tar-file
. But the actual closing of the file will happen in d.writer.Close
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we need to change this later, I can make a PR for it. But I don't think the way it's done right now is going to output truncated tar
files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAICT the current way (tar.Close()
in Commit()
, writer.Close()
in Close()
) is fine.
docker/archive/transport.go
Outdated
// FIXME? We could be verifying the various character set and length restrictions | ||
// from docker/distribution/reference.regexp.go, but other than that there | ||
// are few semantically invalid strings. | ||
return nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there are no namespaces defined, this should fail the way daemonTransport.ValidatePolicyConfigurationScope
does.
docker/archive/transport.go
Outdated
// ParseReference converts a string, which should not start with the ImageTransport.Name prefix, into an Docker ImageReference. | ||
func ParseReference(refString string) (types.ImageReference, error) { | ||
parts := strings.SplitN(refString, ":", 2) | ||
if refString == "" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(It’s not obvious to me that we need this check, an empty path will fail sooner or later, but if it exists, it seems cleaner to place it before the computation of parts
. This does work though.)
docker/archive/transport.go
Outdated
return nil, errors.Errorf("internal error: reference is not tagged even after reference.TagNameOnly: %s", refString) | ||
} | ||
|
||
destinationRef = ref.(reference.NamedTagged) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for not noticing ealier, please do
refTagged, isTagged := ref.(reference.NamedTagged)
if !isTagged { /* fail */ }
destinationRef = refTagged
to make it absolutely that the conversion is valid; this way has the cast done two times and it may get out of sync.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand how a cast can get out of sync (variable types don't change randomly). But sure, I'll change this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(This could get out of sync if the type check were modified for some reason without noticing that the destinationRef = ref.(reference.NamedTagged)
cast depends on it. I agree the function is small enough that it was not terribly likely that the two pieces of code would get that far apart.
But when we can have the compiler do a 100%-reliable verification cheaply enough, it seems a waste not to take advantage of it.)
docker/archive/transport.go
Outdated
|
||
import ( | ||
"fmt" | ||
"os" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
docker/archive/transport.go:5: imported and not used: "os"
Ping @cyphar |
Rebased and fixed most of @mtrmac's comments. I'll squash it all once it gets LGTM'd. |
@mtrmac could you take another look at this? |
tarfile.Source partially implements ImageSource reading from any regular file, tarfile.Destination partially implements ImageDestination writing to any stream. The interactions with the Docker engine API remain in the docker-daemon: transport. Does not change behavior, apart from a few error messages now generically talking about “docker tar files” instead of docker-daemon: . [@cyphar: Rebased and re-did the splitting over several rebases.] Signed-off-by: Miloslav Trmač <mitr@redhat.com> Signed-off-by: Aleksa Sarai <asarai@suse.de>
Rebased (again). |
} | ||
|
||
if inputInfo.Size == -1 { // Ouch, we need to stream the blob into a temporary file just to determine the size. | ||
logrus.Debugf("docker tarfile: input with unknown size, streaming to disk first ...") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/tarfile:/tar file/
, but in a debug message is absolutely NOT blocking.
docker/archive/transport.go
Outdated
return nil, errors.Errorf("internal error: reference is not tagged even after reference.TagNameOnly: %s", refString) | ||
} | ||
|
||
destinationRef = ref.(reference.NamedTagged) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(This could get out of sync if the type check were modified for some reason without noticing that the destinationRef = ref.(reference.NamedTagged)
cast depends on it. I agree the function is small enough that it was not terribly likely that the two pieces of code would get that far apart.
But when we can have the compiler do a 100%-reliable verification cheaply enough, it seems a waste not to take advantage of it.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK, thanks!
Please cherry-pick the last commit from (now updated) https://github.com/mtrmac/image/tree/docker-archive-tests , squash as needed (but please keep the “split tarfile code from docker-daemon code” part a separate commit) and I’ll merge.
// not required/guaranteed that it will be a valid input to Transport().ParseReference(). | ||
// Returns "" if configuration identities for these references are not supported. | ||
func (ref archiveReference) PolicyConfigurationIdentity() string { | ||
// Punt, the justification is similar to dockerReference.PolicyConfigurationIdentity. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/dockerReference/daemonReference/
I think.
This archive allows you to export a docker-load-friendly archive without needing a Docker daemon to facilitate the exporting. Effectively this is just an on-disk version of the docker-daemon transport (they both use docker/tarfile). Signed-off-by: Aleksa Sarai <asarai@suse.de>
Signed-off-by: Miloslav Trmač <mitr@redhat.com> Signed-off-by: Aleksa Sarai <asarai@suse.de>
Squashed + rebased. I can make a follow-up PR to clean up the non-blocking concerns you had. |
🎉 Now to add the patches to |
This transport is effectively just a copy of the docker-daemon://
transport but replacing all of the Docker client code with simple
filesystem code. It currently works as well as docker-daemon:// (from my
testing).
Further refactoring is necessary, this was just a first pass.
Implements #140.
Signed-off-by: Aleksa Sarai asarai@suse.com