-
Notifications
You must be signed in to change notification settings - Fork 877
NG: access image via immutable identifier #804
Comments
Thanks for the writeup @ncdc What you ask for (immutable "references" to tags) equals "complete history of every tag version" in the new lingo. Indeed, this is a casualty of the new design - by making clear the difference between an image (a list of layers, name and signature) and a layer (a binary blob), tags are no longer aliases of immutable layers ids. I'm not sure I have a good solution for you yet, so, let's keep this open and give it some thinking. |
@ncdc With V2, it's very likely we'll have changes to the way things must work. While we should be avoiding serious changes, I hope your team can be somewhat open to this in the hopes that we'll get some benefits. That said, it seems like we need some sort of content-addressable digest of V2 image manifests to make this work. There are two levels that might be interesting to applications:
Number 1 would provide addressability of "identical" images, whereas number 2 would provide addressability over specific builds. The main issue with this is that it makes registry garbage collection nearly impossible, because all layers will technically be referenced by all manifests. Please note that this is just a thought exercise. |
On Tue, Dec 02, 2014 at 10:56:43AM -0800, Stephen Day wrote:
I think you mean v1Compatibility, but other than that this is just the
You don't want to add ‘name’ and ‘tag’ back here? And can you explain
Can you elaborate on this? Manifests should only reference the layers |
If pruning is pluggable, you could have a default implementation that prunes layers and manifests as soon as they become stale (i.e. no longer actively referenced by a tag). This would require thin tags I would imagine. Other pluggable implementations could defer to an external system (e.g. OpenShift) for quota enforcement and pruning decisions, or support simple caps such as keeping at most n revisions of a manifest for namespace/repo:tag. |
On Tue, Dec 02, 2014 at 11:52:05AM -0800, Andy Goldstein wrote:
How would you access these older manifests? Are you imagining new API |
I was thinking about accessing via the content-addressable hash, yes. Although if we can have thin tags back, I think that would likely be sufficient for our needs. We'd either migrate our v1 extension to v2, where the registry extension automatically adds a thin tag on every push, or we'd have our deployment code create the thin tag on the fly and then use that as the image name when deploying. |
@ncdc It sounds like the real problem is creating a consistent identifier to a The issue with permanently exporting references becomes apparent when one
The above table would represent two pushes. One relying on layer(0) and We can use some diagrams to understand how these references develop with each
After the push, we can see that all of the layers are still referenced, and
In the approach where no references are ever removed, such as this content- Under the current V2 registry scheme, a0 and a1 don't exist, so layer(1)
The proposed content-addressable scheme can be mimicked by pushing manifests
I'm not saying that we should never implemented content-addressable manifest For @ncdc's use case, referring to most recent diagram, a0 would be version |
On Tue, Dec 02, 2014 at 06:44:54PM -0800, Stephen Day wrote:
Ah, I'd delete a0 and layer(1) once a1 had been pushed, since I don't
I don't see the additional bookkeeping cost to my proposal, except for
In my scheme, the lightweight tag body would embed the
This is where refcount arrays come in 4, and I think that's handled
Just add a lightweight tag pointing to any content-addressable |
A similar issue is moby/moby#4106, and I added a comment here: moby/moby#9015 (comment) |
Another problem with a content-addressable manifest ids is that it muddies the role of a name/tag reference. If name and tag are omitted from the calculation of such an id and multiple manifests with different names and tags have identical ids, which manifest should be returned? It would stand to reason that the id reference would have to be namespaced by the repository, rather than by pure id. And, at the same time, if an image could be referenced by both name+id and name+tag, it would break the guarantee that the url |
On Thu, Dec 18, 2014 at 02:48:06PM -0800, Stephen Day wrote:
Can't we just return the content-addressable manifest without a
One reason I like thin tags and detached signatures, which you can
|
@stevvooe @wking @titanous @dmp42 how about something like this: Normal push command as we know it today:
Do another push. This time, because we asked to keep the previous version of the image around, it's not deleted:
Retrieve latest version of image:
Retrieve a specific version:
I'm trying to think of something that doesn't require huge changes to the v2 proposals, but that still can give us "pull by immutable identifier." This gives control to the user to decide which images to hold on to, doesn't require the manual use of mutable tags, and still allows the registry to do garbage collection. What do you all think? EDIT: instead of mark/unmark described above, a better UX would be to automatically preserve n revisions/digests per manifest. The value of n could be controlled globally by the registry operator, and optionally set by operators and/or users per repository as well. A value of 1 would keep the current behavior of only allowing 1 revision per tag. |
@ncdc As I mentioned earlier, we can't assume write access on the part of the user (or robot) doing the deploy. So unless the For instance, this would preclude using any public images on Docker Hub, as well as images created by other teams in larger organizations. |
@titanous if Bob wants to deploy Alice's alice/apache:latest (digest=D1) and Alice then pushes a new version of the image (digest=D2), it should be up to Alice and/or the registry operator to decide if D1 is preserved, wouldn't you think? Bob shouldn't be able to impact that decision, since it's Alice (and/or the registry operator) who is paying for storage of her images. |
In other words, Bob could refer to alice/apache:latest:D1, but there's no guarantee that image will always be around, since it's not Bob's decision to make. |
@ncdc Correct, I'm not proposing any policies with regards to garbage collection. |
@titanous so... is my suggestion still an issue for you? The marking is strictly a means to inform the registry not to GC a particular manifest. |
Marking for preservation (or whatever you call it) is more composable than having registry collection be under the docker registry. The idea of the registry defining rules for collection and quota, but the user being able to tag and mark as well as to pull by id, seems like it would allow the registry to function as a general purpose store for images for both humans and systems. Invoking mark on an image is a request by a user to preserve that image. I guess one downside is the next question is whether you need ref counting. I'm assuming that mark is the image owners choice, and that higher level systems that are managing images are really the image owners anyway, so they can implement their own ref counting. |
@ncdc As long as we can fetch manifests by digest without marking them, then it's totally fine. |
@titanous yes, marking shouldn't be required to fetch by digest |
@smarterclayton I don't think there's a need for ref counting, since layers can be deleted when they're referenced by 0 manifests. |
Some additional questions I just thought of: what do you do if you have a few different revisions of foo/bar:latest, and then you delete foo/bar:latest? Does "latest" move back to the previous revision? Do we delete all traces of that image? Do we just delete the ability to pull foo/bar:latest but keep the ability to pull foo/bar:latest:$digest? |
FYI, I have "pull by tag + digest" working locally, and there's support for "docker run" as well. Here's the quickly hacked together prototype: ncdc/docker-registry@docker:next-generation...ng-pull-by-digest |
And in the prototype above, the TagStore json file now looks like this: https://gist.github.com/ncdc/c6fb6cba18dfe679a3b6 |
@ncdc said:
I kind of like this style of specifying a version a la a git commit. @stevvooe I don't think it would be too difficult to add this feature. The registry can still be dumb about the content, just hash the manifest/jws payload getting some content-addressable ID for the manifest like
The registry will store a manifest that is addressable either by that name (until the name is deleted/updated) or by the hash So I could pull it using:
or:
Personally, I would prefer that people not rely on "alias" tags but instead treat tags like version numbers and not allow users to push the same version again. The registry could see that there already exists a manifest with this name and refuse to overwrite it. This would force the user/admin to explicitly delete that |
@jlhawn I think am on board with this approach, although id references will be explicitly namespaced by the tag. From your example, Ultimately, I feel a lot of the contention comes from the term "tag". @mmdriley is correct in that the git model is implied by the nomenclature. I also agree that the git model of tags is appropriate. However, the field known as "tag" in the V2 manifest is not the right way to implement that style of "alias tags". Arguably, we should change this field to "version". I think we can avoid some premature decisions by doing the following:
I'll repeat the proposed supporting endpoints from moby/moby#9015 comments for reference:
If we can agree on this as an interim compromise, I think we can move forward and meet the requirements of this request. Please let me know if any clarification is required. cc @ncdc |
@stevvooe I definitely like your suggestion of having the registry compute the digest and return it to the client. I know you had previously expressed concerns about the ability to perform GC if the registry retains copies of every revision of every manifest (unless they're somehow deleted, either manually by a user or automatically via some sort of policy). Are these concerns still an issue for you, or are they mitigated by the delete mechanics listed in bullet 4 above? cc @smarterclayton - any additional thoughts on the previous comment? |
They are mitigated by bullet 4 above. This approach saves everything unless specifically asked to delete it. An external webhook service can then be used to control manifest lifecycle. This keeps GC simple (ref counting) and separates it from lifecycle management. It also reduces the possibility of data loss upon manifest updates. |
On Tue, Dec 30, 2014 at 11:50:48AM -0800, Stephen Day wrote:
The PUT operation will still be descructive if the unsigned portion of |
@wking We may lean towards just taking the entire hash of the content to address this. There seem to be problems with specialized hashes no matter what way we try to cut this up. We may want to discuss storing the signatures separately from the manifest. |
@wking We may actually be able to merge the signatures on the registry side. |
On Mon, Jan 05, 2015 at 05:10:31PM -0800, Stephen Day wrote:
And the way to avoid this is to just hash the whole thing ;). If “the
Sounds good to me (this is where I started out with moby/moby#6070 |
I've spec'ed out a proposal for implementation in distribution/distribution#46. |
I'm going to close this issue, for now, since it has been superseded by distribution/distribution#46. If there is further discussion to be had, please take it there. |
In OpenShift, we would like to be able to access an image (i.e.
docker pull/create/run
) via an immutable identifier that uniquely identifies an image for all points in time. Here are some use cases:foo/bar:latest
(let's call this Rev1) and someone pushes an updatedfoo/bar:latest
manifest in the middle of the deployment (Rev2), some containers might be created using Rev1 and others with Rev2. The correct behavior is for all containers to be running based on Rev1.foo/bar:latest
- Rev1). As development continues,foo/bar:latest
is updated several times, but none of these newer image manifests has been deployed yet. The user decides to scale up the existing deployment, so OpenShift needs to create and start new containers using the same image that is currently deployed (Rev1).foo/bar:latest
can't be used because that's no longer Rev1 - it's now something else (e.g. Rev17).With the v1 registry, we created a custom extension that responds to the
tag_created
signal, creates a new tag whose name is the image's id (since a v1 image has an id), and then http posts a payload to OpenShift withOpenShift users can create deployment triggers that watch for changes to
foo/bar:latest
and then perform new deployments. When deploying, OpenShift inspects the image to find its id and then translatesfoo/bar:latest
tofoo/bar:$image_id
. This, combined with our custom v1 registry extension, allows us to pull an image by its id. It also lets us deploy a specific image by id, as our deployments don't refer to:latest
but instead to the id.@dmp42 suggested that OpenShift could pull
foo/bar:latest
, generate a new tag (based on commit id, date/time, etc) that is unique and consistent for all time, push the new tag, and then use that when deploying. This creates a couple of problems:It would really be nice to have immutable identifiers for image manifests that are consistent all the time.
@dmp42 @stevvooe @smarterclayton @wking thoughts?
The text was updated successfully, but these errors were encountered: