Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Immutable image manifest references #46

Closed
4 tasks
stevvooe opened this issue Jan 7, 2015 · 28 comments · Fixed by #211
Closed
4 tasks

Immutable image manifest references #46

stevvooe opened this issue Jan 7, 2015 · 28 comments · Fixed by #211

Comments

@stevvooe
Copy link
Collaborator

stevvooe commented Jan 7, 2015

After discussion in moby/moby#9015 and docker-archive/docker-registry#804, its clear that we need support for immutable references to v2 image references.

Here are the following conditions for support, from #804.

  1. For the initial version, the manifest id is controlled by the registry. The manifest id should be returned as part of the response to a manifest PUT, in addition to a Location header with the canonical URL for the manifest (ie /v2/<name>/manifests/<tag>/<digest>).
  2. The "digest" of the manifest is the sha256 of the "unsigned" portion of manifest, with sorted object keys. The id is only calculated by the registry. This is dependent on Store manifest signatures separately from content #25, allowing us to merge signatures from separate pushes of identical content.
  3. PUT operations on the manifest are no longer destructive. If the content is different, the "tag" is updated to point at the new content. All revisions remain addressable by digest. Conflicting signatures are stored separately.
  4. The DELETE method on /v2/<name>/manifests/<tag> should be clarified to delete all revisions of a given tag, whereas DELETE on /v2/<name>/manifests/<tag>/<digest> should only delete the revision with the request digest.

The following are the tasks required to accomplish this:

  • API Specification must be updated with the following endpoints:
Method Path Entity Description
GET /v2/<name>/manifests/<tag>/<digest> Manifest Fetch the manifest identified by name, tag and digest.
DELETE /v2/<name>/manifests/<tag>/<digest> Manifest Delete the manifest identified by name, tag and digest
docker pull <image>:<tag>@<id>
@mmdriley
Copy link

mmdriley commented Jan 7, 2015

This seems a bit odd -- it's like we're implementing tags of tags.

What if two manifest versions with the same "signed" components are uploaded? This would seem to recommend abandoning the idea of any "unsigned" part.

Can these digests be enumerable through the API?

Can the API provide an endpoint to get the digest of the current version of a manifest?

@stevvooe
Copy link
Collaborator Author

stevvooe commented Jan 7, 2015

@mmdriley I understand this may seem slightly odd, but the V2 manifest format is in flux and we are just getting the distribution project up and running. It's a step in the right direction to start supporting a number of requested features without adding too much complexity that must be supported in the long term.

This seems a bit odd -- it's like we're implementing tags of tags.

The "tag" in the V2 manifest is really like a version field. This is not ideal but I cannot change this right away. The "digest" is an immutable reference to a revision of that "tag" (or version). The goal of this is to find a compromise that can support moving off this broken model of tags and move to a more familiar, git-like tag model in the future. Eventually, we can actually externalize tags from the manifest, allowing one to tag a specific revision and change the revision to which a tag points.

What if two manifest versions with the same "signed" components are uploaded? This would seem to recommend abandoning the idea of any "unsigned" part.

The signatures are going to be separated from the content on the backend, allowing us to merge manifest signatures and not destroy content when a new signature is added. This work has been specified in #25.

Can these digests be enumerable through the API?

Yes, they can, but that endpoint is not part of this particular proposal. This is not ideal, but we need to be slightly conservative with API surface area.

Can the API provide an endpoint to get the digest of the current version of a manifest?

This is covered under item 3 above. The manifest's digest will be available in an http header in the response to GET /v2/<name>/manifests/<tag>, along with a canonical Location header. This will also be available via notification webhooks (#42), which we haven't fully specified out yet, to support determinant builds and deployment.

If you still have questions, please come by #docker-distribution in IRC and ping me (stevvooe). A lot of the proposed changes here are based directly on your feedback. I'll be happy to go over the details with you and make sure we're on the same page.

@ncdc
Copy link

ncdc commented Jan 8, 2015

This proposal LGTM

@mmdriley
Copy link

mmdriley commented Jan 9, 2015

What if two manifest versions with the same "signed" components are uploaded? This would seem to recommend abandoning the idea of any "unsigned" part.

The signatures are going to be separated from the content on the backend, allowing us to merge manifest signatures and not destroy content when a new signature is added. This work has been specified in #25.

It seems like the most important aspect of this is as yet unspecified. What does it mean to "merge" two manifests? This goes back to the question of why there are unsigned parts of the manifest at all.

@wking
Copy link
Contributor

wking commented Jan 9, 2015

On Fri, Jan 09, 2015 at 10:20:46AM -0800, Matthew Riley wrote:

What if two manifest versions with the same "signed" components
are uploaded? This would seem to recommend abandoning the idea
of any "unsigned" part.

The signatures are going to be separated from the content on the
backend, allowing us to merge manifest signatures and not destroy
content when a new signature is added. This work has been
specified in #25.

It seems like the most important aspect of this is as yet
unspecified. What does it mean to "merge" two manifests? This goes
back to the question of why there are unsigned parts of the manifest
at all.

It's not merging the manifests, it's merging the signatures. That's
just “create a list of all signatures associated with the signed
portion of this manifest”. We have a signed portion to break the:

manifest → make signature → update manifest

loop, since you can't have the signature signing itself. An easier
approach (which it seems like we're taking on the backend) is to just
use detached signatures.

@stevvooe
Copy link
Collaborator Author

stevvooe commented Jan 9, 2015

@mmdriley The use of the term "unsigned" was inaccurate. This is the "payload" portion of the JWS signed manifest. This can best be explained with an example. A very simple payload might be like this:

{
   "schemaVersion": 1,
   "name": "foo/bar",
   "tag": "0.1",
   "architecture": "amd64",
   "fsLayers": [
      {
         "blobSum": "demo"
      }
   ],
   "history": null
}

Party A signs the manifest with their private key, getting the following manifest:

{
   "schemaVersion": 1,
   "name": "foo/bar",
   "tag": "0.1",
   "architecture": "amd64",
   "fsLayers": [
      {
         "blobSum": "demo"
      }
   ],
   "history": null,
   "signatures": [
      {
         "header": {
            "jwk": {
               "crv": "P-256",
               "kid": "YGDA:Y3ZD:USKI:4L4E:4HZU:KGPU:YC7E:3SJH:VH7R:7VUW:STJK:O24Q",
               "kty": "EC",
               "x": "MsExpH8eIv60vHvz1p1HJ03ctSy1ZRh9pzsG32I-vaI",
               "y": "VJPadjcQK8AnqlOTyONJUFtizN5xAygEA6VCbwPXZ9U"
            },
            "alg": "ES256"
         },
         "signature": "dUWS_SpB6TuKGdfmv4uyizaz8kiXWZcv5sHuAi84GH5mLdYlslHVwIzBotdg4Sbyh0hy1B9hTQvbQ2BuhbZA4w",
         "protected": "eyJmb3JtYXRMZW5ndGgiOjE3NiwiZm9ybWF0VGFpbCI6IkNuMCIsInRpbWUiOiIyMDE1LTAxLTA5VDIxOjM1OjQwWiJ9"
      }
   ]
}

The above can be verified with public key "YGDA:Y3ZD:USKI:4L4E:4HZU:KGPU:YC7E:3SJH:VH7R:7VUW:STJK:O24Q" using libtrust.

Now, let's say party B, with a different private key, signs the same payload and gets the following:

{
   "schemaVersion": 1,
   "name": "foo/bar",
   "tag": "0.1",
   "architecture": "amd64",
   "fsLayers": [
      {
         "blobSum": "demo"
      }
   ],
   "history": null,
   "signatures": [
      {
         "header": {
            "jwk": {
               "crv": "P-256",
               "kid": "U4P3:BQAB:L75W:Q7WX:NN3C:ND3V:XBSS:2MM6:XSXP:ZB5Q:XL6Q:JDFD",
               "kty": "EC",
               "x": "W1ul5T2qa_xM8ATqIhu80_5Z0Mhff9TLMKQofBruA3Q",
               "y": "KElpSRkBOM0Y7TNspJh0jlReLKlS7-EqHvqHut9c7Gk"
            },
            "alg": "ES256"
         },
         "signature": "ZaF-2gEMN4yzxQwr86WEkbC4uEaRW6zgMlpW3iL-4-4jpgpklua8IonJm75QCzCp6rsfbLcyWAIddYHqwcZZ1A",
         "protected": "eyJmb3JtYXRMZW5ndGgiOjE3NiwiZm9ybWF0VGFpbCI6IkNuMCIsInRpbWUiOiIyMDE1LTAxLTA5VDIxOjM1OjQwWiJ9"
      }
   ]
}

The above can be verified to have public key "U4P3:BQAB:L75W:Q7WX:NN3C:ND3V:XBSS:2MM6:XSXP:ZB5Q:XL6Q:JDFD". These two signed manifests have identical payloads with different signatures. With modifications to libtrust, we can trivially merge the signatures, getting the following:

{
   "schemaVersion": 1,
   "name": "foo/bar",
   "tag": "0.1",
   "architecture": "amd64",
   "fsLayers": [
      {
         "blobSum": "demo"
      }
   ],
   "history": null,
   "signatures": [
      {
         "header": {
            "jwk": {
               "crv": "P-256",
               "kid": "YGDA:Y3ZD:USKI:4L4E:4HZU:KGPU:YC7E:3SJH:VH7R:7VUW:STJK:O24Q",
               "kty": "EC",
               "x": "MsExpH8eIv60vHvz1p1HJ03ctSy1ZRh9pzsG32I-vaI",
               "y": "VJPadjcQK8AnqlOTyONJUFtizN5xAygEA6VCbwPXZ9U"
            },
            "alg": "ES256"
         },
         "signature": "dUWS_SpB6TuKGdfmv4uyizaz8kiXWZcv5sHuAi84GH5mLdYlslHVwIzBotdg4Sbyh0hy1B9hTQvbQ2BuhbZA4w",
         "protected": "eyJmb3JtYXRMZW5ndGgiOjE3NiwiZm9ybWF0VGFpbCI6IkNuMCIsInRpbWUiOiIyMDE1LTAxLTA5VDIxOjM1OjQwWiJ9"
      },
      {
         "header": {
            "jwk": {
               "crv": "P-256",
               "kid": "U4P3:BQAB:L75W:Q7WX:NN3C:ND3V:XBSS:2MM6:XSXP:ZB5Q:XL6Q:JDFD",
               "kty": "EC",
               "x": "W1ul5T2qa_xM8ATqIhu80_5Z0Mhff9TLMKQofBruA3Q",
               "y": "KElpSRkBOM0Y7TNspJh0jlReLKlS7-EqHvqHut9c7Gk"
            },
            "alg": "ES256"
         },
         "signature": "ZaF-2gEMN4yzxQwr86WEkbC4uEaRW6zgMlpW3iL-4-4jpgpklua8IonJm75QCzCp6rsfbLcyWAIddYHqwcZZ1A",
         "protected": "eyJmb3JtYXRMZW5ndGgiOjE3NiwiZm9ybWF0VGFpbCI6IkNuMCIsInRpbWUiOiIyMDE1LTAxLTA5VDIxOjM1OjQwWiJ9"
      }
   ]
}

The above can then be verified to have been signed by both "YGDA:Y3ZD:USKI:4L4E:4HZU:KGPU:YC7E:3SJH:VH7R:7VUW:STJK:O24Q" and "U4P3:BQAB:L75W:Q7WX:NN3C:ND3V:XBSS:2MM6:XSXP:ZB5Q:XL6Q:JDFD". The registry just needs to store the signature components to be merged when serving a manifest.

@ncdc
Copy link

ncdc commented Jan 13, 2015

Here's my updated WIP for the Core: ncdc/docker@dmcgowan:v2-registry...v2-registry-manifest-digest

Note: it makes some assumptions about headers (e.g Digest contains the digest of the pushed or pulled manifest) that can change if needed.

Note 2: I'm not super crazy about the changes I made to TagStore, but they seem to work.

@stevvooe stevvooe added ready and removed ready labels Jan 14, 2015
stevvooe added a commit that referenced this issue Jan 15, 2015
Refactor backend storage layout to meet new requirements (addresses #25, #46)
@mmdriley
Copy link

Looks like I should be more careful to distinguish "digest" from "signature".

If two manifests have identical digests, are they guaranteed to be semantically identical modulo the "signatures" section?

@stevvooe
Copy link
Collaborator Author

@mmdriley We are all working to find the correct nomenclature here ;).

Yes, to the degree that sha256 hashing can guarantee. Right now, the implementation takes the hash of the (*SignedManifest).Raw field, which is the signed payload of the JWS.

Checkout the ongoing implementation for details.

@stevvooe stevvooe modified the milestones: Registry/RC, Registry/Beta Jan 19, 2015
@stevvooe stevvooe added In Progress and removed Ready labels Feb 9, 2015
@stevvooe
Copy link
Collaborator Author

Per recent developments, we are changing this proposal to modify the tagged manifest routes to accept digests:

Method Path Entity Description
GET `/v2//manifests/<tag digest>` Manifest
PUT `/v2//manifests/<tag digest>` Manifest
DELETE `/v2//manifests/<tag digest>` Manifest

@mattdr
Copy link

mattdr commented Feb 19, 2015

The descriptions don't mention "tag" despite it being in path.

How will you disambiguate tags and digests?

@stevvooe
Copy link
Collaborator Author

@mattdr Thanks for taking a peak.

I'll fix the descriptions when I update the proposal description. Valid digests aren't a valid tag. For example, "sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855" is not matched by [\w][\w.-]{0,127}.

Clients should always verify a manifest after fetching by digest to avoid "hiding" content with a bad tag if this logic is incorrectly implemented.

@miminar
Copy link
Contributor

miminar commented Feb 20, 2015

@stevvooe Is there a way to list manifest digests similar to GET /v2/<name>/tags/list?

@ncdc
Copy link

ncdc commented Feb 20, 2015

I'm not sure this is necessary? What kind of use cases were you envisioning?

Sent from my iPhone

On Feb 20, 2015, at 3:30 AM, Michal Minar notifications@github.com wrote:

@stevvooe Is there a way to list manifest digests similar to GET /v2//tags/list?


Reply to this email directly or view it on GitHub.

@ncdc
Copy link

ncdc commented Feb 20, 2015

Actually I can see this potentially being useful to allow an external
system to assist the registry in garbage collection, where the external
system knows which digests to preserve. It would get the list of all
digests and issue deletes for the ones no longer needed.
On Fri, Feb 20, 2015 at 6:59 AM Andy Goldstein andy.goldstein@gmail.com
wrote:

I'm not sure this is necessary? What kind of use cases were you
envisioning?

Sent from my iPhone

On Feb 20, 2015, at 3:30 AM, Michal Minar notifications@github.com
wrote:

@stevvooe https://github.com/stevvooe Is there a way to list manifest
digests similar to GET /v2//tags/list?


Reply to this email directly or view it on GitHub
#46 (comment).

@miminar
Copy link
Contributor

miminar commented Feb 20, 2015

@ncdc: Suppose the user wants to pull by ID and provides a shortened one. Docker client needs to resolve it so it can GET /v2/<name>/manifests/<digest>. Unless registry handles shortened digests by itself, Docker client will need to get a list of manifest digests to filter out the one matching the given prefix.

@ncdc
Copy link

ncdc commented Feb 20, 2015

@miminar the only way the user would know about a digest should be if the image and digest already existed on the local system, meaning they already pulled an image by tag (which includes the digest as part of the pull response). In this instance, Docker has a record of the full digest, and it should be able to resolve the shortened digest to the full one before transmitting the request to the registry.

@miminar
Copy link
Contributor

miminar commented Feb 20, 2015

@ncdc: Is that really the only way? Digest IDs are not to be shared among people? What if Bob wants to point Alice to a specific version of container he was playing with? Or I could imagine a Dockerfile with FROM ubuntu@sha256:xxxxxxxxxxxx refering to some older version which just worked. Maybe this wrong to do and I'm just missing something. It changes a way I need to think about the digests and tags.

@ncdc
Copy link

ncdc commented Feb 20, 2015

@miminar there's certainly no reason 2 people couldn't share a digest. And we definitely expect and need the FROM statement in a Dockerfile to support digests. And if having an endpoint to retrieve a list of all digests helps somehow and @stevvooe is ok with it, it's fine with me. I'm just having a hard time seeing if that's really needed.

@ncdc
Copy link

ncdc commented Feb 20, 2015

@miminar if Bob wants to share a manifest-by-digest with Alice, then he would have to do something like this:

  1. docker push bob/foo:latest
  2. Registry calculates digest and returns it to Docker in response to Bob's push
  3. Docker records digest in something similar to the TagStore
  4. Bob runs something like docker images to look for bob/foo:latest to see what its digest is
  5. Bob tells Alice to use bob/foo@<digest>
  6. Alice pulls/runs bob/foo@<digest>

@miminar
Copy link
Contributor

miminar commented Feb 20, 2015

@ncdc: I see your point. The ability to list manifest digests isn't that important after all. I couldn't come up with other use cases and this one is not crucial at all.

@powellquiring
Copy link

Consider the following use case: A docker deployment environment reads the registry to come up with inventory. A "version object" in the inventory system is created for each tag and it is made unique with the digest. The user persists a collection of version objects that make up their application as a snapshot (think: ui:03, database:latest, ...) The snapshot is tested, staged, moved to production.

v1 allowed the creation of the version objects but no way to pull by digest

v2 won't support the creation of the version objects, but would allow them to be pulled.

@stevvooe
Copy link
Collaborator Author

stevvooe commented Mar 5, 2015

@powellquiring This is a pretty important use and this proposal is one of the steps to getting to that scenario. Please check out #211 for more details on the implementation of this proposal.

The current state, which you are describing accurately, is partly due to the current V2 manifest format, which includes the tag as part of the manifest. If the manifest changes, the digest changes, which is unfortunate. We are migrating away from having the tags as part of the manifest exactly because it makes this use case impossible.

The most relevant proposal covering this would be #63, which is the fully fledged manifest format. #173 proposes external tags.

Please let me know if this adequately addresses your concern.

@powellquiring
Copy link

I don't believe the use case can be supported without API to to retrieve the immutable image id. Above it was suggested this be part of:

GET /v2/<name>/tags/list

@ncdc
Copy link

ncdc commented Mar 5, 2015

@powellquiring with the changes in #211, when you retrieve an image manifest, the registry returns the manifest's digest as a response header - that's the immutable reference to that particular manifest.

@stevvooe
Copy link
Collaborator Author

stevvooe commented Mar 6, 2015

@powellquiring What is wrong with using the result of /v2/<name>/tags/list to look up the manifest values? Are you asking for an API to list all the manifests, tagged or not, by digest, in a repository?

It would also be helpful if you cloud explain your use case in terms of what is required, rather than what you believe the model supports. That will help to avoid misunderstanding.

@ncdc
Copy link

ncdc commented Mar 6, 2015

We talked on IRC and I think I cleared up any confusion.
On Thu, Mar 5, 2015 at 8:42 PM Stephen Day notifications@github.com wrote:

@powellquiring https://github.com/powellquiring What is wrong with
using the result of /v2//tags/list to look up the manifest values?
Are you asking for an API to list all the manifests, tagged or not, by
digest, in a repository?

It would also be helpful if you cloud explain your use case in terms of
what is required, rather than what you believe the model supports. That
will help to avoid misunderstanding.


Reply to this email directly or view it on GitHub
#46 (comment).

@stevvooe
Copy link
Collaborator Author

stevvooe commented Mar 6, 2015

@ncdc Thank you!

@powellquiring Let me know if you need any further clarifications. We are definitely interested in supporting a flexible deployment model.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants