New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs/spec: Proposal for new manifest format #1068

Merged
merged 1 commit into from Dec 19, 2015

Conversation

Projects
None yet
@aaronlehmann
Copy link
Contributor

aaronlehmann commented Oct 5, 2015

This is a follow-on to PR #62, and it borrows much of the format
from #993, but uses specific formats for the image manifest and manifest
list (fat manifest) instead of a combined generic format.

The intent of this proposed manifest format is to allow multi-arch, and
allow for full content-addressability of images in the Docker engine.

Signed-off-by: Aaron Lehmann aaron.lehmann@docker.com

@aaronlehmann

This comment has been minimized.

Copy link
Contributor Author

aaronlehmann commented Oct 5, 2015

- **`layers`** *array*

The layer list is ordered starting from the base image (opposite order of schema1).
The `baseLayer` label is intended to improve human comprehensibility of the layer

This comment has been minimized.

@mattmoor

mattmoor Oct 5, 2015

Contributor

Perhaps it would be better if the specification left room for this kind of unstructured annotation without calling them out? This way, different implementations can add/remove information intended for humans / UI without a schema change?

For examples, in the future we might want to know the "top" layer, or per-layer what it's Dockerfile directive was, or...

Maybe this should be "comment": "base layer"?

This comment has been minimized.

@aaronlehmann

aaronlehmann Oct 6, 2015

Author Contributor

I'm fine with that, or we can avoid calling it out in the spec. The intention was to clarify that the label in the example doesn't have any semantic meaning to the engine.

*Example showing a simple manifest list pointing to image manifests for two platforms:*
```json
{
"schemaVersion": 3,

This comment has been minimized.

@mattmoor

mattmoor Oct 5, 2015

Contributor

Why 3?

This comment has been minimized.

@dmcgowan

dmcgowan Oct 5, 2015

Member

The new manifest format is already 2, this is really a 3rd format specifically for listing multiple manifests. The schema version must be set such that old clients can fail early and predictability when this manifest list is not supported. Setting this to 2 as well would have been incorrect and confusing.

{
"schemaVersion": 2,
"config": {
"mediaType": "application/vnd.docker.container.image.v1+json",

This comment has been minimized.

@mattmoor

mattmoor Oct 5, 2015

Contributor

At this point, isn't the media type here implied? What else could it be?

This comment has been minimized.

@RichardScothern

RichardScothern Oct 5, 2015

Contributor

If the format of an image container evolves, it would have a new schema and corresponding mediatype.

This comment has been minimized.

@waghpooja

waghpooja Oct 7, 2015

The media type and schema version here go hand in hand. So having both opens up the possibility of these going out of sync as is the case in this example where schemaVersion is 2 but mediaType is application/vnd.docker.container.image.v1+json instead of pplication/vnd.docker.container.image.v2+json.

This comment has been minimized.

@stevvooe

stevvooe Oct 14, 2015

Contributor

I think we should phase out schemaVersion and rely on the mediatype. The schemaVersion field forces a two step deserialization process. We should just have schemaVersion: 2 as an open keyspace. If it contains a "manifests" key, it has manifests, if it has a "config" key, it has a config. Mediatypes are ultimately more flexible and there is no reason to really carry on with schemaVersion, unless we modify the keyspace after 2.

},
"layers": [
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",

This comment has been minimized.

@mattmoor

mattmoor Oct 5, 2015

Contributor

Same question as above.

"schemaVersion": 3,
"manifests": [
{
"mediaType": "application/vnd.docker.image.manifest.v2+json",

This comment has been minimized.

@mattmoor

mattmoor Oct 5, 2015

Contributor

Can this be any other media type?

This comment has been minimized.

@aaronlehmann

aaronlehmann Oct 6, 2015

Author Contributor

I was thinking we should support application/vnd.docker.image.manifest.v1+json as well, in case someone really wants to reference a legacy manifest here, but I don't have any strong feelings about allowing this, and can see some reasons why we might not want to allow it.

- `application/vnd.docker.image.rootfs.diff.tar.gzip`: "Layer", as a gzipped tar
- `application/vnd.docker.container.image.v1+json`: Container config JSON

## Manifest List

This comment has been minimized.

@mattmoor

mattmoor Oct 5, 2015

Contributor

This kind of confused me a little bit because today we have endpoints:

/v2/<name>/manifests/<reference>
/v2/<name>/tags/list

I initially read this as:

/v2/<name>/manifests/list

Is it worth splitting the resource noun for this from manifest to something like group?

/v2/<name>/groups/<reference>

(label seemed like the wrong noun because it is used in a different sense within the JSON)


- **`labels`** *object*

Labels may be keyed to *any* JSON object, allowing content creators to

This comment has been minimized.

@RichardScothern

RichardScothern Oct 5, 2015

Contributor

We should forbid them being keyed to themselves.

@RichardScothern RichardScothern referenced this pull request Oct 6, 2015

Closed

Distribution does not conform to canonical format #1066

1 of 3 tasks complete
@dmcgowan

This comment has been minimized.

Copy link
Member

dmcgowan commented Oct 7, 2015

We should add a compatibility section at the end which will suggest how we will support clients.

The original suggestion is that clients which support this would provide an "Accept" header and the registry will now set the "Content-Type". When either of these headers are missing the assumption made by the other endpoint would be that schema version 1 is used.


- **`layers`** *array*

The layer list is ordered starting from the base image (opposite order of schema1).

This comment has been minimized.

@waghpooja

waghpooja Oct 7, 2015

What is the motivation behind reversing this order from the one in schema1?

This comment has been minimized.

@aaronlehmann

aaronlehmann Oct 7, 2015

Author Contributor

The ordering in schema1 was awkward because engines ended up having to process it in reverse. The base layer needs to be added to the graph first, but this was at the end of the list in schema1.

@estesp

This comment has been minimized.

Copy link
Member

estesp commented Oct 13, 2015

I like this as a cleaner alternate to #993. I assume we still have some questions to answer based on the questions in-line above from @mattmoor and others, but given we can find answers to those, I'm LGTM on this spec design for supporting multi-arch.

@bowlofeggs

This comment has been minimized.

Copy link
Contributor

bowlofeggs commented Oct 13, 2015

Will this format allow there to be a canonical format for manifests so that there is exactly one way to calculate the digest of a manifest? @TomasTomecek has been working on writing a Python library to calculate the digest of a manifest which has been tricky due to manifests not having a "correct" way to be formatted (things like spacing, ordering of the keys, how to remove signatures, etc.)

The signature is particularly troublesome, since it is included inside the manifest. Most crypto signing libraries transmit the signature outside of the data that is being signed instead of inside it (think of PGP). I think that design would really make digest validation much easier. Another option is to consider a signed manifest and the same manifest with different or no signatures not to be the "same" so they get different digests.

@mattmoor

This comment has been minimized.

Copy link
Contributor

mattmoor commented Oct 13, 2015

@rbarlow Very big +1, this drove us nuts because there is also this, which would make things very straightforward.

- **`schemaVersion`** *int*

This field specifies the image manifest schema version as an integer. A
manifest list uses version `3` to distinguish it from an image manifest.

This comment has been minimized.

@ncdc

ncdc Oct 14, 2015

Contributor

This seems a bit strange to me. Doesn't this imply that if you want to rev the regular manifest version, the next available value would be 4?

This comment has been minimized.

@mattmoor

mattmoor Oct 14, 2015

Contributor

I brought this up in the DWG meeting last week as well.

I think that if these objects are going to have perpetually distinct schema versions, we should split them into distinct API objects (with their own schema number to boot).

I've been trying to articulate this for a couple meetings now, probably badly. I will see if I can put together a proposal that better articulates it. I think where we left things last week was to wait for Steve to weigh in this week.


The manifests field contains a list of manifests for specific platforms.

Fields of a onject in the manifests list are:

This comment has been minimized.

@ncdc

ncdc Oct 14, 2015

Contributor

"a onject" -> "an object"

@stevvooe

This comment has been minimized.

Copy link
Contributor

stevvooe commented Oct 14, 2015

@rbarlow The content used to calculate the digest of a manifest is set by the original generator of the raw bytes. This avoids the need for specifications like Canonical JSON that purport to bring stable generation when they really don't. There are two many cross-language variations JSON generation to always get this right. This is handled simply by saving the raw bytes of the serialized manifest. This is also flexible in that it doesn't prevent implementations from generating deterministic manifests but doesn't require them where they are not needed.

@bowlofeggs

This comment has been minimized.

Copy link
Contributor

bowlofeggs commented Oct 14, 2015

@stevvooe The problem is that signatures get inserted into the manifest, changing it. If a client needs to calculate the digest of a retrieved manifest that has signatures, that client must carefully remove the signatures and try to guess how the manifest was formatted when its digest was calculated. Since there is no canonical format for the manifest, there isn't a guarantee that doing this is possible.

If the signatures were instead stored separately of the manifest, there would be no need for any client to perform any manipulation at all on the manifest to calculate the digest. This is how PGP emails work, for example. The data being signed and the signature are separate, rather than having the signature inside the data that is being signed. This makes signature verification straightforward. The Docker manifest's inclusion of the signatures within the manifest itself make it difficult to calculate the digest of the manifest and also make signature verification difficult.

I am proposing that the digest and the signatures be separated from the manifest so that a client can calculate a plain old checksum on exactly the digest bits that were retrieved without having to do any manipulation at all. For example, something like this would be viable:

{'key1': 'value1', 'key2': 'value2', etc...}
------- BEGIN DOCKER SIGNATURES -------
[SIGNATURE_1, SIGNATURE_2, etc.]
------- END DOCKER SIGNATURES -------

In the example above, the client would strip the signature block and run the digest algorithm on the exact bits that are left over (the JSON). This way the client doesn't have to manipulate the JSON at all. If the original manifest used spaces and newlines or did it all in one line, the client calculates the checksum the same way - on whatever bits were received. This makes digest validation so much simpler, and it also aids in signature verification which suffers from the same problem.

Does that make sense?

@aaronlehmann

This comment has been minimized.

Copy link
Contributor Author

aaronlehmann commented Oct 14, 2015

@rbarlow: This revision of the manifest won't include in-line signatures. Signatures will be handled outside the manifest by signing the manifest's digest, for example with Docker Content Trust.

This will be much less confusing, and I think it resolves the concern about canonical vs. non-canonical JSON.

@bowlofeggs

This comment has been minimized.

Copy link
Contributor

bowlofeggs commented Oct 14, 2015

@aaronlehmann That is excellent news! Any hints on what sort of time frame we can expect this change to be included in Docker? I and some others are working on integrating Pulp with Docker's v2 API, and this digest calculation is currently an issue for us. Knowing a scale of time might help inform our decision about what we should do in the meantime.

@aaronlehmann

This comment has been minimized.

Copy link
Contributor Author

aaronlehmann commented Oct 14, 2015

@rbarlow: Support for this new manifest format is planned for Docker 1.10, which will ship in a few months. I'm not sure what your compatibility plans are, but one thing I'd keep in mind is that you would need to handle the old manifest format if you want to interoperate with engines older than Docker 1.10, or registries that haven't been updated to support the new format.

I'm going to add some details on backward compatibility when I update this proposal.

@aaronlehmann aaronlehmann force-pushed the aaronlehmann:manifest-proposal branch from dae2a31 to b15e4e1 Oct 14, 2015

@aaronlehmann

This comment has been minimized.

Copy link
Contributor Author

aaronlehmann commented Oct 14, 2015

I've updated the proposal with the following changes:

  • s/mediaType/mediatype/
  • schemaVersion 2 used throughout, instead of a combination of 2 and 3
  • The label field has been replaced by a more specific way of specifying a platform. The platform key maps to an object containing fields for the architecture, OS, CPU variant, and required CPU features.
  • Added a backward compatibility section, explaining how the registry will support versions of the engine which don't support the new manifest format.

PTAL

@aaronlehmann aaronlehmann force-pushed the aaronlehmann:manifest-proposal branch 5 times, most recently from f2bc72f to c68abe3 Oct 14, 2015

@mattmoor

This comment has been minimized.

Copy link
Contributor

mattmoor commented Dec 14, 2015

@endophage opaque doesn't mean unverifiable digests. I don't need to understand a mime type to compute it's sha256. However, today that is exactly what's required to compute the sha256 of manifests. I'd much rather have the content-addressing be derived in a content-opaque fashion, regardless of what deeper understanding the registry has of the content.

@endophage

This comment has been minimized.

Copy link
Contributor

endophage commented Dec 14, 2015

@mattmoor I have no issue with that. Verification is different from addressing. I just want to be clear though that the registry needs to verify manifests during upload and provide synchronous responses when they're invalid.

@aaronlehmann

This comment has been minimized.

Copy link
Contributor Author

aaronlehmann commented Dec 14, 2015

I don't think the previous conversation mentioned verification, which is why I was confused.

Despite the philosophical discussion about how it might be better if blobs were opaque, the registry parses the manifest JSON, and would return an error to the client if it's not valid.

For the new manifest format, the only other sensible verification step I can think of would be to make sure that the other digests referenced by the manifest exist in storage. But I'm not 100% sure we should do this - there may be use cases where someone would want to upload a manifest first, and then its dependencies, or situations where eventual consistency leads to the dependencies not being immediately visible. Any thoughts?

@endophage

This comment has been minimized.

Copy link
Contributor

endophage commented Dec 14, 2015

@aaronlehmann I wish that was true but beyond side effect verification (errors attempting to deserialize JSON and digests) there is very little done at the moment. It would be beneficial to also validate that the schema version is an acceptable/known version, if all required fields for that version are present, if integer fields, like size, actually contain integers. The new format appears relatively simple so those are the only things that spring to mind at the moment.

@stevvooe

This comment has been minimized.

Copy link
Contributor

stevvooe commented Dec 14, 2015

@endophage This approach to content verification and validation on the registry side has resulted in a massively broken security situation and an inflexible distribution story. The dumber we make the registry, the more protected the daemon must become.

@harche

This comment has been minimized.

Copy link
Contributor

harche commented Dec 16, 2015

@aaronlehmann May be I am missing something, but how can someone just upload this new manifest format (Fat manifest) and then later the dependencies (I assume you meant the other manifests it references)? Won't that change the Fat manifest's digest?

IMO, this should behave like a pure content addressable storage system (something like IPFS). Users can create these Fat manifests based on existing manifest digests. If someone needs to add another reference manifest then they just have to create new Fat manifest with all the older references plus the additional one.

If one of the manifest reference in Fat manifest gets deleted, the error should occur only when user is trying to access the reference manifest through Fat manifest. This will let user know that this Fat manifest should not be used anymore because the reference manifest they are interested in is no longer available. This way no logic has to be present in registry at all and it can behave like a dumb blobstore.

@aaronlehmann

This comment has been minimized.

Copy link
Contributor Author

aaronlehmann commented Dec 16, 2015

I was only thinking that the uploads might be ordered in a way where the dependencies can't be checked immediately. This probably won't be a problem in practice. BTW, the code that's in progress for the Docker registry validates dependencies referenced by the manifest, with an exception for the proxy mode.

@harche

This comment has been minimized.

Copy link
Contributor

harche commented Dec 17, 2015

@aaronlehmann @stevvooe @estesp Should I create a separate issue to discuss how the CLI going to look like for fat manifests?

@stevvooe

This comment has been minimized.

Copy link
Contributor

stevvooe commented Dec 17, 2015

@harche Yes. This would probably make the most sense. We can have the proposal stage in the distribution project for now and dispatch to docker/docker depending on where the actual feature will land.

@aaronlehmann aaronlehmann force-pushed the aaronlehmann:manifest-proposal branch from c68abe3 to e6e2e2e Dec 18, 2015

@aaronlehmann

This comment has been minimized.

Copy link
Contributor Author

aaronlehmann commented Dec 18, 2015

I've updated this proposal to make a few minor changes:

  • Include the content type in-band in both image manifests and manifest lists. This makes it easy to distinguish between image manifests and manifests lists in case there's no content type available out-of-band.
  • Changed mediatype to mediaType, because this is the way the distribution codebase already serializes Descriptor objects, and I don't want to deviate from that. If people prefer to change this serialization (and it's safe to do so), I can change it back.

I feel like this is ready to be merged. PTAL.

I have working code that implements the formats in this proposal, which I'll open as a PR against docker/distribution once #1268 is merged.

@estesp

This comment has been minimized.

Copy link
Member

estesp commented Dec 18, 2015

Thanks @aaronlehmann ; looks good and glad to see the other PRs showing up with implementation. I'm "non-maintainer LGTM" on this one! :)

@RichardScothern

This comment has been minimized.

Copy link
Contributor

RichardScothern commented Dec 18, 2015

LGTM

On Thu, Dec 17, 2015 at 8:17 PM, Phil Estes notifications@github.com
wrote:

Thanks @aaronlehmann https://github.com/aaronlehmann ; looks good and
glad to see the other PRs showing up with implementation. I'm
"non-maintainer LGTM" on this one! :)


Reply to this email directly or view it on GitHub
#1068 (comment).

@stevvooe

This comment has been minimized.

Copy link
Contributor

stevvooe commented Dec 18, 2015

Include the content type in-band in both image manifests and manifest lists. This makes it easy to distinguish between image manifests and manifests lists in case there's no content type available out-of-band.

Content-type should always be in-band, from the trusted source. Content-type returned by registries (out-of-band) should not be trusted. We might want to say this explicitly, as this is important for security.

Changed mediatype to mediaType, because this is the way the distribution codebase already serializes Descriptor objects, and I don't want to deviate from that. If people prefer to change this serialization (and it's safe to do so), I can change it back.

Please don't. Leave as "mediatype". The fact that descriptor has camelCase is unintentional. In general, keys should be case-insensitive.

@aaronlehmann

This comment has been minimized.

Copy link
Contributor Author

aaronlehmann commented Dec 18, 2015

Please don't. Leave as "mediatype". The fact that descriptor has camelCase is unintentional. In general, keys should be case-insensitive.

Would you like me to change it to mediatype in descriptor as well? My implementation of the schema2 manifest uses distribution.Descriptor directly, and I would like to avoid forking it.

@stevvooe

This comment has been minimized.

Copy link
Contributor

stevvooe commented Dec 18, 2015

Would you like me to change it to mediatype in descriptor as well? My implementation of the schema2 manifest uses distribution.Descriptor directly, and I would like to avoid forking it

@aaronlehmann I don't think we use that anywhere, so go ahead.

@aaronlehmann aaronlehmann force-pushed the aaronlehmann:manifest-proposal branch from e6e2e2e to cf8135a Dec 18, 2015

@aaronlehmann

This comment has been minimized.

Copy link
Contributor Author

aaronlehmann commented Dec 18, 2015

Reverted the mediatype change; will change the Descriptor serialization to match in #1281.

@aaronlehmann

This comment has been minimized.

Copy link
Contributor Author

aaronlehmann commented Dec 18, 2015

@stevvooe: Changing the Descriptor serialization would change affect the notification serialization format. Are we confident enough that implementation will be case-insensitive to change the casing there?

@stevvooe

This comment has been minimized.

Copy link
Contributor

stevvooe commented Dec 18, 2015

@aaronlehmann I guess we can leave it alone and change to mediaType in this specification. We have no guarantee that consumers wrote appropriately agile code to handle case-insensitivity. Sorry for the confusion.

@aaronlehmann aaronlehmann force-pushed the aaronlehmann:manifest-proposal branch from cf8135a to e8d6dbd Dec 18, 2015

@aaronlehmann

This comment has been minimized.

Copy link
Contributor Author

aaronlehmann commented Dec 18, 2015

Changed back. No worries.

docs/spec: Proposal for new manifest format
This is a follow-on to PR #62, and it borrows much of the format
from #993, but uses specific formats for the image manifest and manifest
list (fat manifest) instead of a combined generic format.

The intent of this proposed manifest format is to allow multi-arch, and
allow for full content-addressability of images in the Docker engine.

Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>

@aaronlehmann aaronlehmann force-pushed the aaronlehmann:manifest-proposal branch from e8d6dbd to 2e3f493 Dec 18, 2015

@dmp42

This comment has been minimized.

Copy link
Member

dmp42 commented Dec 19, 2015

LGTM

1 similar comment
@stevvooe

This comment has been minimized.

Copy link
Contributor

stevvooe commented Dec 19, 2015

LGTM

stevvooe added a commit that referenced this pull request Dec 19, 2015

Merge pull request #1068 from aaronlehmann/manifest-proposal
docs/spec: Proposal for new manifest format

@stevvooe stevvooe merged commit ed3346e into docker:master Dec 19, 2015

2 of 3 checks passed

documentation fail 1 tests run, 0 skipped, 1 failed.
Details
ci/circleci Your tests passed on CircleCI!
Details
docker/dco-signed All commits signed
Details

@faiq faiq referenced this pull request Apr 11, 2016

Merged

fix scratch push 1.10 #137

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment