Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop allowing invalid manifest lists which include blobs in their list in version 3.0.0 #3452

Open
brackendawson opened this issue Jul 6, 2021 · 17 comments

Comments

@brackendawson
Copy link
Contributor

brackendawson commented Jul 6, 2021

This issue exists to document the maintainers' vote or objections on making this change, please reply with your 👍 .

Since v2.3.0 (underlying bug existed since v2.1.1), distribution incorrectly allows OCI manifest lists and fat manifests to be uploaded which reference digests uploaded as blobs rather than only those uploaded as manifests. The buildx client has made use of this oversight to upload binary build caches as blobs and then reference them in an OCI manifest list. This works in distribution registries because of a workaround set to be removed in #3365, but some of those registries will still GC the blob.

We want to merge #3365 to fix a separate 500 error on a bad request where users try to get a manifest using the digest of a blob. It was discussed what to about this knowing about this buildx behaviour on the 6th July 2021 weekly call and we think that this should be included in v3.0.0. A distribution registry upgrading to this version will break the build pipelines of its users who:

  • are using a version of buildx which makes these invalid lists.
  • have chosen to keep their build cache.
  • have chosen to keep that build cache in a registry.
  • use a registry that allows it (is distribution and dosn't already GC the blobs).

We think targeting v3.0.0 is appropriate because:

  • It is a breaking change, so should not go in v2.7.2.
  • v3.0.0 willshould come with OCI artefact support, which would be the correct way for buildx to store the cache blobs.
  • v2 already has an alternative, but still not correct, method that buildx can use in the meantime to store these caches; link the blob in an OCI manifest.
  • Leaving the bug in distribution for longer than necessary invites other clients to follow the same invalid pattern. This is of particular concern given that buildx is a docker tool and could be seen as exemplary code.
@brackendawson brackendawson changed the title Stop allowing invalid OCI manifest lists which include blobs in their list in version 3.0.0 Stop allowing invalid manifest lists which include blobs in their list in version 3.0.0 Jul 6, 2021
@deleteriousEffect
Copy link
Member

deleteriousEffect commented Jul 6, 2021

v3.0.0 will come with OCI artefact support, which would be the correct way for buildx to store the cache blobs.

I think "will" is probably a bit strong here, opencontainers/artifacts#41 is still in discussion and I don't believe there's an issue/PR in this project for supporting this yet. I believe this should be our intent for a 3.x.x release, though.

@justincormack
Copy link
Collaborator

I don't think thats a sufficient transition - just saying v3.0.0 will have OCI artefact support doesn't give all the users (not just buildx it seems) sufficient time to adapt to something that has worked for many years. Saying that "there are alternatives" doesn't give users time to write and test code, for users to upgrade and so on. It would also be good if OCI could clarify the spec.

@milosgajdos
Copy link
Member

It would also be good if OCI could clarify the spec.

This is something I would personally like to be strongly pushing for before any action is taken. This isn't the only "ambiguity" we have had to deal with when it comes to OCI spec.

Someone who got involved in this [and other] project(s) later than the spec had been discussed does not have the context of the original intentions and frankly, getting it by endless browsing of the [closed] issues and PRs solves nothing.

Getting the OCI spec clarified is the best step forward before we break more things for people than we fix.

@deleteriousEffect
Copy link
Member

deleteriousEffect commented Jul 6, 2021

I don't think thats a sufficient transition - just saying v3.0.0 will have OCI artefact support doesn't give all the users (not just buildx it seems) sufficient time to adapt to something that has worked for many years.

v3.0.0 should introduce breaking changes. I don't want to discourage adoption with dramatic changes by any means, but @joaodrp has done some investigation on support for this and out of DockerHub, ACR (Azure), GHCR (GitHub), ECR (Amazon), Artifactory, Quay, GCR (Google), and GitLab, only DockerHub, ACR, GHCR, and GitLab allow these cache images to be pushed. So these Image Indexes are already something without broad support.

@joaodrp
Copy link
Collaborator

joaodrp commented Jul 6, 2021

@brackendawson, thanks for creating this issue.

I want to share my opinion on a few things:

  • It's important to note that while it's currently possible to upload lists/indexes that reference blobs (things uploaded through the blobs endpoint and not the manifests endpoint) instead of manifests, this is not due to the intent behind the workaround to be reverted in Remove workaround from 2.1.1 for faulty 2.1.0 manifest links #3365 but rather an unwanted side effect of that workaround (at least that's my perception). And this side effect went unnoticed until now;

  • Although OCI artifacts may be the ultimate answer to some of these more relaxed use cases, we shouldn't hold on to that to fix this link/validation bug. Besides the time to finalize the spec, it may take a long time for it to be adopted by most providers. Meanwhile, we should do the best we can with what we have (enforcing the Docker schema 2 and OCI Image specs) IMO;

  • If some clients can't cope with the current specs and validations because they are not a good fit for them, I think it would be better if they did not use the Docker and/or OCI manifest media types. Otherwise, they trick registries into thinking they are dealing with a given format when they're not;

  • We're referring to the buildkit cache images specifically, but there might be other use cases that we don't know. If we keep this behavior (link/validation bug) indefinitely, the problem will only worsen. Besides the technical challenges these may pose to providers, this affects the UX (different behaviors across different providers).

@joaodrp
Copy link
Collaborator

joaodrp commented Jul 6, 2021

Regarding the buildkit/buildx issue, I've been having a long conversation about that in docker/buildx#173, starting at docker/buildx#173 (comment), so I won't extend myself too much on that.

I'd like to know if the backward compatible change that I proposed in docker/buildx#173 (comment) is viable or not though. I'm approaching this in a constructive way, not pointing fingers or trying to blame anyone. To me, this seems to be the path of least resistance to get instant compatibility with most registries (if not all) while preserving the UX.

@deleteriousEffect
Copy link
Member

It would also be good if OCI could clarify the spec

With regard to what an OCI Image Index should contain within the manifests array, I think this is clear enough to mean that these should be manifests — it links to the OCI Image Manifest Specification when it says this and it doesn't mention other kinds of objects that could be included.

I think this is reasonably narrow, it shouldn't be necessary to enumerate everything you shouldn't put there. It doesn't restrict the mediatypes of those manifests, but that does not mean this array is open to all types of objects which could possibly be associated with a descriptor.

@dmcgowan
Copy link
Collaborator

dmcgowan commented Jul 6, 2021

I voted 👎 since the goal of this implementation was always to be the superset of what is supported by OCI with stricter behavior gated by configuration. The supported types in OCI uses MUST support at least and unknown/unexpected should be gracefully handled. The returning 500 is a bug since it violates the principle of gracefully handling unknown/unexpected content, not because the content was accepted in the first place.

@joaodrp
Copy link
Collaborator

joaodrp commented Jul 6, 2021

The supported types in OCI uses MUST support at least and unknown/unexpected should be gracefully handled.

The "unknown" vs. "unexpected" distinction is key here. "unexpected" does not appear in the current OCI Image Index spec. In regards to "unknown", I believe this is related to the following paragraph:

Image indexes concerned with portability SHOULD use one of the above media types. Future versions of the spec MAY use a different media type (i.e., a new versioned format). An encountered mediaType that is unknown to the implementation MUST be ignored.

So registries should not refuse index references with media types that they don't know. However, using the buildkit indexes as an example (that's the only one I know):

{
    "schemaVersion": 2,
    "mediaType": "application/vnd.oci.image.index.v1+json",
    "manifests": [
        {
            "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
            "digest": "sha256:136482bf81d1fa351b424ebb8c7e34d15f2c5ed3fc0b66b544b8312bda3d52d9",
            "size": 2427917,
            "annotations": {
                "buildkit/createdat": "2021-07-02T17:08:09.095229615Z",
                "containerd.io/uncompressed": "sha256:3461705ddf3646694bec0ac1cc70d5d39631ca2f4317b451c01d0f0fd0580c90"
            }
        },
        ...
        {
            "mediaType": "application/vnd.buildkit.cacheconfig.v0",
            "digest": "sha256:7aa23086ec6b7e0603a4dc72773ee13acf22cdd760613076ea53b1028d7a22d8",
            "size": 1654
        }
    ]
}

In this case, the manifests elements have a media type of application/vnd.oci.image.layer.v1.tar+gzip. This is not an unknown media type for OCI registries. It's well known, and it's used to describe an OCI layer, not a manifest.

So, the fact that we find these references within manifests is totally unexpected. But "unexpected" is not part of the spec (nor I think it should, cause providers want and need some level of predictability, and so do users if they want a consistent experience).

Regardless, reading the spec, we stop way before any mentions to "unknown" media types, on the first line, when the purpose of an index is described, and it says:

The image index is a higher-level manifest which points to specific image manifests

So if an index references anything besides manifests, it's not compliant. We don't even need to get to the point where we look or not at media types.

In this case (buildkit), this is not about the OCI spec, though. Before the OCI indexes were adopted, Docker manifest lists were being used, and their spec is even more strict:

The MIME type of the referenced object. This will generally be application/vnd.docker.distribution.manifest.v2+json, but it could also be application/vnd.docker.distribution.manifest.v1+json if the manifest list references a legacy schema-1 manifest.

So this was already a non-conformance issue before the OCI media types adoption.

@sudo-bmitch
Copy link
Contributor

sudo-bmitch commented Jul 6, 2021

What is the migration plan for clients with an immediate cutover on 3.0 (assuming artifact-spec gets approved and implemented)? Do clients need to support both and automatically detect which registry they are talking to? What about content in the registry that was already pushed as an OCI index? Will there be a migration plan for content between 2.x and 3.x registries? As a user I'd like to see a migration plan that doesn't break workflows during the migration.

Also my reading of the image-spec seems to be different from many of the distribution maintainers. I feel the OCI index entries with a media type that is not a known manifest should just be ignored. There seems to be a desire for the registry to do something with every entry in the index, which is counter to the image spec:

An encountered mediaType that is unknown to the implementation MUST be ignored.

Criticisms I've heard of my suggestion seem to fall into:

  1. The mediaType is known. I'd suggest it's not a known manifest media type, so any process that is expecting a manifest should not be expected to handle that descriptor.
  2. The blobs will get deleted by GC since they aren't referenced by a valid manifest. I see this as a problem for the client to deal with, and I say that as someone that works on client software. It's my responsibility to know what needs to be done to avoid having a blob get GC'd.

The knowledge that GC can and should happen on those blobs should be encouragement for many to avoid depending on this for their own "data storage in a registry" solutions. That should help prevent this behavior from sprawling to other registry client projects.

@deleteriousEffect
Copy link
Member

The blobs will get deleted by GC since they aren't referenced by a valid manifest. I see this as a problem for the client to deal with, and I say that as someone that works on client software. It's my responsibility to know what needs to be done to avoid having a blob get GC'd.

This strikes me as very strange. Allowing these OCI Image Indexes to be pushed up and returning a success status, then (possibly) deleting all of the referenced objects out from under it at some unknown point in the future, is really strange to me. Currently, the garbage collector doesn't delete unknown mediatypes, and I don't believe that we should update it to do so.

An encountered mediaType that is unknown to the implementation MUST be ignored.

application/vnd.oci.image.layer.v1.tar+gzip is a known mediatype, and it's reasonable to enforce the validity of OCI Image Indexes. We know this mediatype, we know the mediatype for OCI Image Indexes, and we know that mediatypes that indicate layers or config blobs should not be contained within the manifests array of an OCI Index according to the spec. Whether we do that via actually inspecting the mediatype is not relevant.

It's reasonable to expect a manifest list to be a list of manifests, and it's reasonable for us to enforce than moving forward with v3.0.0. We perform validation on manifest pushes, so there's clearly some intent in the design of distribution to reject manifests which are invalid.

What is the migration plan for clients with an immediate cutover on 3.0 (assuming artifact-spec gets approved and implemented)?

We'd need to determine all the breaking changes we're going to in 3.0.0 before we can really talk about migration plans. Personally, I think we need to deal with the existing manifests lists with layer references, regardless of whether we allow pushes of this nature in the future.

@sudo-bmitch
Copy link
Contributor

This strikes me as very strange. Allowing these OCI Image Indexes to be pushed up and returning a success status, then (possibly) deleting all of the referenced objects out from under it at some unknown point in the future, is really strange to me. Currently, the garbage collector doesn't delete unknown mediatypes, and I don't believe that we should update it to do so.

It's the same behavior that already exists today if we upload a blob and don't upload an associated manifest. The blob upload succeeds, but the GC can remove that blob in the future. It's my responsibility as a client to ensure that blob has a valid reference to avoid the GC. To me this is an expected behavior.

I also hesitate to use GC behavior as a leading reason to decide what to accept for push since GC doesn't run automatically in distribution/distribution. Without any GC, this ugly hack just works.

application/vnd.oci.image.layer.v1.tar+gzip is a known mediatype

It is, but it's not known to the tooling that is looking for manifests. My concern with enforcing validity on the mediaType field of an index entry is that we may be breaking future use cases in ways the spec described as a point of extensibility.

I'd also like to see this fixed with artifact-spec first, rather than focusing on blocking the existing use cases first. Otherwise we are saying "don't do that" without a better option of what should be done.

@joaodrp
Copy link
Collaborator

joaodrp commented Jul 7, 2021

I'd also like to see this fixed with artifact-spec first, rather than focusing on blocking the existing use cases first. Otherwise we are saying "don't do that" without a better option of what should be done.

I don't think this is the case. Let's look at a few examples.

Helm

{
  "schemaVersion": 2,
  "manifests": [
    {
      "mediaType": "application/vnd.oci.image.manifest.v1+json",
      "digest": "sha256:31fb454efb3c69fafe53672598006790122269a1b3b458607dbe106aba7059ef",
      "size": 354,
      "annotations": {
        "org.opencontainers.image.ref.name": "localhost:5000/myrepo/mychart:2.7.0"
      }
    }
  ]
}
{
  "schemaVersion": 2,
  "config": {
    "mediaType": "application/vnd.cncf.helm.config.v1+json",
    "digest": "sha256:8ec7c0f2f6860037c19b54c3cfbab48d9b4b21b485a93d87b64690fdb68c2111",
    "size": 117
  },
  "layers": [
    {
      "mediaType": "application/tar+gzip",
      "digest": "sha256:1b251d38cfe948dfc0a5745b7af5ca574ecb61e52aed10b19039db39af6e1617",
      "size": 2487
    }
  ]
}

WASM

{
  "schemaVersion": 2,
  "config": {
    "mediaType": "application/vnd.wasm.config.v1+json",
    "digest": "sha256:44136fa355b3678a1146ad16f7e8649e94fb4fc21fe77e8310c060f61caaff8a",
    "size": 2
  },
  "layers": [
    {
      "mediaType": "application/vnd.wasm.content.layer.v1+wasm",
      "digest": "sha256:4c7915b4c1f9b0c13f962998e4199ceb00db39a4a7fa4554f40ae0bed83d9510",
      "size": 1624962
    }
  ]
}

Homebrew

{
  "schemaVersion": 2,
  "config": {
    "mediaType": "application/vnd.oci.image.config.v1+json",
    "digest": "sha256:6324fea418ad479592ef3c21dafa2a6ffc3188d92426420f6257f5b32a7c0841",
    "size": 225
  },
  "layers": [
    {
      "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
      "digest": "sha256:4f6bf2e51a4952f4cc83760c1c732a5b0742ab693e977ae88df71f655b9dee7a",
      "size": 16236572,
      "annotations": {
        "org.opencontainers.image.title": "rclone--1.55.1.catalina.bottle.tar.gz"
      }
    }
  ],
  "annotations": {
    ...
  }
}

None of these are container images or had to wait for an artifacts spec, and none are using an index to aggregate non-manifest references. I think we'd really like to understand why others can't do the same and the technical reasons behind it.

@sudo-bmitch
Copy link
Contributor

None of these are container images or had to wait for an artifacts spec, and none are using an index to aggregate non-manifest references. I think we'd really like to understand why others can't do the same and the technical reasons behind it.

The examples there are all for a single layer. It sounds like the concerns from the buildx maintainers were that they would violate the manifest definition of layers with a multi-layer manifest:

The array MUST have the base layer at index 0. Subsequent layers MUST then follow in stack order (i.e. from layers[0] to layers[len(layers)-1]). The final filesystem layout MUST match the result of applying the layers to an empty directory.

I think sorting this one out will be best handled on the OCI side by clarifying the specs, and then distribution/distribution can follow the result. Today's OCI call discussed it, and it's included on their agenda for next week to give more people time to review the issue and attend. Some key discussion points were:

  • How do they define a "manifest" in the index definition? They don't want to link to a single example like the OCI image manifest. The definition should include a variety of possible manifest types and need to handle future compatibility. (This does not mean that every registry is expected to support every manifest media type.) For example, note the application/xml mediaType in the index example.
  • When using a descriptor, how does a client know when to use the manifest or the blob API paths to the registry? Is each client expected to maintain a list of valid media types for manifests and use blob APIs for everything else? This is the same question a registry faces when performing GC.
  • When words like "must be ignored" are used, is that a direction to clients/runtimes, the registry server, or both?

@joaodrp
Copy link
Collaborator

joaodrp commented Jul 13, 2021

I raised an issue to discuss the Buildkit use case: moby/buildkit#2251

@SteveLasker
Copy link
Collaborator

I'm a 👍 to solve this in a more consistent way, such as (moby/buildkit#2251). That said, those that run instances of this project shouldn't be blocked, so I'm torn in not wanting to suggest gitlab or others running distribution instances shouldn't find a way to meet their customers needs and driving a consistent solution.
In all fairness, ACR did enable buildx support, not because we agreed with the approach. Rather, our ACR Tasks BuildX support started breaking as the cache layers were being GCd and customers opened support tickets.
Had I realized the issue at the time, I would have engaged the buildx team to see how we could solve it then.
It just so happens our ACR eng team is super customer focused, so they just "fixed it" and it was a passive comment that I'm only now realizing the implication.

1 similar comment
@SteveLasker
Copy link
Collaborator

I'm a 👍 to solve this in a more consistent way, such as (moby/buildkit#2251). That said, those that run instances of this project shouldn't be blocked, so I'm torn in not wanting to suggest gitlab or others running distribution instances shouldn't find a way to meet their customers needs and driving a consistent solution.
In all fairness, ACR did enable buildx support, not because we agreed with the approach. Rather, our ACR Tasks BuildX support started breaking as the cache layers were being GCd and customers opened support tickets.
Had I realized the issue at the time, I would have engaged the buildx team to see how we could solve it then.
It just so happens our ACR eng team is super customer focused, so they just "fixed it" and it was a passive comment that I'm only now realizing the implication.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants