Skip to content
This repository has been archived by the owner on Sep 23, 2022. It is now read-only.

Issues with Proposal E #47

Closed
chris-crone opened this issue May 16, 2022 · 7 comments
Closed

Issues with Proposal E #47

chris-crone opened this issue May 16, 2022 · 7 comments

Comments

@chris-crone
Copy link

As discussed on the working group Slack channel, I have some concerns with the current preferred proposal– Proposal E.

Things I'm concerned about in Proposal E (C1-3):

  1. Adding another layer of object referencing (in addition to the existing OCI index/manifest list) to registries complicates the data model and GC.
  2. Annotation filtering is an interesting feature (outside of just this work) but will be non-trivial for registries with large numbers of objects to add. I'd also be for doing this more generically rather than specifically for references.
  3. The fallback of using tags as references will cause clutter on registries and require client side clean up as GCs currently preserve tagged items. Registries often have a repo tag limit (e.g.: ECR limits to 10k).

Questions I have (Q1-2):

  1. How would one move an artifact and all its metadata from one registry to another?
  2. What happens if the ecosystem does not broadly adopt the proposal and we're stuck in the fallback case?

For concerns (C1) and (C2), I think we need registry operators to chime in.

At a high level, I'm worried that requiring major changes to (properly) implement references will at best take a very long time to be adopted and at worst not be adopted broadly by the ecosystem.

@sudo-bmitch
Copy link
Contributor

Questions I have (Q1-2):

  1. How would one move an artifact and all its metadata from one registry to another?

I'm doing this today with the tag based solutions in regclient. And extending that for the API support should be rather straightforward. For every manifest copied, there's a check to see if there's any tags referencing it that also need to be copied. It's a recursive process.

  1. What happens if the ecosystem does not broadly adopt the proposal and we're stuck in the fallback case?

I'm hoping it's a bit carrot and stick where registries see both the value to supplying the feature to users, and where users insist on not doing this by tag, that motivates registries to eventually adopt. I am realistic in suspecting that many registries, especially self hosted ones, will use the fallback solution for many years to come.

@imjasonh
Copy link
Member

Hey Chris, thanks for your comments! Safely rolling this out to registry implementations and clients has been a large and ongoing topic in the group even since before its inception.

To answer the questions:

  1. How would one move an artifact and all its metadata from one registry to another?

As an example, let's say I'm moving registry.biz/my-user/my-app. While the registry only supports the fallback mode, a client recursively moving attachments would:

  1. resolve that reference to its digest registry.biz/my-user/my-app@sha256:abcdef
  2. list all tags for registry.biz/my-user/my-app and find any that match the form :sha256-abcdef.sbom (or :sha256-abcdef.0f0f0f.sbom using Prop E's form)
  3. to find any attachments-on-attachments-etc, also resolve digests and list tags to find things attached to attachments, recursively until the tree is exhaustively discovered
  4. copy all found images to the destination repository or registry

Under the non-fallback case, the tag list and filter would be replaced by a trip to the new /referrers API endpoint, which would return descriptors of attachments. Clients would still have to recursively call /referrers on attachments if they wanted to exhaustively discover the full tree.

Some versions of some proposals have proposed adding recursiveness to the /referrers API from the outset, but I think in general this hasn't made it to the latest versions of various proposals.

There's also been discussions about filtering /referrers so that you can discover only signatures, or only attestations, etc., but for the most part we've been trying to defer this to future proposals, since it becomes a pretty large topic.

  1. What happens if the ecosystem does not broadly adopt the proposal and we're stuck in the fallback case?

To me, this is exactly why I think it's so important that we have a fallback case. Some proposals have basically required registries to adopt the new APIs in order to get this new behavior at all, which I think is just a non-starter for roughly the reasons you describe, around slow uptake.

If, N years from now, almost no registries have adopted the new APIs, then we're roughly in the same spot we're in today, with folks like cosign bolt on the tagging convention onto unchanged APIs. The benefit at least would be a bit more rigorous and thorough specification for the tagging convention, instead of just copying whatever cosign made up one day.

To @sudo-bmitch's point, as attaching things to things takes off, registry operators may find that they can avoid a lot of unnecessary tag list API calls by implementing the new APIs. Many registry instances will never update though, and that's why fallback is so important.

@chris-crone
Copy link
Author

chris-crone commented May 16, 2022

@sudo-bmitch @imjasonh on copying across registries (Q1): This is less elegant than using an OCI index (Proposal D) that you can easily copy with existing tools. You could also write the OCI index out to disk using the OCI layout for air gapped use. This is another example of how fitting into the existing data model is beneficial.

@sudo-bmitch @imjasonh on adoption (Q2): I vote more carrot and less stick 😄 I'm more pessimistic than you unfortunately, I think that the fallback risks becoming the de facto standard with some fragmentation as some operators implement some parts of Proposal E.

There are definitely parts of the OCI spec that I would love to change (looking at you digests computed on compressed content). The reality is that the image spec is difficult to evolve in a backward compatible way except using OCI indexes or manifest lists.

@sudo-bmitch
Copy link
Contributor

on copying across registries (Q1): This is less elegant than using an OCI index (Proposal D) that you can easily copy with existing tools. You could also write the OCI index out to disk using the OCI layout for air gapped use. This is another example of how fitting into the existing data model is beneficial.

We're seeing cosign use digest tags today, so existing tools that don't support this will end up stripping the signature off the image when copying it. That's not actually horrible to me, since the tooling doing the mirroring doesn't understand signatures, there's a good chance the tooling using the image on the mirror doesn't either. But if you want to copy an image with metadata, use a metadata aware image mirroring tool.

For the OCI layout, I've been pushing the other manifests in the layout with their digest tags, same as if I were copying between repositories that don't support the API.

@errordeveloper
Copy link

errordeveloper commented May 17, 2022

But if you want to copy an image with metadata, use a metadata aware image mirroring tool.

That makes the proposed solution a lot less user-friendly, as not many user are aware of what that means, and they shouldn't have to be aware. In my understading primary use-cases are around signatures and other securoty-related metadata, not just some supplimentary metdata, in which case, I'd think that security-for-all would be a great goal. What do folks think?

@imjasonh
Copy link
Member

That makes the proposed solution a lot less user-friendly, as not many user are aware of what that means, and they shouldn't have to be aware. In my understading primary use-cases are around signatures and other securoty-related metadata, not just some supplimentary metdata, in which case, I'd think that security-for-all would be a great goal. What do folks think?

It's hard to disagree with that sentiment, and I don't think anybody here would. 😄

Stuffing signatures into an index with the built image(s) would definitely work for signatures attached at build-time, and the built artifact could even be naively moved across registries using some* existing tools. But we still have the problem of appending more information to it, or updating information included in the index, without modifying the index's digest. That's fundamentally why we chose to explore moving attachments outside the referred-to thing. Once attachments are outside, tooling to move things has to expand to know about how to discover and move attachments.

*Notably, docker pull <src> && docker tag <src> <dst> && docker push <dst> still wouldn't naively work with this, unless the docker CLI learned to include attached signatures included in indexes when pulling/pushing. Today it would only pull the matching platform-specific image, and drop all the rest.

@jdolitsky
Copy link
Member

Closing since the outcome of these discussions resulted in Proposal F

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants