Skip to content
This repository has been archived by the owner on Sep 23, 2022. It is now read-only.

Questions on "Cherry Pick" 馃崚 #63

Closed
vsoch opened this issue Jul 8, 2022 · 5 comments
Closed

Questions on "Cherry Pick" 馃崚 #63

vsoch opened this issue Jul 8, 2022 · 5 comments

Comments

@vsoch
Copy link

vsoch commented Jul 8, 2022

Hiya! Per @jdolitsky email I went over the "Cherry Pick" 馃崚 proposal and here are some quick questions:

Added annotations

So for annotations I noticed we are adding new ones:

https://github.com/opencontainers/wg-reference-types/blob/main/docs/proposals/PROPOSAL_E.md#annotations

But we aren't adding a simple one for something like "name." For a client that intends to store and retrieve, I think right now our main (only?) option is to save a custom annotation, but then it means it could be different based on the client. Would it hurt to allow adding a more official "name" annotation org.opencontainers.artifact.name to store an original basename, or a name relative to the upload root (no direct paths)? It would be akin to org.opencontainers.image.title but title is weird for an artifact, so maybe just name.

Architectures

We are adding org.opencontainers.platform.architecture - is this a scoped set? Is architecture == microarchitecture? E.g., if this is the granularity I have:

https://github.com/archspec/archspec-json/blob/51f75b644a7efd6b2c4dc1a07542856ee2ba6229/cpu/microarchitectures.json

Is that going to be supported?

Referrers

"referrers" is really hard to say, and somehow reminds me of the word "furry" because of the two r's (no idea why, lol). Why can't it just be "refers" ? Also, is it one directional?

Pagination

Is it implied there is a ?page parameter included in Link? https://github.com/opencontainers/wg-reference-types/blob/main/docs/proposals/PROPOSAL_E.md#ordering-and-pagination so someone doesn't always have to start at 0? Might be good to mention.

Tags when referrers not supported

Is this done so you can push something to a registry (that doesn't support referrers) that you might want to eventually push to one that does support it? If objects are now like a graph, how are these basic operations done without losing important nodes?

And I didn't read through all the requirements - but those should be some questions to get us started!

@sudo-bmitch
Copy link
Contributor

Hiya! Per @jdolitsky email I went over the "Cherry Pick" cherries proposal and here are some quick questions:

Huge thanks for giving this a review @vsoch!

Added annotations

So for annotations I noticed we are adding new ones:

https://github.com/opencontainers/wg-reference-types/blob/main/docs/proposals/PROPOSAL_E.md#annotations

But we aren't adding a simple one for something like "name." For a client that intends to store and retrieve, I think right now our main (only?) option is to save a custom annotation, but then it means it could be different based on the client. Would it hurt to allow adding a more official "name" annotation org.opencontainers.artifact.name to store an original basename, or a name relative to the upload root (no direct paths)? It would be akin to org.opencontainers.image.title but title is weird for an artifact, so maybe just name.

Annotations are relatively painless to add. Do you have an example with a type, name, and description? In the use cases we've been thinking of, we had the type and description, so I'm curious how you'd use the name field in addition to those. We're also assuming artifacts will probably have type specific annotations, like a public key hash for image signing (to find signatures by someone you have in your trust store), or a SBOM format to know one is a cyclonedx-json and another is spdx-xml. If the name needs to be unique across the repository, we can also push an artifact by tag (that exists already even without this WG).

Architectures

We are adding org.opencontainers.platform.architecture - is this a scoped set? Is architecture == microarchitecture? E.g., if this is the granularity I have:

https://github.com/archspec/archspec-json/blob/51f75b644a7efd6b2c4dc1a07542856ee2ba6229/cpu/microarchitectures.json

Is that going to be supported?

These are based on the platform options in the config.json, so we could push for more options there if that's useful to you. Runtimes in the docker/k8s ecosystem wouldn't know how to use those, but you may have other runtimes that do. And since it's a string, there's nothing stopping you from setting your own values, you'll just want to coordinate with others that need that granularity.

Referrers

"referrers" is really hard to say, and somehow reminds me of the word "furry" because of the two r's (no idea why, lol). Why can't it just be "refers" ? Also, is it one directional?

The refers is one way, the signature refers to an existing image. Going the other way, we can already add more content into an image we are creating. Naming things is hard, and I'm trying to get some brainstorming to happen in #41.

Pagination

Is it implied there is a ?page parameter included in Link? https://github.com/opencontainers/wg-reference-types/blob/main/docs/proposals/PROPOSAL_E.md#ordering-and-pagination so someone doesn't always have to start at 0? Might be good to mention.

The Link header I believe comes from RFC5988, and the content of it is left up to registry implementations in that case. The tag listing API also uses this, but they included a way to pass the last parameter to request tags after a specific name. For the index, that might be a descriptor or digest value, but I'm not sure if that's useful (you'd need to know the digest in advance).

Tags when referrers not supported

Is this done so you can push something to a registry (that doesn't support referrers) that you might want to eventually push to one that does support it? If objects are now like a graph, how are these basic operations done without losing important nodes?

Exactly! Registries today don't have the new API, and I'm expecting it will be some time before not just the various SaaS registries are updated, but mainly the various self hosted registries, in large organizations, that are slow to upgrade. So we added the tag schema to allow the same image-spec artifact to be copied from registry to registry without needing any changes, it's just a different way to query it.

And I didn't read through all the requirements - but those should be some questions to get us started!

I make it a point to stop reading there myself. :)

@vsoch
Copy link
Author

vsoch commented Jul 8, 2022

Annotations are relatively painless to add. Do you have an example with a type, name, and description?

I do! So for oras-py (and oras proper too i suspect) when we push an artifact, we want to be able to pull it down to the same name. This is accomplished via https://github.com/oras-project/oras-py/blob/07ae3a1b46b245cb59dd390b50f3a900b07cd861/oras/defaults.py#L28-L29 here https://github.com/oras-project/oras-py/blob/dab9fb207009256b88553e8faca914412083c7ba/oras/provider.py#L563-L564. An interaction might look like this (for oras or oras-py)

$ oras-py push localhost:5000/dinosaur/artifact:v1 --insecure \
--manifest-config /dev/null:application/vnd.dinosaur.config \
./artifact.txt
Successfully pushed localhost:5000/dinosaur/artifact:v1

So I could (right now with the spec) keep a record of the media type, and then maybe I would have an annotation for a description. But where I'd run into trouble is pulling:

$ oras-py pull localhost:5000/dinosaur/artifact:v1

Since this is version 2 of me running this command later, there's no way in heck I can remember what I pushed to that URI! So what I need to do is run that command and then have artifact.txt appear magically, and that only works right now by hijacking this random annotation. So I think if the org.opencontainers.artifact.name was made a proper annotation, then oras (and other tools like it) would consistently be defining the same thing to then be able to then pull the same thing (and have agreement on the annotation). Does that make sense?

These are based on the platform options in the config.json, so we could push for more options there if that's useful to you. Runtimes in the docker/k8s ecosystem wouldn't know how to use those, but you may have other runtimes that do. And since it's a string, there's nothing stopping you from setting your own values, you'll just want to coordinate with others that need that granularity.

I do think for us loners in HPC space (where many don't join the conversations here) we need the granularity. What would be the best way to go about this? Should I try to get a group together to propose extending platform options - or what about if there was an optional org.opencontainers.artifact.microarch annotation with a more refined set of strings? I could try to get a group to work on that - I think a few folks in HPC land are working on archspec as a root that could be a source of these strings.

Naming things is hard, and I'm trying to get some brainstorming to happen in #41.

I'll see if I can make some suggestions!

The Link header I believe comes from RFC5988, and the content of it is left up to registry implementations in that case. The tag listing API also uses this, but they included a way to pass the last parameter to request tags after a specific name. For the index, that might be a descriptor or digest value, but I'm not sure if that's useful (you'd need to know the digest in advance).

okay so to be more explicit - if I come back later to some initial query and I ask for the Link URL ?page=10, can I do that? And if so, should the page parameter not be documented? Or are you saying "That's totally up to the registry we can't control it?" Because that is kind of terrible - imagine trying to get a listing of stuffs and you got up to page 10 and then something was killed and you want to start at 11 but because it's not clear or not supported to provide page=, you have to start over. Being someone that has used many APIs in many contexts I value this little param guy :)

Exactly! Registries today don't have the new API, and I'm expecting it will be some time before not just the various SaaS registries are updated, but mainly the various self hosted registries, in large organizations, that are slow to upgrade. So we added the tag schema to allow the same image-spec artifact to be copied from registry to registry without needing any changes, it's just a different way to query it.

Gotcha, thank you!

@sudo-bmitch
Copy link
Contributor

Annotations are relatively painless to add. Do you have an example with a type, name, and description?

I do! So for oras-py (and oras proper too i suspect) when we push an artifact, we want to be able to pull it down to the same name. This is accomplished via https://github.com/oras-project/oras-py/blob/07ae3a1b46b245cb59dd390b50f3a900b07cd861/oras/defaults.py#L28-L29 here https://github.com/oras-project/oras-py/blob/dab9fb207009256b88553e8faca914412083c7ba/oras/provider.py#L563-L564.

Gotcha, since it's part of the descriptor and not the top level of the manifest, I think you have your choice of annotations, including the description. The use cases we've been looking at are the top level of the manifest since we want to pull those up for later filtering and UI's that list artifacts in a registry. If ORAS and others want to collaborate, we could probably agree on a filename annotation that all of the tools use for this.

These are based on the platform options in the config.json, so we could push for more options there if that's useful to you. Runtimes in the docker/k8s ecosystem wouldn't know how to use those, but you may have other runtimes that do. And since it's a string, there's nothing stopping you from setting your own values, you'll just want to coordinate with others that need that granularity.

I do think for us loners in HPC space (where many don't join the conversations here) we need the granularity. What would be the best way to go about this? Should I try to get a group together to propose extending platform options - or what about if there was an optional org.opencontainers.artifact.microarch annotation with a more refined set of strings? I could try to get a group to work on that - I think a few folks in HPC land are working on archspec as a root that could be a source of these strings.

I'd recommend getting that updated in the config.json and index.json descriptor specs, because I think this is needed for more than just artifacts, you also want your images themselves to have other platforms. This may be best handled as more platform variants and features so that we don't need to create new fields.

The Link header I believe comes from RFC5988, and the content of it is left up to registry implementations in that case. The tag listing API also uses this, but they included a way to pass the last parameter to request tags after a specific name. For the index, that might be a descriptor or digest value, but I'm not sure if that's useful (you'd need to know the digest in advance).

okay so to be more explicit - if I come back later to some initial query and I ask for the Link URL ?page=10, can I do that? And if so, should the page parameter not be documented? Or are you saying "That's totally up to the registry we can't control it?" Because that is kind of terrible - imagine trying to get a listing of stuffs and you got up to page 10 and then something was killed and you want to start at 11 but because it's not clear or not supported to provide page=, you have to start over. Being someone that has used many APIs in many contexts I value this little param guy :)

Right now we don't have that option. I think it would terrify some registry operators because paging support is a bit of a break glass scenario. The hope is for a single image, the descriptor list for associated artifacts to that image would fit in a single Index without breaking the 4MB limit. When thinking of a normal scenario, a descriptor is 100B, but I'll round that up to 500B for extra pulled up annotations, which means you can get ~8,000 artifacts associated with one image before needing a second page of results.

@vsoch
Copy link
Author

vsoch commented Jul 8, 2022

all of the above sounds good! Thanks @sudo-bmitch !

@jdolitsky
Copy link
Member

Thanks @vsoch - closing for now!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants