Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add platform.Feature #761

Closed
wants to merge 6 commits into from

Conversation

ChristianKniep
Copy link

As discussed on the mailing list a PR to bring platform.features back into the image spec.
Motivation stated in moby/moby#38715
Blog post providing context:

Not sure about the version bump tho - please advise how to handle that one.

@stevvooe
Copy link
Contributor

You need to update the spec and the schema. Also, I think we need to be explicit about how this field is matched.

@ChristianKniep
Copy link
Author

Ah - ok missed that. Will update. Thx

image-index.md Outdated
- CPU type (*haswell*, *broadwell*, *skylake*, *ryzen*)
- GPU [NVIDIA Compute Capability](https://developer.nvidia.com/cuda-gpus) (3.7, 6.1)
- Host name, to separate different types in a data center without flushing out all the specifics (*ComputeNode-20190213*, *storage-2019*)
By using this feature flag, a manifest list is able to provide specific images for certain hosts.
Copy link
Contributor

@jonjohnsonjr jonjohnsonjr Mar 27, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: s/a manifest list/an image index/

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jonjohnsonjr changed it - thx for the correction

@ChristianKniep
Copy link
Author

@stevvooe sorry for the delay - a lot going on. re: your comment...
Do you think the matching should be discussed within the spec? Might be something TBD within the runtime? Even though I am not sure - just a gut feeling.

Copy link
Member

@cyphar cyphar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the need for something like this (and the previous method we used for CPU features had ... issues), but there are some pretty big things that should be resolved first (see my comments).

I would like to point out that some of these examples definitely feel like something you'd want to use internally within a particular company, but in their current form are quite questionable to be baked into a spec which requires interoperability between implementations and users. I would like to mention that you are allowed to add your own fields to the spec and use those if you really need them -- especially if you have a feature that is uber-specific to your usecase.

- CPU type (*haswell*, *broadwell*, *skylake*, *ryzen*)
- GPU [NVIDIA Compute Capability](https://developer.nvidia.com/cuda-gpus) (3.7, 6.1)
- Host name, to separate different types in a data center without flushing out all the specifics (*ComputeNode-20190213*, *storage-2019*)
By using this feature flag, an image index is able to provide specific images for certain hosts.
Copy link
Member

@cyphar cyphar Mar 29, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order for interoperability, we need to have either a standard set of well-understood names, or some way parsing well-structured feature names (cpu-feature:avx512 and cpu-type:haswell for instance) in order to be able to handle unknown names. I would opt for well-structured feature names and an explanation of how to check them -- and I believe this is the concern @stevvooe had.

I am very un-nerved by the hostname example, and would like to know how do we handle an image-spec implementation which wants to extract an image but doesn't have that hostname? Should it ignore the field? If so, how is the field mandatory if you can ignore it?

image-index.md Outdated
@@ -87,7 +87,11 @@ For the media type(s) that this document is compatible with, see the [matrix][ma

- **`features`** *array of strings*

This property is RESERVED for future versions of the specification.
This OPTIONAL property specifies an array of strings, each specifying a mandatory hardware/host feature. Examples are:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"mandatory" doesn't really have meaning in this spec, you need to explain what the behaviour needs to be and how a runtime should check the features.

@ChristianKniep
Copy link
Author

@cyphar Thank you for your feedback. The previous use of the feature flag tried (as I see it) to come up with an agreed on and objective description of hardware features; hence, the focus was on (discoverable) CPU flags.
My proposal is different in that it allows a site to define such feature flags as it fits their environment. That is what is meant be using flags like compute-20190213 - it provides a reference to a certain node and freezes his host installation in time, so that one can create site-specific or even only node type specific containers.
As discussed with @AkihiroSuda in Japan last month and as brought up by your comment; for public image repositories a naming scheme should be agreed on; similar to label best-practises or what you provided as an example.
Where I see this feature flag missing is when you want to create containers that rely on something that the host provides but are not controlled by the kernel. The prime example is GPUs, where the runtime (nvidia-docker) breaks the immutability of the container FS to allow a container workflow to take place.
Allowing runtimes to use the OCI image spec to the fullest would inform a discussion about what feature flags are used (/useful) for public repositories.
This features provides a solution for communities with a need for either host-specific dependencies or the need for hardware optimisation by allowing them to define their own flags and hand over the distribution decision of optimised images to the runtime.

@ChristianKniep
Copy link
Author

@cyphar / @stevvooe Any additional input on that?
As said; IMHO predefining all possible features strings is hard to impossible. And in an environment where mixed workloads exists I reckon that using a hardware identifier like host-type:HPE-Model2000 is a good start as it does not define which components of that host are used and optimised for. Workloads one might want to optimise for CPU+GPU and another one only for the CPU.

@ChristianKniep
Copy link
Author

I am addressing the issue in a different way - so the PR is obsolete.
http://www.qnib.org/2019/06/12/metahub/
https://github.com/qnib/metahub

@ChristianKniep ChristianKniep deleted the b1.1.0-feature branch June 13, 2019 07:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants