Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support OCI Image spec field `Platform.features` #38715

Open
ChristianKniep opened this Issue Feb 12, 2019 · 9 comments

Comments

Projects
None yet
3 participants
@ChristianKniep
Copy link

ChristianKniep commented Feb 12, 2019

Proposal: Support OCI Image spec platform features

Using node specific features/resources (e.g hardware) which requires to schedule a different image on each system is not easily achieved today.

Similar to platform decisions made for a CPU architecture (arm64,amd64,ppcle) a mechanism to express those features is proposed here. Extending this mechanism, the engine would not only choose an image based on Architecture, operating system and CPU variant, but also on arbitrary features, configured on an engine (node) basis.

To implement this, the platform object already provides a features field, which might be used.

  • features: array of strings
    This property is RESERVED for future versions of the specification.

Generic Example / How this could look like

  1. Use example image manifest list on Docker Hub (qnib/plain-manifestlist)

    image: qnib/plain-manifestlist  
    manifests:  
      -
        image: qnib/plain-featuretest:test1
        platform:
          architecture: amd64
          os: linux
          features:
            - test1
      -
        image: qnib/plain-featuretest:test2
        platform:
          architecture: amd64
          os: linux
          features:
            - test2
    
  2. Start two daemons with different platform features

    Either by providing the --platform-feature as CLI flag:

    hostA $ dockerd --platform-feature=test1
    hostB $ dockerd --platform-feature=test2
    

    Or by setting platform features in daemon.json:

    hostA $ cat /.../daemon.json
    { … “platform-features”: [“test1”] }
    hostB $ cat /.../daemon.json
    { … “platform-features”: [“test2”] }
    
  3. Pull the image on both daemons

    hostA $ docker image pull qnib/plain-manifestlist
    hostA $ docker image inspect -f '{{.Id}}' qnib/plain-manifestlist
    sha256:ef0c26e6eccd9f20e1dde8dad63b95f5be2236cb27498895ed3888a5fee83c54
    hostB $ docker image pull qnib/plain-manifestlist
    hostB $ docker image inspect -f '{{.Id}}' qnib/plain-manifestlist
    sha256:25833f2c141b60d2107d33989e847b46eaff37a17d13e1220bea8106b0137a26
    

    Note: The two resulting container images have different IDs as they are chosen according to the field platform-features.

Checklist / Proposed next steps

Practical application of example above

GPU case in which different CUDA drivers are installed

  • HostA: Server with CUDA90 requiring CUDA driver 390-30
    {.."platform-features": "nvidia-390-30",..}
  • HostB: Server with CUDA92 requiring CUDA driver 396-44
    {.."platform-features": "nvidia-396-44",..}
$ docker run -p 8888:8888 \
                      --device=/dev/nvidiactl --device=/dev/nvidia-uwm \
                      --device=/dev/nvidia0 qnib/cv-nccl-tf-jupyter:1.12.0

Above command would result in the instantiation of two different container images being pulled on HostA and HostB with the appropriate CUDA to match the NVIDIA driver on the host.

@thaJeztah

This comment has been minimized.

Copy link
Member

thaJeztah commented Feb 13, 2019

Related; docker/cli#1200, #37434, and #37647

The TL;DR on the last one is that we need to define;

  • A common format on serializing (and un-serializing) the platform fields to/from a string (it should be possible to encode all the information into a string, and turn it back into a struct; taking into account that some fields can be optional).
    • I'm not sure if the string format has been finalised / agreed on?
  • A common matching algorithm and/or definition what MUST match, and what SHOULD match, e.g.;
    • Fail if no matching OS and Architecture was found
    • How to prioritise multiple matches (e.g. Multiple Windows versions are included in a manifest; pick the "highest" number, or "highest" lower than the actual version of the host, or ....)
  • Some of those matches should probably be pre-defined (i.e., if I pull a multi-arch/multi-os image with containerd or docker, both should give me the same variant of the image)
  • Implementations could add additional requirements (I think those would be around the "platform-features", as it is optional, and the content of those fields is not defined).
  • Prevent hard-coding matching algorithms for specific cases where possible; there was talk about a "pluggable" matcher, but the daemon configuration option can be defined flexible enough to specify conditions.

Also wondering;

  • if the --platform option on docker pull, docker run, and docker build would be used to specify matches, e.g.:
    • --platform=linux/amd64/cuda1:nvidia-390-30 to match linux/amd64 images with cuda1 and nvidia-390-30
    • --platform=linux/amd64/nvidia-390-30,nvidia-396-44 to match linux/amd64 images with either a nvidia-390-30 or a nvidia-396-44 driver
  • if we should take a structured / CDV notation daemon option (--platform features=test1)
    • also worth considering to make this a configuration-file only feature (i.e., only configurable through daemon.json)

I think the current approach looks sane, but I'm a bit wary on performing a strict match on the features (I guess we can extend this in future); also (again, not sure if the format has been finalized), can feature fields have key/value pairs? (featureX=featureXValue or (e.g.) featureX.featureXValue)

As I mentioned, limiting to exact matches for now would make sense (but depends a bit on how well-defined the "features" field is)

For more flexible matching, some quick thinking from the top of my head;

{
  "manifestMatching":[
    {
      "__comment": "Straight matching of fields: images that support cuda1",
      "field": "platform.features",
      "value": "cuda1",
    },
    {
      "__comment": "Windows; only use LCOW on this machine",
      "field": "platform.os",
      "value": "linux",
    },
    {
      "__comment": "ARM machine; we only want the v7 variant, even though we can run v6",
      "field": "platform.variant",
      "value": "v7",
    },
    {
      "__comment": "RegEx matching; match images that support nvidia 390 (version 30 or up (max 99))",
      "field": "platform.features",
      "match": "nvidia-390-([3-9][0-9])",
    },
    {
      "__comment": "Alternatively; support named matches, to separate constraints; this one would match images with nvidia 390 (version 30 or up)",
      "field": "platform.features",
      "match": "nvidia-(?<type>[0-9]+)-(?<version>[0-9]+)",
      "constraints": ["type=390", "version>=30"]
    },
  ]
}
@thaJeztah

This comment has been minimized.

Copy link
Member

thaJeztah commented Feb 13, 2019

An alternative could perhaps be to not use the platforms field, but have annotations for this (similar to labels) (i.e., for GPU it's still platform(ish), but do we want to end up with, e.g. "network drivers", "some-other-feature" in this field?) I'm not super-familiar with the original intended purpose / scope of the "features" field

@ChristianKniep

This comment has been minimized.

Copy link
Author

ChristianKniep commented Feb 13, 2019

Thanks for your feedback @thaJeztah. Appreciated.

serializing / un-serializing

True, the format was just an idea to make it easy. The --platform option is experimental and not going to be used in production setups as this feature makes it an explicit host-specific configuration. I used it for debugging purposes to explicitly fetch images based on the platform-feature.

moby or containerd

AFAIU containerd is not concerned with ManifestLists. moby will fetch the list, extract the platform object and apply the matcher on each entry in the list.
@thaJeztah Do you reckon that handling the ManifestList will move to containerd?
As explaint below - IMHO pre-defined matches should stay out of a feature that provides a tool to fetch optimized images.

Exact match or sub-match (including empty match)?
(test1 && test2) or test1 || test2 or test1 || test2 || ''

IMHO the matcher should do an exact match.

  • exact match to make it deterministic and predictable
    In case Ops puts in a platform match it has a purpose and to provide deterministric outcome the container image should be an exact match.
    Example: I compiled Tensorflow for CUDA9.2 (nvidia-396-44) and a NVIDIA K80 (TF_CUDA_COMPUTE_CAPABILITY=3.7).
    The platform.feature holds two items: ["nvidia-396-44", "nv-compute-capability-3-7"]
    If the matcher would allow for a submatch an image for CUDA92 might be found, which does not include itself to a K80. I described different image optimization compiling Tensorflow in 1.
    Otherwise the outcome is non-deterministic and hard to debug. If an image does not exist to exactly match, I reckon the download should fail, so that Dev can build a new image for this platform.
  • No fallback to image without any match
    If this configuration is done, the node is not going to work with this side effect anymore (if the matcher does an exact match, as discussed above). docker pull ubuntu will not provide an image, as it does not match the platform features.

Also worth noting that the propose feature fields are not populated via Operating System functions.
IMHO this feature provides users in need to be host specific the tools to do so in line with how their domain agreed on. Providing a silver bullet feature list might be impossible.

The idea here is that one creates a generic image first and once the image is used often enough to start optimizing it the features list can grow, so that different configurations get different images.

^1: http://qnib.org/2019/02/12/optimized-images-for-aiml-hpc/

@AkihiroSuda

This comment has been minimized.

Copy link
Member

AkihiroSuda commented Feb 13, 2019

dockerd --platform-feature=test1

How will it relate to dockerd --label?

@ChristianKniep

This comment has been minimized.

Copy link
Author

ChristianKniep commented Feb 13, 2019

@AkihiroSuda As engine labels are engine specific they might be used to describe such features. That would imply that a certain key structure is used. Say the prefix node.platform.label includes the value of the key into the platform object to be matched.
If so, I am not sure if such special labels lead to confusion as they are handled/influencing engine behavior depending on what the key-name is.

@AkihiroSuda

This comment has been minimized.

Copy link
Member

AkihiroSuda commented Feb 14, 2019

👍 for using OCI platform, but platform.features was initially designed for CPU features, especially x86 ones such as SSE4 ( opencontainers/image-spec#622 (comment) ), and then the platform.features spec
was cancelled due to its complexity: opencontainers/image-spec#672

Let's make sure to loop in OCI Image Spec maintainers for the definition of platform.features.

Perhaps the GPU spec can be better defined as a new field like platform.gpufeatures?

@ChristianKniep

This comment has been minimized.

Copy link
Author

ChristianKniep commented Feb 14, 2019

I empathize with the original thought process for CPU features, but I reckon from a practical stand point the CPU features are hard to use. Either the matcher does a SHOULD comparison and takes the image that matches the most CPU-features or he looks out for an exact match. Since CPUs have a bunch of those features, the CI/CD pipeline needs to be slick and up to date to handle that.
I imaging the initial idea was also, that the engine is able to detect the features at startup - like it does with Architecture, OS and with some checking Variant. On first sight that looks easy, just ask the kernel.

$ lscpu
Architecture:        x86_64
*snip*
R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz
*snip*
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca *snip*

But in practice I do not see many willing to build images with 50+ feature flags.

IMHO introducing a gpufeatures field has the problem of trying to be specific.
You can have a TPU from google, other FPGA, Xeons even have a device for rendering (/dev/dri/render/D128), which one might want to use. Let alone the fancy networking stuff like InfiniBand, OmniPath, Aries.
We will get lost pretty quickly if we assume too much (I'd even say: anything).

I rather allow Ops to use it if they want to and to the degree they want. A hello-world case of this feature might just be: 'ServerModel1234'
Within the build process you just make sure that the Dockerfile frontend grants --privileged execution so that the compile process can figure out the specifics. GCC does a great job in detecting this.
Ok, the frontend for buildkit needs to be created later this year (one piece at a time :) ).

@AkihiroSuda

This comment has been minimized.

Copy link
Member

AkihiroSuda commented Feb 16, 2019

But in practice I do not see many willing to build images with 50+ feature flags.

Can we ask OCI to predefine some common features flag like "avx512" and "vmx"?

ServerModel1234 could not be predefined, but nvidia-396-44 could be perhaps predefined.

@ChristianKniep

This comment has been minimized.

Copy link
Author

ChristianKniep commented Feb 17, 2019

After talking to @AkihiroSuda yesterday I posted a proposal on the opencontainers mailing list to get feedback from there as well.
https://groups.google.com/a/opencontainers.org/forum/#!topic/dev/PCvWK6rEcqE

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.