conversion: add document about image -> runtime configuration#492
conversion: add document about image -> runtime configuration#492vbatts merged 1 commit intoopencontainers:masterfrom cyphar:479-define-conversion-to-runtime-spec
Conversation
wking
left a comment
There was a problem hiding this comment.
Do we want to specify defaults for runtime-spec fields not set in the image? E.g. in a converted Linux config.json, should linux.namespaces be set?
Also, do we have version requirements for the generated config.json? Or is the converter free to target any tagged runtime-spec release?
conversion.md
Outdated
There was a problem hiding this comment.
“components of the orthogonal components of the extraction” → “orthogonal components”?
conversion.md
Outdated
There was a problem hiding this comment.
From POSIX:
There is no meaning associated with the order of strings in the environment. If more than one string in an environment of a process has the same name, the consequences are undefined.
So I think we should replace the append/prepend rule with one about not duplicating names.
conversion.md
Outdated
There was a problem hiding this comment.
Unless we have notes entries, I'd rather drop the column. You have notes on this conversion below, so maybe add [notes](#config-user) or similar to the notes column here?
conversion.md
Outdated
There was a problem hiding this comment.
The config docs allow both numeric and named user/group values, and only mention /etc/passwd in the context of default groups. So I think we need to weaken this line to say “MAY involve parsing /etc/passwd and /etc/group” to allow for:
- Configs using numberic uid/gids (where there is no need for a lookup)
- Agents using other NSS backends (e.g. LDAP), since the config spec doesn't require a particular NSS backend.
conversion.md
Outdated
There was a problem hiding this comment.
Probably “corresponding to the user” → “corresponding to the group”.
There was a problem hiding this comment.
No, because on Linux additionalGids is defined by group association of a user. Maybe "corresponding to the value in Config.User" if you want to make it more generic.
There was a problem hiding this comment.
Ah, right. Maybe there is wording that makes “this is an NSS lookup in the container context” more obvious, so folks don't trip into the same hole I just did.
conversion.md
Outdated
There was a problem hiding this comment.
This sentence is incomplete.
conversion.md
Outdated
conversion.md
Outdated
conversion.md
Outdated
There was a problem hiding this comment.
Do we want to SHOULD an annotations key for this information?
There was a problem hiding this comment.
Maybe, but since we haven't defined such an annotation in the org.opencontainers. namespace I'm a bit iffy about it.
conversion.md
Outdated
There was a problem hiding this comment.
“include” → “included”.
And “SHOULD NOT allow volumes to be included in the newly created layers” sounds like it may belong here, and not in the conversion spec.
conversion.md
Outdated
There was a problem hiding this comment.
Why this ordering?
To tell you the truth, this is why have been against adding annotations everywhere, especially when there isn't easy ways to distinguish runtime annotations/labels from those inherited from the source image.
To tell you the truth, this should probably be the opposite order. Manifests and Manifest-List tend not to survive to the runtime portion and are more oriented towards distribution.
There was a problem hiding this comment.
The ordering was based on what I thought made sense (the different blob layers are like onions, with each descriptor dereference going down a layer). But I wasn't sure why there were multi-level annotations in the first place (I assumed it was historical).
The other reason is that it means that if you share the same config between different manifests it's possible to override the config's annotations without needing to modify it -- if the ordering was in the reverse order then you wouldn't be able to do anything similar (you'd have to store two copies of the config).
There was a problem hiding this comment.
To tell you the truth, this should probably be the opposite order. Manifests and Manifest-List tend not to survive to the runtime portion and are more oriented towards distribution.
These are in order of ascending precedence, and while that's a somewhat unusual way to list precedence, it does mean that Config.Labels has the highest precedence. I think that's what you're asking for too.
There was a problem hiding this comment.
The other reason is that it means that if you share the same config between different manifests it's possible to override the config's annotations without needing to modify it…
Heh, so I read this completely wrong. Whatever we settle on, I suggest a clarifying example to avoid future confusion like mine.
There was a problem hiding this comment.
I actually meant descending order when writing this. But we should figure out what order makes sense first. I'd also understand if we don't include the annotations from manifests at all (which would be an even better idea because it would mean that the conversion is entirely done with v1.Image and doesn't require knowing what the manifest was).
That would make the distinction more clear between distribution and extraction/runtime.
There was a problem hiding this comment.
I'd also understand if we don't include the annotations from manifests at all (which would be an even better idea because it would mean that the conversion is entirely done with v1.Image and doesn't require knowing what the manifest was).
This sounds reasonable to me.
|
@stevvooe Can you take another look? I've switched around the wording and changed my annotation conversion description so that only |
|
@cyphar Could update this to reflect the removal of |
|
Yup, it's on my list of things to do. If I get the chance I'll do it this weekend or on Monday if I'm too busy. |
|
@stevvooe Updated. |
conversion.md
Outdated
There was a problem hiding this comment.
runtime-spec has cut an rc3.
conversion.md
Outdated
There was a problem hiding this comment.
In the verbatim section, you use annotations as the runtime field in cases like this.
conversion.md
Outdated
There was a problem hiding this comment.
Maybe "... annotation to the JSON serialization of the ExpopsedPorts value? And copy footnote 1 about precedence from the verbatim table. Then we could drop the ExposedPorrs section below without losing specificity.
There was a problem hiding this comment.
Scratch this, because the JSON for ExposedPorts is wonky. Your current handling is fine, although coping down your precedence footnote is probably still a good idea for consistency.
conversion.md
Outdated
There was a problem hiding this comment.
"an image" -> "the image" to match the preceding sentence and be more accurate.
jonboulle
left a comment
There was a problem hiding this comment.
at a first pass this looks good!
conversion.md
Outdated
|
Can you all PTAL? Is there any outlying issues I need to fix up? /ping @opencontainers/image-spec-maintainers |
wking
left a comment
There was a problem hiding this comment.
I think this still needs wording around the runtime-config properties which aren't related to the image config. Are they forbidden? Unspecified? Implementation-defined? For any fields not specified here, I prefer implementation-defined.
conversion.md
Outdated
conversion.md
Outdated
There was a problem hiding this comment.
"provide a hint to the runtime what the set of container exposed ports are by using" -> "set"?
conversion.md
Outdated
There was a problem hiding this comment.
Can we just disallow NSS? Is this happening anywhere today?
There was a problem hiding this comment.
I don't see a compelling reason to disallow NSS. The point of this section is to say that the mapping of Config.User -> {uid, gid} is implementation defined, with some hints about how it should be done.
conversion.md
Outdated
conversion.md
Outdated
There was a problem hiding this comment.
implementation defined -> implementation-defined
conversion.md
Outdated
There was a problem hiding this comment.
This doesn't sound right. Config.User should have a colon separated user and primary group. If there is no colon, the primary group is the user id. I don't think there is any support for supplementary (somehow, these where NIH'd to "additional") in the image format.
I am not sure why runtime-spec diverged from this. Having a separate field doesn't really provide much and makes it clear what is primary version. For example, in swarmkit, we have User and Groups, where user can be user or user:group and Groups is always supplementary.
I am not sure about the utility of supplementary groups in the image format.
There was a problem hiding this comment.
I don't think there is any support for supplementary (somehow, these where NIH'd to "additional") in the image format.
The config doesn't support it directly, which is why the following line suggests a way to extract this from the root filesystem or other source (“…MAY involve resolution through NSS or parsing /etc/group…”). Resolving this via the root filesystem is the same type of thing that this conversion doc suggest for non-numeric users and groups, although @jonboulle recently questioned the usefulness of generic NSS vs. requiring /etc/passwd / /etc/group.
There was a problem hiding this comment.
@stevvooe To clarify, "corresponding to the user" refers to the concept of a user in the image as opposed to the user string. The utility is so that groups like video, dialout and audio all still work inside containers.
I actually completely agree with what you said in your first paragraph. The point is that the resolution of supplementary groups (either through NSS or /etc/group) is recommended.
conversion.md
Outdated
There was a problem hiding this comment.
Let's separate the active fields from the annotation fields.
Also, we need to be careful about the impact of picking up image labels into runtime annotations. I know I am having trouble convincing everyone this isn't a good idea, but it is a security hole waiting to happen.
conversion.md
Outdated
There was a problem hiding this comment.
I'll change it to Unix-like
conversion.md
Outdated
There was a problem hiding this comment.
The Config.Volumes field doesn't necessarily require a mount. That is really an implementation detail.
Config.Volumes really just marks a section of the file system as not part of the primary container filesystem.
There was a problem hiding this comment.
Okay, but is there a way of phrasing it in terms of the runtime specification to guarantee this? Maybe there is another way of pasting together a filesystem, but I'm having trouble thinking of one at the moment.
There was a problem hiding this comment.
I don't have a good suggestion here. The volume really acts as almost a "mask" of the filesystem. The implementation of the mask can vary.
conversion.md
Outdated
There was a problem hiding this comment.
MUST is too strong here when taking user input for entrypoint and args.
There was a problem hiding this comment.
My main issue with entrypoint and cmd is that it's not really clear anywhere (outside of the Docker documentation) what their individual purposes are. The semantics also are quite hazy if you read config.md (and I think there's some hard-to-parse sentences there too).
There was a problem hiding this comment.
@cyphar Entrypoint and Cmd are indeed a bit of a mess. We've shored this up slightly in the docker documentation.
If you can find a way to word it such that implementations may substitute either Entrypoint or Cmd with user input, that would shore up the problem with this statement.
There was a problem hiding this comment.
If you can find a way to word it such that implementations may substitute either Entrypoint or Cmd with user input, that would shore up the problem with this statement.
Without an API spec to pin down “user input”, I think this is going to be hard to do. Can the user-input additions be a follow-up step? So:
image config →{this translation}→ runtime config →{user input}→ runtime config
Then whether the translator handles this translation correctly can be validated without worrying about user input, and whether the user-input injection happens correctly is up to whoever specified the user-input API.
There was a problem hiding this comment.
If you need to know where the break between the entrypoint and command is, you can turn it around and have:
image config →{user input}→ image config →{this translation}→ runtime config
Then @cyphar's current append wording matches what is listed here for the case where both are arrays of strings. Or is image spec supposed to support the shell form versions too?
There was a problem hiding this comment.
Or is image spec supposed to support the shell form versions too?
Absolutely not. IMO those were a bad decision which Docker has had to stick with because of backwards compatibility issues.
There was a problem hiding this comment.
@stevvooe I'll give it a shot in terms of wording.
|
|
||
| **org.opencontainers.imageSpec.exposedPorts** is a list of comma-separated values that correspond to the [keys defined for `Config.ExposedPorts`](config.md). | ||
|
|
||
| ## Annotations |
There was a problem hiding this comment.
as pointed out in #541 - can we have an optional, but still org.opencontainers.* prefixed, annotation which specifies the signal the image author wants the container to be kill'ed?
There was a problem hiding this comment.
This should have already been here. Is it missing from the config?
https://github.com/docker/docker/blob/master/api/types/container/config.go#L59
|
I would prefer if we can get this into one of the next milestones, @opencontainers/image-spec-maintainers. WDYT? |
|
/ping @stevvooe You said you'd take another look at this? |
conversion.md
Outdated
There was a problem hiding this comment.
manifest list -> image index
conversion.md
Outdated
There was a problem hiding this comment.
probably bump to latest rc
conversion.md
Outdated
There was a problem hiding this comment.
manifest list -> image index
conversion.md
Outdated
|
@cyphar bump. please ❤️ |
|
@vbatts I was having trouble tracking what issues have not already been addressed. Here is my understanding of the current issues:
The one final point which we discussed in the call and also mentioned by @crosbymichael is
The distinction in this document is not between end users and converters. It's between implementation-defined defaults and externally provided inputs. In the basic case there is no distinction, but for the spec the difference is quite important. An implementation-defined default is something that we can test by providing a bunch of images that contain various bits of metadata and our testing will only check the outputs that were actually specified in the image (the image is the source of truth). Any extra fields or whatever don't matter. But if the converter changes fields that we specified in the image, that's where non-compliance kicks in. When it comes to externally defined input, the way I'm thinking of it is that when an externally provided input is specified it is applied to the source image as though the source image always had the value specified. For a more concrete example, imagine that But the important point is that a converter wouldn't be able to change And please note that this entire document is describing the "default generated runtime configuration". The point is that if a converter decides to start overriding things it's okay as long as they specify that they aren't outputting the default generated runtime configuration (and maybe we should require that implementations provide a way to use the "default generated runtime configuration" -- I'm not sure). |
| However, converters SHOULD set the [`org.opencontainers.image.exposedPorts` annotation](#config.exposedports). | ||
| 2. If a converter implements conversion for this field using mountpoints, it SHOULD set the `destination` of the mountpoint to the value specified in `Config.Volumes`. | ||
| The other `mounts` fields are platform and context dependent, and thus are implementation-defined. | ||
| Note that the implementation of `Config.Volumes` need not use mountpoints, as it is effectively a mask of the filesystem. |
There was a problem hiding this comment.
@cyphar could you help me fully understand point 2 for Config.Volumes; the first line reads:
If a converter implements conversion for this field using mountpoints, it SHOULD set the
destinationof the mountpoint to the value specified inConfig.Volumes.
what does that mean? are there any other way to implement conversion for this field other than mountpoints? if yes, which ones?
How's the expected flow to implement the field with mountpoints for runtimes? create a host dir for that volume and bind mount it in the container?
Note that the implementation of
Config.Volumesneed not use mountpoints, as it is effectively a mask of the filesystem.
as a non-native speaker this seems in contrast with the first sentence. It now reads "don't use mountpoints!"
There was a problem hiding this comment.
are there any other way to implement conversion for this field other than mountpoints?
Yes, you could just copy the contents of the volume into the container. Hopefully (see #496) the purpose of volumes is to restrict how implementations should handle diff layers (allow external data that is not snapshotted).
In the case of umoci I could envisage this all being done through manifests and black-holing certain directories so umoci will simply ignore them.
How's the expected flow to implement the field with mountpoints for runtimes?
I only mention the destination. In particular what this is meant to mean is that the value of Config.Volumes should be a mountpoint (though there are reasons to not do that). The source of the mountpoint, the type of filesystem and so on can be whatever you want (it could be a tmpfs for example or NFS).
as a non-native speaker this seems in contrast with the first sentence. It now reads "don't use mountpoints!"
The first sentence says you SHOULD, but the second sentence says you don't have to. To be fair, maybe I shouldn't repeat that point but I felt worried people would make the same assumption you did.
There was a problem hiding this comment.
Yes, you could just copy the contents of the volume into the container.
alright, I guess what's still not clear to me is the definition of a volume which this doc is missing. What's a volume then? some directory/mountpoint on the host or somewhere else?
from that sentence, as you said, it can be anything.
| However, converters SHOULD set the [`org.opencontainers.image.exposedPorts` annotation](#config.exposedports). | ||
| 2. If a converter implements conversion for this field using mountpoints, it SHOULD set the `destination` of the mountpoint to the value specified in `Config.Volumes`. | ||
| The other `mounts` fields are platform and context dependent, and thus are implementation-defined. | ||
| Note that the implementation of `Config.Volumes` need not use mountpoints, as it is effectively a mask of the filesystem. |
There was a problem hiding this comment.
Should this also stipulate that data from the image may be copied into the volume?
There was a problem hiding this comment.
No, because that's forbidden by the definition of Config.Volume (https://github.com/opencontainers/image-spec/blame/master/config.md#L152):
If a file or folder exists within the image with the same path as a data volume, that file or folder will be replaced by the data volume and never be merged.
There was a problem hiding this comment.
Sorry, the entire directory is masked, not just the files that are in the data volume. Maybe I should make this sentence clearer but given the Config.Volume definition I can't imagine a reasonable reading of this text would conclude that merging is okay.
There was a problem hiding this comment.
This is behavior that docker allows today: effectively, the contents of a volume can be seeded with the contents of the image.
There was a problem hiding this comment.
Okay, but that's not what the spec currently allows (I don't agree with that design either, but let's discuss that somewhere else). Would you mind if we handle that in a separate PR (it's got nothing to do with the config generation and more to do with extraction surely).
There was a problem hiding this comment.
@cyphar I'll file an issue and not hold up this PR further on this matter.
conversion.md
Outdated
There was a problem hiding this comment.
I think if we relax this to a SHOULD NOT, we can go forward here.
There was a problem hiding this comment.
Presumably you also want me to change this line too:
- The converter MAY add additional entries to
process.envbut it MUST NOT add entries that have variable names present inConfig.Env.
?
Add documentation about how to convert from an OCI image configuration to an OCI runtime configuration. In particular, describe precisely what fields need to be filled given a particular configuration. The fields have been grouped into several categories, because some of the image fields are not as well-thought-out as others (such as the resource limitation fields, which should really be decided by the user not by the image creator). In addition, some fields (such as Volumes) cannot be understood by a generic configuration converter. In addition, the annotation org.opencontainers.imageSpec.exposedPorts has been defined in order to allow for hinting to runtimes what ports are exposed by a container. The same for org.containers.imageSpec.stopSignal. Signed-off-by: Aleksa Sarai <asarai@suse.de>
|
Sorry, I just squashed it. Can you LGTM again? |
|
LGTM I'll open PRs for the two left over touch-ups (https://github.com/opencontainers/image-spec/pull/492/files#r104840624 & https://github.com/opencontainers/image-spec/pull/492/files#r104840585) |
Reference: opencontainers#492 (comment) Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
Reference: opencontainers/image-spec#492 (comment) Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
Reference: opencontainers/image-spec#492 (comment) Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
Reference: opencontainers/image-spec#492 (comment) Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
Reference: opencontainers/image-spec#492 (comment) Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
Reference: opencontainers/image-spec#492 (comment) Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
Add documentation about how to convert from an OCI image configuration
to an OCI runtime configuration. In particular, describe precisely what
fields need to be filled given a particular configuration.
The fields have been grouped into several categories, because some of
the image fields are not as well-thought-out as others (such as the
resource limitation fields, which should really be decided by the user
not by the image creator). In addition, some fields (such as Volumes)
cannot be understood by a generic configuration converter.
Fixes: #479
Signed-off-by: Aleksa Sarai asarai@suse.de