Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Container annotations (metadata) #9013

Closed
wants to merge 2 commits into from

Conversation

squaremo
Copy link
Contributor

@squaremo squaremo commented Nov 6, 2014

This PR describes "Container annotations", that is, metadata for containers.

Motivation

The motivation for annotations is to be able to "hang" values on a container so that clients -- e.g., plugins -- can examine them and react accordingly. For example, a weave plugin might attach a container to the weave network based on a weaveIPs attribute.

Attributes can be set while the container is running, and are visible to consumers of the remote API and to command-line tooling, but not in general from within the container.

Synopsis

Command line:

# Run a container with an annotation
INSTANCE=$(docker run -ti --annotate example.com/animal=Dog ubuntu)
# Annotate a container (whether it's running or not)
docker annotate example.com/animal Fox $INSTANCE
# Get an annotation
docker inspect --format='{{ index .Annotations "example.com" "animal" }}' $INSTANCE

NB that annotation get JSON literals, so some values will require quoting; I'll discuss that below.

Dockerfile:

ANNOTATE example.com/animal Goat

Design choices

Keys

Keys are namespaced labels <domain>/<id>, where the <id> has no structure (i.e., no dots). This is to line up with e.g., Kubernetes' idea of annotation and label keys.

Although there's no structure to keys when setting annotations, the values may be maps and arrays, and you can reach down into them with the format argument.

docker annotate example.com/animals '["deer", "otter", "seal"]' $INSTANCE
docker inspect --format='{{ index .Annotations "example.com" "animals" 0 }}' $INSTANCE

JSON values

Values are JSON literals. This will often necessitate quoting, which can be awkward. The upside is that you can have structured values, like lists of labels or IP addresses or animals.

Anything that can't be parsed as a JSON literal is interpreted as a string -- but care needs to be taken here, since null, numbers and booleans look like strings without the quotes. Anything that's not a hard-coded value, e.g., variables in scripts, should always be quoted if it's intended as a string.

Update semantics

The given semantics eschew transactions and fine-grained operations; you can inspect an annotation then set it, or you can just stomp on whatever value it has already; but there's no operation for say, adding an entry to an array-valued annotation, atomically. This is to keep things simple to implement and script, mainly.

null for delete

The literal null is used to remove an annotation. This is consistent with what you get if you ask docker inspect for a field that doesn't exist -- i.e., null means missing.

Events

There's an event "update" that gets fired whenever container annotations are changed. This is so that clients can react to changes in whatever annotations they were interested in.

It doesn't carry any additional information: clients are expected to go and look for a new value and react accordingly. Possibly the key could be included in the event, so clients don't end up spamming docker inspect.

@crosbymichael
Copy link
Contributor

Have you considered the term label instead of attributes?

@squaremo
Copy link
Contributor Author

squaremo commented Nov 6, 2014

Have you considered the term label instead of attributes?

Yes; but I didn't think "label" does justice to structured values. I could be talked around ..

@squaremo squaremo changed the title Container attribute docs Proposal: Container attributes (metadata) Nov 6, 2014

## attr

Usage: docker attr KEY VALUE
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Surley there should be a container-id somewhere in there. Same mistake is in the PR text.

Signed-off-by: Michael Bridgen <mikeb@squaremobius.net>
@bfirsh
Copy link
Contributor

bfirsh commented Nov 7, 2014

+1 labels, partially because it is already an established term from Kubernetes.

@brendandburns @bgrant0607 You have expressed interest in getting Kubernetes-style labels upstream. Is this proposal something that could work for Kubernetes?

@rade
Copy link

rade commented Nov 7, 2014

@bfirsh

+1 labels, partially because it is already an established term from Kubernetes.

Only if the functionality provided here essentially matches Kubernetes. In particular, the kubernetes docs are completely silent as to what form label values take, and the only examples given are strings.

@squaremo
Copy link
Contributor Author

squaremo commented Nov 7, 2014

+1 labels, partially because it is already an established term from Kubernetes.

Having had a look at the Kubernetes labels design doc, I think using "labels" here would be a mistake. It should be possible (as in it could be a design goal) to implement Kubernetes' idea of labels using attributes, but there are disconnects:

  • Kubernetes wants labels to apply to pods (and other Kubernetes API objects), not containers. Actually it's a bit difficult to tell what's included in "other objects", but I think containers are not included.
  • Labels are identifying, that is, intended to be used to select objects; attributes as suggested here don't have that property (though some future docker API might include it for convenience)

So calling them the same thing would quite possibly induce confusion. If anything, attributes are similar to Kubernetes' idea of annotations, but as @rade suggests, I'd want to avoid conflating them unless they were really the same thing.

@squaremo
Copy link
Contributor Author

squaremo commented Nov 7, 2014

So calling [attributes] [labels] would quite possibly induce confusion.

If it turns out that what people want really is exactly Kubernetes' labels, that's a different situation. I don't think labels would necessarily align with my motivation as given above (although, perhaps selection by label would be useful ..).

@bgrant0607
Copy link

Kubernetes has 2 mechanisms, for different purposes:

  1. Labels
  2. Annotations

Labels are intended to be used for identifying attributes. There are lots of examples in the doc. They are used for filtering/selecting/matching, both in mechanisms such as our replication controllers and services, as well as operations, such as GET, DELETE, etc. We plan to index them (and reverse-index them) for fast lookup.

Annotations are intended to be used to attach arbitrary, potentially structured data. Again, there are examples in the document.

For both, keys are restricted to DNS labels. Validation code.

We haven't decided how to strict label values yet but, yes, we do expect them to be relatively simple strings.

For annotations, we'd be happy with something JSON literals for values.

For applications to communicate data upwards, we plan to introduce different mechanisms.

@squaremo
Copy link
Contributor Author

squaremo commented Nov 7, 2014

Thanks @bgrant0607. Do you think what's proposed here is congruent with Kubernetes' annotations (I think so)? And hence or otherwise, ought they have the same name?

@bgrant0607
Copy link

They are congruent in purpose and we could set the same restriction on the values.

How attached are you to Go identifiers for keys?

Are Docker container names still restricted to [a-zA-Z_-.] ?

@thockin @smarterclayton

@squaremo
Copy link
Contributor Author

squaremo commented Nov 7, 2014

They are congruent in purpose and we could set the same restriction on the values.

That it's JSON values (and null means missing)? This would line things up nicely, if it works for your purposes. I'm not sure if there will be a technical need to line up exactly (perhaps you have an opinion on that) but it won't hurt in any case.

How attached are you to Go identifiers for keys?

Not very. The motivation for this restriction is so they work nicely with using go templates in docker inspect; but, for keys with dashes, {{index ...}} can be used, so it's not a big deal really.

I'll change "attributes" to "annotations", and relax the key restriction, then we can see how we like it.

@squaremo
Copy link
Contributor Author

squaremo commented Nov 7, 2014

I'll [...] relax the key restriction

The labels design says "Valid labels follow a slightly modified RFC952 format: 24 characters or less, all lowercase, begins with alpha, dashes (-) are allowed, and ends with alphanumeric." but it's not said whether this is talking about the keys or the values, or which production in the grammar given in RFC952 is meant. I presume it's keys, and this one:

<name>  ::= <let>[*[<let-or-digit-or-hyphen>]<let-or-digit>]

I'd prefer to refer to something more precisely, or just give a regexp: [a-z]([a-z0-9-]*[a-z0-9])?

By the way, encouraging (as I do) people to prefix keys with a namespace and requiring lowercase means most keys will include dashes, and therefore not be amenable to `docker inspect --format='{{ .Annotations.key.field }}', which is a bit of a pain. What is the motivation for using the particular formulation above?

@thockin
Copy link
Contributor

thockin commented Nov 7, 2014

The motivation for using DNS names for input validation is primarily so
that we don't have a proliferation of validation formats. Names can be X,
but labels can be Y, and containers can be Z.

Everything we name is somewhere on the DNS spectrum from RFC952 to
RFC1123. We don't have to stay there, I suppose, but my next point makes
more trouble. I believe we need to add explicit namespacing of labels such
that different pieces of the system can attach labels and not collide. The
proposal that we have on the table is /. E.g.
docker.io/name or kubernetes.io/controller.

Again, we don't HAVE to use that format, but it is somewhat well
established.

Also, docker container names allow dashes anyway, so you're proposing
something that is similar to but not quite the same - which is a recipe for
user confusion.

As for values we are less opinionated - they are just strings. JSON
literals means that I would have to say things like
docker attr foo "bar"
? Puke. Can't we do better and infer quotes?

If we can find common ground on the spec, we can extract all of this into a
neutral library that we both use - I think that would be a win all around.

On Fri, Nov 7, 2014 at 8:17 AM, Michael Bridgen notifications@github.com
wrote:

I'll [...] relax the key restriction

The labels design says "Valid labels follow a slightly modified RFC952
format: 24 characters or less, all lowercase, begins with alpha, dashes (-)
are allowed, and ends with alphanumeric." but it's not said whether this is
talking about the keys or the values, or which production in the grammar
given in RFC952 is meant. I presume it's keys, and this one:

::= [*[]]

I'd prefer to refer to something more precisely, or just give a regexp:
a-z?

By the way, encouraging (as I do) people to prefix keys with a namespace
and requiring lowercase means most keys will include dashes, and
therefore not be amenable to `docker inspect --format='{{
.Annotations.key.field }}', which is a bit of a pain. What is the
motivation for using the particular formulation above?

Reply to this email directly or view it on GitHub
#9013 (comment).

@squaremo
Copy link
Contributor Author

squaremo commented Nov 7, 2014

As for values we are less opinionated - they are just strings. JSON literals means that I would have to say things like docker attr foo "bar" ? Puke. Can't we do better and infer quotes?

Yes, quoting is a pain. Inferring quotes in the case of contiguous characters probably isn't a big deal, though there's ambiguity around numbers of course, so care then has to be taken by users to quote when using values from elsewhere, etc. Anyway.

However, I would like to do better than "just strings" for values; if they are just strings, then there's an extra parsing step needed for tools to get anything useful out of structured values.

I believe we need to add explicit namespacing of labels such that different pieces of the system can attach labels and not collide. The proposal that we have on the table is /.

Namespacing is good, and that scheme is fine. As above, my concern is that this makes docker's use of templates in --format rather awkward for tooling that wants to use annotations*. If someone has a suggestion for how to reconcile these things, I'll happily consider it.

*It's my understanding that this would in general require e.g., {{index .Annotations "docker.io/foobar"}}, and nested values would need something like {{ (index .Annotations "docker.io/foobar").bar.baz }}. I'm prepared to be taught something new though ...

@thockin
Copy link
Contributor

thockin commented Nov 7, 2014

I agree that Go's format language falls down here. Could we define the
values to be YAML instead of JSON? YAML has rules for inferring strings vs
numbers, and yet JSO syntax is valid.

foo=bar # string
foo='[ bar, qux ]' # list of strings

On Fri, Nov 7, 2014 at 9:05 AM, Michael Bridgen notifications@github.com
wrote:

As for values we are less opinionated - they are just strings. JSON
literals means that I would have to say things like docker attr foo "bar"
? Puke. Can't we do better and infer quotes?

Yes, quoting is a pain. Inferring quotes in the case of contiguous
characters probably isn't a big deal, though there's ambiguity around
numbers of course, so care then has to be taken by users to quote when
using values from elsewhere, etc. Anyway.

However, I would like to do better than "just strings" for values; if they
are just strings, then there's an extra parsing step needed for tools to
get anything useful out of structured values.

I believe we need to add explicit namespacing of labels such that
different pieces of the system can attach labels and not collide. The
proposal that we have on the table is /.

Namespacing is good, and that scheme is fine. As above, my concern is that
this makes docker's use of templates in --format rather awkward for
tooling that wants to use annotations*. If someone has a suggestion for how
to reconcile these things, I'll happily consider it.

*It's my understanding that this would in general require e.g., {{index
.Annotations "docker.io/foobar"}}, and nested values would need something
like {{ (index .Annotations "docker.io/foobar").bar.baz }}. I'm prepared
to be taught something new though ...

Reply to this email directly or view it on GitHub
#9013 (comment).

@squaremo
Copy link
Contributor Author

squaremo commented Nov 7, 2014

Could we define the values to be YAML instead of JSON?

Much of docker's tooling thinks in JSON terms, so perhaps best not.

my concern is that this makes docker's use of templates in --format rather awkward

Maybe there's some way I can work namespaces in explicitly, so the "non go identifier" keys are confined to a distinct argument, and don't appear in templates. Probably this would require a docker annotation ... command, which isn't ideal -- it might even be better have the awkward templates as above. I'll have a think about it.

@errordeveloper
Copy link
Contributor

I think this is an important enabling piece that the API must have!

@errordeveloper
Copy link
Contributor

One simple example where you'd want more then labels is descriptions or comments of some sort.

Anyways, I don't see a point of even bothering to add restrictive metadata format to anything other then free-form JSON. If metadata field is to be there in the first place, why not make it map to anything as far as JSON goes?

Regarding format debate, YAML is newline sensitive format and JSON values are not newline-friendly. The Docker API is currently JSON-only, so it would be a major pain to deal with YAML.

@errordeveloper
Copy link
Contributor

I certainly agree that this could be called annotations instead of attributes, that's pretty much the same thing, if it helps to synchronise naming of stuff between Kubernetes and Docker. Although, as we all know naming is the hardest part and it's hard to agree. As of labels, the only reason I'd see them as separate feature is because you might want to show them in docker ps and that they are generally more likely to be present the annotations... However, why wouldn't you just keep Docker less opinionated about metadata and implement labels and annotations under a more generic place you could call attributes, user_properties, or even just meta?

@thaJeztah
Copy link
Member

I have a feeling this discussion is repeating the same path that lead to #8955 (a continuation of #7470) although those were for images, not containers.

Perhaps those should be taken into account, so that we can get a consistent implementation / naming?

@bfirsh
Copy link
Contributor

bfirsh commented Jan 14, 2015

@squaremo FYI, we've got a pull request going here: #9882

@dreamcat4
Copy link

+1 for this.

My views on this feature are:

  • Am guessing that Go templates are good for best performance, being language-native, which is why we have them. So perhaps if you want to offer users an alternative and simpler way of parsing, then the most obvious alternative would be to include a json library in the binary and add to docker inspect a --json option to be used in conjunction with the existing --format flag. So we could parse json format strings just like are accepted by the program jq.
  • I want this feature to work for stopped data-only containers as well as running ones. Since data-only containers should always be in stopped mode and I definitely need to have metadata on those kinds of containers too.
  • What the heck is wrong with calling it "metadata", rather than anything else? It that not what this stuff actually is? Arbitrary user-specified metadata? Why not call a stick a stick?
  • This feature should be designed in mind to be exposable via a future intospection feature. But I also need a very similar feature called intraspection. This design is already good because it treats all of the new user-set metadata just like a container's other existing metadata. ++1.

@thaJeztah
Copy link
Member

@jfrazelle should this be closed, now that a design approval was given on #9882 ?

@jessfraz
Copy link
Contributor

jessfraz commented Mar 2, 2015

I think we can close after #9882 is merged

@rade
Copy link

rade commented Mar 2, 2015

@jfrazelle

I think we can close after #9882 is merged

No, because of #9882 (comment)

@thaJeztah
Copy link
Member

@rade ah, thanks. Yes that is different;

To reiterate, a more important difference is that annotations per #9013 can be changed after a container is started.

I'm thinking what's best here; #9882 allows setting the labels, but those are immutable. Also different is that this proposal stores structured (JSON) data, whereas #9882 only supports strings. Because of the overlap and differences between the proposals, I don't think this proposal will fully make it.

Perhaps a new proposal should be created as a "follow up" to #9882, the new proposal would only include the feature to being able to change (add/remove/update) labels on running containers, e.g. "Proposal: Allow modifying labels (meta-data) on running containers".

Would that be reasonable?

@squaremo
Copy link
Contributor Author

squaremo commented Mar 2, 2015

Would that be reasonable?

Not from my point of view. As people have mentioned repeatedly above, this is a feature quite distinct from labels: #9013 (comment) identifies the different uses, and #9013 (comment) describes the various purposes to which labels, annotations, etc., can be put.

I am all for avoiding proliferation of features. If "runtime labels" can serve both purposes, without compromising either, than we can conflate them with annotations as described here. But being superficially similar isn't a good enough reason. After all, labels are pretty similar to environment entries on the face of it, aren't they.

In any case, I don't see why submitting another PR with the same use cases and so on -- rather than figuring out how to adapt this one if that's necessary -- would advance things, unless politically.

Is there any 1-design-review feedback?

@crosbymichael
Copy link
Contributor

@squaremo from what I understand your argument for this proposal over the other label proposals is that this is mutable annotations and the others are immutable?

@rade
Copy link

rade commented Mar 10, 2015

I believe there are two major differences to #9882:

  • Annotations can be set on containers in any state, including stopped and running containers. Labels can only bet set on prior/during container creation.
  • Annotations are mutable, and there is a corresponding event stream resulting from that mutation. Labels are immutable.

@rhatdan
Copy link
Contributor

rhatdan commented Mar 11, 2015

Yes, so lets get Immutable labels merged. :^)

@squaremo
Copy link
Contributor Author

@squaremo from what I understand your argument for this proposal over the other label proposals is that this is mutable annotations and the others are immutable?

I'm not arguing for this proposal over another, only that they should not be conflated (and this proposal be closed as a duplicate as a consequence), since annotations as described here are useful in ways that labels in #9882 and elsewhere are not.

The main difference is that annotations can be changed after container creation, yep. Other differences are less important to their purpose.

@rade
Copy link

rade commented Mar 13, 2015

Is there anything in #9882 that would prevent its 'labels' from getting the functionality described here, at some future point?

I would not like to see a future docker having three ways of annotating containers - env vars, labels and annotations. We could end up there by accident, e.g. what if some users employ labels in a way that requires them to be immutable?

@icecrime
Copy link
Contributor

Closed by #9882 Oh apparently it's not the same thing oO

@LK4D4
Copy link
Contributor

LK4D4 commented Mar 31, 2015

@squaremo Can we open new proposal in light of merged labels? Because I'm not sure that labels+annotations+whatever is cool way.

@icecrime
Copy link
Contributor

icecrime commented Apr 2, 2015

+1, it really needs to start fresh knowing that #9882 is now merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet