Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Proposal: One Meta Data to Rule Them All => Labels #9882

Merged
merged 7 commits into from Mar 17, 2015

Conversation

Projects
None yet
Contributor

ibuildthecloud commented Jan 3, 2015

This is yet another meta data PR in an attempt to pull together multiple PRs hopefully into something we can get into Docker 1.5 (seriously, let's move fast).

Background

Some background... (from what I can gather). There is currently docker#8955 for adding UserData to a Dockerfile. Basically it adds the USERDATA key=value key="long value" syntax to the Docker. Then there is docker#9013 which looks to add structured JSON data to a Dockerfile and a container in much the same fashion as Kubernete's annotations. Finally there is docker#9854 which discusses the need for meta data on container to help with Swarm and similar clustering systems. There are probably another 10 threads that @thaJeztah can find that also talk about the need to add meta data to containers.

We need this...

We need meta data, it's clear user want it.

Labels

We already have labels on Hosts (Docker daemon) today. It seems that going forward we should be able to add labels to everything: Hosts, containers, volumes, images, etc. Let's just continue with that approach. Labels are simple key/value pairs in the style of map[string]string. There have been comments to standarize the format of the keys such that we have namespaces or other means to prevent conflicts. Honestly I don't think that is all that necesary at the Docker level to dictate this. It's just good practice to namespace things. Whether you do "foo_bar" or "foo/bar" or "com.foo/bar", who really cares. If your going to use "id=3", well that's a bad name and somebody may clobber your value. I'm assuming some will disagree with me on this one. That's fine, I'll agree to a namespace standard just as long as it doesn't take three weeks and 132 comments to decide that DNS format is far superior to arbitrary strings split by "/".

Labels are not structured data, and as such this PR is different from docker#9013, so that discussion can happen differently. Honestly, I'm not in favor of adding structured data to objects and having Docker maintain it. But if others disagree, so be it, we can have structured data as something else.

What about ENV vars

Yes, labels are very close to environment variables. You can add ENV to a Dockerfile and you can add them at container create. The basic difference here is that these key/values are not visible to the running processes in the container.

Lookup by Label

Another key attribute of labels is that you should be able to find an object based on it's label. This should initially be kept very simple. You can either say "give me all images/containers that have key foo." Or you can say "give me all images/containers that have foo=bar".

How should we go about doing this?

I think docker#8955 is the right start. USERDATA should be renamed to LABEL. Now that we have LABEL on images we need to add --label key=value to docker run/create. This is in the same fashion as ENV and -e today.

Now the only thing left do to is to figure out how to query based on labels. We just need to add --label foo or --label foo=bar to docker images/ps.

It's just that simple folks

Okay, good? Alright, let's move forward...

Member

thaJeztah commented Jan 3, 2015

Thanks, Darren!

Honestly I don't think that is all that necesary at the Docker level to dictate this. It's just good practice to namespace things. Whether you do "foo_bar" or "foo/bar" or "com.foo/bar", who really cares.

Fully agree on this one. Docker should offer the means to store, search and retrieve the data, but have no opinion on what they are used for, or what (naming)conventions are used. If Docker itself is using meta-data for something, that is just an implementation, just like any other system using the meta-data.

The data stored in a label is just free-form text as well; if an implementation decides to use it for storing JSON, that's fine, but Docker doesn't offer special treatment for those values; no parsing, validation or nested search for JSON properties.

Indexing / performance

To be useful (for example, fetch a container via a "custom" id), querying meta-data should be fast. Useful indexes should probably be present, including "partial" matches or wild-card support, both on "keys" and "values". For example, getting all containers that have a labels with namespace/prefix starting with my.name.space.*

Scope / Visibility

We should probably ask if meta data is only accessible from "outside" containers (in case of meta-data on containers), or also from within a container; I can see use-cases where meta-data can be useful inside a container. How to control access is something to be discussed (also wrt read/read-write)

Inheritance

In case of Image and Container meta-data; should images share the same meta-data as containers running from it? Will they be inherited, but kept separate? Or are they "merged" when creating a container instance?

Contributor

phemmer commented Jan 3, 2015

I also agree that docker should not dictate how the data gets used or formatted. I personally think docker tries to dictate things a bit too often. We should recommend a best practice, but if someone wants to ignore best practice, they might have a good reason for it.

However I don't like the term 'label'. Most systems I've ever dealt with have treated a label as value-only data, not key/value (one example being github labels). Just to clarify what I mean, a label would be something like --label foo_bar, where there is no =, and thus no presence of a key.

Contributor

jessfraz commented Jan 5, 2015

I like this FWIW

Contributor

erikh commented Jan 5, 2015

Paging @vieux and @aluzzardi, as this may be very relevant to their interests.

I also think it's a really good idea. Chef search is very similar and one of its best features.

Contributor

ibuildthecloud commented Jan 5, 2015

I want to make it clear that my intention is to quickly move this forward. I want to find what we can implement now that will give the minimally viable value but also put us on clear path to adding more functionality.

@thaJeztah - Comments below

Indexing/Performance

I completely agree that fast lookup is required. That is one of the fundamental differences between environment variables and labels. For containers I think this can be easily achieved by just keeping the labels in memory in the current data structures. Searching for a label will just be iterating over the list of containers in memory. This means at it’s worse searching for a label will be as slow as docker ps. I think this is sufficient at first. We can improve down the line.

Images become more difficult as you have more images than containers, typically. For images a real index should be built. The problem though is that docker networks are coming soon and volumes will probably not be too far. It seems we should find one consistent approach for labels that works for all object types. As such, I would like to defer on search for images by tag. I’ve currently seen a higher demand for fast lookup of containers, but not the same for images. I’m not saying the use cases don’t exist, just that containers are a higher priority.

The way in which one can search is largely based on the underly index. So supporting wildcard, regexp, etc. has real technical implications. I think we need a good query syntax, but for the first pass I think it is safe to support “give me all containers that have label foo” and “give me all containers that have label foo equal to bar”. The syntax would be docker ps --label foo and docker ps --label foo=bar respectively.

Scope/Visibility

First off, scope visibility doesn’t really matter until we have an introspection service. So this is obviously a discussion that will happen elsewhere, but regardless I’ll say what I think it should be. Labels should follow the existing pattern of ports. That being that they are private by default and must be explicitly published. Defining a label is the same as EXPOSE in the Dockerfile and then publishing the label should follow a pattern like -P and -p .... This means all labels will not be seen by the container unless you explicitly --publish-all-labels or --publish-label foo. It seems a wildcard syntax should exist like --publish-label com.example/*.

Inheritance

I hope you notice a trend in my comments in that we should just follow existing patterns. For inheritance I would expect labels follow the same approach as environment variables. I honestly haven’t given a huge amount of thought to this, but I think the ENV approach should be sufficient.

@phemmer I completely agree that label is a bad term. Unfortunately the precedence has already been set with host labels and Kubernetes labels. I have a strong opinion that it's better to be consistently wrong then inconsistently right. I think we should just stick with the ill named “label.”

Implementation

I have every intention of pushing this through as fast as possible. I’m going to code the implementation of this hopefully today based off of @rhatdan’s existing work in docker#8955. I'm optimistic the community can come to a consensus.

Member

dnephin commented Jan 6, 2015

Labels would be awesome for fig/docker compose. Tracking images with tags would allow users to use any name for their images and containers, and would address some of the performance issues with the current version.

Inheritance

It seems to me like inheritance could be entirely client-side. A client which is creating a container from an image should be able to decide which labels to copy over to the container.

@SvenDowideit SvenDowideit commented on an outdated diff Jan 6, 2015

docs/README.md
@@ -1,3 +1,5 @@
+Okay, I'm lazy. I'll write docs if I think this PR doesn't get violently opposed
+to at first...
@SvenDowideit

SvenDowideit Jan 6, 2015

Contributor

giggle - you'll want to remove this >:}

@SvenDowideit SvenDowideit commented on the diff Jan 6, 2015

contrib/syntax/vim/syntax/dockerfile.vim
@@ -11,7 +11,7 @@ let b:current_syntax = "dockerfile"
syntax case ignore
-syntax match dockerfileKeyword /\v^\s*(ONBUILD\s+)?(ADD|CMD|ENTRYPOINT|ENV|EXPOSE|FROM|MAINTAINER|RUN|USER|VOLUME|WORKDIR|COPY)\s/
+syntax match dockerfileKeyword /\v^\s*(ONBUILD\s+)?(ADD|CMD|ENTRYPOINT|ENV|EXPOSE|FROM|MAINTAINER|RUN|USER|LABEL|VOLUME|WORKDIR|COPY)\s/
@SvenDowideit

SvenDowideit Jan 6, 2015

Contributor

Ah, this means you need to mention LABEL as on of the ONBUILD instructions in the docs and man page

@SvenDowideit SvenDowideit and 1 other commented on an outdated diff Jan 6, 2015

docs/sources/reference/builder.md
@@ -322,6 +322,17 @@ default specified in `CMD`.
> the result; `CMD` does not execute anything at build time, but specifies
> the intended command for the image.
+## LABEL
+ LABEL <key>=<value> <key>=<value> <key>=<value> ...
+
+ --The `LABEL` instruction allows you to describe the image your `Dockerfile`
+is building. `LABEL` is specified as name value pairs. This data can
+be retrieved using the `docker inspect` command
+
+
+ LABEL Description="This image is used to start the foobar executable" Vendor="ACME Products"
+ LABEL Version="1.0"
+
@SvenDowideit

SvenDowideit Jan 6, 2015

Contributor

are these (or could they be) the same kind of labels as are happening for swarm? @vieux ?

@ibuildthecloud

ibuildthecloud Jan 6, 2015

Contributor

@SvenDowideit The idea is that these labels should be conceptually the same as the host labels. Going forward we will just be able to apply labels to everything, on host, containers, images, and future networks and volumes.

@SvenDowideit

SvenDowideit Jan 8, 2015

Contributor

ie, #everythingisawesome! 👍

@phemmer phemmer and 1 other commented on an outdated diff Jan 6, 2015

docs/sources/reference/api/docker_remote_api_v1.16.md
@@ -1326,6 +1327,11 @@ Create a new image from a container's changes
"Cmd":[
"date"
],
+ "Labels": [
+ "Vendor=Acme",
+ "License=GPL",
+ "Version=1.0"
+ ],
@phemmer

phemmer Jan 6, 2015

Contributor

Why are the labels stored/accessed as a dictionary/hash everywhere else, but provided as an array through the API?
What happens if we do something like:

"Labels": [
  "foo=bar",
  "foo=baz",
]

I think it would be more appropriate to treat it as a dictionary/hash everywhere.

Also assuming the inspect data would be an array as well, what if I want to do:

docker inspect --format '{{.Config.Labels.foo}}'

...to get a specific value. With an array, you cant.

@ibuildthecloud

ibuildthecloud Jan 6, 2015

Contributor

@phemmer I agree a map is better, I actuallly updated the code as such, but not the documentation yet.

@ibuildthecloud

ibuildthecloud Jan 6, 2015

Contributor

@vieux @phemmer Okay, I've just noticed that host labels are ["key=value"] in the API and not a map. @vieux Why was it chosen to be done this way? It does make the code a bit wonky.

@ibuildthecloud

ibuildthecloud Jan 6, 2015

Contributor

So it appears that ["key=value"] is there to support multiple keys of the same name. We will stick with that syntax.

@phemmer

phemmer Jan 7, 2015

Contributor

If were going with that, then I think we should remove any restriction that the label content contain a =.

Internally the code won't be able to use a map since you can have duplicate keys, so it'll just be an array of strings. If it's an array of strings, I don't see any reason why a = would be required.
Doing this also supports the term 'label' as well, as it's no longer a key/value pair (it can be used as such, but it's up to the client).

@ibuildthecloud

ibuildthecloud Jan 7, 2015

Contributor

We need a notion of key and value to support search as I described. I would rather change the name to Metadata then remove the notion of =

Contributor

phemmer commented Jan 6, 2015

Do we ever anticipate labels being added/removed on an already existing container? If so, storing them in the config data precludes this possibility.

Contributor

aanand commented Jan 6, 2015

I agree with pretty much everything in this proposal - it's basically the one I would've written. As @dnephin says, Compose will be a primary consumer of this feature.

@bfirsh bfirsh added the UX label Jan 6, 2015

Contributor

rhatdan commented Jan 6, 2015

I can go along with this, just as long as something finally gets merged to allow us to add Meta Data. Even if we can just settle on a name Meta->UserData->Label...

@bfirsh bfirsh and 3 others commented on an outdated diff Jan 7, 2015

docs/sources/reference/api/docker_remote_api_v1.17.md
@@ -124,6 +124,11 @@ Create a container
"OpenStdin":false,
"StdinOnce":false,
"Env":null,
+ "Labels": [
+ "Vendor=Acme",
+ "License=GPL",
+ "Version=1.0"
+ ],
@bfirsh

bfirsh Jan 7, 2015

Contributor

Is this a list for a specific reason, or just consistency with Env? Could it be this?

"Labels": {
  "Vendor": "Acme",
  "License": "GPL",
  "Version": "1.0"
},
@thaJeztah

thaJeztah Jan 7, 2015

Member

Apparently that is to allow multiple keys with the same name (docker#9882 (comment)).

I wonder (if it should be supported) would be better to store as;

    "Labels": {
      "Vendor": ["Acme"],
      "License": ["GPL"],
      "Version": ["1.0"],
      "foo": ["bar","baz","bam"]
    },

Note; I'm not suggesting that users can directly provide JSON arrays as argument, but only to store multiple --label foo=bar --label foo=baz --label foo=bam

One case that might cause problems here is if one of those labels doesn't have a value (--label foo). Perhaps this should be stored as [NULL,"bar","baz","bam"]? IDK

@ibuildthecloud

ibuildthecloud Jan 7, 2015

Contributor

I think this would be a more logical data structure.

@aanand

aanand Jan 7, 2015

Contributor

@thaJeztah Labels "without values" don't make a lot of sense to me. A label with the empty string as its value, sure, but having a distinct NULL value feels messy.

@thaJeztah

thaJeztah Jan 7, 2015

Member

yes that's better; if we're treating label values as strings, then "no" value should be an empty string

@thaJeztah

thaJeztah Jan 7, 2015

Member

Lolz. Reading back your comment, you probably suggested to dont add a value to the array at all?

Should probably be fine as well. There's no need to know how many times I set a label, so --label foo will result in {"foo":[]}` (an empty array), right?

@bfirsh bfirsh commented on an outdated diff Jan 7, 2015

docs/sources/reference/commandline/cli.md
@@ -1501,6 +1504,7 @@ removed before the image is removed.
--ipc="" Default is to create a private IPC namespace (POSIX SysV IPC) for the container
'container:<name|id>': reuses another container shared memory, semaphores and message queues
'host': use the host shared memory,semaphores and message queues inside the container. Note: the host mode gives the container full access to local shared memory and is therefore considered insecure.
+ -l, --label=[] Set labels
@bfirsh

bfirsh Jan 7, 2015

Contributor

Should probably explain what labels are. E.g. "Set key/value metadata on container"

@jessfraz jessfraz added the Proposal label Jan 7, 2015

Contributor

ibuildthecloud commented Jan 7, 2015

Current approach

As it's implemented right now, it is the following

Dockerfile

FROM blah
LABEL A B
LABEL A=B C=D D="X Y"

Docker run
docker run -l A=B --label C="X Y"

Querying

docker ps -f label=x
docker ps -f label=x=y
docker images -f label=x
docker images -f label=x=y

Status

I've written the code, tests, and some documentation. I feel the approach is pretty solid but there are two remaining issues I see.

  1. What is the name? Labels or Metadata
  2. What is the internal data structure? ["key=value", "key=value2"] or { "key" : ["value","value2"] }

The name

Labels is consistent with host labels and Kubernetes. It has already been pointed out that because these labels allow multiple keys with the same name they are already different from Kubernetes. Labels is not the obvious term I believe because most people don't think of key/value pairs. Meta data seems like the more accepted term. Meta data would be inconsistent with host labels, but we could standardize on that name going forward as we apply this same approach to networks and volumes in the future.

Data Structure

The current code is using [ "key=value" ] as this was the consistent with the current Docker code base. It seems more user friendly if API consumers did not have to parse/split strings so { "key" : ["value", "value1"]} seems like a better approach.

@vieux @aluzzardi @crosbymichael Opinions?

Contributor

aanand commented Jan 7, 2015

Sorry to pile on the design questions, but I'm concerned about multiple labels with the same key. My gut reaction is to not allow it, because then it's impossible to override a label - if an image specifies A=foo, there's no way for me to get rid of that. All I can do is add more A labels.

Contributor

erikh commented Jan 7, 2015

I'm guessing this is an artifact of how the ENV parser works?

-Erik

On Wed, 2015-01-07 at 08:41 -0800, Aanand Prasad wrote:

Sorry to pile on the design questions, but I'm concerned about
multiple labels with the same key. My gut reaction is to not allow it,
because then it's impossible to override a label - if an image
specifies A=foo, there's no way for me to get rid of that. All I can
do is add more A labels.


Reply to this email directly or view it on GitHub.

Contributor

ibuildthecloud commented Jan 7, 2015

@aanand I agree, I also don't like multiple labels with different names. My assumption is that there was already a long discussion about this for host labels, as host labels were implemented this way. Maybe @vieux can chime in.

Rancher.io will be a consumer of this API and I would personally prefer to not allow multiple keys as it complicates interactions. Additionally, even though we say you can have multiple keys with the same name it is not possible to do that from a Dockerfile. The Dockerfile is always assuming that you are overriding the value. Come to think of it the way the docker run is implemented too, it is also not possible to add multiple keys. I should probably change that. But honestly I would like to not support multiple keys with the same name. I feel somebody with "authority" should make this decision.

Contributor

ibuildthecloud commented Jan 7, 2015

@erikh Yes it is a side effect, but I originally just wrote a different implementation to do a map structure. After I wrote that I had a discussion with @vieux in that he pointed out that they felt that having multiple keys of the same name was an important feature of host labels.

Contributor

rhatdan commented Jan 7, 2015

I also don't want multiple labels with the same key.

Collaborator

vieux commented Jan 7, 2015

@ibuildthecloud are we labelling images or containers here ? the Dockerfile example suggests you're labelling the image, and the docker run the container.

Contributor

ibuildthecloud commented Jan 7, 2015

@vieux Both images and containers. The label from the Dockerfile goes in Config.Labels and ContainerConfig.Labels of the image and then gets merged into the container in Config.Labels

Contributor

ibuildthecloud commented Jan 7, 2015

@shykes Can you make the oh so critical decision of whether the format should be ["key=value", "key=value2"] or { "key" : [ "value", "value2" ] } or just map[string]string. I think all the opinions and points have been shared in this thread. Additionally should this be called "Labels" or "Metadata".

Contributor

SvenDowideit commented Jan 8, 2015

I presume the container won't have access to this info?

Also, will we be able to add labels from the build cli?

ie docker build -t test --label SECRET=no . and then could the container inspect that?

Contributor

ibuildthecloud commented Jan 8, 2015

@SvenDowideit the container does not have access to this info. You currently can't add labels from the build cli, the label has to be in the Dockerfile. What would be the need to assign labels from the CLI and not from the Dockerfile?

Hello. I'm still quite new to docker. So my opinion might be a bit naive on un-informed. Personally I would be in favour of calling it Metadata instead of Labels, and have the option to be allowing arbitrary json structures under it there. It may sound like an overkill, but docker inspect already accepts a --format parameter for specifying arbitrary go template. So that approach could work with querying or filtering for arbitrary metadata structures on docker ps).

I am guessing if you implemented the feature that way, then it would be truly flexible and future-proof forever (and still should not require docker to be dictating or managing anything more either). Being able to store arbitrary metadata would be more inter-operable with external tools (that can then store their own formatted metadata structures without any restriction). So it would simplify those scenarios a lot.

And then such future intro/intra-spection feature could be more powerful too.

And I guess in fig.yml (docker compose), you would finally be able to specify arbitrary yml structures in there:

e.g. fig.yaml

service:
  image: busybox
  environment:
     - ENV_VAR1=VAL1
     - ENV_VAR2=VAL2
  metadata:
    - label1=value1
    - label2=value2
    feature1:
      sub_structure1_1:
        - arbitrary json / yaml here
    feature2:
      - metadata for a different unrelated feature / etc.

Where most people would be using the metadata key like simple labels (as you guys have already honed in on). But then some fewer more complex cases, when someone knows what they are doing and really needs it, can specify arbitrary json structure, or have 3rd party management tools populate them automatically (for the cluster and distributed management, etc).

So I guess what I am really saying is: if we can just call them metadata and not labels, then we aren't going to be limiting ourselves in the future so much. And we won't ultimately end up with yet another redundant docker-managed structure if we made the feature called labels then decided later on we needed to add a new feature called metadata: anyhows. My concerns makes sense? Are they explained well enough?

Sorry to come in with different ideas so late to the discussion.

Member

thaJeztah commented Jan 9, 2015

@dreamcat4 The problem with supporting nested JSON structures, is that will also require a lot of extra handling in Docker. While you prefer to use JSON, other people may want to store plain-text.

The next step would become that people want to modify parts of that JSON structure, or search for a deeply nested property (and want Docker to be optimized for that).

The current look at things is to make as little assumptions as possible. The value you're storing is a simple string. What you want to put in that string is up to you, or the package using it.

However, this doesn't prevent you to store JSON, it just has to be encoded as a string;

{
    "Labels": {
        "myFancyLabel": [
            "{\"Vendor\":[\"Acme\"],\"License\":[\"GPL\"],\"Version\":[\"1.0\"],\"foo\":[\"bar\",\"baz\",\"bam\"]}"
        ]
    }
}

To use the JSON, the software that uses it, has to decode it before using.

@thaJeztah ah, i now see your original comment near the top where you first mention encoding json into a string value. Sorry I missed that the first time around.

Don't doubt your assessment about the implementation being harder. I don't think the current proposed version of it needs to be any much different. I am not arguing for that. Just that the name of the feature as labels, is a very limiting concept that cannot be expanded upon any further into the future, for subsequent revisions or this feature. It seems shortsighted.

I do agree that complex search expressions do irk a lot of people. Because they already IRK me too when using complex go templates in docker inspect --format '{{ arcane go templates one-liner here }}'.

What if we could eventually turn the problem on it's head? And instead simply converted any non-string objects to their json string representations, before passing them to those newly exposed string matching and search functions? To cache those json blobs representations would improve the speed of successive searching.

Then they still get handled and output to sstdout as the sort of embedded json strings that you showed me here in your recent example. But critically that their underlying native representation need not be forced to be a string (not right now, but eventually, when the feature is improved upon). For arbitrary structured metadata.

The same proposed text search interface(s) would then continue to work externally exactly the same as before. For in the case of complex data the output would be just json blobs for those complex data instead. Sorry. I feel I am repeating myself here.

In fig.yml, I need to express arbitrary metadata that is nested. For intro-intra-spection. In docker compose if I follow your solution to my problem then that would mean constructing a pretty horrendous encoded json blob (as per the example given above) and inserting that into a string keys under a labels: yaml key for each service i declare in my human-readable yaml files. Never want to be editing json directly in a .yml document. But especially if it is a single-quoted and backslash-escaped json blob. That would be far too error prone.

Member

thaJeztah commented Jan 9, 2015

In fig.yml, I need to express arbitrary metadata that is nested [....] In docker compose [...] that would mean constructing a pretty horrendous encoded json blob

Not necessarily; that will completely depend on the software using the docker API. If Fig (or Docker Compose, or Crane, or ...) decides to support structured labels, they could allow the user to enter it as YAML or JSON. It's the responsibility of that software to convert it to a string before sending the label to the Docker API.

So (to stick with Fig/Compose as an example), you could write the fig.yml as in your earlier example. Fig/Compose would convert it to to a string when sending it to the Docker API and it would end up in docker something like this;

{
    "Labels": {
        "fig.project" : ["my-project.example.com"],
        "fig.service" : ["service"],
        "fig.metadata" : ["[\"label1=value1\",\"label2=value2\",{\"feature1\":{\"sub_structure1_1\":[\"arbitrary json \/ yaml here\"]}},{\"feature2\":[\"metadata for a different unrelated feature \/ etc.\"]}]"]
    }
}

Where fig.metadata, fig.project and fig.service are just names of labels I made up. Also the structure could be really different, depending on how Fig decides to store it (perhaps "fig.metadata.label1":["value1"], "fig.metadata.label2":["value2"])?

The same would apply to (nested) searching in that "JSON", only in reverse; Docker itself doesn't support that, so Fig would have to fetch the labels from Docker and convert them back to JSON when searching.

Perhaps, in the future, it would be possible to extend this via "plugins", e.g. being able to plug-in a custom "label" storage and/or search engine. But, for now, that's out of scope for this Proposal.

+1 like the proposal (and others that are referenced in this proposal)
labels are simple yet powerful.

@ibuildthecloud wrote:

We already have labels on Hosts (Docker daemon) today. It seems that going forward we should be able to add labels to everything: Hosts, containers, volumes, images, etc.

It probably is worth stating what else includes in etc.for example network/network-endpoints as defined in docker#9983. Probably not for endpoints, and may be for network, but the scope of which objects can be associated with labels should be specified.

Can I also assume that docker state/libpack will distribute the labels (as it becomes available).

@thaJeztah OK then. I guess I would be happy with that.

Collaborator

vieux commented Jan 14, 2015

I would say I don't care much about labels for the images.

Why not make 2 PRs ? one for containers and one for images ?

Containers should be quite small.

Otherwise I think we should add a new column to docker ps to show the labels

Contributor

bfirsh commented Jan 14, 2015

I like this a lot. Let's make it happen.

My one concern: I think labels should have single values. Several reasons:

  • It's consistent with how other tools do it (EC2, kubernetes, openstack, etc)
  • Because it's consistent, it's not surprising. Users will expect a key to have a single value.
  • It's much simpler. A lot of complexity arises from having multiple values. At every point, you need different behaviour on whether you want a single value or multiple values. For example:
    • setting a label in an image you need to define whether you want to override all the values or add a value.
    • filters will get complicated. if a container has labels foo=5 and foo=10, does filtering by foo<7 include this container? likewise, does filtering by foo=5 include this container or just containers with a single label of foo=5? and if I wanted to filter by just containers with a single label of foo=5, how do I do that?

I agree with @bfirsh. Single values.

Contributor

rhatdan commented Jan 14, 2015

I have changed my pull request to use labels instead of UserData. This seems to follow the standard set in docker -d. We believe labels should be single value. We really need this for labeled images. We are looking at putting data into the image to instruct the commands for install, config, run. We want to make the container image contain all information to define how an application will run even if the application consists of multiple containers.

Contributor

bfirsh commented Jan 14, 2015

One thing worth noting is that if we have single-value labels here, daemon labels should also be consistent. @vieux – is this feasible, do you think?

Contributor

squaremo commented Jan 14, 2015

Labels are not structured data, and as such this PR is different from #9013, so that discussion can happen differently.

That's not what the title of this PR suggests! But in any case I agree, searchable labels have a different purpose to annotations.

Using structured data is not the only difference, however. #9013 was motivated by a design for docker plugins (extensions, proxies, whatever). Crucial to the design is that the annotations can be changed while the container is running, and interested parties (e.g., plugins) are told that this has happened. I don't see that addressed here.

For the record, I don't think the solution is to try and make this proposal encompass even more features. If anything it ought to be drastically simple -- so not only no structured values, but nothing that would make you want to use it with encoded structured values, or to rely on the values as anything other than opaque tokens.

Contributor

rhatdan commented Jan 14, 2015

Yes lets keep this simple and get it merged. We could extend this proposal later or add alternatives in the future. For now, we just want to allow image developers to add opaque data to the image.

rade commented Jan 14, 2015

A crucial, motivating feature for #9013, and a key distinction from env vars, is the ability to add/update/delete metadata for stopped or running containers. This proposal does not appear to address that.

Collaborator

shykes commented Jan 15, 2015

Thanks Darren for taking the initiative on this! A few notes:

1) On overall design

I'm a big +1 on the need for arbitrary user data as you know.

2) On the word "label"

I'm not a huge fan of the term "label", but I don't think it contradicts another well-established term, I don't have a strong opinion and we already use it for "host labels" (see daemon flags). Consistency matters, so I would say let's use that unless someone has a very strong opinion backed by strong evidence.

NOTE: I saw consistency with Kubernetes mentioned as an argument: that is utterly unimportant. Consistency within the project is important. Consistency with well-established standard practices is also important. Consistency with young and immature is not. If we start using consistency with projects like that as an argument, we're opening a pandora box: there will always be at least 1 bleeding edge project, somewhere, which is consistent with our pet terminology. Nobody cares.

Consistency within the same project is essential, but consistency with other projects is nice-to-have, at best, unless they're massively adopted. Kubernetes is even more immature than Docker which is saying a lot, so it really doesn't matter if we stay consistent with them. Hopefully this is not a controversial statement. We use the term "label" already at the host level (see daemon flags), so that's a more compelling argument for consistency.

3) On format of values

I agree that single string values are better, because:

  • The concept of multiple values is confusing to the user
  • Allowing arbitrary types (maps, arrays etc) as values doesn't seem like a hair-on-fire need right now, and we can always add it later. In doubt I prefer to choose the option that is easier to reverse :)

4) On namespacing

I understand your point @ibuildthecloud but I think we should still enforce at least a convention. It's important to do this now, because in practice it's the only opportunity we'll get. Once real-world users and tools start taking over the entire namespace, it will be extremely hard to add new restrictions later without breaking reverse compatibility (which would go against our core principles).

So I suggest the following simple convention:

  • All 3d-party tools must prefix their keys with the reverse DNS notation of a domain controlled by the author of the tool. For example "sh.fig.app" or "io.rancher.organization".
  • Keys without dots are reserved for Docker's internal use. This allows using the same label system for exposing internal properties, querying them, etc. I think we could use this for a lot of properties which are currently special cases: "running, created_at" etc.
Contributor

bfirsh commented Jan 15, 2015

I think the word "label" has wide understanding as the meaning we are giving it. "Tags" is a close second. The fact we also use the term "label" for daemons is also a plus.

I have no strong feelings about namespacing. @aanand – got any thoughts on this?

Contributor

aanand commented Jan 15, 2015

I don't feel particularly strongly about namespacing, but I would like to make one point: @shykes observes correctly that it's almost always better to make a decision you can reverse later, rather than one that locks you in. In the case of namespaced keys, though, both options actually involve lock-in, just for different sets of tools:

  1. If we don't insist on namespaced keys, tools which produce labels can be locked in - should we begin to insist on namespacing later on, it's a breaking change for tools which assume they can use whatever keys they want.
  2. If we do insist on namespaced keys, tools which consume labels can be locked in - should we abandon the convention later on, tools which assume that all label keys are in a reverse-DNS format might break if they aren't.

So either way it's a risk. To me, it feels like the breakage in (1) is going to be a much more common case, which is an argument for enforcing namespacing from the start.

Member

thaJeztah commented Jan 15, 2015

I agree with @aanand. Insisting on namespaced keys is better than removing namespaces afterwards.

Some notes / questions;

  1. To keep the implementation simple and if I understand correctly, insisting on namespaces is purely convention; the code will not validate labels and "bail-out" if I set an unknown non-namespaced label. Correct?
  2. A note should be added to the "documentation" README.md to use the com.example namespace for fictive labels in the documentation (com.example.label-name).
  3. Is there a need to reserve a "private/free to use" namespace? Just like local IP-subnets (e.g. anyone can use local.label-name, but this should not be used by software implementing labels).
  4. Document what (open source) projects should use if they don't have a domain name; Should they use com.github.username.projectname or something?
Contributor

ibuildthecloud commented Jan 15, 2015

I'm glad to see this PR is moving along. Let me summarize the points so far. It seems we are very close to decisions on most of these points.

  • Name: We will go with the term "label" as this is consistent with what already exists. We recognize that this may not be the absolute best name but consistency is more important.
  • Single Value for a Keys: We will go with a single value for each key. This is more consistent with well established systems like EC2 and causes less corner cases. This means foo=1 and foo=2 is not valid. This implies internally we can switch to map[string]string for the data structure.
  • Structured data: Structured data (annotations) is a different discusssion and as such #9013 should be for this purpose.
  • Namespacing: A convention will be useful to enforce now. We will go with reverse DNS format and keys must have at least one period in the key.
  • Labels on Images/Containers: @rhatdan presents a compelling reason to keep labels on images. Nobody has had objection to labels on containers. Labels will be on both images and containers.

Based on my summary I will be updating the code and documentation as follows

  • Switch the code to use map[string]string internally
  • Enforce that keys have at least one period in them and match the regexp [.-a-zA-Z0-9]+
  • Clearly state in the documentation that reverse DNS format is the convention.
  • Rebase on #8955 (I would still like this PR to be based on #8955 as the credit (and patience) goes to @rhatdan for this work)
Contributor

squaremo commented Jan 15, 2015

Structured data: Structured data (annotations) is a different discussion and as such #9013 should be for this purpose.

To reiterate, a more important difference is that annotations per #9013 can be changed after a container is started.

Contributor

ibuildthecloud commented Jan 15, 2015

@thaJeztah I don't see much harm in just enforcing what I just posted. This is a very light restriction. By enforcing the regexp now we can reserve space for internal keys by having keys that don't have a period, or maybe have a "_" or "/" in them.

@moxiegirl moxiegirl commented on an outdated diff Mar 16, 2015

docs/mkdocs.yml
@@ -59,6 +59,7 @@ pages:
- ['userguide/dockerimages.md', 'User Guide', 'Working with Docker Images' ]
- ['userguide/dockerlinks.md', 'User Guide', 'Linking containers together' ]
- ['userguide/dockervolumes.md', 'User Guide', 'Managing data in containers' ]
+- ['userguide/labels-custom-metadata.md', 'User Guide', 'Labels - custom meta-data in Docker' ]
@moxiegirl

moxiegirl Mar 16, 2015

Contributor

metadata

@moxiegirl moxiegirl commented on an outdated diff Mar 16, 2015

docs/sources/reference/api/docker_remote_api.md
@@ -71,15 +71,33 @@ This endpoint now returns `SystemTime`, `HttpProxy`,`HttpsProxy` and `NoProxy`.
### What's new
+**New!**
@moxiegirl

moxiegirl Mar 16, 2015

Contributor

New implies now so you can omit it. Also here you use user data but elsewhere you were using "metadata" so just be consistent.

---> try this

The build supports LABEL command. Use this to add metadata
to an image or container. For example you could add data describing the content of an image.

LABEL "com.example.vendor"="ACME Incorporated"

@moxiegirl moxiegirl commented on the diff Mar 16, 2015

docs/sources/reference/api/docker_remote_api.md
`POST /containers/(id)/attach` and `POST /exec/(id)/start`
**New!**
Docker client now hints potential proxies about connection hijacking using HTTP Upgrade headers.
+`POST /containers/create`
+
+**New!**
+You can set labels on container create describing the container.
@moxiegirl

moxiegirl Mar 16, 2015

Contributor

You can use labels to configure metadata for an image or a container.

@ibuildthecloud

@ibuildthecloud

ibuildthecloud Mar 16, 2015

Contributor

Labels are like env variables. They can be set on an image and then copied and applied to the container.

@rade

rade Mar 16, 2015

Perhaps the docs should clearly state somewhere how labels are different to env vars. Otherwise users may wonder why we need a new mechanism.

@moxiegirl

moxiegirl Mar 16, 2015

Contributor

@ibuildthecloud Copy and applied --- I didn't really see that in the documentation you provided. I saw you could "set LABELS" on images or containers. Where to you describe copying labels from an image to a container?

@rade I don't see where a reader would confuse a label with an env var.

@ibuildthecloud

ibuildthecloud Mar 16, 2015

Contributor

The key different between labels and env var is that labels are not visible to the processes running in the container and they allow fast lookup through the API.

I don't know if there is a place we want to add that description.

@rade

rade Mar 16, 2015

"fast lookup through the API" could presumably be added for env vars. So really the key difference is in the visibility to containers. A sentence somewhere along the lines of "Unlike env vars, labels are not visible to processes running inside a containers" would be great. Though quite where that would go I do not know. Perhaps @moxiegirl can advise.

@moxiegirl

moxiegirl Mar 17, 2015

Contributor

@rade Good point. I'll add it in after the merge.

@moxiegirl moxiegirl and 1 other commented on an outdated diff Mar 16, 2015

docs/sources/reference/api/docker_remote_api.md
`POST /containers/(id)/attach` and `POST /exec/(id)/start`
**New!**
Docker client now hints potential proxies about connection hijacking using HTTP Upgrade headers.
+`POST /containers/create`
+
+**New!**
+You can set labels on container create describing the container.
+
+`GET /containers/json`
+
+**New!**
+This endpoint now returns the labels associated with each container (`Labels`).
@moxiegirl

moxiegirl Mar 16, 2015

Contributor

This implies the labels are on the container --- aren't they on the image the container is running? Possessive works in this case.

---> try this

The endpoint returns the labels associated with a container.

@ibuildthecloud

ibuildthecloud Mar 16, 2015

Contributor

Similar to my previous comment, they can be set on an image and then copied and applied to the container. The labels apply to the container, but you can inspect the image separately an see the labels on the images

@moxiegirl moxiegirl commented on an outdated diff Mar 16, 2015

docs/sources/reference/api/docker_remote_api_v1.18.md
@@ -190,6 +195,8 @@ Json Parameters:
- **OpenStdin** - Boolean value, opens stdin,
- **StdinOnce** - Boolean value, close stdin after the 1 attached client disconnects.
- **Env** - A list of environment variables in the form of `VAR=value`
+- **Labels** - A map of labels and their values that will be added to the
@moxiegirl

moxiegirl Mar 16, 2015

Contributor

Adds a map of labels that to a container. To specify a map: {"key":"value"[,"key2":"value2"]}

Contributor

moxiegirl commented Mar 16, 2015

GLOBAL: In the docs and the syntax pick key/value or name/value and use whichever you use consistently. I tried to standardize on key/value in my comments.

@moxiegirl moxiegirl commented on the diff Mar 16, 2015

docs/sources/reference/builder.md
@@ -328,6 +328,27 @@ default specified in `CMD`.
> the result; `CMD` does not execute anything at build time, but specifies
> the intended command for the image.
+## LABEL
@moxiegirl

moxiegirl Mar 16, 2015

Contributor

LABEL

LABEL = = = ...

The LABEL instruction adds metadata to an image. A LABEL is a
key-value pair. To include spaces within a LABEL value, use quotes and
blackslashes as you would in command-line parsing.

LABEL "com.example.vendor"="ACME Incorporated"

An image can have more than one label. To specify multiple labels, separate each
key-value pair by an EOL.

LABEL com.example.label-without-value
LABEL com.example.label-with-value="foo"
LABEL version="1.0"
LABEL description="This text illustrates \
that label-values can span multiple lines."

To view an image's labels, use the docker inspect command.

@ibuildthecloud

ibuildthecloud Mar 16, 2015

Contributor

I added this text, I'm not sure why github isn't removing the text.

@moxiegirl moxiegirl commented on an outdated diff Mar 16, 2015

docs/sources/reference/commandline/cli.md
@@ -791,6 +791,8 @@ Creates a new container.
-h, --hostname="" Container host name
-i, --interactive=false Keep STDIN open even if not attached
--ipc="" IPC namespace to use
+ -l, --label=[] Set meta data on the container (e.g., --label=com.example.key=value)
@moxiegirl

moxiegirl Mar 16, 2015

Contributor

metadata

@moxiegirl moxiegirl commented on an outdated diff Mar 16, 2015

docs/sources/reference/commandline/cli.md
@@ -1662,6 +1665,8 @@ removed before the image is removed.
--link=[] Add link to another container
--lxc-conf=[] Add custom lxc options
-m, --memory="" Memory limit
+ -l, --label=[] Set meta data on the container (e.g., --label=com.example.key=value)
@moxiegirl

moxiegirl Mar 16, 2015

Contributor

metadata

@moxiegirl moxiegirl commented on an outdated diff Mar 16, 2015

docs/sources/reference/commandline/cli.md
@@ -1662,6 +1665,8 @@ removed before the image is removed.
--link=[] Add link to another container
--lxc-conf=[] Add custom lxc options
-m, --memory="" Memory limit
+ -l, --label=[] Set meta data on the container (e.g., --label=com.example.key=value)
+ --label-file=[] Read in a line delimited file of labels
@moxiegirl

moxiegirl Mar 16, 2015

Contributor

Read in a file of labels (EOL delimited)

@moxiegirl moxiegirl commented on an outdated diff Mar 16, 2015

docs/sources/reference/commandline/cli.md
@@ -1835,6 +1840,36 @@ An example of a file passed with `--env-file`
This will create and run a new container with the container name being
`console`.
+ $ sudo docker run -l my-label --label com.example.foo=bar ubuntu bash
@moxiegirl

moxiegirl Mar 16, 2015

Contributor

The intro sentence refers to console -- cut-n-paste error. Below, you meant to refer to the env-file but repeated the ref to label-file. Tightened presentation

-->

A label is a a key=value pair that applies metadata to a container. To label a container with two labels:

$ sudo docker run -l my-label --label com.example.foo=bar ubuntu bash

The my-label key doesn't specify a value so the label defaults to an empty
string(""). To add multiple labels, repeat the label flag ( -l or
--label).

The key=value must be unique. If you specify the same key multiple times
with different values, each subsequent value overwrites the previous. Docker
applies the last key=value you supply.

Use the --label-file flag to load multiple labels from a file. Delimit each
label in the file with an EOL mark. The example below loads labels from a
labels file in the current directory;

$ sudo docker run --label-file ./labels ubuntu bash

The label-file format is similar to the format for loading environment variables
(see --env-file above). The following example illustrates a label-file format;

com.example.label1="a label"

# this is a comment
com.example.label2=another\ label
com.example.label3

You can load multiple label-files by supplying the --label-file flag multiple
times.

For additional information on working with labels, see
Labels - custom meta-data in Docker in
the Docker User Guide.

@moxiegirl moxiegirl commented on an outdated diff Mar 16, 2015

docs/sources/userguide/labels-custom-metadata.md
@@ -0,0 +1,194 @@
+page_title: Labels - custom meta-data in Docker
+page_description: Learn how to work with custom meta-data in Docker, using labels.
@moxiegirl

moxiegirl Mar 16, 2015

Contributor

metadata

@moxiegirl moxiegirl commented on an outdated diff Mar 16, 2015

docs/sources/userguide/labels-custom-metadata.md
@@ -0,0 +1,194 @@
+page_title: Labels - custom meta-data in Docker
+page_description: Learn how to work with custom meta-data in Docker, using labels.
+page_keywords: Usage, user guide, labels, meta-data, docker, documentation, examples, annotating
+
+## Labels - custom meta-data in Docker
@moxiegirl

moxiegirl Mar 16, 2015

Contributor

page_title: Labels - custom metadata in Docker
page_description: Learn how to work with custom metadata in Docker, using labels.
page_keywords: Usage, user guide, labels, metadata, docker, documentation, examples, annotating

Labels - custom metadata in Docker

You can add metadata to your images, containers, and daemons via
labels. Metadata can serve a wide range of uses. Use them to add notes or
licensing information to an image or to identify a host.

A label is a <key> / <value> pair. Docker stores the values as strings.
You can specify multiple labels but each <key> / <value> must be unique. If
you specify the same key multiple times with different values, each subsequent
value overwrites the previous. Docker applies the last key=value you supply.

note: Support for daemon-labels was added in Docker 1.4.1. Labels on
containers and images are new in Docker 1.6.0

Naming your labels - namespaces

Docker puts no hard restrictions on the label key you. However, labels can
conflict. For example, you can categorize your images by using a chip "architecture"
label:

LABEL architecture="amd64"

LABEL architecture="ARMv7"

But a user can label images by building architectural style:

LABEL architecture="Art Nouveau"

To prevent such conflicts, Docker namespaces label keys using a reverse domain
notation. This notation has the following guidelines:

  • All (third-party) tools should prefix their keys with the
    reverse DNS notation of a domain controlled by the author. For
    example, com.example.some-label.
  • The com.docker.*, io.docker.* and com.dockerproject.* namespaces are
    reserved for Docker's internal use.
  • Keys should only consist of lower-cased alphanumeric characters,
    dots and dashes (for example, [a-z0-9-.])
  • Keys should start and end with an alpha numeric character
  • Keys may not contain consecutive dots or dashes.
  • Keys without namespace (dots) are reserved for CLI use. This allows end-
    users to add metadata to their containers and images, without having to type
    cumbersome namespaces on the command-line.

These are guidelines and are not enforced. Docker does not enforce them.
Failing following these guidelines can result in conflicting labels. If you're
building a tool that uses labels, you should use namespaces for your label keys.

Storing structured data in labels

Label values can contain any data type as long as the value can be stored as a
string. For example, consider this JSON:

{
    "Description": "A containerized foobar",
    "Usage": "docker run --rm example/foobar [args]",
    "License": "GPL",
    "Version": "0.0.1-beta",
    "aBoolean": true,
    "aNumber" : 0.01234,
    "aNestedArray": ["a", "b", "c"]
}

You can store this struct in a label by serializing it to a string first:

LABEL com.example.image-specs="{\"Description\":\"A containerized foobar\",\"Usage\":\"docker run --rm example\\/foobar [args]\",\"License\":\"GPL\",\"Version\":\"0.0.1-beta\",\"aBoolean\":true,\"aNumber\":0.01234,\"aNestedArray\":[\"a\",\"b\",\"c\"]}"

While it is possible to store structured data in label values, Docker treats this
data as a 'regular' string. This means that Docker doesn't offer ways to query
(filter) based on nested properties.

If your tool needs to filter on nested properties, the tool itself should
implement this.

Adding labels to images; the LABEL instruction

Adding labels to an image:

LABEL [<namespace>.]<key>[=<value>] ...

The LABEL instruction adds a label to your image, optionally setting its value.
Use surrounding quotes or backslashes for labels that contain
white space character:

LABEL vendor=ACME\ Incorporated
LABEL com.example.version.is-beta
LABEL com.example.version="0.0.1-beta"
LABEL com.example.release-date="2015-02-12"

The LABEL instruction supports setting multiple labels in a single instruction
using this notation;

LABEL com.example.version="0.0.1-beta" com.example.release-date="2015-02-12"

Wrapping is allowed by using a backslash (\) as continuation marker:

LABEL vendor=ACME\ Incorporated \
      com.example.is-beta \
      com.example.version="0.0.1-beta" \
      com.example.release-date="2015-02-12"

Docker recommends combining labels in a single LABEL instruction instead of
using a LABEL instruction for each label. Each instruction in a Dockerfile
produces a new layer that can result in an inefficient image if you use many
labels.

You can view the labels via the docker inspect command:

$ docker inspect 4fa6e0f0c678

...
"Labels": {
    "vendor": "ACME Incorporated",
    "com.example.is-beta": "",
    "com.example.version": "0.0.1-beta",
    "com.example.release-date": "2015-02-12"
}
...

$ docker inspect -f "{{json .Labels }}" 4fa6e0f0c678

{"Vendor":"ACME Incorporated","com.example.is-beta":"","com.example.version":"0.0.1-beta","com.example.release-date":"2015-02-12"}

Querying labels

Besides storing metadata, you can filter images and labels by label. To list all
running containers that have a com.example.is-beta label:

# List all running containers that have a `com.example.is-beta` label
$ docker ps --filter "label=com.example.is-beta"

List all running containers with a color label of blue:

$ docker ps --filter "label=color=blue"

List all images with vendor ACME:

$ docker images --filter "label=vendor=ACME"

Daemon labels

docker -d \
  --dns 8.8.8.8 \
  --dns 8.8.4.4 \
  -H unix:///var/run/docker.sock \
  --label com.example.environment="production" \
  --label com.example.storage="ssd"

These labels appear as part of the docker info output for the daemon:

docker -D info
Containers: 12
Images: 672
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 697
Execution Driver: native-0.2
Kernel Version: 3.13.0-32-generic
Operating System: Ubuntu 14.04.1 LTS
CPUs: 1
Total Memory: 994.1 MiB
Name: docker.example.com
ID: RC3P:JTCT:32YS:XYSB:YUBG:VFED:AAJZ:W3YW:76XO:D7NN:TEVU:UCRW
Debug mode (server): false
Debug mode (client): true
Fds: 11
Goroutines: 14
EventsListeners: 0
Init Path: /usr/bin/docker
Docker Root Dir: /var/lib/docker
WARNING: No swap limit support
Labels:
 com.example.environment=production
 com.example.storage=ssd
Contributor

moxiegirl commented Mar 16, 2015

When I got to your final and new page on labels, I see that labels can apply to images or containers. So, in my edits just apply the proper noun for the context.

Contributor

moxiegirl commented Mar 16, 2015

@ibuildthecloud Hey if you can't see the MD in my comments let me know, I stored that last file offline just in case.

Contributor

ibuildthecloud commented Mar 16, 2015

@moxiegirl I can't see the MD source. For most of the comments I just made the appropriate changes but if you could stick docs/sources/userguide/labels-custom-metadata.md in a gist or something that would be great.

Contributor

icecrime commented Mar 16, 2015

@moxiegirl Please take another look! :-)

Documentation changes for labels
Signed-off-by: Darren Shepherd <darren@rancher.com>
Contributor

ibuildthecloud commented Mar 17, 2015

@moxiegirl I believe all comments have been address and I copied in your version of docs/sources/userguide/labels-custom-metadata.md

Contributor

moxiegirl commented Mar 17, 2015

LGTM

@SvenDowideit SvenDowideit commented on the diff Mar 17, 2015

docs/man/Dockerfile.5.md
@@ -143,6 +143,21 @@ A Dockerfile is similar to a Makefile.
**CMD** executes nothing at build time, but specifies the intended command for
the image.
+**LABEL**
+ -- `LABEL <key>[=<value>] [<key>[=<value>] ...]`
+ The **LABEL** instruction adds metadata to an image. A **LABEL** is a
+ key-value pair. To include spaces within a **LABEL** value, use quotes and
+ blackslashes as you would in command-line parsing.
+
+ ```
+ LABEL "com.example.vendor"="ACME Incorporated"
+ ```
+
+ An image can have more than one label. To specify multiple labels, separate each
+ key-value pair by a space.
@SvenDowideit

SvenDowideit Mar 17, 2015

Contributor

what happens if my Dockerfile has more than one LABEL command?

how do LABELs in FROM images affect the end result - are they additive?

@ibuildthecloud

ibuildthecloud Mar 17, 2015

Contributor

They behave just the same as ENV, so they are all additive: multiple LABEL or LABEL in the FROM image. Labels of the same name override previous ones.

@SvenDowideit SvenDowideit commented on the diff Mar 17, 2015

docs/man/docker-create.1.md
@@ -102,6 +104,12 @@ IMAGE [COMMAND] [ARG...]
'container:<name|id>': reuses another container shared memory, semaphores and message queues
'host': use the host shared memory,semaphores and message queues inside the container. Note: the host mode gives the container full access to local shared memory and is therefore considered insecure.
+**-l**, **--label**=[]
+ Set metadata on the container (e.g., --label=com.example.key=value)
@SvenDowideit

SvenDowideit Mar 17, 2015

Contributor

does this Set i.e., replace, or Add a label?

@ibuildthecloud

ibuildthecloud Mar 17, 2015

Contributor

I don't know if set or add is the right verb. The behavior is similar to ENV. The image has labels which are copied to the container and then -l can add additional ones. So maybe add is the right term.

@SvenDowideit SvenDowideit commented on the diff Mar 17, 2015

docs/man/docker-create.1.md
@@ -102,6 +104,12 @@ IMAGE [COMMAND] [ARG...]
'container:<name|id>': reuses another container shared memory, semaphores and message queues
'host': use the host shared memory,semaphores and message queues inside the container. Note: the host mode gives the container full access to local shared memory and is therefore considered insecure.
+**-l**, **--label**=[]
+ Set metadata on the container (e.g., --label=com.example.key=value)
+
+**--label-file**=[]
+ Read in a file of labels (EOL delimited)
@SvenDowideit

SvenDowideit Mar 17, 2015

Contributor

read in, and do what?

@ibuildthecloud

ibuildthecloud Mar 17, 2015

Contributor

That text was suggested by @moxiegirl.

@SvenDowideit SvenDowideit commented on the diff Mar 17, 2015

docs/sources/reference/builder.md
@@ -328,6 +328,27 @@ default specified in `CMD`.
> the result; `CMD` does not execute anything at build time, but specifies
> the intended command for the image.
+## LABEL
+
+ LABEL <key>=<value> <key>=<value> <key>=<value> ...
+
+The `LABEL` instruction adds metadata to an image. A `LABEL` is a
+key-value pair. To include spaces within a `LABEL` value, use quotes and
+blackslashes as you would in command-line parsing.
@SvenDowideit

SvenDowideit Mar 17, 2015

Contributor

Does an image inherit the labels FROM its base image?

@SvenDowideit SvenDowideit commented on the diff Mar 17, 2015

docs/sources/reference/builder.md
@@ -328,6 +328,27 @@ default specified in `CMD`.
> the result; `CMD` does not execute anything at build time, but specifies
> the intended command for the image.
+## LABEL
+
+ LABEL <key>=<value> <key>=<value> <key>=<value> ...
+
+The `LABEL` instruction adds metadata to an image. A `LABEL` is a
+key-value pair. To include spaces within a `LABEL` value, use quotes and
+blackslashes as you would in command-line parsing.
+
+ LABEL "com.example.vendor"="ACME Incorporated"
+
+An image can have more than one label. To specify multiple labels, separate each
+key-value pair by an EOL.
@SvenDowideit

SvenDowideit Mar 17, 2015

Contributor

this statement does not gel with the examples that follow it.

@SvenDowideit SvenDowideit commented on the diff Mar 17, 2015

docs/sources/reference/commandline/cli.md
@@ -1832,8 +1837,39 @@ An example of a file passed with `--env-file`
$ sudo docker run --name console -t -i ubuntu bash
-This will create and run a new container with the container name being
-`console`.
+A label is a a `key=value` pair that applies metadata to a container. To label a container with two labels:
+
+ $ sudo docker run -l my-label --label com.example.foo=bar ubuntu bash
+
+The `my-label` key doesn't specify so the label defaults to an empty
@SvenDowideit

SvenDowideit Mar 17, 2015

Contributor

key doesn't specify __a value__

@SvenDowideit SvenDowideit commented on the diff Mar 17, 2015

docs/sources/reference/commandline/cli.md
@@ -1832,8 +1837,39 @@ An example of a file passed with `--env-file`
$ sudo docker run --name console -t -i ubuntu bash
-This will create and run a new container with the container name being
-`console`.
+A label is a a `key=value` pair that applies metadata to a container. To label a container with two labels:
+
+ $ sudo docker run -l my-label --label com.example.foo=bar ubuntu bash
+
+The `my-label` key doesn't specify so the label defaults to an empty
+string(`""`). To add multiple labels, repeat the label flag (`-l` or
+`--label`).
+
+The `key=value` must be unique. If you specify the same key multiple times
@SvenDowideit

SvenDowideit Mar 17, 2015

Contributor

if I can specify a key multiple times, then the key=value doesn't need to be unique - please remove this sentence :)

ie, I can have docker run -l sven=yes -l sven=no -l sven=yes ubuntu bash and docker will not error out.

@SvenDowideit SvenDowideit commented on the diff Mar 17, 2015

docs/sources/reference/commandline/cli.md
-`console`.
+A label is a a `key=value` pair that applies metadata to a container. To label a container with two labels:
+
+ $ sudo docker run -l my-label --label com.example.foo=bar ubuntu bash
+
+The `my-label` key doesn't specify so the label defaults to an empty
+string(`""`). To add multiple labels, repeat the label flag (`-l` or
+`--label`).
+
+The `key=value` must be unique. If you specify the same key multiple times
+with different values, each subsequent value overwrites the previous. Docker
+applies the last `key=value` you supply.
+
+Use the `--label-file` flag to load multiple labels from a file. Delimit each
+label in the file with an EOL mark. The example below loads labels from a
+labels file in the current directory;
@SvenDowideit

SvenDowideit Mar 17, 2015

Contributor

can I have multiple --label-file flags? what happens?

(as i read on i get the answer to my question)

please move the 'You can load multiple....` up to this paragraph, so the reader isn't distracted :)

@SvenDowideit SvenDowideit commented on the diff Mar 17, 2015

docs/sources/reference/commandline/cli.md
+The `my-label` key doesn't specify so the label defaults to an empty
+string(`""`). To add multiple labels, repeat the label flag (`-l` or
+`--label`).
+
+The `key=value` must be unique. If you specify the same key multiple times
+with different values, each subsequent value overwrites the previous. Docker
+applies the last `key=value` you supply.
+
+Use the `--label-file` flag to load multiple labels from a file. Delimit each
+label in the file with an EOL mark. The example below loads labels from a
+labels file in the current directory;
+
+ $ sudo docker run --label-file ./labels ubuntu bash
+
+The label-file format is similar to the format for loading environment variables
+(see `--env-file` above). The following example illustrates a label-file format;
@SvenDowideit

SvenDowideit Mar 17, 2015

Contributor

can this be a link?

@SvenDowideit SvenDowideit commented on the diff Mar 17, 2015

docs/sources/userguide/labels-custom-metadata.md
@@ -0,0 +1,194 @@
+page_title: Labels - custom metadata in Docker
+page_description: Learn how to work with custom metadata in Docker, using labels.
+page_keywords: Usage, user guide, labels, metadata, docker, documentation, examples, annotating
+
+## Labels - custom metadata in Docker
+
+You can add metadata to your images, containers, and daemons via
+labels. Metadata can serve a wide range of uses. Use them to add notes or
+licensing information to an image or to identify a host.
+
+A label is a `<key>` / `<value>` pair. Docker stores the values as *strings*.
+You can specify multiple labels but each `<key>` / `<value>` must be unique. If
+you specify the same `key` multiple times with different values, each subsequent
+value overwrites the previous. Docker applies the last `key=value` you supply.
@SvenDowideit

SvenDowideit Mar 17, 2015

Contributor

this sounds like it should be written as each <key> will be unique, and have one value

if you say each key=value must be unique, aren't you saying that a key can be in the store more than once?

@SvenDowideit SvenDowideit commented on the diff Mar 17, 2015

docs/sources/userguide/labels-custom-metadata.md
+
+- Keys should only consist of lower-cased alphanumeric characters,
+ dots and dashes (for example, `[a-z0-9-.]`)
+
+- Keys should start *and* end with an alpha numeric character
+
+- Keys may not contain consecutive dots or dashes.
+
+- Keys *without* namespace (dots) are reserved for CLI use. This allows end-
+ users to add metadata to their containers and images, without having to type
+ cumbersome namespaces on the command-line.
+
+
+These are guidelines and are not enforced. Docker does not *enforce* them.
+Failing following these guidelines can result in conflicting labels. If you're
+building a tool that uses labels, you *should* use namespaces for your label keys.
@SvenDowideit

SvenDowideit Mar 17, 2015

Contributor

can we make this more explicit? conflict can mean several things - when really (if i understand correctly) the last LABEL for a particular key replaces all others. (for that image layer? - but leaving the previous image label alone?)

@SvenDowideit SvenDowideit commented on the diff Mar 17, 2015

docs/sources/reference/builder.md
+ LABEL <key>=<value> <key>=<value> <key>=<value> ...
+
+The `LABEL` instruction adds metadata to an image. A `LABEL` is a
+key-value pair. To include spaces within a `LABEL` value, use quotes and
+blackslashes as you would in command-line parsing.
+
+ LABEL "com.example.vendor"="ACME Incorporated"
+
+An image can have more than one label. To specify multiple labels, separate each
+key-value pair by an EOL.
+
+ LABEL com.example.label-without-value
+ LABEL com.example.label-with-value="foo"
+ LABEL version="1.0"
+ LABEL description="This text illustrates \
+ that label-values can span multiple lines."
@SvenDowideit

SvenDowideit Mar 17, 2015

Contributor

Can we note that the example above will result in 4 image layers - each of which will have different sets of labels?

and if the example as the same key used twice, we can show how the conflict resolution works, and that the subsequent label doesn't affect that of the lower layer?

@SvenDowideit SvenDowideit commented on the diff Mar 17, 2015

docs/sources/userguide/labels-custom-metadata.md
+string. For example, consider this JSON:
+
+
+ {
+ "Description": "A containerized foobar",
+ "Usage": "docker run --rm example/foobar [args]",
+ "License": "GPL",
+ "Version": "0.0.1-beta",
+ "aBoolean": true,
+ "aNumber" : 0.01234,
+ "aNestedArray": ["a", "b", "c"]
+ }
+
+You can store this struct in a label by serializing it to a string first:
+
+ LABEL com.example.image-specs="{\"Description\":\"A containerized foobar\",\"Usage\":\"docker run --rm example\\/foobar [args]\",\"License\":\"GPL\",\"Version\":\"0.0.1-beta\",\"aBoolean\":true,\"aNumber\":0.01234,\"aNestedArray\":[\"a\",\"b\",\"c\"]}"
@SvenDowideit

SvenDowideit Mar 17, 2015

Contributor
LABEL com.example.image-specs="{\
    \"Description\":\"A containerized foobar\",\
    \"Usage\":\"docker run --rm example\\/foobar [args]\",\
    \"License\":\"GPL\",\
    \"Version\":\"0.0.1-beta\",\
    \"aBoolean\":true,\
    \"aNumber\":0.01234,\
    \"aNestedArray\":[\"a\",\"b\",\"c\"]\
}"

might be less painful.

Contributor

SvenDowideit commented Mar 17, 2015

yup, minor nits - basically, most of my questions are answered elsewhere, but the reader won't know that at the time.

LGTM - though if you address the nits, it'll Look even better :)

Contributor

icecrime commented Mar 17, 2015

Thanks all, and thanks Darren for your patience! @moxiegirl will take care of the final adjustments in a separate PR.

icecrime pushed a commit that referenced this pull request Mar 17, 2015

Merge pull request #9882 from ibuildthecloud/labels
Proposal: One Meta Data to Rule Them All => Labels

@icecrime icecrime merged commit b6ac111 into moby:master Mar 17, 2015

1 of 2 checks passed

windows Jenkins build Windows-PRs 469 has failed
Details
janky Jenkins build Docker-PRs 3445 has succeeded
Details
Contributor

ibuildthecloud commented Mar 17, 2015

@icecrime Thank you so much for helping move this along.

Member

thaJeztah commented Mar 17, 2015

Wow! Happy to see this merged. Thanks @ibuildthecloud for finally making this happen!

wyaeld commented Mar 17, 2015

huge thank-you to all the people who worked on this and the various discussions that led to it, such a useful building block

One question (not to try and increase scope here though). Do people think it makes sense for the distribution components (docker search) to eventually allow for filtering on the labels?

@TomasTomecek TomasTomecek commented on the diff Mar 17, 2015

docs/man/Dockerfile.5.md
@@ -143,6 +143,21 @@ A Dockerfile is similar to a Makefile.
**CMD** executes nothing at build time, but specifies the intended command for
the image.
+**LABEL**
+ -- `LABEL <key>[=<value>] [<key>[=<value>] ...]`
+ The **LABEL** instruction adds metadata to an image. A **LABEL** is a
+ key-value pair. To include spaces within a **LABEL** value, use quotes and
+ blackslashes as you would in command-line parsing.
@TomasTomecek

TomasTomecek Mar 17, 2015

Contributor

blackslashes → backslashes

@thaJeztah

thaJeztah Mar 17, 2015

Member

@TomasTomecek that's "dark matter" I think 😄 Do you want to make a pull-request to fix that?

@TomasTomecek

TomasTomecek Mar 17, 2015

Contributor

@thaJeztah
@icecrime said that @moxiegirl will do final adjustments; I guess this can be included in those

@thaJeztah

thaJeztah Mar 17, 2015

Member

Ah, you're right! Thanks for spotting it nevertheless :)

@dreamcat4 dreamcat4 referenced this pull request in michaelsauter/crane Mar 17, 2015

Closed

Support new Docker 1.6.0 features - Labels #160

Contributor

rhatdan commented Mar 17, 2015

This is huge. Only been waiting on this one for about a year....

Contributor

bfirsh commented Mar 17, 2015

Yaaaay. What an excellent birthday present. Thanks all for your help.

We're going to try and get Compose support in for this in the next release: docker/compose#1066

@bfirsh Well if you implement Compose support, please consider @thaJeztah's idea earlier on in this thread ^^. So compose can be auto-converting nested data structured into json text blob. So we can have basically annotations-like feature (immutable), and arbitrary or complex data structures saved into labels.

nicornk commented Aug 3, 2015

Hi Guys,
is it possible / how is it possible to add a label to an already running container?
Thanks!

Contributor

cpuguy83 commented Aug 3, 2015

@nicornk Not possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment