Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Save "META" field in Dockerfile into image content. #7470

Closed
wants to merge 1 commit into from

Conversation

rhatdan
Copy link
Contributor

@rhatdan rhatdan commented Aug 7, 2014

This will allow a user to save META data into an image, which
can later be retrieved using:

docker inspect IMAGEID

I have copied this from the "Comment" handling in docker images.

We want to be able to add json data to an image to describe the image, and
then be able to use other tools to look at this data, to be able to do
security checks based on this data.

We are thinking about adding version names, Perhaps listing the content of the dockerfile.
Descriptions of where the code came from etc.

One option is to call this flag Meta, another would be to call it Vendor.

Docker-DCO-1.1-Signed-off-by: Dan Walsh dwalsh@redhat.com (github: rhatdan)

@SvenDowideit
Copy link
Contributor

If you're thinking about adding a json payload, wouldn't it be better to define it as always JSON hash - rather than making it a string and then hoping that the divergent uses don't cause pain? that way, the separator escaping becomes more consistent too?

@rhatdan
Copy link
Contributor Author

rhatdan commented Aug 8, 2014

Did not know I could do that, I will experiment with this today. But yes I believe this data should be in JSON format.

@thaJeztah
Copy link
Member

If multiple META instructions occur in a Dockerfile, would the last one overwrite the previous one?

I can imaging meta being used for various purposes. If data is stored as (a) JSON hash(es), I would like to have the option to update/append/remove data in that hash. Especially, if meta-data is inherited from the 'FROM' image.

@rhatdan
Copy link
Contributor Author

rhatdan commented Aug 8, 2014

Updated to require JSON Format

@rhatdan
Copy link
Contributor Author

rhatdan commented Aug 8, 2014

@thaJeztah I think that each image would have its own META data. So you could walk through the layered images to see all of the data.

@rhatdan
Copy link
Contributor Author

rhatdan commented Aug 8, 2014

Here is a little python script that would allow you to look at the Meta data of all images used to create the current image.

inspect.py
#! /usr/bin/python
from subprocess import Popen, PIPE
import sys

import json
def inspect(id):
fd = Popen("docker inspect %s" % id, shell=True, stdout=PIPE).stdout
l = fd.read()
fd.close()
js=json.loads(str(l))[0]
print id, "Meta:", js["Meta"]
if js["Parent"] != "":
inspect(js["Parent"])
inspect(sys.argv[1])

@thaJeztah
Copy link
Member

@rhatdan makes sense yeah. Would require me to loop through all layers, but I guess that's unavoidable.

Thx

@thaJeztah
Copy link
Member

@rhatdan thanks again for that example, our comments just crossed 😄

Should this usage example be part of the docs, of would that be "out of scope"? Wdyt?

@rhatdan
Copy link
Contributor Author

rhatdan commented Aug 8, 2014

Currently it looks like it appends them together.
FROM fedora:rawhide
MAINTAINER “Dan Walsh” dwalsh@redhat.com
ENV container docker
META { "Description" : "This image is used to start the foo executable, Based on the foo application", "vendor" : "ACME Products", "Version" : "1.0" }
META { "Description2" : "d1" }
CMD "/bin/sh"

docker inspect
...
"Meta": {
"Description": "This image is used to start the foo executable, Based on the foo application",
"Description2": "d1",
"Version": "1.0",
"vendor": "ACME Products"
},

@thaJeztah
Copy link
Member

Awesome! That's actually what I like!

That should definitely be included as an example in the docs, so that users know what to expect when using META multiple times.

@rhatdan
Copy link
Contributor Author

rhatdan commented Aug 8, 2014

Well the Meta Data is really only for the Image, IE If you build an image from Red Hat and then added data from "ACME" you would want Vendor of one layer to say ACME, and the base image to say Redhat

@cpuguy83
Copy link
Member

cpuguy83 commented Aug 8, 2014

IAMTM but it seems like this would be better implemented as a simple k/v store.
Value would just be a string and not unmarshalled on the server in any way, since it's meta and not really important to Docker itself.
Dockerfile would take something like "LABEL key value" with support for N number of labels
If you want a big ole hunk of json in that value you could do it, but aren't forced to.

I also think we are trying to reduce the amount of JSON exposed in the UI (Dockerfiles, CLI, etc)

Again, IAMTM so don't go changing things based on this comment.

@thaJeztah
Copy link
Member

Ok, right about that, but I would be able to obtain that info by looping through the layers?

Other option would be to organise the meta-data using the layer-id as key, e.g.

Meta: {
    aabbcc1 : { organisation: "RedHat" }
    bbccdd1 : { organisation: "ACME"}
}

But I feel that would be awkward to use and replicating the layers?

Update; comment was targeted at @rhatdan, not @cpuguy83 .. Timing, LOL

@thaJeztah
Copy link
Member

@cpuguy83 +1 on that one, only wondering;

  • Will each META instruction add a new layer? (We might run out of layers pretty fast?
  • Easier method for adding a bunch of keys (for those things, JSON is actually nice. But I do concur that JSON looks a bit foreign inside Dockerfiles)

@rhatdan
Copy link
Contributor Author

rhatdan commented Aug 12, 2014

How about if I allow both syntax

META Organization: Red Hat, version :1.2, License: ASL 2.0,

The code would just wrap this code with {} and do a json test.

and standard json

META { Organization: Red Hat, version :1.2, License: ASL 2.0 }

@rhatdan
Copy link
Contributor Author

rhatdan commented Aug 12, 2014

To give you an idea of what kind of data we would see being stored in META data fields. Look at what rpm ships

rpm -qi httpd

Name : httpd
Version : 2.4.10
Release : 2.fc22
Architecture: x86_64
Install Date: Mon 04 Aug 2014 09:45:02 AM EDT
Group : System Environment/Daemons
Size : 3967053
License : ASL 2.0
Signature : (none)
Source RPM : httpd-2.4.10-2.fc22.src.rpm
Build Date : Thu 31 Jul 2014 07:25:12 AM EDT
Build Host : buildvm-03.phx2.fedoraproject.org
Relocations : (not relocatable)
Packager : Fedora Project
Vendor : Fedora Project
URL : http://httpd.apache.org/
Summary : Apache HTTP Server
Description : The Apache HTTP Server is a powerful, efficient, and extensible web server.

@thaJeztah
Copy link
Member

@rhatdan is that JSON or a multi-line string? Build Date should be something like BuildDate for JSON format?

Automatically wrapping the string following META with {} looks a bit fragile and error prone to me, what if I did this;

META http://example.com

Which will probably become:

{
    http: "//example.com"
}

At this point, the limitations of the Dockerfile format are becoming a bit of an issue (IMO), especially when defining "structured" information.

@rhatdan
Copy link
Contributor Author

rhatdan commented Aug 13, 2014

Yes after fooling around with this a little more, I think sticking to json would be best. Writing the META content is a programmer/sysadmin job, they should be able to figure out how to write json.

I worked late into the night last night, trying to get the Dockerfile content recorded into the Meta Field, and ended up settling on encoding the dockerfile commands as a string separated by \n.

For a dockerfile that looks like:

cat ~/Dockerfile

FROM fedora:rawhide
MAINTAINER "Dan Walsh" dwalsh@redhat.com
ENV container docker
META { "Description" : "This image is used to start the foo executable", "Based on the foo application, vendor" : "ACME Products", "Version" : "1.0" }
CMD ["/bin/sh"]

I end up with

docker inspect
...
"Meta": {
"Based on the foo application, vendor": "ACME Products",
"Description": "This image is used to start the foo executable",
"Dockerfile": "FROM fedora:rawhide\nMAINTAINER "Dan Walsh" \u003cdwalsh@redhat.com\u003e\nENV container=docker\nMETA { "Description" : "This image is used to start the foo executable", "Based on the foo application, vendor" : "ACME Products", "Version" : "1.0" }\nCMD [/bin/sh]",
"Version": "1.0"
},

If someone has a better format than this, I am all ears.

@thaJeztah
Copy link
Member

Writing the META content is a programmer/sysadmin job, they should be able to figure out how to write json.

Just be aware that these;

"Hello world"
1234
12.25
[1,"234","FooBar", {hello:"World"}]

Are (afaik) technically valid JSON (so, including strings, integers and floats), but not objects. Wether or not your implementation should accept them, is probably up to you :)

@rhatdan
Copy link
Contributor Author

rhatdan commented Aug 13, 2014

Not sure why I wouldn't allow them. If an admin or programmer wants to put strange meta data into his image, not sure why I would block it.

@thaJeztah
Copy link
Member

👍 comment was mainly to confirm if your PR wouldn't fail if something like that was used, don't know if the tests currently check for those :) (haven't looked TBH)

@rhatdan
Copy link
Contributor Author

rhatdan commented Sep 2, 2014

Back from Vacation, updating Pull Requests and seeing if there is anything I can do to move these along.

@vbatts
Copy link
Contributor

vbatts commented Sep 24, 2014

since everything is metadata, including Ports and Env, perhaps META is too generic for this. Really the distinction I'm seeing for this feature, which I feel has a place in the images ecosystem, is that this concept conveys information that is bound to the image, deterministically accesible (not like a random META-INF file in the image's filesystem), and is part of the information provided from image builder to the consumer of the image.
Perhaps ANNOTATIONS would a better command name?

@rhatdan
Copy link
Contributor Author

rhatdan commented Sep 24, 2014

The other Name that came up was Vendor.


if r.Form.Get("meta") != "" {
if err := json.Unmarshal([]byte(r.Form.Get("meta")), &meta); err != nil {
return fmt.Errorf("META Data must be specified in valid json format")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could be a little bit more specific for API error messages: ideally validation error like this would list verbatim: the field, optionally their value and what was the failed. expectation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the error message, so now

grep META ~/Dockerfile
META { Da=1, "Description" : "This image is used to start the foo executable", "Vendor" : "ACME Products", "Version" : "1.0" }

Would give an error like.

Step 3 : META { Da=1, "Description" : "This image is used to start the foo executable", "Vendor" : "ACME Products", "Version" : "1.0" }
2014/10/03 13:42:09 META Data must be specified in valid json format: invalid character 'D' looking for beginning of object key string

@rhatdan rhatdan force-pushed the meta branch 5 times, most recently from 3a080a0 to 59dd2f0 Compare October 6, 2014 13:59
@TJNII
Copy link

TJNII commented Oct 10, 2014

+1 for this feature, being able to embed metadata and pull it from the API will be very useful in an enterprise environment.

This will allow a user to save META data into an image, which
can later be retrieved using:

docker inspect IMAGEID

I have copied this from the "Comment" handling in docker images.

We want to be able to add json data to an image to describe the image, and
then be able to use other tools to look at this data, to be able to do
security checks based on this data.

We are thinking about adding version names, Perhaps listing the content of the dockerfile.
Descriptions of where the code came from etc.

One option is to call this flag Meta, another would be to call it Vendor.

Docker-DCO-1.1-Signed-off-by: Dan Walsh <dwalsh@redhat.com> (github: rhatdan)
type Image struct {
ID string `json:"id"`
Parent string `json:"parent,omitempty"`
Comment string `json:"comment,omitempty"`
Meta *MetaData `json:"meta,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Meta does not clearly reflect intended use. I would prefer something more explicit like userdata or maybe labels?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like userdata is the preference among maintainers so far.

@shykes
Copy link
Contributor

shykes commented Nov 4, 2014

We are reviewing design here, since the implementation itself is against the old builder implementation. If we agree on design we can go back to talking about implementation.

Comments on design:

  • Everybody seems to agree allowing arbitrary user data in an image is useful.
  • Everybody agrees simple map[string]string is sufficient and useful
  • We don't like the term "meta", preference seems to be "userdata" (makes it more clear what it's for)
  • It would be nice to have an explicit plan for overriding some userdata-based patterns with new top-level fields
  • The examples in this patch (description, vendor, version) are not good uses of userdata, we should definitely add them to top-level instead.
  • We already have an established Dockerfile syntax for setting key-values, see eg. ENV. It's better to use that than add another flavor.
  • Let's not do commit --meta and import --meta, that is better done through the planned --change

@rhatdan to avoid wasting more of your time on implementation back-and-forth, could you re-submit a docs-only patch, so we can lock down design (both UI and API)? Once that's approved, it will be easier to get your implementation through. Note: when submitting that docs patch, don't worry too much about cosmetic/form review by docs maintainers until the content itself is approved.

Closing for bookkeeping, looking forward to the v2.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants