Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds Docker Image v1 Spec Documention #9560

Merged
merged 1 commit into from Jan 12, 2015

Conversation

Projects
None yet
@jlhawn
Copy link
Contributor

jlhawn commented Dec 8, 2014

Docker-DCO-1.1-Signed-off-by: Josh Hawn josh.hawn@docker.com (github: jlhawn)

@jlhawn

This comment has been minimized.

Copy link
Contributor Author

jlhawn commented Dec 8, 2014

From issue #9538:

If you were to complain that Docker's image format and runtime specification, as massively adopted as it is, is not appropriately documented, and it could be made easier to produce alternate implementations - then I would completely agree with you. In response, I would encourage the project maintainers to improve the specs documentation based on your suggestions.

Well, here it is 🐋

@SvenDowideit @fredlf Please review and give feedback.

@jlhawn jlhawn force-pushed the jlhawn:image_spec branch 3 times, most recently from 3997360 to 82f5917 Dec 8, 2014

@jessfraz

This comment has been minimized.

Copy link
Contributor

jessfraz commented Dec 8, 2014

😍

@jlhawn

This comment has been minimized.

Copy link
Contributor Author

jlhawn commented Dec 8, 2014

also @vbatts @dmp42 @crosbymichael @tianon @jfrazelle @unclejack @docker/distribution-trust @nathanleclaire @cpuguy83 @huslage and anyone else in the community that comes across this - please read through it and comment on anything that isn't clear or anything that requires more explanation, keeping in mind that this is not a new specification but is only documentation of how images are currently create/formatted in Docker.

I was thinking we could also generate a list of 'issues' with this specification to include in the bottom - something that could help us drive design of the next major version of the specification. Here are a few things I can think of for example:

- image IDs are an implementation detail of storage drivers in Docker and shouldn't be part of the specification.
- there is extraneous or useless info in some of the fields:
    - container id?
    - config *and* containerConfig? containerConfig seems to be only useful for the build system and nothing else.
- why is OnBuild in the runConfig and not top-level in the image JSON?
- is every field of the `runconfig.Config` struct necessary or useful?
- etc...
@jlhawn

This comment has been minimized.

Copy link
Contributor Author

jlhawn commented Dec 8, 2014

also @metalivedev ;-)


The execution parameters which should be used as a base when running a container using the image.

<h4>Container RunConfig Field Descriptions</h4>

This comment has been minimized.

@icecrime

icecrime Dec 8, 2014

Contributor

This container config has been a strong point of confusion for me and several others. As far as I can tell:

  • This provide defaults values for settings if not specified at run time (e.g: CpuShares)
  • Some of these settings are completely ignored (e.g.: Tty, Attach*, ...)
  • As far as I understand, this whole idea of "default container config" is out of v2 image format so although this is a purely v1 documentation, don't you think it might be relevant to add a "deprecation warning"?

This comment has been minimized.

@jlhawn

jlhawn Dec 8, 2014

Author Contributor

good points @icecrime

I mentioned above:

  • is every field of the runconfig.Config struct necessary or useful?

I can update this document to clarify this - only I'm not entirely sure which fields are ignored. I guess I can dig up the code to find out exactly what's going on: https://github.com/docker/docker/blob/58ce0146e16e2e63b7a94d34a48722a9c7400c18/daemon/daemon.go#L418

Do you happen to know which fields are used? @erikh I think you have some expertise with runconfig, could you shed any light on this?

This comment has been minimized.

@icecrime

icecrime Dec 9, 2014

Contributor

I think it's all in runconfig.Merge. Fields used:

  • Cmd
  • CpuShares
  • Entrypoint
  • Env
  • ExposedPorts (and its legacy counterpart PortSpecs)
  • Memory
  • MemorySwap
  • User
  • Volumes
  • WorkingDir
parent <code>string</code>
</dt>
<dd>
Randomly generated, 256-bit, hexadecimal encoded. Uniquely identifies the parent image. If there is no parent image then its value is <code>""</code>.

This comment has been minimized.

@metalivedev

metalivedev Dec 9, 2014

Contributor

The parent id is not randomly generated -- it is the id of the parent (so it has a definite referent and is not random).

Volumes <code>struct</code>
</dt>
<dd>
TODO: Entries are in some format... I dunno.

This comment has been minimized.

@metalivedev

metalivedev Dec 9, 2014

Contributor

Who can flesh this out?

/bin/my-app-tools
```

The TarSum checksum for the archive file is then computed and placed in the JSON metadata along with the execution parameters and our image is built!

This comment has been minimized.

@metalivedev

metalivedev Dec 9, 2014

Contributor

Then where does the JSON metadata go? What is the final format for the layer data + metadata?

This comment has been minimized.

@jlhawn

jlhawn Dec 9, 2014

Author Contributor

Good point. We should definitely include documentation of what the format of docker load and docker save is. It describes how they are joined together. It's basically another Tar archive which includes the JSON metadata and archives for all of the image layers.

For example, here's what the full archive of library/busybox is (in tree format):

.
├── 5785b62b697b99a5af6cd5d0aabc804d5748abbb6d3d07da5d1d3795f2dcc83e
│   ├── VERSION
│   ├── json
│   └── layer.tar
├── a7b8b41220991bfc754d7ad445ad27b7f272ab8b4a2c175b9512b97471d02a8a
│   ├── VERSION
│   ├── json
│   └── layer.tar
├── a936027c5ca8bf8f517923169a233e391cbb38469a75de8383b5228dc2d26ceb
│   ├── VERSION
│   ├── json
│   └── layer.tar
├── f60c56784b832dd990022afc120b8136ab3da9528094752ae13fe63a2d28dc8c
│   ├── VERSION
│   ├── json
│   └── layer.tar
└── repositories

Where the content of the VERSION files is simply:

1.0

And the repositories file is another JSON file which describes names/tags:

{  
    "busybox":{  
        "latest":"5785b62b697b99a5af6cd5d0aabc804d5748abbb6d3d07da5d1d3795f2dcc83e"
    }
}

This comment has been minimized.

@jlhawn

jlhawn Dec 10, 2014

Author Contributor

I think I'll just add the above comment to the document...

This comment has been minimized.

@nathanleclaire

nathanleclaire Dec 10, 2014

Contributor

No exclamation point here

Deleted: /etc/my-app-config
```

It then creates a Tar Archive which contains *only* this changeset: The added and modified files in their entirety, and for each deleted item it creates an entry for an empty file at the same location but prefixes the basename of the file with `.wh.`. These `.wh.` prefixed files are known as whiteout files. The resulting Tar archive for `f60c56784b83` has the following entries:

This comment has been minimized.

@metalivedev

metalivedev Dec 9, 2014

Contributor

Suggested: "The filenames prefixed with .wh. are known as "whiteout" files."

Is the name the only indication of the special nature of these files? That is, if I had a file named .wh.somename actually in my tree, would the file be unpacked to the layer? I'm kind of hoping there is some permissions bit or something set that together with the name means it is a special file.


## Loading an Image Filesystem Changeset

Loading an Image Filesystem Changeset is simply the inverse of the above operation: start with an empty directory for the rootfs of the container and extract each of the changesets of an image in order, treating a whiteout file as a sign to remove the file with the given name sans the `.wh.` prefix.

This comment has been minimized.

@metalivedev

metalivedev Dec 9, 2014

Contributor

Do we have a reference implementation of packing and unpacking layers? I thought there was some really basic chroot filesystem driver. If so, and if it is easy to understand, we should reference it from this spec as an implementation example.

@jamescarr

This comment has been minimized.

Copy link
Contributor

jamescarr commented Dec 9, 2014

👍 this is great to see!

@jlhawn jlhawn force-pushed the jlhawn:image_spec branch 2 times, most recently from fcaaef4 to e073386 Dec 10, 2014

@jlhawn

This comment has been minimized.

Copy link
Contributor Author

jlhawn commented Dec 10, 2014

I've just pushed a major update to the draft spec. Please review again if you already have!

Layer
</dt>
<dd>
Refers to either one or both of the JSON metadata and filesystem changes for a single link in a chain of layers that make up a complete image. To refer to either specifically, one may use the terms `Image/Layer JSON` or `Image/Layer Metadata` to refer to its JSON metadata and `Image/Layer Filesystem Changeset` or `Image/Layer Diff` to refer to the set of filesystem changes.

This comment has been minimized.

@tiborvass

tiborvass Dec 10, 2014

Collaborator

I find this pretty hard to understand.

This comment has been minimized.

@jlhawn

jlhawn Dec 10, 2014

Author Contributor

Do you think it'd be okay to just delete the second sentence of this paragraph? I realize I probably went a little crazy in the second sentence... we really should agree on some common terminology though. It's a bit confusing to have a single term used loosely to refer to multiple things :(

This comment has been minimized.

@tiborvass

tiborvass Dec 10, 2014

Collaborator

Do you think we could force a definition and make sure we use that everywhere?

This comment has been minimized.

@jlhawn

jlhawn Dec 10, 2014

Author Contributor

I'm okay with the first sentence definition if everyone else is.

This comment has been minimized.

@nathanleclaire

nathanleclaire Dec 10, 2014

Contributor

I would phrase it something like this:

Images are composed of "layers". "Image layer" is a general term which may be used to refer to one or both of the following:

  1. The metadata for the layer, described in the JSON format
  2. The filesystem changes described by a layer

To refer to the former specifically, the terms "Layer JSON" or "Layer Metadata" are frequently used.

To refer to the latter, the terms "Image Filesystem Changeset" or "Image Diff" are frequently used.

WDYT?

This comment has been minimized.

@jlhawn

jlhawn Dec 10, 2014

Author Contributor

sounds good to me, @nathanleclaire I'll update it.

@@ -0,0 +1,532 @@
# Docker Image Specification v1.0.0

A Docker Image is an ordered collection of root filesystem changes and their corresponding execution parameters. Filesystem change sets exists as Tar archives which are extracted and applied in order starting from an empty directory. Because every image is accompanied with execution parameters, any one image layer may be run as its own Docker container as long as all ancestor changesets are applied first.

This comment has been minimized.

@mmdriley

mmdriley Dec 10, 2014

Contributor

What is a "filesystem change"? Clearly it has "execution parameters", so a change is something executable? In the next sentence they're "change sets"?

What does it mean to "run" a Docker container? Are all containers "runnable"? You claim all ancestors of a Docker container are runnable, but some may not include the "CMD" or "ENTRYPOINT" commands added in later layers.

I'm worried that the goal of this effort -- adding a rigorous specification that enables independent implementations -- is at risk if we start off already assuming tons of context about how the official Docker client works in December 2014.

This comment has been minimized.

@nathanleclaire

nathanleclaire Dec 10, 2014

Contributor

I'd specify the purpose in the first sentence, as in:

A "Docker Image" is an ordered collection of root filesystem changes and their corresponding execution parameters, for use with the Docker container runtime. Images allow the basis for executing a process which is controlled in a fine-grained way in an isolated (chroot-like) environment.

Then move the other two sentences (as they're more specific and not as general) later in the spec.

This comment has been minimized.

@jlhawn

jlhawn Dec 10, 2014

Author Contributor

@mmdriley wrote:

What does it mean to "run" a Docker container? Are all containers "runnable"? You claim all ancestors of a Docker container are runnable, but some may not include the "CMD" or "ENTRYPOINT" commands added in later layers.

Any layer of an image is still runnable. It's ultimately up to the users to either set valid defaults for the entry point and/or command and they can always set a command to run upon creating the container with any image - and they must if one isn't specified. It's also dependent upon the user to execute a command that is valid in the first place, e.g., trying to run anything in an image with an empty filesystem would simply result in an error - "file not found".

I'm worried that the goal of this effort -- adding a rigorous specification that enables independent implementations -- is at risk if we start off already assuming tons of context about how the official Docker client works in December 2014.

A lot of things in Docker seem to have traditionally been implementation driven. I'm glad to see the project now moving towards a more specification/open-design driven way of doing things. While I think the project does a good job of documenting how to use Docker, unfortunately many details on how some components are currently designed and implemented have always been spread across various parts of the code or distributed in the knowledge of maintainers and experts of different subsystems. So, I'm not quite sure how to answer this part of your question other than by just pointing out that this isn't meant to be a rigorous new specification but just description of how it already works.

@nathanleclaire I'll switch to use your suggested intro for now. @mmdriley do you have a suggestion for an abstract description of a container image that we could use as an introduction?

@jlhawn jlhawn force-pushed the jlhawn:image_spec branch from 5d92a6b to 8366746 Dec 10, 2014

Image ID
</dt>
<dd>
The randomly generated ID given to an image or image layer upon its creation. It is represented as a hexidecimal encoding of 256 bits, e.g., `a9561eb1b190625c9adb5a9513e72c4dedafc1cb2d4c5236c9a6957ec7dfd5a9`.

This comment has been minimized.

@mmdriley

mmdriley Dec 10, 2014

Contributor

"image or image layer" -- this seems odd. Do images and image layers have IDs in different namespaces?

Need the image ID necessarily be random, or is the assertion simply that it need not have semantic meaning?

If random, must the ID be from a CSPRNG?


The commands `docker load` and `docker save` work with a single Tar archive which contains complete information about an image, including:

- repository names/tags

This comment has been minimized.

@mmdriley

mmdriley Dec 10, 2014

Contributor

These are presented without definition.

This comment has been minimized.

@jlhawn

jlhawn Dec 10, 2014

Author Contributor

Thanks! I've added term definitions for "Repository" and "Tag" in the latest commit.

Image
</dt>
<dd>
A collection consisting of the JSON metadata and filesystem changes of an image and those of all of its parent images.

This comment has been minimized.

@nathanleclaire

nathanleclaire Dec 10, 2014

Contributor

This definition is recursive.

@nathanleclaire

This comment has been minimized.

Copy link
Contributor

nathanleclaire commented Dec 10, 2014

cc @jamtur01 would be great to get your input on this

Image Filesystem Changeset
</dt>
<dd>
An archive of the new or changed files and directories which a layer of an image has. This archive also contains special "whiteout" files, which have names beginning with `.wh.`, which describe that that file or directory has been deleted from its parent image's filesystem. These archives can be made trivially by a layer-based/union filesystem such as AUFS or OverlayFS or by computing the diff of two directories (one corresponding to a snapshot of the parent image's filesystem and the other the current image's filesystem).

This comment has been minimized.

@mmdriley

mmdriley Dec 10, 2014

Contributor

It seems like any description of whiteout files and their semantics is going to #include the implementation details of a specific version of aufs, with specific config/compilation flags, along with the flags Docker invokes it with. For example, this paragraph from the aufs documentation:

The whiteout is for hiding files on lower branches. Also it is applied to stop readdir going lower branches. The latter case is called ’opaque directory.’ Any whiteout is an empty file, it means whiteout is just an mark. In the case of hiding lower files, the name of whiteout is ’.wh..’ And in the case of stopping readdir, the name is ’.wh..wh..opq’ or ’.wh.__dir_opaque.’ The name depends upon your compile configuration CONFIG_AUFS_COMPAT. All whiteouts are hardlinked, including ’/.wh..wh.aufs.’

This comment has been minimized.

@nathanleclaire

nathanleclaire Dec 10, 2014

Contributor

I'd rm the bit about whiteout files (cover it later) and phrase like:

An archive of the files which have been added, changed, or deleted in an image layer. Using a layer-based or union filesystem such as AUFS, or by computing the diff from filesystem snapshots, the filesystem changeset can be used to present a series of image layers as if it were one cohesive filesystem.

Env <code>array of strings</code>
</dt>
<dd>
Entries are in the format of <code>VARNAME="var value"</code>.

This comment has been minimized.

@mmdriley

mmdriley Dec 23, 2014

Contributor

Are these double-quotes normative?

This comment has been minimized.

@jlhawn

jlhawn Dec 24, 2014

Author Contributor

I probably shouldn't have included the quotes in this example. I believe the way this value should be interpreted is that the substring before the first = is the variable name and everything after is the value. In Docker, I think this is passed directly to the execution driver in this format. @crosbymichael could you clarify this for us please?

Cmd <code>array of strings</code>
</dt>
<dd>
Default arguments to the entry point of the container. These

This comment has been minimized.

@mmdriley

mmdriley Dec 23, 2014

Contributor

Not just arguments but the command as well.

<dd>
A list of arguments to use as the command to execute when the
container starts. This value acts as a default and is replaced
by an entrypoint specified when creating a container.

This comment has been minimized.

@mmdriley

mmdriley Dec 23, 2014

Contributor

ENTRYPOINT and CMD do serve different purposes, but by this point in the spec it's not clear how they should be interpreted differently.

An example of creating an Image Filesystem Changeset follows.

An image root filesystem is first creating as an empty directory named with the
ID of the image being created. Here is the initial empty directory structure

This comment has been minimized.

@mmdriley

mmdriley Dec 23, 2014

Contributor

The name of the directory doesn't matter, does it? At least, not for the most common image format.

This comment has been minimized.

@jlhawn

jlhawn Dec 24, 2014

Author Contributor

Nope. I'll add this to the end of the paragraph:

Implementations need not name the rootfs directory in this way but it may be
convenient for keeping record of a large number of image layers.

A Tar Archive is then created which contains *only* this changeset: The added
and modified files in their entirety, and for each deleted item an entry for an
empty file at the same location but with the basename of the file prefixed with
`.wh.`. The filenames prefixed with .wh. are known as "whiteout" files. NOTE:

This comment has been minimized.

@mmdriley

mmdriley Dec 23, 2014

Contributor

What if I delete a directory?

This comment has been minimized.

@jlhawn

jlhawn Dec 24, 2014

Author Contributor

it should be the same logic and is explained in the last paragraph of this section:

  • Extract all contents of each archive.
  • Walk the directory tree once more, removing any files with the prefix
    .wh. and the corresponding file or directory named without this prefix.

@jlhawn jlhawn force-pushed the jlhawn:image_spec branch from 5acaed2 to c816e95 Dec 24, 2014

<dd>
The username or UID which the process in the container should
run as. This acts as a default value to use when the value is
not specified when creating a container.

This comment has been minimized.

@tianon

tianon Dec 24, 2014

Member

All the following are valid:

  • user
  • uid
  • user:group
  • uid:gid
  • uid:group
  • user:gid

If group/gid is not specified, the default group and supplementary groups of the given user/uid in /etc/passwd from the container are applied.

This comment has been minimized.

@jlhawn

jlhawn Dec 24, 2014

Author Contributor

thanks @tianon !

@mmdriley

This comment has been minimized.

Copy link
Contributor

mmdriley commented Jan 1, 2015

lgtm. No doubt there are still nits to be picked, but this is a great step forward in unambiguously-described behavior. Thanks for your time and effort in compiling it.

@crosbymichael

This comment has been minimized.

Copy link
Member

crosbymichael commented Jan 5, 2015

@jlhawn Any more edits remaining?

There is no reason for shykes to look at this if you are just documenting the reality of the current system.

@shykes

This comment has been minimized.

Copy link
Collaborator

shykes commented Jan 5, 2015

Correct, if we're documenting today's design don't feel obligated to wait for my +1

@jlhawn

This comment has been minimized.

Copy link
Contributor Author

jlhawn commented Jan 5, 2015

@crosbymichael I think I just need to add in the User field description from @tianon and we're set!

@jessfraz jessfraz added the docs-only label Jan 6, 2015

@jlhawn jlhawn force-pushed the jlhawn:image_spec branch from c816e95 to a1e3b04 Jan 12, 2015

@bfirsh

This comment has been minimized.

Copy link
Contributor

bfirsh commented Jan 12, 2015

This is brilliant. Thanks @jlhawn. I'm really glad we're taking steps towards specifying how Docker works.

Perhaps we could this in for Docker 1.5 and shout about it a bit. ^_^

@jessfraz

This comment has been minimized.

Copy link
Contributor

jessfraz commented Jan 12, 2015

I don't know what we are waiting on @jlhawn

@jessfraz

This comment has been minimized.

Copy link
Contributor

jessfraz commented Jan 12, 2015

Maybe just a squash of commits?

Adds Docker Image v1 Spec Documention
Many iterations have gone into documenting a v1 specification of Docker's Image
format.

v1 Image spec: clarify parent field

- metalivedev pointed out that the description was ambiguous, so I've removed
  mention that it was randomly generated. It IS the ID of the parent image.

Updated v1 image specificatino documentation

- More complete details and deprication notifications for each field
  in the JSON metadata of an image.
- Details on the format for packaging combined Image JSON + Filesystem
  Changeset archives for all layers of an image.

Clarify description of an image "Layer" in v1 spec

Updated intro of image v1 spec

Updated image v1 spec after more review

- Removed description of "Image" from the terminology section. The entire
  document is meant to serve this purpose.
- Updated the definition of "Image Filesystem Changeset".
- Clarified the level of randomness needed for generating image IDs.
- Updated the description of "Image Checksum".
- Added term descriptions for "Repository" and "Tag"
- Removed extraneous/implementation-specific fields from the Image JSON
  example file and field descriptions:
  - removed "container_config" and "docker_version" fields.
  - Added missing "author" field example and description.
- Removed extraneous/implementation-specific fields from the "config" struct
  example and description:
  - removed "Hostname", "Domainname", "Cpuset", "AttachStdin", "AttachStdout",
    "AttachStderr", "PortSpecs", "Tty", "OpenStdin", "StdinOnce", "Image",
    "NetworkDisabled", and "OnBuild".
- Updated example Image JSON config with better example values for "Env",
  "Cmd", "Volumes", "WorkingDir", "Entrypoint", "CpuShares", "Memory",
  "MemorySwap", and "User".
- Added notices that any fields not specified are to be considered as
  implementation specific and should be ignored my implementations which
  are unable to interpret them.
- Updated example of creating layer filesystem changesets to use less formal
  language.
- Listed more details in the section regarding extraction of a bundle of image
  layers into the root filesystem of a container.
- Updated the closing mention of Docker as an evolving implementation.

More updates to the v1 image spec

- Added line wrapping after 80 columns per line to adhere to documentation
  style guides, as pointed out by @jamtur01

- Removed references to any specific docker commands, updated a few descriptions
  or drop repeated statements, as pointed out by @cpuguy83

Cleanup image v1 spec draft after fredlf comments

Address comments by mmdriley on v1 image spec

Improve description of image v1 spec 'config.User`

- Improves description of image v1 specification for the 'User' runtime
  parameter after recomendations by tianon.

Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn)

@jlhawn jlhawn force-pushed the jlhawn:image_spec branch from a1e3b04 to 7991062 Jan 12, 2015

@jlhawn

This comment has been minimized.

Copy link
Contributor Author

jlhawn commented Jan 12, 2015

@jfrazelle All squashed!

jessfraz pushed a commit that referenced this pull request Jan 12, 2015

Jessie Frazelle
Merge pull request #9560 from jlhawn/image_spec
Adds Docker Image v1 Spec Documention

@jessfraz jessfraz merged commit 0192b6c into moby:master Jan 12, 2015

1 check was pending

default The build is pending on drone.io
Details
@jessfraz

This comment has been minimized.

Copy link
Contributor

jessfraz commented Jan 12, 2015

awesome!

@thaJeztah

This comment has been minimized.

Copy link
Member

thaJeztah commented Jan 12, 2015

Thanks @jlhawn!

@ankushagarwal

This comment has been minimized.

Copy link
Contributor

ankushagarwal commented on image/spec/v1.md in 7991062 Jan 15, 2015

Shouldn't this be : specified at creation of the *container*

This comment has been minimized.

Copy link
Contributor Author

jlhawn replied Jan 15, 2015

you're right! good catch!

@ankushagarwal

This comment has been minimized.

Copy link
Contributor

ankushagarwal commented on image/spec/v1.md in 7991062 Jan 15, 2015

Pardon me but, shouldn't this be called Layer JSON? From the definition, it looks like it contains metadata about only one layer of an image. Image JSON feels like it describes the metadata of Image as a whole.

@ankushagarwal

This comment has been minimized.

Copy link
Contributor

ankushagarwal commented on image/spec/v1.md in 7991062 Jan 15, 2015

Is Image ID different from Layer ID or is Image ID just the ID of the topmost layer?

@odino

This comment has been minimized.

Copy link

odino commented Jan 15, 2015

I was talking to @shykes about this at the dockercon in amsterdam...great job guys!

@jlhawn jlhawn deleted the jlhawn:image_spec branch Jul 31, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.