Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Composable docker images with Nix #11156

Merged
merged 1 commit into from
Jan 13, 2016
Merged

Conversation

lucabrunox
Copy link
Contributor

So I wanted a Nix function to create docker images. That's easy, just make a tarball of the closure of a derivation, and it's ready to be imported.

The problem is: how do you add more content to an existing image, by using layers?

So here's an attempt to provide a solution, even if it's very ugly.

Let's start with a simple bash container:

$ nix-build -E 'with import <nixpkgs> {}; dockerImage { drv = bashInteractive; drvCommand = "${bashInteractive}/bin/bash"; }'
# cat result/image.tar.gz|docker import - nixbash
# docker run -t -i nixbash /bin/bash
bash-4.3# 
# docker history nixbash
IMAGE               CREATED             CREATED BY          SIZE
4055d6dc456e        30 minutes ago                          186.4 MB

Now let's create another image with python only:

$ nix-build -E 'with import <nixpkgs> {}; dockerImage { drv = python; drvCommand = "${python}/bin/python"; }'
# cat result/image.tar.gz|docker import - nixpython
# docker run -t -i nixpython /bin/python
Python 2.7.10 (default, Jan 01 1970, 00:00:01) 
...
# docker history nixpython
IMAGE               CREATED             CREATED BY          SIZE
4ed68027aae5        10 seconds ago                          223.5 MB

Now it comes the magic. Let's layer that python image on top of bash:

$ nix-build -E 'with import <nixpkgs> {}; dockerImage { drv = python; drvCommand = "${python}/bin/python"; baseImage = "nixbash"; }'
# cd result
# ./docker-build.sh -t nixpython2 --no-cache .
# docker history nixpython2
IMAGE               CREATED             CREATED BY                                      SIZE
51b3911260fc        17 seconds ago      /bin/sh -c hostip=$(/nix/store/wa3fimm2m26fwh   46.57 MB
4055d6dc456e        24 minutes ago                                                      186.4 MB

Note: the baseImage is there only for convenience. One might as well generate a Dockerfile on the fly, and use the same image.tar.gz. I haven't done that, but it's easy to do.

In other words, we're able to take arbitrary nix containers and layer multiple of them with minimal overhead.

Now that 184mb is because for some reason gcc is pulled in with a lot of other stuff, most of which are useless for doing the layer merge. The python closure is only 36mb, so... there's a lot of opportunity to optimize.
Just think a debian docker is around 150mb, so we aren't that huge either.

cc @offlinehacker @datakurre @domenkozar for discussion :)

@trishume
Copy link
Contributor

This is neat. Do you think it would be possible someday to have the layering work within the nix expressions? Could you have baseImage = bashImage and bashImage = dockerImage { ... } so that Nixpkgs could provide a generic baseImage attribute that lives in the nix store that other specialty images could layer on top of?

Also it sounds like with some tuning (figure out why gcc is pulled in) Nix could create the leanest container images of any OS since exactly the minimal set of dependencies for any given program would be pulled in. That would be almost micro-kernel like.

@lucabrunox
Copy link
Contributor Author

@trishume I'm thinking about that and I feel that it may be possible. docker save would be the derivation output of an image. At this point inside nix, run the docker daemon with custom paths, docker load, docker build and finally docker save. It all depends if the docker daemon may run as a user and with enough privileges for creating his filesystems inside nix.
At last, we may run these commands in a VM.

And yes, docker images with nix, if packages were properly optimized, would deliver the smallest images.

@datakurre
Copy link
Contributor

@lethalman That 184 MB for Python sounds strange. I've been getting 36 MB tar.gz and 108.9 MB unpacked "virtual size" already. Since I've still learning Nix as language, I've simply:

nix-build python.nix
tar cvz `nix-store -qR result ` result > python.tar.gz

Also good to know: If there's no explicit PATH env defined, Docker run seems to add the normal /usr/local/bin:/usr/bin:/bin/etc... so that docker run -t -i nixpython python should be enough for run.

Then there's that issue of many software expecting /bin/sh or /bin/bash and writable /tmp to exist. And usually one need to add many derivations to PATH. I went far enough to map expression build result into image root with:

mkdir tmp
tar cvz --transform="s|^result/||" tmp `nix-store -qR result` result/* > image.tar.gx

And with buildEnv that results with /bin et al. having all the expected paths. Probably there are also downsides.

@lucabrunox
Copy link
Contributor Author

@datakurre that's not only python, but also tools for extending the image later with layers (like curl, iproute, ...). Everything can be done, this is a PoC that nix images can compose with layers.

@lucabrunox
Copy link
Contributor Author

@trishume so it seems possible to do with runInLinuxVM, however there's a drawback of docker save/load. The name of the image cannot be chosen when loading, but it can be renamed afterwards. So indeed we can generate layered images, but well that's really a docker limitation.

Also other the limitation is that kvm is needed in order to do the merge.

Will do a complete PR.

@datakurre
Copy link
Contributor

@lethalman That explains. You figured out quite a way to get data into image through the filter without ADD or COPY :)

Still, could there be a way to create delta.tar.gz outside Docker (e.g. getting the paths from the base image with docker save) and use a Dockerfile with ADD delta.tar.gz / instead? According to the docs https://docs.docker.com/engine/reference/builder/#add ADD has bevior of tar -x

the result is the union of:

Whatever existed at the destination path and
The contents of the source tree, with conflicts resolved in favor of “2.” on a file-by-file basis.

@lucabrunox
Copy link
Contributor Author

@datakurre with a delta it could be possible yes, but it's still another step to do from the outside :( But yeah, it's indeed a step not involving an http server :D

So I think all this will be divided into two things:

  • A merge script which does the delta and then docker build. Much like this PR does, but without the http server, as suggested by @datakurre . So this already lets you create the layered image but in your docker system environment.
  • Additionally, we can create pure layered docker images with runInLinuxVM by using the above script.

This leads to a series of image manipulation tools:

  • create: drv -> image.tar.gz . It's the simple base image, and it's pure.
  • nix-docker-merge, installable script which creates a layered image on the system of two images, one of which is already in the system and the other is a tar.gz, and it's impure.
  • mergeImages: image1.tar.gz, image2.tar.gz -> merged.tar.gz, pure layered image creation, more expensive and requires kvm, but it's portable to other systems for docker load and buildable by hydra.

The above ones should be the very basic functionalities on top of which one creates other utility functions.

@lucabrunox
Copy link
Contributor Author

@datakurre can you explain better? I didn't understand the image version with Nix profiles...

@datakurre
Copy link
Contributor

@lethalman Sorry, I missed your update on deltas. Is there any option, which would work within Docker? (To make it possible to build a Linux Docker image on a Mac with boot2docker. E.g. some way to pass paths known to exist in baseImage so that there's no need to read them from the image with Docker.)

About the profiles. I was thinking, if it would make sense to manage named dockered apps using Nix profiles so that it would be possible to create a delta tarball by diffing paths from the previous profile version to the current version. But I didn't think this too much. Probably something, where a bash script works better than a nix expression. Not suitable for hydra and very impure.

@lucabrunox
Copy link
Contributor Author

@datakurre the problem is docker build , so you still need impurity, hence no reason to put other metadata somewhere else... because you need impurity anyway.

I didn't understand you first question...

@datakurre
Copy link
Contributor

@lethalman Agreed. I understand this now.

Also the first question was probably a misunderstanding from me. So, would nix-docker-merge result an image.tar.gz, Dockerfile and the script to create the delta for the base image? So that to build x86 containers, I could call dockerImage at Nix container in VM, copy the results into darwin host and run the script on darwin host?

@lucabrunox
Copy link
Contributor Author

nix-docker-merge creates the Dockerfile, the delta and outputs the saved image, yes. If the VM has x86 nix, yes you can do that. Note that runInLinuxVM bind mounts the host /nix, also I don't know if it works on darwin.

So if you have an x86 VM, and you create two docker images from there, you can merge the result of course regardless of the architecture. Basically that nix-docker-merge has nothing to do with nix, it only uses nix-hash :)

@lucabrunox
Copy link
Contributor Author

Alright :D We have it...

with import <nixpkgs> {};

let
  image1 = dockerTools.createImage { drv = bashInteractive; };
  image2 = dockerTools.createImage { drv = python; };
in
  dockerTools.mergeImages { name = "bash-python"; inherit image1 image2; }
$ nix-build
/nix/store/br5hqbzcqcgv7fy85pp7h91smpia32nb-bash-python.tar.gz
# docker load -i result
# docker history 240c630f46d8
IMAGE               CREATED             CREATED BY                                      SIZE                COMMENT
240c630f46d8        5 minutes ago       /bin/sh -c #(nop) ADD dir:f576a907f86b772f9b7   68.29 MB            
df1dafed18ab        6 minutes ago                                                       50.16 MB            Imported from -

There are three problems:

  1. When doing docker build I've passed -t bash-python, but then docker load ignored the name and imported it anonymously.
  2. But the real problem is that docker load does not print the image id :( hence mergeImages will not work with layered images................... that's really something to report to docker.
  3. Still have to work more on disk size estimation, but the basics are there.

@jgeerds
Copy link
Member

jgeerds commented Nov 20, 2015

@lethalman I fell in love with you 😆

That's so great! I have to test it

@domenkozar
Copy link
Member

@geerds don't scare him away ✌️

@lucabrunox
Copy link
Contributor Author

Just reported the docker load problem upstream: moby/moby#18117

@datakurre
Copy link
Contributor

👍

@lucabrunox
Copy link
Contributor Author

Alright. For now I've workarounded that issue, and we can only deal with layered images that have a name, because we cannot know easily the image id of the loaded image.

So we can stack as many layers as we want from independent images:

with import <nixpkgs> {};

rec {
  image1 = dockerTools.createImage { drv = bash; };
  image2 = dockerTools.createImage { drv = python; };
  image3 = dockerTools.createImage { drv = perl; };
  bash-python = dockerTools.mergeImages { name = "bash-python"; baseImage = image1; image = image2; };
  bash-python-perl = dockerTools.mergeImages { name = "bash-python-perl"; baseImage = bash-python; image = image3; };
}

Do nix-build -A bash-python-perl; docker load -i result et voila!

Going to add also some other useful tools, like converting a flattened image to an image with one layer, and viceversa.

@lucabrunox
Copy link
Contributor Author

@edolstra do you feel we can merge this? It's just tools, no harm for the build farm.

@lucabrunox
Copy link
Contributor Author

Docker is taking up so much disk space (like over 2gb...) that I'm thinking of implementing the packing format of the images by myself. It's quite simple, and there will be no need of runInLinuxVM, completely pure.

https://github.com/docker/docker/blob/master/image/spec/v1.md

And they even reimplemented nix-hash: https://github.com/docker/docker/blob/master/pkg/tarsum/tarsum_spec.md

@lucabrunox
Copy link
Contributor Author

There we go, cleaned up of all the docker impure evilness. It's all pure nix without virtual machines, and without docker. We generate compatible docker images.

with import ./. {};

rec {
  image1 = dockerTools.mkImage { drv = bash; };
  layeredImage1 = dockerTools.toLayeredImage { name = "bash"; image = image1; };
  image2 = dockerTools.mkImage { drv = python; };
  image3 = dockerTools.mkImage { drv = perl; };
  bash-python = dockerTools.mergeImages { name = "bash-python"; baseImage = layeredImage1; image = image2; };
  bash-python-perl = dockerTools.mergeImages { name = "bash-python-perl"; baseImage = bash-python; image = image3; };
}
# docker load -i result
# docker history bash-python-perl
IMAGE               CREATED               CREATED BY          SIZE
6fc05ae2aee1        45.917024 years ago                       145.2 MB
0494c69c169b        45.917024 years ago                       68.29 MB
d2ac909653e5        45.917024 years ago                       40.8 MB
# docker run -t -i bash-python-perl /nix/store/rfcsnszaw2jlkqdgg65h3phss8xa0mlp-python-2.7.10/bin/python
Python 2.7.10 (default, Jan 01 1970, 00:00:01) 
[GCC 4.9.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> 

Not only, we can also add all the various CMD and ENTRYPOINT ecc. as the spec is very simple. The ADD command can be simply replaced by a series of extra shell commands. At this point we have replaced docker build with our pure nix variant.

That mergeImages should really become dockerBuild.

Just one nitpick :D @edolstra why was I able to use /dev/urandom in a chrooted build? Shouldn't that dev be faked?

@lucabrunox lucabrunox force-pushed the docker-image branch 3 times, most recently from 46bc46a to 75f24f1 Compare November 20, 2015 17:09
@lucabrunox
Copy link
Contributor Author

New syntax:

with import <nixpkgs> {};
with dockerTools;

rec {
  image1 = mkTarball { drv = bash; };
  layeredImage1 = build { name = "bash"; addTarball = image1; };
  image2 = mkTarball { drv = python; };
  image3 = mkTarball { drv = perl; };
  bash-python = build { name = "bash-python"; fromImage = layeredImage1; addTarball = image2; };
  bash-python-perl = build { name = "bash-python-perl"; fromImage = bash-python; addTarball = image3; };

  just-python = build { name = "just-python"; config = { Cmd = [ "${python}/bin/${python.executable}" ]; }; };
}

But now we also support arbitrary config and its dependencies! Look just-python, we don't have to specify any tarball:

$ nix-build -A just-python
# docker load -i result
# docker run -t -i just-python
Python 2.7.10 (default, Jan 01 1970, 00:00:01) 
[GCC 4.9.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> 

We basically have a partial docker build working. Other stuff like handling multiple images ecc. are not working of course.

@lucabrunox
Copy link
Contributor Author

I'm going to merge this in a few days, unless somebody is contrary.

@lucabrunox
Copy link
Contributor Author

fixed and rebased on current master

@lucabrunox
Copy link
Contributor Author

@rvl ok the example was misleading because runAsRoot required directly an executable. Now it can be a string and your code should work.

EDIT: you better use #!${stdenv.shell}, because /bin/sh might not be there.

@lucabrunox lucabrunox force-pushed the docker-image branch 3 times, most recently from 07dd929 to ad57007 Compare December 14, 2015 15:48
@lucabrunox
Copy link
Contributor Author

@anderspapitto optimized the diffing part, but note the most expensive operations are due to unpacking and repacking the whole image. I tested and on 15 seconds total, with this simple optimization you can gain less than 1 second only.

@rvl
Copy link
Contributor

rvl commented Dec 17, 2015

Thanks @lethalman, it works well. Is there anything we can do to help getting this merged?

I reviewed the code (or tried to). The only questions I can come up with are:

  • how long will the docker hub registry /v1 continue to work?
  • is there any way of calculating a timestamp for a derivation which isn't 1970?
  • is the result let-binding needed in buildImage? (seems to work without it)
  • is an end-to-end test possible? ... e.g. build an image, load into docker, check that it runs.

@lucabrunox
Copy link
Contributor Author

@rvl thanks for testing.

  • Well, they don't have a v2 so... :)
  • Derivations have to be reproducible. You have to think they are addressed by hash, not by time. That's for any derivation in nix, not only docker images. I understand that's ugly, it's as ugly as running uname on NixOS.
  • It's handy just in case we need to tweak the result further. If in the future you have to re-indent, that would break the blame.
  • Yes, it's possible with runInLinuxVM but also like with any other nixos test. There's even a docker test already for nixos.

@lucabrunox
Copy link
Contributor Author

As for the last bits todo:

  • Allow contents to be a list.
  • Currently in some code paths, when copying contents, there's a chance it fails due read-only files of different nixbld users. It should suffice to do a chown -R a+w after copying stuff.

@lucabrunox
Copy link
Contributor Author

I should have addressed the last bits too, and the docs about the default values for pullImage.

I've rebased on top of current master. Further testing is welcome before I'll finally merge this :)

@rvl
Copy link
Contributor

rvl commented Dec 27, 2015

Great, it still works for me. A list of derivations in contents also works for me. Can I suggest rewording the sentence in doc/functions.xml:405 to "contents is one or more derivations that will be copied in the new layer of the resulting image."

@lucabrunox
Copy link
Contributor Author

@rvl thanks for testing. How should it be reworded?

@rvl
Copy link
Contributor

rvl commented Dec 28, 2015

Instead of "contents is a derivation that will be copied in the new layer of the resulting image", try "contents is one or more derivations that will be copied in the new layer of the resulting image." I think that would suggest using a list if necessary.

@jgillich
Copy link
Member

What's the status of this?

@lucabrunox
Copy link
Contributor Author

@jgillich mergeable. I'm going to rebase, retest and merge right now.

@jgillich
Copy link
Member

Awesome! 👯

@lucabrunox
Copy link
Contributor Author

Tested pullImage, sha is almost always wrong. I think docker changed again something :(

@lucabrunox
Copy link
Contributor Author

Nvm it's now somewhat consistent. I guess there's still some non-determinism somewhere in pullImage, but we'll see with time.

lucabrunox pushed a commit that referenced this pull request Jan 13, 2016
Composable docker images with Nix
@lucabrunox lucabrunox merged commit 5b1d1a3 into NixOS:master Jan 13, 2016
@lucabrunox
Copy link
Contributor Author

Merged. No reason to keep this unmerged, it can be improved over time and already does a very good job I think.

@datakurre
Copy link
Contributor

👍

@cstrahan
Copy link
Contributor

Wow, this is pretty awesome looking!

@datakurre
Copy link
Contributor

@lethalman Would you know, is something special required for hydra on nixos to be able to build image with runAsRoot (requiring kvm)? Images without buildAsRoot build fine. Images with runAsRoot build fine as a user, but not by hydra (on the same machine).

@datakurre
Copy link
Contributor

@lethalman Sorry for bothering. It seems that supportedFeatures = [ "kvm" ]; for nix.buildMachine did it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.