Skip to content

Latest commit

 

History

History
332 lines (229 loc) · 11.3 KB

REPRODUCIBLE.org

File metadata and controls

332 lines (229 loc) · 11.3 KB

GNU Guix reproducible software deployment

Table of Contents

Introduction

In this document I will describe how to have reproducible software deployment with GNU Guix adding a caching server for binaries (local or on the internet).

GNU Guix gives full control over the software dependency graph. This is because Guix is a declarative software deployment system, unlike, for example, Debian, Docker and Brew which give different results depending on the order of installation and the time of installation. Guix has none of that. You get what you expect. More can be read about this in SYSCONFIG.org.

One of the great features of GNU Guix is that it supports rolling upgrades. I.e., when you do a ‘guix pull’ it will install the latest and the greatest - matching the state of the build-farm at that point. Because GNU Guix packages are isolated, i.e., packages can not overwrite each other or their dependencies, it is safe in that way.

Here we describe how to recreate the same software stack every time:

  1. Use a checked out git repository of GNU Guix packages
  2. The checked out git HASH value defines the origin of the dependency graph
  3. Build software of fetch binary substitutes against that git repo/graph

In autumn 2018 Guix pull got updated to make it reproducible in an easy way (it was possible, but harder before). That means you can run ‘guix pull’ between different machines and you get the same result!

Essentially, when you do a ‘guix pull’ it fetches the GNU Guix source code tree from a git repo and builds that. By default it fetches the latest tree. Now, when you pass the git checkout hash it will build a specific version of the tree. E.g.,

guix pull --commit=ff349415b27cc764fd7168ef35ca76c3b8b05889

This installs a new version of guix in $HOME/.config/guix/current/bin! Note that you can use substitute servers - which will fetch pre-build binaries, more on that below.

When you run ‘$HOME/.config/guix/current/bin/guix –version’ you should see the HASH ff349415b27cc764fd7168ef35ca76c3b8b05889 of above commit.

Next we create a caching server for binaries (named ‘guix publish’)

  1. Create a key for the server
  2. Publish the server on the network
  3. Tell your machines to use the server for substitute binaries

It is all rather simple, really.

A side-effect of taking this approach is that you’ll spend less time downloading and installing large binaries. GNU Guix can be quite heavy in its use of bandwidth if you opt for rolling upgrades.

Introducing a GNU Guix caching server (guix publish)

But better than the archive option is to set up a Guix publish server.

Set up server

It is important to use recent versions of the guix daemon and guix on both ends.

  1. Generate a key pair for the guix-publish service
  2. Run the guix-publish service (daemon)
  3. Either build or pull all the packages you want to distribute

For the last, get a git checkout of the guix repository as described in INSTALL.org.

Then you need to add software to the cache by either (a)

for n in `./pre-inst-env guix packages -A | cut -f1`; do
  ./pre-inst-env guix build "$n"; done

or (b)

for n in `./pre-inst-env guix packages -A | cut -f1`; do
  ./pre-inst-env guix --no-substitutes build "$n"; done

(a) pulls packages available from hydra, (b) tries to rebuild them all. You can mix the two.

These for-loops will fail altogether if a single build fails. This is probably not what you want. So try

for n in `./pre-inst-env guix packages -A | cut -f1`; do
  ./pre-inst-env guix build "$n" || true; done

And you might also want to look into the –cache-failures option for the guix-daemon. And instead of just using a for-loop you might want to use gnu parallel or something.

First generate the key in /etc/guix/signing-key.pub

guix archive --generate-key

To publish the server is a trivial

useradd guixpublisher
guix publish -p 8080 -u guixpublisher

Note that it is also possible to use the GUIX_PACKAGE_PATH to distribute pre-built binaries. Please note the section Dealing with special packages.

Use server

Example for Guix published on http://penguin.org:8080

The public key on the publishing server is defined in /etc/guix/acl

(public-key
  (ecc
    (curve Ed25519)
      (q #AFF68C4E099401E85BE2D7375C1DB5E8A29F1DB83299038122AF5C0984099CF8#)))

On the receiving machine run

sudo guix archive --authorize

so as to authorize the distributing (publishing) server. Paste in the scheme expression for the key above and finish with ctrl-D. After that you can use something like

guix package -i boost --substitute-urls="http://penguin.org:8080"

Or use it with the guix-daemon using hydra as a fallback

guix-daemon --build-users-group=guixbuild --substitute-urls="http://penguin.org:8080 http://mirror.guixsd.org http://hydra.gnu.org"

(for multiple substitutes to work make sure you are running Guix > 0.9, note that the Guix mirror automatically fetches the master too)

To test the server by hand go to the URL

curl http://penguin.org:8080/nix-cache-info

and check the contents, it should show something like

StoreDir: /gnu/store
WantMassQuery: 0
Priority: 100

Nginx as a proxy

To use Nginx as a proxy use the following settings:

server {
  listen 80;
  server_name guix.genenetwork.org;
  access_log  logs/guix.access.log;

  proxy_connect_timeout       3000;
  proxy_send_timeout          3000;
  proxy_read_timeout          3000;
  send_timeout                3000;

  location / {
      # proxy_set_header   Host $host;
      proxy_set_header   Host      $http_host;
      # proxy_redirect     off;
      proxy_set_header   Connection keep-alive;
      proxy_set_header   X-Real-IP $remote_addr;
      proxy_set_header   X-Forwarded-For $proxy_add_x_forwarded_for;
      proxy_set_header   X-Forwarded-Host $server_name;
      proxy_pass         http://127.0.0.1:8080;
  }
}

which can probably be simplified. Start nginx with something like

/root/.guix-profile/sbin/nginx -c /etc/nginx/nginx-genenetwork.conf -p /var/spool/nginx

Trouble shooting guix publish

The guix substitute server is not very helpful giving messages - i.e., it fails silently to comply if an authorization key is missing, or if you pass in a wrong URL. Best is to test the URL, e.g.

curl http://guix.mycachingserver.org
Resource not found: /

Next, look at the output of guix publish when querying. It should show

GET /nar/vxdm2dqckv3yvwihr4hs6f886v6104az-zlib-1.2.8
GET /nar/601j6j3fa9nf37vyzy8adcaxcfddw4m1-libsm-1.2.2

Typical things to go wrong are:

  1. Webserver not visible
  2. Key not working
  3. Package tree differs
  4. Packages created with or without –no-grafts option

It is advisable to use the same versions of guix and guix-daemon at the same time.

Source reproducible graphs with Guix channels

Guix recently added an extremely good feature named ‘channels’. Channels allow you to support out-of-tree packages. Especially relevant for versions of software that are not on the GNU Guix trunk.

Source reproducible graphs using GUIX_PACKAGE_PATH

GUIX_PACKAGE_PATH is the traditional way for doing out-of-tree package builds.

Extras

Using the GNU Guix archive feature (guix archive)

With the archive option a package with all its dependencies can be copied from one machine to another. For rapid reproducible deployment this can be a useful method.

Generate the key

First, as root, generate a key for the machine:

guix archive --generate-key

Note this can take forever on a server without a keyboard so you may want to generate a key on a local machine and copy it across. Depending on how Guix was installed the key may be stored in etc/guix or usr/local/etc/guix, e.g.,

cat /usr/local/etc/guix/signing-key.pub

    (public-key
     (ecc
      (curve Ed25519)
      (q #11217788B41ADC8D5B8E71BD87EF699C65312EC387752899FE9C888856F5C769#)))

Then create a signed tar ball with

guix archive --export -r ruby > guix_ruby.nar

The NAR file is a 200Mb archive which contains the Ruby binary with all its run-time dependencies. Next on a new machine you can unpack it with

guix archive --import < guix_ruby.nar

A more advanced example could look like

env GUIX_PACKAGE_PATH=../guix-bioinformatics/ ./pre-inst-env guix archive --export --no-grafts -r $(readlink -f /usr/local/guix-profiles/gn2-2.10rc5) |ssh penguin guix archive --import

which includes a package path, a recently built guix, the profile in /usr and an install on a remote machine. A very elegant way to synchronize binary software on machines.

Reproducible software deployment using a git checkout

The above approach presents ‘guix pull’ which gives you a version of the package source tree (transparently).

Git checkout of GNU Guix repository

Note: the following method is no longer needed now we have the ‘guix pull’ exact checkout as described above. I am leaving it in, just in case you find it useful.

A reproducible software graph can be handled via a git checkout of the Guix package repository. Follow the guidelines in INSTALL.org to get a git checkout and make of the repository inside a Guix container. After that, inside the repository you should be able to run

./pre-inst-env guix package -A

At this point your source repo really defines your graph. So if you do a `git log’ you can see the SHA value which is the current version of your guix git repo/graph, e.g.

commit 96250294012c2f1520b67f12ea80bfd6b98075a2

Anywhere you install software based on the git checkout with this SHA value you will get the exact same result. For example using this version of the git repo/graph

./pre-inst-env guix package -A ruby

Will install the exact same Ruby, for x86_64 this can be

/gnu/store/pgks1l9cl696j34v9mb35lk8x6lac3b0-ruby-2.2.4

In fact, the only external outside-Guix run-time dependency is the Linux kernel API which, fortunately, does not change much. There is no (external) glibc dependency because glibc is part of the graph. GNU Guix comes with its own dependencies (i.e., batteries included).

Set up an authoritative git

For general deployment you can set up a git repo which contains the tree that gets used to deploy software in a reproducible way. Note that git branches can be helpful when dealing with different deployment versions (e.g., development, testing, staging, production).