Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Newer
Older
100644 420 lines (315 sloc) 17.583 kb
cd19eaf @dagss Initial upload of ideas document
authored
1 Scidist -- Nix for scientific computing
2 =======================================
3
4 Introduction
5 ------------
6
e146904 @dagss fix
authored
7 Here's what I want for a scientific software distribution system (or
8 "Scidist", which I'll henceforth dub my utopian scientific
9 build/distribution system):
cd19eaf @dagss Initial upload of ideas document
authored
10
11 - Non-root mode is completely necesarry (and may be the reason a lot
12 of the current scientific distributions exist).
13
14 - If I need to go back in time to compare with a run I did a year
15 ago, it should be efficient. Jumping back and forth in time to
16 compare effects (think benchmarks!) should be as quick and easy as
17 jumping in git history.
18
19 - Should be easy to switch between different versions of LAPACK/BLAS,
20 or different Fortran compilers, or say "I want to be mostly on a
21 stable branch but live on the trunk in cosmological
22 software". Again, jumping between different configurations should
23 be as easy as jumping between git branches.
24
25 - Must encourage working on upstream projects. That is, very low
26 treshold for patching upstream projects -- if I fix an issue by
27 sending a patch of NumPy upstream, I want to start using the fix
28 right away without having to "manually maintain my own package".
29
30 - Should be easy to use to push *my own* software from my laptop to
31 the cluster or to my peers. I guess this may be another instance of
32 the point above.
33
34 - Must support Python well, but not be overly focused on
35 Python-specific solutions, and ideally useful outside the Python
36 community as well. The software I use is mostly C/C++/Fortran,
37 Python is just a nice shell around it. Anything that encourages
38 bundling LAPACK with the Python extension should put up a BIG
39 warning flag.
40
41 And above all, must be *simple*. It is possible to solve a lot of
42 things by adding lots of complexity and features (think popular Linux
43 distros), but I don't think that path is viable for the scientific
44 community -- we just need better, simpler ideas.
45
46 Note that point 2-4 can be summarized as "must be more git-like". If
47 it is *trivial* and *efficient* to branch and merge an entire
48 scientific distribution (of course, you'd still track a stable and
49 tested upstream!), I feel 2-4 is almost solved, or at least it'll be a
50 very decent foundation to build a few utility scripts on.
51
afbf940 @dagss Fixes
authored
52 Before I start rambling about Nix, here's some things I evaluated
cd19eaf @dagss Initial upload of ideas document
authored
53 before I fell for the Nix approach:
54
55 - I was a fan of Gentoo Prefix for a while. Pro: Relies on the huge
56 Gentoo community, Sage already ported, uses the "ports" approach of
57 a single repo for all package definitions. Con: Simply too
58 complicated, wants to force you to "live on the trunk". Also lacks
59 several items on the above list.
60
61 - Other solutions fail for much the same reasons our current approaches
62 fall short, but here are some links:
63
64 - http://0install.net/
65 - http://lilypond.org/gub/
66 - http://www.gnu.org/software/gsrc/
67
68
69 The core idea behind Nix
70 ------------------------
71
72 Fortunately Eelco Dolstra comes, in my opinion, close to fixing this problem
73 in his PhD thesis:
74
75 http://www.st.ewi.tudelft.nl/~dolstra/pubs/phd-thesis.pdf
76
77 The result is now the Nix project,
78
79 http://nixos.org
80
81 It is fortunately very cleanly split into four components; the Nix
82 language and build system, nixpkgs, NixOS, and Hydra -- I'll focus on
83 Nix itself for now, touch upon nixpkgs eventually, and totally ignore
84 NixOS and Hydra (except mention right now that Hydra is a CI system
85 for Nix).
86
87 Read the docs for the details, but here's a quick tour:
88
89 Each package is described using a simple file (or small set of files)
90 containing a) tarball download link or pointer to VCS repo, b) some
90f9436 @koffie Fixed a minor mistake and added a reference to the spkg documentation…
koffie authored
91 metadata, c) commands to build it, d) any patches. By seperating the
cd19eaf @dagss Initial upload of ideas document
authored
92 main download from the packaging and keep many packages in the same
93 repo, one can work quite efficiently with revision control on a set of
94 packages. This part is similar to FreeBSD ports and Gentoo/Portage.
95
96 Each package is built and installed into its own directory, which
97 contains a hash from *all* the inputs to the package: The downloaded
98 tarball, the build script, the C compiler used, the Bash used, and so
afbf940 @dagss Fixes
authored
99 on. So one would have::
cd19eaf @dagss Initial upload of ideas document
authored
100
afbf940 @dagss Fixes
authored
101 $NIXSTORE/as8f76234fasdgfas-myprogram-1.0/bin/myprog
102 $NIXSTORE/as8f76234fasdgfas-myprogram-1.0/lib/libmyprog.so
cd19eaf @dagss Initial upload of ideas document
authored
103
104 and so on. The big point is that when upgrading to 1.1,
105 1.0 is still left in place, and 1.1 simply gets a new hash since
afbf940 @dagss Fixes
authored
106 the downloaded tarball etc. hashes differently::
cd19eaf @dagss Initial upload of ideas document
authored
107
afbf940 @dagss Fixes
authored
108 $NIXSTORE/234asdfas1234-myprogram-1.1/bin/myprog
109 $NIXSTORE/234asdfas1234-myprogram-1.1/lib/libmyprog.so
cd19eaf @dagss Initial upload of ideas document
authored
110
111 The idea is *not* to support using multiple versions concurrently in
112 the same program (which is a bad idea for any software). The big point
113 is that this facilitates very quickly switching between branches of
114 package sets, atomic upgrades, perfect rollbacks, and so on. Imagine
115 creating one branch of your software where everything is built with
116 ATLAS, another one where everything is built with GotoBLAS2, and
afbf940 @dagss Fixes
authored
117 atomically and instantly switch between them. While not so
118 useful for production use, this can be *very* useful when debugging.
cd19eaf @dagss Initial upload of ideas document
authored
119
120 The "build artifact output directories" are called derivations in
121 Nix-speak. To describe a package (or, compute a derivation), one uses
122 a high-level functional programming language to describe every package
123 as a programmatic function. Fortunately, packages are still built
124 using Bash commands -- it is only processing package parameters (and
125 passing them on to Bash) that gets done in the functional language.
126 Of course, one composes a package using other functions, so one could
90f9436 @koffie Fixed a minor mistake and added a reference to the spkg documentation…
koffie authored
127 support installing an SPKG (for the none sage people, see:
128 http://www.sagemath.org/doc/reference/sage/misc/package.html for a
129 description of what is meant by an SPKG) by writing a
130 mkSagePkgDerivation function that takes the spkg and does most of what
131 you need, only leaving you to connect configuration parameters with
132 the corresponding Sage environment variables.
cd19eaf @dagss Initial upload of ideas document
authored
133
134 Every package is purely a function of its inputs, and one is strict
135 about requiring all inputs that affect the build process to be passed
136 in. This is quite literal: If you want to build a C program, you need
137 to pass in which "stdenv" to use (meaning which bash, C compiler,
138 automake, etc.). If you want to build a Fortran program, the Fortran
139 compiler must be passed in in addition. If you want to check out the
140 sources from Git instead of downloading a tarball, pass in the
141 "fetchGit" function that can be called to achieve this. Downloading a
142 tarball isn't magic, it is simply part of the functionality of "stdenv",
143 which you can trivially replace with your own.
144
afbf940 @dagss Fixes
authored
145 So, again, if you want your setup to contain two versions of a
146 software, e.g., compiled with different Fortran compilers, you can
147 mostly call the same function twice while passing in different Fortran
148 compilers. The results will hash differently and so be installed in
149 seperate locations.
cd19eaf @dagss Initial upload of ideas document
authored
150
151 You only specify build-time dependencies, which are simply other
152 packages passed as arguments to your package-building
153 functions. "Hard" run-time dependencies, where you link exactly one
154 derivation to a specific version of another, is automatically
155 registered simply by grepping through the built derivation for the
156 hashes of the built-time dependencies. So if your program links
157 with a shared library it will find the dependency. In some cases
158 one may need to emit a hash of a run-time dependency to a file
159 just to make sure it is picked up.
160
161 Finally, there's a garbage collector that removes unused derivations.
162
163 ``nixpkgs`` builds on top of ``nix`` to actually build a distribution.
164 It does so by:
165
166 - Having a tree of package files (e.g.,
167 ``nixpkgs/pkgs/development/interpreters/python/2.6/default.nix``
168
169 - Have some top-level files that ``import``-s the packages and strings
170 them together. From ``nixpkgs/pkgs/top-level/all-packages.nix``::
171
172 rsnapshot = callPackage ../tools/backup/rsnapshot {
173 logger = inetutils;
174 };
175
176 inetutils = [...]
177
178 Finally, the ``nix-env`` command is the front-end to Nix. The
179 essential purpose is to create the "top-level" derivation that asks
180 for every other derivation that is needed (reference them of
181 ``all-packages.nix``, which gives you closures that also references
182 build dependencies); this derivation builds a ``/local``-style tree of
183 symlinks into the other derivations, so that only one path needs to be
184 added to ``$PATH``. ``nix-env`` swaps your current set of symlinks
185 when you switch "profile"; potentially to an earlier "generation" (a
186 rollback).
187
188
189 Needed work
190 -----------
191
192 My current hunch is that we should not use the nixpkgs distribution,
193 but build something new on top of core Nix. Reasons:
194
195 - nixpkgs is probably more complicated than what we need
196
197 - easier to get up and running by starting from scratch
198
199 - we want to focus on a very small subset of packages and have
200 individual release schedules
201
202 - Nix is too difficult to use for non-experts, front-end should look
203 a bit different
204
205 - ``nix-env`` in some ways duplicates what a VCS can do for us
206
207 - I must admit I really loathe the deep directory nesting of packages
208 in ``nixpkgs`` and would like a flatter namespace, although this is
209 not a good reason...
210
211 Of course, after having built something and got our direction
212 straight, one can decide that a merger with ``nixpkgs`` is in order.
213 This *will* however require lots of patches to ``nixpkgs`` scripts, it
214 is not possible to use ``nixpkgs`` out of the box.
215
216 As a general idea, I hope that as much of the "state" as possible
217 can be put into git repositories that are created for the user.
218 So the ``$SCIDIST`` distribution root directory may look like::
219
220 $SCIDIST/conf # local git repository with configuration,
221 # such as text file listing wanted packages
222 $SCIDIST/scidist # git repository containing .nix expressions,
223 # upgrading a package involves pull-ing this
224 # one to a new version and then rebuilding
225 $SCIDIST/local # symlink to a nix derivation, put this in $PATH
226 $SCIDIST/store # Nix state, managed entirely by basic nix system
227 $SCIDIST/bin # commands for package management -- may also link to a derivation
228
229 In release 0.1, the system works entirely by:
230
231 - Modify configuration files in ``/conf`` in order to describe the system
232 one wants (say, there's a text file of wanted packages)
233
234 - Run ``bin/scidist build [confdir] [localsymlinktarget]`` to make
235 sure that ``/local`` matches ``/conf``
236
237 - A rollback to a previous configuration is done manually like this::
238
239 (cd conf; git reset --hard HEAD^)
240 bin/scidist build
241
242 - Upgrading to the next release of scidist looks like this::
243
244 (cd scidist; git branch prevrelease; git pull)
245 bin/scidist build
246
247 And if that broke things::
248
249 (cd scidist; git checkout prevrelease)
250 bin/scidist build
251
1e0e5be @dagss scidist env command
authored
252 - To use a Nix distribution there'll be a convenient command::
253
254 $ bin/scidist env
255 export PATH=/home/dagss/nix/local:$PATH
256 # PYTHONPATH etc. as needed
257
258 So, we have a canonical way of getting environment variables set up
259 for a shell::
260
19ce8c9 @dagss fix buf
authored
261 $ source <(/path/to/my/scidist/bin/scidist env)
1e0e5be @dagss scidist env command
authored
262
cd19eaf @dagss Initial upload of ideas document
authored
263 Building on this, we can start to add polish in the ``scidist`` command.
264
265
266 Ticket #1: Built distributions cannot be moved
267 ''''''''''''''''''''''''''''''''''''''''''''''
268
269 This is the old question of -Wl,-rpath vs. LD_LIBRARY_PATH. The
270 current approach in ``nixpkgs`` is to hard-code RPATH for loading dynamic
271 libraries; meaning a set of Nix packages cannot easily be relocated
272 (and they are not made for it). Sage OTOH uses LD_LIBRARY_PATH,
273 however this is broken for other reasons (interfers with how dynamic
274 libraries are loaded *globally*, so that, e.g., it's impossible for me
275 to launch the local convenience program for sending something to the
276 printer in my institute from within a Sage shell).
277
278 Fortunately, in modern ``ld.so`` there's support for relative RPATHs
279 using ``$ORIGIN`` (type ``man ld.so``), which solves this problem if
280 one only builds packages diligently. As a quicker hack, the Nix team
281 has created the ``patchelf`` utility for rewriting RPATHs after the
282 fact; this could be used on older systems. Not sure about Mac or
283 Cygwin...
284
285 Ticket #2: Nix patches its gcc etc.
286 '''''''''''''''''''''''''''''''''''
287
288 This one initially made me put Nix aside for a year, but I was really
289 wrong: The patches are AFAIK (I didn't actually read them) only about
290 making sure ``/usr/include`` isn't looked up by default; patching the
291 toolchain is not necesarry for the Nix concept to work. There's many
292 "lightweight sandboxes" exploiting LD_PRELOAD available that can be
293 used instead -- not as secure, but a lot more reasonable for our
294 purposes. See #4.
295
296 Ticket #3: nix-env is too complicated
297 '''''''''''''''''''''''''''''''''''''
298
299 My current hunch is that we must make our own front-end tool instead
300 of ``nix-env``. Simply put, since our final goal is a bit different
301 from NixOS, we need a different UI. See #4.
302
303 An idea is to utilize git instead of some of the features ``nix-env``
304 has. We don't want profiles; instead one can have a file of the
305 packages one wants to have installed under (a local) git repository,
306 and switching profiles then means having many branches and/or clones
307 of that repository.
308
309 Then, of course, a nice frontend to install and remove packages, but
310 always just as furnish above something very simple and as stateless as
311 possible.
312
313 Yes, this was vague, more investigation needed. And by all means,
314 let's just wrap ``nix-env`` if possible.
315
b12bf0b @dagss fix
authored
316 Ticket #4: nixpkgs wants its own toolchain
317 ''''''''''''''''''''''''''''''''''''''''''
cd19eaf @dagss Initial upload of ideas document
authored
318
b12bf0b @dagss fix
authored
319 At least on the Linux platform, ``nixpkgs`` takes things to the extreme: In
cd19eaf @dagss Initial upload of ideas document
authored
320 order to build its own GCC, it even downloads a binary bootstrap
321 tarball (with Busybox!), to really ensure that things are the same
322 everywhere.
323
324 Of course, ``libc.so`` in the binary bootstrap tarball
325 segfaults on my Uni's computers...
326
327 For our purposes we *really* don't want to be this pedantic. If
328 nothing else, having to build the toolchain is a major marketing
0e63589 @dagss libX
authored
329 problem in getting Scidist accepted. Also there's a real problem: For
330 GUI components, it is likely a lot more reliable to link with the
331 system ``libX`` than to build our own.
cd19eaf @dagss Initial upload of ideas document
authored
332
333 However, it would be nice to not loose the integrity features;
334 if I build the same Scidist on two Ubuntu 10.04 computers, it'd
335 be nice if the hashes ended up the same, but they should end up
336 different on Ubuntu 10.10. At the same time, we must make sure
337 that users can upgrade their system without rebuilding everything.
338
339 Here's a way to do it:
340
341 - We have a script that probes the system for the presence of a set
342 of predeclared "base system" packages that we want Scidist to use
343 from the host system. We'll use stdenv/gcc as the example.
344 When run, the script finds the current path to gcc, and runs it with
345 ``--version`` and records the output. The product of the script are
346 NIX packages::
347
348 $SCIDIST/host/gen1/stdenv.nix
349 $SCIDIST/host/gen1/...
350
351 Inside ``stdenv.nix``, there's code to "build" the package, which
352 means simply wrapping the binaries present one on the system, while
353 making sure (at run-time, somehow) that ``--version`` still produces
354 the same output. Also, LD_PRELOAD tricks (see #2) can be played in
355 the wrappers to make sandboxed builds, which helps with writing
356 packages to make sure all dependencies are explicit.
357
358 When run again, then if anything has changed (say, upgraded gcc through
359 ``apt-get install``), a new "host system generation" is created::
360
361 $SCIDIST/host/gen1/stdenv.nix
362 $SCIDIST/host/gen2/stdenv.nix
363 $SCIDIST/host/gen3/stdenv.nix
364
365 Each containing a different string for their expected ``--version``
366 output.
367
368 - The top-level derivation (that is managed by the package manager
369 front-end) must end up looking something like this (though probably
370 generated from a domain-specific language)::
371
372 [
373 (cfitsio {stdenv=gen1.stdenv}),
374 (python2.7 {stdenv=gen1.stdenv}),
375 (healpix {stdenv=gen2.stdenv}),
376 ]
377
378 Here, ``cfitsio`` and ``python2.7`` were installed first, then the
379 host ``gcc`` was upgraded (resulting in a new host-generation being
670b3fa @dagss comment on binary distribution
authored
380 created), and then finally ``healpix`` was installed.
381
382 This also solves the problem with distributing binary packages. It
383 is OK to ship the "host-expressions" from one computer to another
384 as long as nothing triggers them to build. So, you could
385 have::
386
387 [...]
388 (cfitsio {stdenv=sageBuildFarm34.stdenv}),
389 [...]
390
391 And if you want to trigger a local build of that package instead of
1500475 @dagss typo
authored
392 using an available binary, you change it and rebuild::
670b3fa @dagss comment on binary distribution
authored
393
394 [...]
395 (cfitsio {stdenv=gen3.stdenv}),
396 [...]
397
398
cd19eaf @dagss Initial upload of ideas document
authored
399
400 Ticket #5: Downloadable package/bootstrap scripts
401 '''''''''''''''''''''''''''''''''''''''''''''''''
402
403 Title says it all.
a101763 @dagss Ticket #6
authored
404
405 Ticket #6: Soft run-time dependencies
406 '''''''''''''''''''''''''''''''''''''
407
408 With Python packages, it is often the case that you depend on another
409 package at run-time only. Adding the package as a hard run-time
96a2688 @dagss Fix mistake
authored
410 dependency is overkill; it would trigger a rebuild whenever the
411 dependency changes, but we know the result will be the same.
a101763 @dagss Ticket #6
authored
412
413 So perhaps we need something to say "if you install ``joblib``, you
414 want ``argparse`` as well, even if there is no explicit dependency in
415 the Nix expressions". This really classifies more as meta-information
416 about packages than a part of Nix expressions.
417
418 Perhaps Nix has this somewhere as well and I just didn't find it yet.
1e0e5be @dagss scidist env command
authored
419
Something went wrong with that request. Please try again.