-
Notifications
You must be signed in to change notification settings - Fork 0
bac is a command-line tool written in Python that is responsible for managing the build artifact cache directory. The most important aspect of BAC is what it does not do, but leave as the responsibility of the caller.
BAC is a tool which is only used by package management systems (or a package developer during debugging).
The BAC store is a directory structure on disk that looks roughly like this:
$bacstore/sources/python-2.7.1.tar.gz # cached downloaded tarball $bacstore/sources/numpy.git # raw git repository $bacstore/sourcedb/hKUWhBuneltGSN4s0N-LMOpG27Q # see bac fetch command $bacstore/builds/python-hvfkN-qlp-zhXR3cuerq6jd2Z7g/bin/python # already produces builds $bacstore/builds/... $backstore/tmp/numpy-6dcfXufJLW3J6S-9rRe4vUlBj5g/.... # build in progress or failed
The first feature of BAC is managing downloading source code and labeling it, so that it can be referred to in build instructions:
bac fetch /path/to/bacstore http://python.org/ftp/python/2.7.2/Python-2.7.2.tar.bz2 [optional md5 hash] bac fetch /path/to/bacstore https://github.com/numpy/numpy.git 72185d34170369ec07e8e84ed18d2f6a814e327a
Each of these commands will
- Download the given sources to
$backstore/sources(this may be awgetor agit cloneor agit remote add ...; git fetch ...) - Make a label (such as
rDR41po8gfpi5g9cNpYWWk5easQ) - Create a database file in
$bacstore/sourcedb/$labelcontaining the location of the downloaded source (i.e. a tarfile; or a git repo + a git commit)
The sister command takes a label for a set of sources and unpacks them:
bac unpack rDR41po8gfpi5g9cNpYWWk5easQ [dest-dir]
(be aware of the git archive command when implementing this).
The following command,:
bac build /path/to/bacstore mybuild.json
ensures that the build specified by the mybuild.json is available in the store (by building it if necesarry), and returns its location. The build specification file is described below.
On success, returns (on standard output) the resulting path to the built package. This could have been built on the fly, or been found already existing in the cache. (Some status information should be provided as one goes too so that one can tail logfiles etc. too; all easily parseable by a scripted called).
In the event of a failure, a directory will be left in $bacstore/tmp that can be inspected for post-mortem debugging. It is the responsibility of the caller to remove this.
bac check /path/to/bacstore mybuild.json
Checks if the build is already present
bac debug /path/to/bacstore mybuild.json
Like build, but only goes through setting up the environment for the build, then echoes the commands that it would have executed had a build been requested, then drops into a shell.
More commands that should be the responsibility of BAC:
- Garbage collection
The build specification fully describes a build, so that it can be hashed. Example:
{ 'name' : 'mypackage',
'dependencies' :
{ 'numpy' : 'numpy-345wr23wrfw3r4w',
'blas' : 'ATLAS-32rasdfasdfasdf',
'gcc' : 'gcc-324qwed32e2q3d',
'python' : 'python-q324rfaewfcwqrf',
'bash' : 'bash-34raewfvq23rw'
},
'depend-files' : ['/usr/include/foo.h'],
'copy-files' : ['/path/to/qsnake/spkgs/mypackage.install'],
'files' : [
{ 'filename' : 'build.sh',
'contents' : [
'source $gcc/build_env',
'export USE_FROBNICATOR=True',
'export CFLAGS=-O3',
'bash mypackage.install'
]
},
{ 'filename' : 'mypackage.ini',
'contents' : [
'somearg = foo'
]
}
],
'sources' : '6dcfXufJLW3J6S-9rRe4vUlBj5g',
'build_cmd' : ['$bash/bin/bash', 'build.sh'],
'hash-payload': ['foo']
}
(Note: lists of strings are multi-line strings encoded as json)
- name
- Will be prepended to the hash string in the cache (for human-readability).
- dependencies
- Specifies existing build artifacts that is depended upon by this build. If these are not present already, fail. The full expanded list of dependencies should be present (it is not known that numpy once upon a time depended on Python). Each dependency will be present in the environment variables when running the build script. These strings are included in the hash.
- depend-files
- (Unordered list) Contents of these files is included in hash.
- copy-files
- (Unordered list) Lists files reachable on the local filesystem. The contents of these files are included in the hash, and the file copied to the build directory.
- files
- Files that should be created verbatim in the build directory.
- sources
- The label for the tarball or git commit for the sources that should be built (as passed to
bac unpack). This must have been previously downloaded usingbac fetch. - build_cmd
- Command to run (with arguments) to do the build and install, provided as a list. Often
this runs a script which is given in files. The command (first list element) can either
be a command to be looked up in the environment PATH (
bash), a full path (starts with/), or a binary from a previously built artifact (start with$, e.g.,$perl/bin/perl, providedperlis given in dependencies. - hash-payload
- (Ordered list) Additional bytes to throw into the hash (for whatever purposes the caller decides; e.g. some system state that cannot otherwise be tracked, such as the OS version etc.). Callers are encouraged to use user-readable strings so that it is obvious to a casual developer what is being hashed.
By specification, the build process should always produce the same results when run, and the caller should not care whether a build happens or the built artifact is found in cache. When a change happens on the host environment so that a rebuild is required, this should be done by making sure the hash of the build changes. (However, commands for forcing a rebuild may be provided to facilitate debugging.)
Actually achieving build isolation and reproducability is still the responsibility of BAC's caller. These are encouraged to use an LD_PRELOAD sandbox to enforce this. There is no way for BAC to enforce anything.
As hinted to above in the source gcc/build_env line, there should be ways to quickly set up the necesarry build environment provided by the build artifacts one depends on. BAC may lay down some guidelines/conventions/utilities for this, or even build a full throw-away prefix for the build (details TBD).
The build artifact directories from a successful build are write-protected. Other post-processing may be deemed necessary too and should be built into BAC.
Borrowing terminology from NIX, a profile is a build artifact that contains symlinks into all of its dependencies.
It may make sense to either build this functionality into BAC, or to do it in a build-script that is fed to BAC the usual way.
One may wish to have fake build artifacts for system software simply to make the build process simpler for depending packages. If a package needs perl to build; it could access $perl/bin/perl, and then one would need a fake perl build artifact if one wishes to use the system perl.
This should probably be implemented without any support in BAC. E.g., one can call perl -v and put the results in hash-payload or files, and then let the build script be a series of ln -s commands.