Hashdist is a tool for building software stacks from source in a reproducible manner. The stacks are described in [YAML](yaml.org) and consist of a *profile* and all the *package definitions* required by the profile. 

Proteus uses hashdist to build the software stack it depends on, so it has a copy that only requires a Python interpreter:

In [4]:
!ls ../proteus/hashdist

bin			  doc	    LICENSE.txt  setup.py  tox.ini
distlib-CONTRIBUTORS.txt  hashdist  README.rst	 share


Hashdist as a Command Line Interface (CLI), call `hit`

In [6]:
!../proteus/hashdist/bin/hit -h

usage: hit [-h] [--config-file CONFIG_FILE] [--ipdb] [--log LOG]
           {bdir,build,build-postprocess,build-unpack-sources,build-whitelist,build-write-files,clearsources,cp,create-links,develop,fetch,fetchgit,gc,help,init-home,load,mv,purge,push,remote,rm,self-check,shell,show,skeleton-pypi,status,unpack}
           ...

Entry-point for various HashDist command-line tools

optional arguments:
  -h, --help            show this help message and exit
  --config-file CONFIG_FILE
                        Location of hashdist configuration file (default:
                        /home/cekees/.hashdist/config.yaml)
  --ipdb                Enable IPython debugging on error

subcommands:
  {bdir,build,build-postprocess,build-unpack-sources,build-whitelist,build-write-files,clearsources,cp,create-links,develop,fetch,fetchgit,gc,help,init-home,load,mv,purge,push,remote,rm,self-check,shell,show,skeleton-pypi,status,unpack}
    bdir                Create the build directory, ready 

Most users only need a few commands.

In [7]:
!cd ../proteus/stack && ../hashdist/bin/hit build default.yaml

[[34;01mprofile[39;49;00m] Building profile/byicuf3i4yct, follow log with:
[[34;01mprofile[39;49;00m]   tail -f /home/cekees/.hashdist/tmp/profile-byicuf3i4yct/_hashdist/build.log
Up to date, link at: default


In [8]:
!ls ../proteus/stack/default

artifact.json  blaze	   build.log.gz  etc  include  man
bin	       build.json  doc		 id   lib      share


`hit build` builds read-only stacks. If you want to install additional software into your profile, use `hit develop` instead

In [10]:
!cd ../proteus/stack && ../hashdist/bin/hit develop default.yaml default_dev

[Profile dependencies are up to date]
[[34;01mprofile[39;49;00m] Building profile/hrtwdodbu7tt, follow log with:
[[34;01mprofile[39;49;00m]   tail -f /home/cekees/proteus/stack/default_dev/_hashdist/build.log
Development profile build /home/cekees/proteus/stack/default_dev successful


In [12]:
!PATH=../proteus/stack/default_dev/bin:$PATH pip install pandas

Collecting pandas
  Using cached pandas-0.21.0.tar.gz
Installing collected packages: pandas
  Running setup.py install for pandas ... [?25ldone
[?25hSuccessfully installed pandas-0.21.0


The default profile in default.yaml looks something like this:

```
# This profile file controls your <#> (HashDist) build environment.

# In the future, we'll provide better incorporation of
# automatic environment detection.  For now, have a look
# at the YAML files in the top-level directory and choose
# the most *specific* file that matches your environment.

extends:
- file: debian.yaml

# The packages list specifies all the packages that you
# require installed.  <#> will ensure that all packages
# and their dependencies are installed when you build this
# profile.

packages:
  recordtype:
  launcher:
  cmake:
```

Notice that # is a comment and the packages are simply listed. These packages must be in the `pkgs` subdirectory of the profile, for example `pkgs/pandas.yaml` contains:

```
extends: [setuptools_package]
dependencies:
  build: [numpy, python-dateutil, pytz]
  run: [numpy, python-dateutil, pytz]

sources:
  - url: https://pypi.python.org/packages/source/p/pandas/pandas-0.13.0.tar.gz
    key: tar.gz:loz7pjpsj7ucqdue4vah3qjjgzhhqjol```

This structure descripts the source code for the package, any dependencies, and build rules (often just extended from some base package). There are some commands to help you pupulate this file for a new package. If the package is on the python package index, then you can use `hit skeleton-pypi`

In [15]:
!cd ../proteus/stack && ../hashdist/bin/hit skeleton-pypi pymor

In [18]:
!cat ../proteus/stack/pkgs/pymor.yaml

extends: [setuptools_package]

dependencies:
  build: []
  run: []

sources:
 - key: tar.gz:56s4imabx77kskp3ddxqmdyimgnjbdvh
   url: https://pypi.python.org/packages/34/38/b62853b58ccdb89118d8ac4d09a7aa6f2a49d588921725d1dde571457099/pymor-0.4.2.tar.gz


Now you can add pymor to default.yaml, but be careful. You may have to track down some dependencies.

# Exercises
1. Build a development profile
2. Add a package from `pkgs` that is not originally in your profile and try rebuilding the profile
3. Try adding a new package from pypi (i.e. one that is not already in `pkgs`
4. Use `git` to see what commit of `hashdstack` you are on (in `proteus/stack`)